tag:blogger.com,1999:blog-3240271627873788873.post7294732501916157722..comments2023-01-26T01:15:33.815-05:00Comments on Doing Bayesian Data Analysis: Prior for normality (df) parameter in t distributionJohn K. Kruschkehttp://www.blogger.com/profile/17323153789716653784noreply@blogger.comBlogger12125tag:blogger.com,1999:blog-3240271627873788873.post-29575740293343445192015-12-02T10:27:25.200-05:002015-12-02T10:27:25.200-05:00See post about updated programs here: http://doing...See post about updated programs here: <a href="http://doingbayesiandataanalysis.blogspot.com/2015/12/prior-on-df-normality-parameter-in-t.html" rel="nofollow">http://doingbayesiandataanalysis.blogspot.com/2015/12/prior-on-df-normality-parameter-in-t.html</a>John K. Kruschkehttps://www.blogger.com/profile/17323153789716653784noreply@blogger.comtag:blogger.com,1999:blog-3240271627873788873.post-70511039562925956182015-12-02T10:26:33.861-05:002015-12-02T10:26:33.861-05:00This comment has been removed by the author.John K. Kruschkehttps://www.blogger.com/profile/17323153789716653784noreply@blogger.comtag:blogger.com,1999:blog-3240271627873788873.post-5319483185061347552015-11-16T10:03:41.636-05:002015-11-16T10:03:41.636-05:00May be the whole nu < 1 think is a frequentist ...May be the whole nu < 1 think is a frequentist contamination!!! :-)<br /> <br />Probably restricting nu > 1 is a good idea, on the one hand I think is easier on the sampler and on the other as you say in general we are interested on mu and sigma parameters, not the nu parameter. <br /><br />I hope the guy on the other post agrees to expand his comment. <br /><br />aloctavodiahttps://www.blogger.com/profile/09470130971106757391noreply@blogger.comtag:blogger.com,1999:blog-3240271627873788873.post-4113795500398316652015-11-16T09:18:32.039-05:002015-11-16T09:18:32.039-05:00Hmmm... Past self had learned that nu ranged from ...Hmmm... Past self had learned that nu ranged from 1 to infinity (not from 0 to infinity). That "knowledge" found its way into both editions of the book. After all these years, you are the first person to question the restriction! Present self recognizes that past self was just plain wrong. <br /><br />The fact that the error has (apparently) made virtually no difference to any practical applications reflects the fact that most applications focus on the mu and sigma parameters, not the nu parameter, AND in most applications the kurtosis is moderate not extreme. Therefore the bias in the estimate of nu is (a) mild and (b) not very consequential for the estimate of mu and sigma.<br /><br />I will go through all the programs and modify them. Basically, any program that has file name "-Mrobust" will need some attention. I want to be sure that the MCMC samples won't break if nu is ever probed at zero or values very close to zero. The only other concern is that the plots of the posterior use log10(nu), which will break if nu is actually zero (or maybe very close to zero). Hopefully I can do this this week...<br /><br />Thanks for bringing this to my attention.<br /><br />(By the way, the comment about "very important to be smaller than 1" is strange to me; it would depend very much on the how extreme the non-normality of the data actually is. I would think that it's very important to allow nu<1 only if the data have extreme non-normality that can only be adequately captured with nu<1. I suspect that this sort of data is extremely rare, but of course that depends on the domain of application.)John K. Kruschkehttps://www.blogger.com/profile/17323153789716653784noreply@blogger.comtag:blogger.com,1999:blog-3240271627873788873.post-50705322781068639432015-11-16T04:52:20.370-05:002015-11-16T04:52:20.370-05:00Hi,
Thanks for your answer. I found the Gelman...Hi,<br /><br />Thanks for your answer. I found the Gelman's post you mention.<br /><br />http://andrewgelman.com/2015/05/17/do-we-have-any-recommendations-for-priors-for-student_ts-degrees-of-freedom-parameter/<br /><br />Why do you say that the df parameter in a t distribution is not defined below 1? Is not defined in the sense that is not a "degree of freedom". The post start with the suggestion to use the prior gamma(2, 0.1), this will give you values below 1, right?. Also in the Gelman's post there is a comment that states "It is unnecessary to allow nu to be larger than 30 or 50, but it is very important to allow nu to be smaller than 1." A very surprising comment!<br />aloctavodiahttps://www.blogger.com/profile/09470130971106757391noreply@blogger.comtag:blogger.com,1999:blog-3240271627873788873.post-10254992710574425712015-11-12T10:13:55.309-05:002015-11-12T10:13:55.309-05:00Yes, I still think that a shifted-exponential prio...Yes, I still think that a shifted-exponential prior with mean 30 is a good choice for practical applications.<br /><br />Nu (the df parameter in a t distribution) is not defined below 1, so whatever prior you choose, it should now allow values below 1.<br /><br />There was some discussion on Gelman's blog about appropriate priors for the df parameter in a t distribution, but now I can't find the thread. Maybe you can. Basically, it's about an article that shows the exponential prior works reasonably well, although it does shrink extreme df values toward the mean a little bit.John K. Kruschkehttps://www.blogger.com/profile/17323153789716653784noreply@blogger.comtag:blogger.com,1999:blog-3240271627873788873.post-62619938791954882882015-11-12T09:58:06.985-05:002015-11-12T09:58:06.985-05:00Hi John,
I would you to know your opinion on thes...Hi John,<br /><br />I would you to know your opinion on these (and may be other) priors for the normality parameter. And if you still think that a shifted-exponential with mean 30 is a good choice (as you recommend in the second edition of your book).<br /><br />* gamma(2, 0.1) (i.e mean=20)<br /><br />* nu.y <- 1/nu0 <br /> nu0 ~ dunif(0,.5) (i.e uniform between 2 and inf)<br /><br />It seems that some people would like to avoid nu < 1, while others nu < 2 and other do not have problems with values below 1.<br /><br />Theoretically I guess that avoiding values below 1 is related to the indetermination on the mean, and avoiding values below 2 because the indetermination on the variance. <br /><br />From a computational point of view (at least on PyMC3) I have found that is better to avoid values below 1, otherwise the sampler keeps jumping to values far away the posterior mean and the sapling process get really slow (using Metropolis or NUTS), I guess this is the results of allowing "ridiculously· fat tails when nu < 1.<br /><br />Thanks in advance.aloctavodiahttps://www.blogger.com/profile/09470130971106757391noreply@blogger.comtag:blogger.com,1999:blog-3240271627873788873.post-53675180618516555632014-08-13T09:07:10.361-04:002014-08-13T09:07:10.361-04:00Oh, don't get me started on Past self. He is t...Oh, don't get me started on Past self. He is this really irritating guy who makes these strange decisions, never does the dishes and never comments his code...Rasmus Bååthhttps://www.blogger.com/profile/16575386339856902265noreply@blogger.comtag:blogger.com,1999:blog-3240271627873788873.post-47069314840256127172014-08-12T15:55:55.691-04:002014-08-12T15:55:55.691-04:00Why did the book do it the strange indirect way? I...Why did the book do it the strange indirect way? I have no idea. Past self did not leave enough bread crumbs for present self to follow. Maybe some motivation can be constructed post hoc.John K. Kruschkehttps://www.blogger.com/profile/17323153789716653784noreply@blogger.comtag:blogger.com,1999:blog-3240271627873788873.post-22964937527980126422014-08-11T03:46:06.214-04:002014-08-11T03:46:06.214-04:00Just a note: To save some computations, it's p...Just a note: To save some computations, it's possible to replace log(1-udf) by log(udf), since 1-udf and udf are both distributed uniformly on [0,1].<br /><br />Not that it matters so much today...Daniel Hecknoreply@blogger.comtag:blogger.com,1999:blog-3240271627873788873.post-33471700954884483612014-08-10T12:49:33.173-04:002014-08-10T12:49:33.173-04:00So is there a reason for why you use one version i...So is there a reason for why you use one version in the book and one version in the paper? Performance differences?Rasmus Bååthhttps://www.blogger.com/profile/16575386339856902265noreply@blogger.comtag:blogger.com,1999:blog-3240271627873788873.post-71586073545035982232014-08-10T12:48:51.817-04:002014-08-10T12:48:51.817-04:00This comment has been removed by the author.Rasmus Bååthhttps://www.blogger.com/profile/16575386339856902265noreply@blogger.com