Saturday, March 22, 2014

Potpourri of recent inquires about doing Bayesian data analysis


I get a lot of email asking about the book and the blog. Too often, I say to myself, "That's interesting. I'll have to think about it and reply later." And then later never comes. My apologies to everyone to whom I have not yet replied. Below is a sample of a few of the messages I've received in the last few days, actually with replies.

Topics:
  • Initializing chains for multi-factor ANOVA-style models.
  • Errors in BUGS (another reason to use JAGS).
  • Getting the decimal places of the HDI limits that don't appear in the graphs.
  • Effect size measure in ANOVA-style log-linear models.
  • Variable selection in multiple regression.



March 22, 2014:
Dear Prof. Kruschke,

by now I've opened and closed your contact form many a times, and have been successfully able to tackle each problem I had at the time using your book. Now, though, I am stumped. I am calculating a Poisson Exponential "ANOVA" for 4 nominal predictors and the code for initializing the chains troubles me. For that matter, my question is regarding any and all situations in which there are more than two nominal predictors present. The code to initialize the linear predictor sum is this:

linpred = as.vector( outer( a1 , a2 , "+" ) + a0 )

I understand the concept of the outer() function for two variables, but could you please explain how you would deal with adding a3 and a4 into this mix, assuming the rest of the code has been updated accordingly?

Kind regards,

Reply:

Thanks very much for your interest in the book.

One approach is to try not to initialize manually, letting JAGS initialize for you. You have to check that the chains have burned in well and converged. But if it works, problem solved.

But if you really want to manually initialize, you have a few choices. One option is not to bother with the interaction terms. That is, initialize the main effects with lines like
a0 = mean( theData$y )
a1 = aggregate( theData$y , list( theData$x1 ) , mean )[,2] - a0
a2 = aggregate( theData$y , list( theData$x2 ) , mean )[,2] - a0

but leave the interaction terms all initialized at zero.

If you really, really want to initialize the interactions, you can use the outer() function to build higher dimensional arrays. To get help, type
?outer
at R's command line. As an example, consider this:

> a=1:2
> b=3:5
> c=6:9
> outer(a,b,"+")
     [,1] [,2] [,3]
[1,]    4    5    6
[2,]    5    6    7
> outer(c,outer(a,b,"+"),"+")
, , 1

     [,1] [,2]
[1,]   10   11
[2,]   11   12
[3,]   12   13
[4,]   13   14

, , 2

     [,1] [,2]
[1,]   11   12
[2,]   12   13
[3,]   13   14
[4,]   14   15

, , 3

     [,1] [,2]
[1,]   12   13
[2,]   13   14
[3,]   14   15
[4,]   15   16


Hope that helps!
--John


March 20, 2014:
Hi John,

First of all I would like to mention that I am really enjoying reading the book.

I have studied a bit of Bayesian statistics in a university course and I find really interesting to see real world applications of MCMC.

I have been doing the exercises okay but I have come across a stumbling block in chapter Nine.

On exercise 9.2 when I changed the data parameters as per exercise solutions I do get the following error message. These errors appear when I run the FilconBrugs.R script in its original form as well

'Initializing chain 1:
initial values loaded and chain initialized but another chain contain uninitialized variables
Initializing chain 2:
initial values loaded and chain initialized but another chain contain uninitialized variables
Initializing chain 3:
model is initialized,

After this error then I encounter the following one:

"update error for node <theta[4]> algorithm conjugate beta updater error argument two  has too small a value"

Basically the model does not produce any posterior. If you could let me know what the error could be due to it would be much appreciated. I'm using R studio and I haven't come across any errors in the rest of exercises and code that I have ran so far.

Kind Regards,

Reply:

Thanks very much for your interest in the book.

First off, I strongly recommend using JAGS instead of BUGS. I have repeatedly found that BUGS yields troubles when JAGS works fine. Here is the link to a blog post about using JAGS instead of BUGS:
http://doingbayesiandataanalysis.blogspot.com/2012/01/complete-steps-for-installing-software.html

Both of the errors you encounter are from BUGS. The statements about intializing the chains are not errors, but merely status reports. It's simply going through the chains and initializing them, but letting you know that after chain 1 there are still 2 more chains to go, and after chain 1 there is still another chain to go.

The bit about "update error for node..." is a genuine error. As I recall, I do not get that sort of error in JAGS. There are work-arounds in BUGS, by truncating the distribution or the values of higher-level parameters. But give JAGS a try and see if that solves the problem.

Thanks again.
--John



March 20, 2014:
Good afternoon Mr. Kruschke,

I have a  questions about two-way ANOVA from your book. On page 522, you run an example using salary data and you mention that the means and HDI limits are displayed with only three significant digits, but more precise values can be obtained directly from the program.  How do I get those more precise values?

Thank you for help,

Reply:
Thanks very much for your interest in the book.

The 22-Jun-2012 version of plotPost.R (see programs at http://www.indiana.edu/~kruschke/DoingBayesianDataAnalysis/Programs/ ) produces a summary of the posterior as its returned value. For example, if you use the R statement

postInfo = plotPost( ... )

then postInfo has the mean, median, mode, HDI limits, etc.

Don't forget that even though multiple decimal points are there, they are not necessarily meaningful. The values are estimates from a random MCMC sample from the posterior, so they are noisy!

Thanks again.
--John



March 19, 2014:
Dear Prof. Krushke, I'm working with ch 22 of DBDA to analyse contingency tables. I'm curious to know where to look for a measure of effect size, both for the whole table and for individual interactions. Any tips you have would be much appreciated!
Thanks,

Reply:

Thanks for your interest in the book.

My short answer is that I don’t have an adequate answer. There are many relevant measures of effect size in the frequentist literature, of course, but we seek something for hierarchical Bayesian log-linear models. A measure of effect size is supposed to be an aggregated difference (i.e., the “effect”) standardized relative to some indicator of variance or noise in the data. Moreover, in a Bayesian setting, the effect size has a posterior distribution; it’s not just a point estimate. An example of posterior effect size for a difference between groups is given here: http://www.indiana.edu/~kruschke/BEST/
One possibility for describing effect size in hierarchical Bayesian ANOVA-style models (including log-linear models) is a ratio of estimated variance parameters. For example, the main effect of factor A might have an “effect size” given by sigma_A divided by sigma_y, where sigma_A is the scale parameter of the distribution of the factor-A deflections, and sigma_y is the scale parameter of the within-cell noise distribution. But that’s just off the top of my head.
Thanks again,
--John




March 18, 2014:
John, I really enjoyed reading your post [http://doingbayesiandataanalysis.blogspot.com/2014/01/bayesian-variable-selection-in-multiple.html] as, similar to you, I feel that when it comes to real data there is a lot of fine tuning that is usually inadequately addressed in papers/books.

I have a problem where the posterior probability of delta=1 is always above 0.5 (even when the respective variable should clearly not be selected). I was wondering whether you made progress with your issues and perhaps can share some light on mine.

Thanks,

Reply:

Thanks for the comment.

I’m not sure how you know that the “respective variable should clearly not be selected.” If you already know that, does that mean that you have prior knowledge that is not being expressed in the prior distribution of the model?

My impression of variable selection is that it’s a bit of an art, in the sense that the prior matters! But that’s true of Bayesian model selection in general. Real prior knowledge should be implemented if possible, and, in the case of variable selection, shrinkage from hierarchical structure should be implemented if appropriate.

Friday, March 14, 2014

Bayesian estimation and precision as the goal for data collection (expanded)

Precision as the goal for data collection, talk at U.C. Irvine, March 14, 2014.

Part 1: Rejecting null is not enough, also need estimate and precision. Bayesian estimation supersedes confidence intervals and "the new statistics".


Part 2: Two Bayesian ways to assess a null value. Highest density interval with region of practical equivalence. Bayesian model comparison and Bayes factor.


Part 3: Biased estimation in sequential testing and optional stopping.


Part 4: Monte Carlo study of biased estimation in sequential testing and optional stopping.


Parts 3 and 4 are elaborations of this previous blog post.

Saturday, March 8, 2014

Recent Advances in Bayesian Inference at U.C. Irvine, March 14



Recent Advances in Bayesian Inference

Venue:
University of California at Irvine,
1517 Social and Behavioral Sciences Gateway building

Date:
Friday, March 14, 2014

Speakers:
  • 9:30 John Kruschke (Indiana University). Precision as the goal for data collection.
  • 10:30 Hal Stern (UC Irvine). A statistical approach to detecting patterns in behavioral event sequences.
  • 11:30 Jeff Rouder (University of Missouri). The p < .05 rule and the hidden costs of the free lunch in inference.
  • 2:00 Wes Johnson (UC Irvine). Introduction to Bayesian Nonparametrics.
  • 3:00 Mario Peruggia (Ohio State University). Introduction to Bayesian model averaging.
Registration:
Please contact Dr. Joachim Vandekerckhove, joachim@uci.edu