Monday, July 25, 2011

Autocorrelation in Bayesian ANOVA: Fixing the book's overkill

In hierarchical Bayesian ANOVA (Ch's 18, 19, and 22 of the book), the model expresses the cell means in terms of deflections from baseline, such that the deflections sum to zero. The sum-to-zero constraint is only imposed after BUGS does its work; i.e., the deflection parameters in the BUGS model are not intrinsically constrained to sum to zero. The deflection parameters in the model can therefore become strongly autocorrelated, and extensive thinning may be necessary to attenuate autocorrelation in the unconstrained parameters.

But we don't care about the unconstrained parameters. What we care about is the sum-to-zero versions of the parameters, because those are the parameters we use to interpret the data. It turns out that autocorrelation in the sum-to-zero parameters is much less severe. I've known about this for quite a while, but haven't gotten around to modifying the programs until now.

Here's an example. This comes from the two-factor hierarchical Bayesian ANOVA in Section 19.1.3 (p. 520) of the book, regarding salaries. The deflection parameters in the BUGS model are labeled a1[], a2[], and a1a2[,]. The sum-to-zero versions of the parameters are labeled b1[], b2[], and b1b1[,]. Here are the a1[] chains when the thinning is set to 75:

Obviously there is nasty autocorrelation, and much more thinning would be needed if we cared about these parameters. (Added Nov. 7, 2011: Regarding thinning, see a later post which points out that thinning is not advisable. But the remainder of this post, below, is still important for better interpretation of the parameters in Bayesian ANOVA.)

But we don't care about a1[], instead we care about the sum-to-zero reparameterization, b1[]. Here are the b1[] chains derived from the chains above:

There is minimal autocorrelation in the sum-to-zero versions of the parameters.

Therefore, the versions of the programs that accompany the book, which assess and attenuate autocorrelation in the non-sum-to-zero parameters, thin far more than is necessary. The results are not wrong, merely overkill.

I have created new versions of two of the ANOVA programs such that the sum-to-zero constraint is computed inside the BUGS model statement, and therefore the sum-to-zero parameters can be assayed using the plotChains() command. The two new programs are ANOVAonewayBRugsSTZ.R and ANOVAtwowayBRugsSTZ.R . These programs use far less thinning than the original programs to achieve the same results. Other ANOVA programs --such as those that allow heterogeneous variances, and Poisson exponential ANOVA-- can be modified analogously.


  1. Nice. MCMC usually "likes" deviation contrasts better than dummy coding because the X matrix is orthogonal. Similarly, if you want something like polynomial contrasts it would make a great deal of sense to set them up as orthogonal contrasts.

  2. Jay: I probably misunderstand your comment, but I wanted to be sure that the method used in the programs is clear. The underlying MCMC chain is completely unaffected by the reparameterization, and the "a" parameters in the MCMC chain remain badly autocorrelated. It's just that we assess autocorrelation by considering the reparameterized "b" values, not the "a" values.

    With respect to contrasts, again those are computed after the MCMC chain is generated, and the contrasts have no influence on the MCMC chain. In the programs many different contrasts are examined, without concern for their orthogonality. In principle, we should check that the chains for the contrasts are not badly autocorrelated, but in practice they are not, so it's not a problem.