## Saturday, April 8, 2017

### Trade-off of between-group and within-group variance (and implosive shrinkage)

Background: Consider data that would traditionally be analyzed as single-factor ANOVA; that is, a continuous metric predicted variable, $y$, and a nominal predictor, "Group." In particular, consider the data plotted as red dots here:

A Bayesian approach easily allows a hierarchical model in which both the within-group and between-group variance are estimated. The hierarchical structure imposes shrinkage on the estimated group means. All of this explained in Chapter 19 of DBDA2E

The data above come from Exercise 19.1, which is designed to illustrate "implosive shrinkage." Because there are only a few data points in each group, and there are lots of groups with little variance from baseline, a reasonable description of the data merely collapses the group means close to baseline while expanding the estimate of within-group variance.

The purpose of the present post is to show the trade-off in the posterior distribution of the estimated within-group variance and between-group variance, while also providing another view of implosive shrinkage.

After running the data through the generic hierarchical model in Exercise 19.1, we can look at the posterior distribution of the within-group and between-group standard deviations. The code below produces the following plot:

mcmcMat = as.matrix( mcmcCoda )
openGraph()
plot( mcmcMat[,"aSigma"] , mcmcMat[,"ySigma"] ,
xlab="Between-Group Sigma" , ylab="Within-Group Sigma" ,
col="skyblue" , cex.lab=1.5 ,
main="Trade-Off of Variances (and implosive shrinkage)" )
abline( lm( mcmcMat[,"ySigma"] ~ mcmcMat[,"aSigma"] ) )

Notice in the plot that as the between-group standard deviation gets larger, the within-group standard deviation gets smaller. Notice also the implosive shrinkage: The estimate of between-group standard deviation is "piled up" against zero.

1. Hello, just starting this chapter and encountered this error below. Is it something obvious I have misspelt? Thanks, Graeme

yName="Longevity"
xName="CompanionNumber"
source("Jags-Ymet-Xnom1fac-MnormalHom.R")
mcmcCoda=genMCMC(datFrm=myDataFrame,yName=yName, xName=xName,
numSavedSteps=1100, thinSteps=10, saveName=fileNameRoot)

Error in genMCMC(datFrm = myDataFrame, yName = yName, xName = xName, numSavedSteps = 1100, :

OUTPUT
. Initializing model
-------------------------------------------------| 500
++++++++++++++++++++++++++++++++++++++++++++++++++ 100%
. Updating 1000
-------------------------------------------------| 1000
************************************************** 100%
. . . . . . Updating 3670
-------------------------------------------------| 3650
************************************************** 100%
* 100%
. . . . Updating 0
. Deleting model
.
All chains have finished