(I first messed around with analyzing Anscombe's quartet back in October 2010, for personal interest. A conversation I had last week in Basel, Switzerland, suggested that this analysis might interest a larger audience. Hence this blog post.)
Below are the four data sets, shown as black circles. They are modeled with typical linear regression that assumes normality:
y ~ dnorm( b0+b1*x , sigma )with vague priors on the intercept b0, the slope b1, and the standard deviation sigma.
Actually, for easy consistency with the subsequent analysis, I used a t distribution with the normality (df) parameter given a prior that insists it is very large, near 500, which is essentially normal. Thus, the model is
y ~ dt( b0+b1*x , sigma , nu )with vague priors on b0, b1, and sigma, and the prior on normality being
nu ~ dnorm( 500 , sd=0.1 )
The figures below show the data in the left panels, with a smattering of credible regression lines superimposed, and veritical bars that indicate the 95% HDI of posterior predicted y values at selected x values:
Notice that the marginal parameter estimates are the same for all four data sets (up to MCMC sampling error), which is exactly the point that Anscombe wanted to emphasize for least-squares regression. Here we see that it is also true for Bayesian estimation, and this is no surprise when we remember from formal Bayesian analysis of linear regression (with normally distributed noise) that the "sufficient statistics" of the data determine the posterior distribution, and the sufficient statistics are identical for the four data sets.
One of the beauties of the flexibility of JAGS/BUGS is that it is easy to implement other models. In particular, it is easy to implement regression that is robust to outliers, using the t distribution. I used the same model as above,
y ~ dt( b0+b1*x , sigma , nu )with vague priors on b0, b1, and sigma, but I put a vague prior on the normality of the t distribution:
nu <- nuMinusOne + 1This is the same prior on nu as is used in the "BEST" model for robust comparison of two groups. The resulting posteriors look like this:
nuMinusOne ~ dexp( 1/29 )
nuMinusOne ~ dexp( 1/29 )
Analogously, we could easily do a robust quadratic trend analysis, and thereby detect the non-linear trend in Type 2. See this previous blog post for how to extend a JAGS/BUGS linear regression to include a quadratic trend.
To reiterate and conclude, the Anscombe quartet not only makes its original point, that we should look at the posterior predictions to make sure that the model is a reasonable description of the data, but it also highlights that for Bayesian analysis using JAGS/BUGS, it is easy to try other models that may be better descriptions of the data.
ADDED 18 JUNE 2013: Rasmus Baath, after reading the above post, has actually implemented the extensions that I merely suggested above. Please continue reading his very interesting follow up.