Using the log-normal density can be confusing because it's parameterized in terms of the mean and precision of the log-scale data, not the original-scale data. Thus, if your data, y, are nicely described by a log-normal distribution, the estimated mean and precision are for log(y), not y. To get the corresponding parameters for the original-scale data, you have to transform the estimated parameters. And the prior on the mean and precision have to be appropriate for log(y), as opposed to y. This post shows an example of a JAGS program (with rjags) for estimating the parameters of log-normal distributed data.
UPDATED CODE CAN BE FOUND AT THIS MORE RECENT POST.
The data for the example are shown as the (pink) histogram in the top left panel. A smattering of credible log-normal distributions is superimposed (as blue curves) on the data:
The upper-middle panel shows that the mean of the log-normal distribution is estimated to be 5.0. Notice that a value of 5.0 is nowhere near the bulk of the data, because the mean is for the log data, not the original-scale data. The transformed parameters are shown in the lower panels. Thus, the mode of the log-normal distribution, on the original scale, is about 115, and the mean of the log-normal distribution, on the original scale, is about 169.
The complete program I used is included below. Notice, in particular, that the priors are set to be vague on the scale of the log data.
graphics.off()
rm(list=ls(all=TRUE))
fileNameRoot="YmetricXsingleJagsDLNORM" # for constructing output filenames
source("openGraphSaveGraph.R")
source("plotPost.R")
require(rjags) # Kruschke, J. K. (2011). Doing Bayesian Data Analysis:
# A Tutorial with R and BUGS. Academic Press / Elsevier.
#------------------------------------------------------------------------------
# THE MODEL.
modelstring = "
model {
for( i in 1 : N ) {
y[i] ~ dlnorm( muOfLogY , 1/sigmaOfLogY^2 )
}
sigmaOfLogY ~ dunif( 0.001*sdOfLogY , 1000*sdOfLogY )
muOfLogY ~ dnorm( meanOfLogY , 0.001*1/sdOfLogY^2 )
muOfY <- exp(muOfLogY+sigmaOfLogY^2/2)
modeOfY <- exp(muOfLogY-sigmaOfLogY^2)
sigmaOfY <- sqrt(exp(2*muOfLogY+sigmaOfLogY^2)*(exp(sigmaOfLogY^2)-1))
}
" # close quote for modelstring
writeLines(modelstring,con="model.txt")
#------------------------------------------------------------------------------
# THE DATA.
# Generate random data from known parameter values:
set.seed(47405)
trueM = 5.0
trueSD = 0.5
y = rnorm( n=125 ) # R dnorm uses mean and SD
LogY = (y-mean(y))/sd(y)*trueSD + trueM
y = exp(LogY)
dataList = list(
y = y ,
N = length(LogY) ,
meanOfLogY = mean(LogY) ,
sdOfLogY = sd(LogY)
)
#------------------------------------------------------------------------------
# INTIALIZE THE CHAINS.
# Let JAGS do it
#------------------------------------------------------------------------------
# RUN THE CHAINS
parameters = c("muOfLogY" , "sigmaOfLogY" , "muOfY" , "modeOfY" , "sigmaOfY" )
adaptSteps = 1000 # Number of steps to "tune" the samplers.
burnInSteps = 1000 # Number of steps to "burn-in" the samplers.
nChains = 3 # Number of chains to run.
numSavedSteps=30000 # Total number of steps in chains to save.
thinSteps=1 # Number of steps to "thin" (1=keep every step).
nPerChain = ceiling( ( numSavedSteps * thinSteps ) / nChains ) # Steps per chain.
# Create, initialize, and adapt the model:
jagsModel = jags.model( "model.txt" , data=dataList ,
n.chains=nChains , n.adapt=adaptSteps )
# Burn-in:
cat( "Burning in the MCMC chain...\n" )
update( jagsModel , n.iter=burnInSteps )
# The saved MCMC chain:
cat( "Sampling final MCMC chain...\n" )
codaSamples = coda.samples( jagsModel , variable.names=parameters ,
n.iter=nPerChain , thin=thinSteps )
# resulting codaSamples object has these indices:
# codaSamples[[ chainIdx ]][ stepIdx , paramIdx ]
#------------------------------------------------------------------------------
# EXAMINE THE RESULTS
checkConvergence = FALSE
if ( checkConvergence ) {
openGraph(width=7,height=7)
autocorr.plot( codaSamples[[1]] , ask=FALSE )
show( gelman.diag( codaSamples ) )
effectiveChainLength = effectiveSize( codaSamples )
show( effectiveChainLength )
}
# Convert coda-object codaSamples to matrix object for easier handling.
# But note that this concatenates the different chains into one long chain.
# Result is mcmcChain[ stepIdx , paramIdx ]
mcmcChain = as.matrix( codaSamples )
chainLength = NROW(mcmcChain)
openGraph(width=10,height=6)
layout(matrix(1:6,nrow=2,byrow=TRUE))
# posterior predictive
hist( dataList$y , xlab="y" , main="Data w. Post. Pred." , breaks=30 ,
col="pink" , border="white" , prob=TRUE , cex.lab=1.5)
pltIdx = floor(seq(1,chainLength,length=20))
xComb = seq( min(dataList$y) , max(dataList$y) , length=501 )
for ( chnIdx in pltIdx ) {
lines( xComb ,
dlnorm( xComb, mcmcChain[chnIdx,"muOfLogY"], mcmcChain[chnIdx,"sigmaOfLogY"] ),
col="skyblue" )
}
# param's of log(y)
postInfo = plotPost( mcmcChain[,"muOfLogY"] , xlab="mu of log(y)" )
postInfo = plotPost( mcmcChain[,"sigmaOfLogY"] , xlab="sigma of log(y)" )
# param's of y
postInfo = plotPost( mcmcChain[,"modeOfY"] , xlab="mode of y" )
postInfo = plotPost( mcmcChain[,"muOfY"] , xlab="mu of y" )
postInfo = plotPost( mcmcChain[,"sigmaOfY"] , xlab="sigma of y" , showMode=TRUE)
saveGraph(file=fileNameRoot,type="jpg")
#------------------------------------------------------------------------------
Saturday, April 27, 2013
Monday, April 15, 2013
Four-day course in doing Bayesian data analysis, June 10-13
There will be a four-day introductory course in doing Bayesian data analysis, June 10-13 (2013), at the University of St. Gallen, Switzerland. The course is offered through the Summer School in Empirical Research Methods. Complete info is at this link:
http://www.indiana.edu/~jkkteach/WorkshopStGallen2013.html
Saturday, April 13, 2013
Bayesian robust heteroscedastic hierarchical two-factor "ANOVA"
A great advantage of Bayesian analysis, in software such as JAGS/BUGS, is that you can create models that are appropriate to the data (instead of shoe-horning your data into inappropriate models). For example, suppose you have a design that would be analyzed traditionally by two-factor ANOVA. In traditional ANOVA, it is assumed that the data are distributed normally within cells and that the variances are the same for all cells. But suppose you have real data. In real data, there may be many outliers (i.e., distributions within cells have heavy non-normal tails) and the cells may have very different variances (i.e., there is "heteroscedasticity" across cells). In JAGS/BUGS, we can create a model that (1) accommodates outliers by using heavy-tailed distributions to describe data within cells, and (2) accommodates heteroscedasticity by using distinct variance parameters in each cell. And, in the program I present here, there is hierarchical structure that provides shrinkage on the cell means and cell variances.
The program is a variant of programs in DBDA, so I will not provide a self-contained tutorial in this post, instead relying on readers to consult Chapter 19 of DBDA for background information. Since writing DBDA, the ANOVA-style programs have been iteratively improved, as explained in various posts in this blog; search the blog with the term "ANOVA".
Important: The hierarchical structure in the model can produce implosive shrinkage for small data sets and/or priors that allow (or encourage) zero differences between cells. Implosive shrinkage is merely what the model implies, given the data, but it might not be what you think is a meaningful description of the data. If you get implosive shrinkage that you deem inappropriate or not useful, then you can change the prior structure so it more accurately reflects your modeling assumptions. For example, you could simply change the model so it has no hierarchical structure, but then, of course, you get no sharing of information across cells. The program presently uses prior distributions that de-emphasize zero differences between cells. This helps prevent implosive shrinkage, but it is also an expression of a prior belief that exactly-zero differences are improbable. In particular, the prior is not a half-Cauchy or folded-t distribution that some statisticians have recommended as a generic prior for hierarchical variance terms. Please see the comments inserted throughout the program.
As an example, consider a 3-by-4 two-factor between-subjects design. (And no, I don't have any within-subjects versions; sorry.) A link to the data file is included below. Here is a plot of the sample means:
Although not visible in the plot of the cell means, the cells have outliers and heteroscedastic variances. The results of a conventional ANOVA yield:
Notice, above, the non-significant (overall) interaction. Using the Bayesian program, we find, among other things, that there is a credibly non-zero interaction contrast (depending on your choice of ROPE):
The Bayesian program provides estimates of all the cell means and cell variances (etc.), so you can do lots of comparisons of cell means and cell variances.
The program (AnovaTwoFactor.R) and data file (AnovaTwoFactorData.csv) are available at the usual program repository.
The program is a variant of programs in DBDA, so I will not provide a self-contained tutorial in this post, instead relying on readers to consult Chapter 19 of DBDA for background information. Since writing DBDA, the ANOVA-style programs have been iteratively improved, as explained in various posts in this blog; search the blog with the term "ANOVA".
Important: The hierarchical structure in the model can produce implosive shrinkage for small data sets and/or priors that allow (or encourage) zero differences between cells. Implosive shrinkage is merely what the model implies, given the data, but it might not be what you think is a meaningful description of the data. If you get implosive shrinkage that you deem inappropriate or not useful, then you can change the prior structure so it more accurately reflects your modeling assumptions. For example, you could simply change the model so it has no hierarchical structure, but then, of course, you get no sharing of information across cells. The program presently uses prior distributions that de-emphasize zero differences between cells. This helps prevent implosive shrinkage, but it is also an expression of a prior belief that exactly-zero differences are improbable. In particular, the prior is not a half-Cauchy or folded-t distribution that some statisticians have recommended as a generic prior for hierarchical variance terms. Please see the comments inserted throughout the program.
As an example, consider a 3-by-4 two-factor between-subjects design. (And no, I don't have any within-subjects versions; sorry.) A link to the data file is included below. Here is a plot of the sample means:
Although not visible in the plot of the cell means, the cells have outliers and heteroscedastic variances. The results of a conventional ANOVA yield:
Df Sum Sq Mean Sq F value Pr(>F) x1 2 658 329.1 2.375 0.0939 . x2 3 3661 1220.2 8.804 1.02e-05 *** x1:x2 6 1446 241.0 1.739 0.1096 Residuals 588 81495 138.6 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Notice, above, the non-significant (overall) interaction. Using the Bayesian program, we find, among other things, that there is a credibly non-zero interaction contrast (depending on your choice of ROPE):
The Bayesian program provides estimates of all the cell means and cell variances (etc.), so you can do lots of comparisons of cell means and cell variances.
The program (AnovaTwoFactor.R) and data file (AnovaTwoFactorData.csv) are available at the usual program repository.
Monday, April 8, 2013
Week-long course in doing Bayesian data analysis, July 15-19
There will be a week-long introductory course in doing Bayesian data analysis, July 15-29 (2013), at Indiana University. The course is offered through the Inter-university Consortium for Political and Social Research (ICPSR). Complete info is at this link:
http://www.indiana.edu/~jkkteach/WorkshopICPSR2013.html
* Please forward this announcement to interested persons or lists. *
http://www.indiana.edu/~jkkteach/WorkshopICPSR2013.html
* Please forward this announcement to interested persons or lists. *
Subscribe to:
Posts (Atom)