Doing Bayesian Data Analysis: Negative Binomial Reparameterization

Saturday, April 7, 2012

Negative Binomial Reparameterization

In a previous post, I showed that direct estimation of the p and r parameters in a negative binomial distribution could involve bad autocorrelation in the MCMC chains, and I suggested that there must be some standard reparameterization to solve the problem, and asked for a pointer. Dr. John Davey of the University of Edinburgh was good enough to point the way. The solution is indeed straight forward: In the direct estimation, the p and r parameters are given priors, while the mean m (and variance v) is derived from p and r. In the reparameterization, the m and r parameters are given priors, while the p (and variance v) is derived from m and r. Autocorrelation in the chains is greatly reduced. Here is an example.

The model specifications:

Parameterized by p and r:

model {

    for( i in 1 : N ) {

      y[i] ~ dnegbin( p , r )

    }

    p ~ dbeta(1.001,1.001)

    r ~ dgamma(0.01,0.01)

    m <- r*(1-p)/p

    v <- r*(1-p)/(p*p)

}

Parameterized by m and r:

model {

    for( i in 1 : N ) {

      y[i] ~ dnegbin( p , r )

    }

    p <- r/(r+m)

    r ~ dgamma(0.01,0.01)

    m ~ dgamma(0.01,0.01)

    v <- r*(1-p)/(p*p)

}

Results:

Parameterized by p and r:

Parameterized by m and r:

The posteriors are a bit different for the two parameterizations, because I used "generic" priors without any attempt to transform them to be equivalent.

Thanks to John Davey for this pointer!

5 comments:

Daniel HockingNovember 6, 2012 at 12:11 PM
This is great. I'm going to try it with my data now. Thanks for the tip since this seems like a common problem I will encounter with my count data.

Cheers,
Dan
ReplyDelete
Replies
AnonymousDecember 25, 2013 at 7:42 PM
Thanks, Professor Kruschke. You inspired me to try this -- and JAGS as a vehicle for learning BUGS -- on another problem, available for public inspection at: http://hypergeometric.wordpress.com/2013/12/23/jags-for-finding-highs-and-lows-in-a-week-of-wikipedia-accesses/
ReplyDelete
Replies
GraemeAugust 9, 2015 at 1:23 AM
G'day, What would a full script look like when comparing two vectors both of which have a negbinom distribution? I have been trying to transpose this post with BEST.R function, but not very well.

ReplyDelete
Replies
John K. KruschkeAugust 9, 2015 at 9:46 AM
Graeme:

I won't have a chance to write a full script, but the key thing to do is use nested indexing for the group identifier. Row i of the data file contains the count y[i] and the group identifier x[i]. Then the model statement uses

for ( i in 1:Nrows ) {
y[i] ~ dnegbin( p[x[i]] , r[x[i]] )
}
for ( grpIdx in 1:2 ) {
p[grpIdx] <- r[grpIdx]/(r[grpIdx]+m[grpIdx])
m[grpIdx] ~ dgamma(0.01,0.01)
# etc. ...
}
ReplyDelete
Replies
AnonymousSeptember 27, 2015 at 4:10 AM
This seems to be working to here. Anything obvioulsy wrong?

y[i] ~ dnegbin( p[x[i]] , r[x[i]] )
}
for ( grpIdx in 1:2) {
p[grpIdx] <- r[grpIdx]/(r[grpIdx]+m[grpIdx])
m[grpIdx] ~ dgamma(0.01, 0.01)
r[grpIdx] ~ dgamma(0.01, 0.01)
v[grpIdx] <- r[grpIdx]*(1-p[grpIdx])/(p[grpIdx]*p[grpIdx])
}
}
"

snail<-read.table('alive14-15.txt',header=T)
attach(snail)

# Load the data:
y = c( snail$alive14 , snail$alive15) # combine data into one vector
x = c( rep(1,length(alive14)) , rep(2,length(alive15)) ) # create group membership code
Ntotal = length(y)
# Specify the data in a list, for later shipment to JAGS:
dataList <- list(
x = x,
y = y ,
Ntotal = Ntotal
)
length(y) #check
length(x)

paramaters<-c("m","r","p","v") #collect these
jagsModel <- jags.model(textConnection(modelString), data=dataList, n.chains=3, n.adapt=1e3)

codaSamples <- coda.samples(jagsModel, variable.names=parameters, n.iter=3000, thin=1)
m <- as.matrix(codaSamples)

thanks Graeme
ReplyDelete
Replies

Add comment

Doing Bayesian Data Analysis

Saturday, April 7, 2012

Negative Binomial Reparameterization

The model specifications:

Results:

5 comments:

Total Pageviews

Blog Roll