Saturday, February 24, 2018

Compose JAGS model statements for human readability

There's a great new book by Farrell and Lewandowsky, Computational Modeling of Cognition and Behavior (at the publisher, at Amazon.com), that includes some chapters on Bayesian methods. Each chapter includes a little "in vivo" commentary by an outside contributor. My commentary accompanies their chapter regarding JAGS. The commentary is posted here in a succession of three blog posts; this is 2 of 3. (Part 1 is here.) Do check out their book!

Compose JAGS model statements for human readability

All mathematical models are designed to describe structure in data. Logically, to comprehend a model, we must first know what the data are that the model is supposed to describe. We begin with describing how the data are probabilistically distributed according to some likelihood function. The likelihood function has parameters, which typically describe some trend or relation in the data. The parameters might be expressed in terms of higher-level parameters. Finally, the parameters have uncertainty, expressed as prior distributions on the parameters. The JAGS model-specification language lets us write models in this logical and comprehensible way: Start with the data, write the likelihood function, then write any dependencies among parameters, and finish with the prior distribution on the parameters. This makes it easy to write the model, and, importantly, easy for readers of the model specification to make sense of the model.

For example, consider a JAGS model specification for describing a set of data with a normal distribution (cf. Listing 8.3 [in Farrell and Lewandowsky's book]):
model {
  for ( i in 1:N ) { y[i] ~ dnorm( mu , 1/sigma^2 ) }
  mu ~ dunif( -100 , 100 )
  sigma ~ dunif( 0 , 100 )
}
   (Listing 8.11. Describe data with a normal distribution in JAGS.)
The model specification (above) is easy to comprehend sequentially in reading order.

JAGS does not execute the lines of the model specification as if they were procedural R commands, but instead JAGS examines the overall model statement for structural consistency. The three lines in the model specification (above) could be put in any order and JAGS would not care. For example, JAGS would also allow the following:
model {
  sigma ~ dunif( 0 , 100 )
  mu ~ dunif( -100 , 100 )
  for ( i in 1:N ) { y[i] ~ dnorm( mu , 1/sigma^2 ) }
}
   (Listing 8.12. Alternative JAGS description of a normal distribution.)
In terms of information content, it does not matter if you say “the knee bone’s connected to the thigh bone, and the thigh bone’s connected to the hip bone,” or instead say “the thigh bone’s connected to the hip bone, and the knee bone’s connected to the thigh bone.”

But for human readers trying to comprehend the statements, order does matter. Especially for complicated models with unfamiliar or arbitrary parameter names, it can be very difficult to understand model specifications that begin by specifying priors on parameters before specifying what distributions those parameters play a role in, and what the relation of the data to the parameters is. Therefore, be kind to your readers, and to your future self who will look back on your code months later. Specify JAGS models starting with the data likelihood then working through the parameters and their priors. These ideas are expressed with more examples on p. 199 and p. 414 of Kruschke (2015).

No comments:

Post a Comment