Sunday, November 30, 2014

An example of hierarchical conditional-logistic Bayesian estimation, applied to punishment choice in a public goods game.

Conditional-logistic regression and softmax regression (a.k.a. multinomial logit regression) are covered in Chapter 22 of DBDA2E, but no examples of hierarchical versions are given in that chapter. An example of hierarchical conditional-logistic Bayesian estimation, applied to punishment choice in a public goods game, is provided in a new article. The appendix of the article (linked below) provides complete details of the model.

Abstract: Punishment is an important method for discouraging uncooperative behavior. We use a novel design for a public goods game in which players have explicit intended contributions with accidentally changed actual contributions, and in which players can apply costly fines or ostracism. Moreover, all players except the subject are automated, whereby we control the intended contributions, actual contributions, costly fines, and ostracisms experienced by the subject. We assess subject’s utilization of other players’ intended and actual contributions when making decisions to fine or ostracize. Hierarchical Bayesian logistic regression provides robust estimates. We find that subjects emphasize actual contribution more than intended contribution when deciding to fine, but emphasize intended contribution more than actual contribution when deciding to ostracize. We also find that the efficacy of past punishment, in terms of changing the contributions of the punished player, influences the type of punishment selected. Finally, we find that the punishment norms of the automated players affect the punishments performed by the subject. These novel paradigms and analyses indicate that punishment is flexible and adaptive, contrary to some evolutionary theories that predict inflexible punishments that emphasize outcomes. Keywords: punishment, public goods game, ostracism, trembling-hand, intention, outcome bias.

Liddell, T. M., and Kruschke, J. K. (2014). Ostracism and fines in a public goods game with accidental contributions: The importance of punishment type. Judgment and Decision Making, 9(6), 523-547. The article is here (

Friday, November 28, 2014

How can I learn Bayesian modeling?

An email inquires:

Dear Sir, Greetings. I am a PhD ... student based in [country]. I work in the area of [...] and much of the new models in this area are based on Bayesian approach. I too feel this approach needs to [be] adopted ... . I Sir would too like to learn this approach. Unfortunately here in [country] we in Social Science are [not] trained too well in methodology. This mail is to request you to guide me as to how I can learning Bayesian Modeling relevant to [...] by self-study. There are currently no courses in [country] (to the best of my knowledge) that offer Bayesian Statistics/Bayesian Modeling. Please suggest resources (books, articles, videos) and methodology. I sincerely thank you for going through my mail and taking out you valuable time to help me. Thanks and Regards,

My reply:

Dear [...]:

Thank you for your interest.

Well, there's a vast literature out there, but since you're asking me, I'll recommend the following sequence from my own work.

For an introduction to Bayesian data analysis:
1. Read Chapter 2 of the 2nd Edition of my book. You can preview the book at various sites listed here.
2. Read the article, "Bayesian estimation supersedes the t test," linked here.
3. Read the article, "The time has come: Bayesian methods for data analysis in the organizational sciences," linked here.

For an introduction to Bayesian models of perception and cognition:
1. Read the blog post, "Bayesian models of mind, psychometric models, and data analytic models," linked here.
2. Read the article, "Bayesian learning theory applied to human cognition," linked here.
3. Read the article, "Bayesian approaches to associative learning: From passive to active learning," linked here.

Good luck with your work, and thanks again for your interest.


Tuesday, November 25, 2014

Warning message in runjags turned off in new version of DBDA2E utilities

A reader pointed out that programs that accompany DBDA2E and use runjags produce a disconcerting warning message about "starting parallel chains without setting different PRNG for each chain. Different values have been added to each set of initial values." This is merely runjags saying that it is doing the right thing, and not to worry. But it seems to cause worry. Therefore, in a new version of the DBDA2E utilities program I have set the options in runjags not to produce the warning message. To get the new version, go to the software web page and scroll down to the bottom where you will find the programs for the book. Click the download arrow on the right-hand side. Be sure to unzip/extract the file.

Monday, November 24, 2014

eBook and Kindle of DBDA2E now available

The eBook (PDF) and Kindle editions of Doing Bayesian Data Analysis, 2nd Edition, are now available. The book web page has relevant links.

Friday, November 21, 2014

Ordinal probit regression: Transforming polr() parameter values to make them more intuitive

In R, the polr function in the MASS package does ordinal probit regression (and ordinal logistic regression, but I focus here on probit). The polr function yields parameter estimates that are difficult to interpret intuitively because they assume a baseline intercept of 0 and a noise standard deviation of 1, which produces slopes and thresholds that are relative to an underlying scale with little direct relation to the ordinal values in the data. In can be more intuitive to fix the low and high thresholds at values that make sense on the ordinal data scale, and estimate the other parameters relative to those low and high thresholds. This intuitive parameterization is used in Chapter 23 of DBDA2E. This blog post presents a function that transforms the polr parameter values to the intuitive parameterization.

Background: Ordinal probit regression.

See Chapter 23 of DBDA2E for a complete introduction to and explanation of ordinal probit regression. The figure below provides a quick reminder of the model.
The data to be predicted are ordinal, shown on the left side of the figure. The predictor in this case is metric, shown at the lower-right of the figure. The model assumes there exists a underlying metric scale for the ordinal outcome, and this underlying scale is chopped into subintervals that have thresholds denoted theta (see the diagram). The model assumes there is some noisy mapping from the predictor to the underlying metric response scale, as assumed by ordinary linear regression, and the probability of each discrete ordinal response is the area under the normal curve between the thresholds for that response category.

From the data we estimate the intercept, the slope, the standard deviation, and the thresholds. But the underlying scale has arbitrary location and magnification, so two of the parameters can be arbitrarily fixed. In polr, the intercept is fixed at 0 and the standard deviation is fixed at 1. But that yields thresholds and slopes on a scale with no intuitive relation to the response values, which go from 1 to K, where K is the maximum ordinal response level.

A more intuitive parameterization fixes the low threshold at 1.5 (half way between 1 and 2), and the high threshold at K-.5 (half way between K-1 and K). Then the thresholds are directly interpretable with respect to those anchors, and the standard deviation can also be interpreted relative to the distance between those anchors.

Example: Happiness and financial assets.

Consider subjective ratings of happiness on a 1 to 5 ordinal scale, where 1 is bummed out and 5 is ecstatic (I forget the exact wording used to label the response items). For each respondent we also get their total assets. Some real data are shown in the top panel of the figure below:
See the figure caption for details. The lower panels of the figure show aspects of the Bayesian posterior distribution. Notice that the lowest threshold is fixed at 1.5, and the highest threshold is fixed 4.5. This means that the underlying metric output scale is aligned with the ordinal responses, and the other parameters are relative to the low and high anchors. In particular, the intercept is at about 3.16, which suggests a happiness rating of about 3 even with zero assets. The slope is about 4.14e-6 happiness units per yuan. The standard deviation of the noise is about 0.854 happiness units, suggesting large variability on a 5-point scale.

The output of polr, however, looks like this (where Happiness is Yord and Assets is X):

> polrInfo = polr( Yord ~ X , method="probit" )
> print( polrInfo )
Coefficients: X 4.025474e-06 Intercepts: 1|2 2|3 3|4 4|5 -2.048053 -1.180089 -0.127664 1.544575

What do those parameter estimates mean? The coefficient on X is the slope, and the intercepts are the thresholds. While we can use those estimates to derive predictions of outcome probabilities, they aren't very directly interpretable, at least not for me!

The polr estimates can be transformed into the parameterization I explained above, and then we get the following:

> polrInfoOrd = polrToOrdScale( polrInfo )
> print( polrInfoOrd )
[1] 0.8350432

[1] 3.210213


     1|2      2|3      3|4      4|5 
1.500000 2.224788 3.103608 4.500000 

The thresholds are the zeta values; notice that the lowest threshold is 1.5 and the highest threshold is 4.5. The sigma, intercept (i.e., b0), and slope (i.e., coefficient) are very close to the modal posterior values of the Bayesian estimation.

The function:

Here is the specification of the function in R:

polrToOrdScale = function( polrObject ) {
  polrThresh = polrObject$zeta
  polrSlopes = polrObject$coefficients
  polrInter = 0.0
  polrSigma = 1.0
  K = length(polrThresh) + 1  # K is number of ordinal levels
  sigmaMult = unname( (K-2)/(polrThresh[K-1]-polrThresh[1]) )
  inter = unname( 1.5 - ( sigmaMult*polrThresh[1] ) )
  respThresh = sigmaMult*polrThresh + inter
  respSigma = sigmaMult*polrSigma
  respB0 = sigmaMult*polrInter + inter
  respSlopes = sigmaMult*polrSlopes
  return( list( sigma=respSigma ,
                b0=respB0 ,
                coefficients=respSlopes ,
                zeta=respThresh ) )

I have only tested the function on a few examples; there are no guarantees! Feel free to reply with refinements.

Monday, November 10, 2014

2nd Edition has shipped (Doing Bayesian Data Analysis)

I am told by some readers that they have received a physical copy of the 2nd Edition of Doing Bayesian Data Analysis, but I have yet to see it myself. I hope the paper is heavy and opaque, but the book lifts the spirits and the concepts are transparent! Info about the 2nd Edition is here.