Thursday, November 3, 2016

Bayesian meta-analysis of two proportions in random control trials


For an article that's accepted pending final revision (available here at OSF), I developed a Bayesian meta-analysis of two proportions in random control trials. This blog post summarizes and links to the complete R scripts.

We consider scenarios in which the data consist of the number of occurrences and the number of opportunities in a control group and in a treatment group.  The number of occurrences in the treatment group is denoted \(z_T\) and the number of opportunities in the treatment group is denoted \(n_T\), and analogously \(z_C\) and \(n_C\) in control group. The proportion of occurrences in the treatment group is \(z_T/n_T\), and the proportion of occurrences in the control group is \(z_C/n_C\).

For example, perhaps we are interested in the occurrence of mortality after myocardial infarction (death after heart attack), in a control group and in a group treated with beta-blockers (heart muscle relaxant). In this case, if beta-blockers have a beneficial effect, \(z_T/n_T\) should be less than \(z_C/n_C\).

As another example, perhaps we are interested in re-use of towels by patrons of hotels (instead of having towels changed every day for the same patron, which wastes laundering electricity and detergent). We consider towel re-use in a control group and in a group treated with a sign that indicates it's normal for people to re-use their towels. In this case, if the treatment has a beneficial effect, \(z_T/n_T\) should be greater than \(z_C/n_C\).


In meta-analysis, there are several studies that each test the treatment. The data from study \(s\) are denoted \(z_{T[s]}, n_{T[s]}, z_{C[s]}, n_{C[s]}\). Each study is conducted at its own site (e.g., hotel, hospital). Because each site has its own specific attributes, we do not expect the underlying proportions of occurrence to be identical across sites. But we do expect them to be similar and mutually informative, so we treat the data from different sites as representative of higher-level parameters in the model that describe what's typical across sites and how much variability there is across sites. This approach is the usual random-effects model for meta-analysis.

Here are the parameters I'll use to describe the data:
  • \(\theta_{C[s]}\) is the probability of occurrence in the control group for study \(s\).
  • \(\theta_{T[s]}\) is the probability of occurrence in the treatment group for study \(s\).
  • \(\rho_{[s]}\) is the difference of log-odds between groups:
       \(\rho_{[s]} = logit(\theta_{T[s]}) - logit(\theta_{C[s]})\)
Re-arranged, the equation for \(\rho_{[s]}\) expresses the relation of \(\theta_{T[s]}\) to \(\theta_{C[s]}\): \[ \theta_{T[s]} = logistic( \rho_{[s]} + logit( \theta_{C[s]} ) )\] This relation is a natural way to represent the dependency of the probabilities between groups because the relation is (i) symmetric with respect to what outcome is defined as success or failure because \(logit(\theta) = -logit(1-\theta)\), and (ii) symmetric with respect to which group is defined as the treatment by reversing the sign of \(\rho\). Note that \(\rho_{[s]}\) is the so-called log odds ratio across groups: \(\rho_{[s]} = log( [\theta_{T[s]}/(1-\theta_{T[s]})] / [\theta_{C[s]}/(1-\theta_{C[s]})] )\). I hope this little explanation and motivation of \(\rho_{[s]}\) is helpful.

I'll describe the distribution of  \(\theta_{C[s]}\) across studies as a beta distribution, parameterized by its mode and concentration:
  • \(\omega_{\theta C}\) is the modal value (of the beta description) of \(\theta_{C[s]}\)
  • \(\kappa_{\theta C}\) is the concentration (of the beta description) of \(\theta_{C[s]}\)
For beta distributions parameterized by the usual \(a,b\) shape parameters, we convert mode and concentration to \(a,b\), and the above specification becomes \(\theta_{C[s]} \sim beta( \omega_{\theta C}(\kappa_{\theta C}-2)+1 , (1-\omega_{\theta C})(\kappa_{\theta C}-2)+1 )\).

I'll describe the distribution of  \(\rho_{[s]}\) across studies as a normal distribution, parameterized by its mean and standard deviation:
  • \(\mu_{\rho}\) is the mean (of the normal description) of \(\rho_{[s]}\)
  • \(\sigma_{\rho}\) is the standard deviation (of the normal description) of \(\rho_{[s]}\) 
In other words, for a normal distribution parameterized by mean and precision as in JAGS, \( \rho_{[s]} \sim normal( \mu_{\rho} , 1/\sigma_{\rho}^2 ) \). Usually, the primary focus of research is the value of \(\mu_{\rho}\), that is, the estimate of the treatment effect across studies.

This type of hierarchical model is a typical random-effects model for meta-analysis, because the model gives each study its own individual parameter values which are assumed to be ("exchangably") representative of a common underlying tendency.

I'll set vague priors on the top-level parameters. An implementation of the model in R, JAGS, and runjags is provided below.

Notice that an alternative model that estimated \(\theta_{C[s]}\) independently from \(\theta_{T[s]}\), and then computed \(\rho_{[s]}\) afterwards, would not produce the same results. Nor would it be an appropriate model! It would not be appropriate because it would treat the control group and treatment group in the same study as being completely unrelated --- if we independently permuted the study indices in the treatment and control groups (i.e., re-arranged which control groups go with which treatment groups) the results of this alternative model would be unchanged. Instead, the model I am using assumes that the treatment probability is linked to the control probability. For example, if one  hospital has a low rate of heart attacks in the control condition but another hospital has a high rate of heart attacks in the control condition, the treatment should reduce heart attacks relative to the particular hospital's base rate, not to some absolute rate independent of hospital.


I'll apply the model to two sets of data. First, some data on death after heart attack, summarized on pp. 124-128 of Gelman et al., 2014, Bayesian Data Analysis, Third Edition. There were 22 studies, involving as few as 77 patients and as many as 3,887 patients. The treatment group received beta-blockers. If the treatment is effective, the log-odds-ratio will be less than 0. Below is a forest plot of the results:

In the plot above, each of the 22 lower rows shows an individual study's observed log-odds-ratio with a gray triangle. Notice that the gray triangle is greater than 0 in 6 of the 22 studies. The size of the triangle indicates the sample size in the study. The blue distribution is the posterior distribution for \(\rho_{[s]}\). The distribution is marked with its modal value and its 95% highest density interval (HDI). The numerical values of the mode and HDI are indicated at the right margin. At the top of the plot is shown the posterior distribution of \(\mu_{\rho}\). In words, this means that across studies the typical effect of treatment has a log-odds-ratio of about \(-0.25\), with a range of uncertainty from \(-0.39\) to \(-0.12\), well below 0. (These values are very similar to those reported by Gelman et al., BDA3, Table 5.3, p. 127.)

Notice also in the forest plot above that there is strong shrinkage of the individual study estimates toward the modal value across studies. For example, Study 22 has a greatly reduced rate of heart attack in its treatment group (gray triangle is at a low value), but the posterior estimate of its treatment effect is not so extreme, and its posterior distribution shows a skew to accommodate the "pull" of the extreme data from the shrunken distribution. Complementary skew and shrinkage can be seen, for example, in Study 14. The posterior distributions of the individual studies also show different uncertainties depending on the sample size in the study. For example, Study 10, with a large sample size, has a much narrower HDI than Study 19 with its small sample size.

Here is another application. In this case the data come from studies of towel re-use (Scheibehenne, Jamil, and Wagenmakers (2016). Bayesian evidence synthesis can reconcile seemingly inconsistent results: The case of hotel towel reuse. Psychological Science, 27, 1043-1046). At 7 different hotels, patrons in the treatment group were told it is normal to re-use towels (see article for details). If the treatment is effective, the log-odds-ratio should be greater than 0. Here is a forest plot of the results from the analysis:
(In Study 1 above, the gray triangle is so small that it falls outside of the plot range. It had N=162.) In this case, because there were only 7 studies and wide variation in the results across studies, the overall estimate of the log-odds-ratio is fairly uncertain: Its 95% HDI goes from \(-0.12\) to \(+0.47\). While the mode of the overall estimate is positive (at \(0.21\)), the uncertainty is great enough that we would want to do more studies to nail down the magnitude of the effect of treatment. Notice also the posterior distributions of the individual studies: There is evident shrinkage, but also lots of uncertainty, again with smaller studies showing more uncertainty than larger studies.

(In Scheibehenne et al.'s published analysis, a fixed-effects model was used, which is tantamount to using a single \(\rho\) and single \(\theta_C\) for all studies. This can be approximated in the model used here by specifying a prior that forces there to be tiny variance across \(\rho_{[s]}\) and tiny variance across \(\theta_{C[s]}\).)

As mentioned at the beginning, this model was developed for an article that is accepted pending final revision, available here. I also talked about Bayesian meta-analysis, and these applications in particular, in a presentation about Bayesian methods for replication analysis, that you can watch here.

The complete R script is available here.

16 comments:

  1. Hello John,

    Thanks a lot for posting your script, I’ve been wondering for some time how one could get such nice forest plots with posterior distributions from each study included in the meta-analysis.

    However, I’m encountering some problem with running the script, hopefully you could help me. When I try to run the model with JAGS, here is what I get (see below). Apparently it must be some problem with writing the model for JAGS, but I cannot really find where the problem is. I’d appreciate a lot some help !

    Best,
    Wojciech

    > runJagsOut <- run.jags( method=c("rjags","parallel")[2] ,
    +                         model="TEMPmodel.txt" ,
    +                         monitor=parameters ,
    +                         data=dataList , 
    +                         inits=initsList ,
    +                         n.chains=nChains ,
    +                         adapt=adaptSteps ,
    +                         burnin=burnInSteps ,
    +                         sample=ceiling(numSavedSteps/nChains) ,
    +                         thin=thinSteps ,
    +                         summarise=FALSE ,
    +                         plots=FALSE )
    Welcome to JAGS 3.4.0 on Sat Nov 5 08:20:52 2016
    JAGS is free software and comes with ABSOLUTELY NO WARRANTY
    Loading module: basemod: ok
    Loading module: bugs: ok
    . <
    Error parsing model file:
    syntax error on line 19 near "¬"
    >. .
    Error in setup.jags(model = outmodel, monitor = outmonitor, data = outdata, :
    Error reading model (see output above for more details, and use failed.jags('model') to see model syntax with line numbers)

    ReplyDelete
  2. Dear Wojciech:

    Two things come to mind. First, I notice you're using an old version of JAGS. Update everything to the latest versions, including R, RStudio, JAGS, and runjags. Second, if that doesn't work, it might be a font mismatch problem, if you've copied and pasted from the screen to your R file. Check the model specification string, near line 19, to be sure that all the characters are correct.

    Let us know how it goes.

    John

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Thanks a lot for help. I followed your recommendations (updated the software) and I also copied the code for the model manually, without pasting. For sure it solved the previous problem, but quite strangely it still doesn’t work : I get an error message, and basically what seems to be the cause is this :

    Compilation error on line 7.
    Unknown distribution: ilogit

    Somehow my software does not recognise this distribution… Would you have some ideas how to solve it ?

    Thanks a lot,
    Wojciech

    ReplyDelete
  5. Did you copy and paste the code exactly? Seems that your run is trying to treat ilogit as a distribution, not as a function. Did you type in a "~" instead of a "<-"?

    ReplyDelete
  6. Dear John,
    Exactly, I would not have seen it on my own ! ;) Thank you for your time and help. Everything is working now.
    Best,
    Wojciech

    ReplyDelete
  7. Another "gold piece" of Bayesian meta-analysis, thank you very much.

    When you have time, I'd like to also have a version for dependent proportions i.e. obtained in the same participant or unit.

    Best
    Patrizio

    ReplyDelete
  8. I thought this analysis method might be useful in the manufacturing operation in which I work. I would treat each product batch as a study, model the pass/fail test at the end of the process, and there are many ways to divide the product into treatment/control populations since we have about 100 process steps. For example, there are steps where a batch gets split between two nominally identical machines and we would be interested in characterizing the resulting yield difference, if any. Is there any obvious problem with applying the meta-analysis code in that way?

    Working for a company that is very strict about proper licensing (as they should be), what is the license, if any, of the code published here and at the book's web site? I haven't found anything other than a copyright notice for the book. I hope I didn't miss an obvious notice!

    (Sorry if this is a double post. I tried to post yesterday but it seems to have failed.)

    ReplyDelete
  9. Hi Dr. Kruschke,

    Thank you for this post. I think it highlights the important point that random effects meta-analyses are easily conceptualized as Bayesian multilevel models.

    I would like to point out that these models, including the one you're fitting in the post, are now very easily implemented in Stan (rstan) through the R package bmrs. Hopefully they will get more attention within people applying meta-analyses to data!

    I wrote a short blog post on how to fit meta-analytic models with rstan (with brms) here.

    Cheers,

    ReplyDelete
    Replies
    1. Thanks for the link to your interesting blog post. The brms package for Stan is nice. I wish there were something comparable for JAGS. I understand that the runjags package by Matt Denwood has some comparable facilities, but not as extensive as brms.

      Delete
  10. @ Francis Campos, regarding copyright and license:

    I have posted the code and article at the Open Science Foundation:
    https://osf.io/j6364/
    For the code, there is a modified MIT license, posted with files at OSF.
    Thanks for your interest.

    ReplyDelete
  11. @ Patrizio Tressoldi, regarding dependent proportions:

    I'm not sure what you mean. The present model does assume that the treatment and control groups are "dependent" in the sense of being conducted at the same site. That assumption is the key for defining

    theta_T[s] = logistic( rho[s] + logit( theta_C[s] ) )

    Perhaps you meant something different?

    ReplyDelete
  12. Hi Dr.Kruschke

    Would you also provide an example for the cases that only the differences of means and variances are available?

    Thank you

    ReplyDelete
    Replies
    1. Sorry, I don't have an existing version of a model for data with that sort of structure. It would be straight forward to create, but it would also be time consuming (not only to program, but also to write the blog post), and time is short...

      Delete
  13. Hi John,
    as dependent proportions I meant proportions observed within the same participants, for example pre and post intervention.

    Best
    Patrizio

    ReplyDelete
    Replies
    1. That's a curious data structure, but in principle it would be straight forward to create a model that would describe it. But it would also be time consuming (not only to program, but also to write the blog post), and time is short...

      Delete