Monday, August 16, 2021

Bayesian Analysis Reporting Guidelines

Just published (open access) in Nature Human Behaviour:

Bayesian Analysis Reporting Guidelines

Abstract: Previous surveys of the literature have shown that reports of statistical analyses often lack important information, causing lack of transparency and failure of reproducibility. Editors and authors agree that guidelines for reporting should be encouraged. This Review presents a set of Bayesian analysis reporting guidelines (BARG). The BARG encompass the features of previous guidelines, while including many additional details for contemporary Bayesian analyses, with explanations. An extensive example of applying the BARG is presented. The BARG should be useful to researchers, authors, reviewers, editors, educators and students. Utilization, endorsement and promotion of the BARG may improve the quality, transparency and reproducibility of Bayesian analyses.

The open access article is available at https://www.nature.com/articles/s41562-021-01177-7

The Supplementary Information is available at https://osf.io/w7cph/

Citation: Kruschke, J.K. Bayesian Analysis Reporting Guidelines. Nat Hum Behav (2021). https://doi.org/10.1038/s41562-021-01177-7

(In the original version of the manuscript, I made a few puns involving BARG and BORG. The final published version retained only one allusion to the BORG: "The BARG have assimilated many previous checklists...")

Update: See also the blog post at Nature.



Friday, July 23, 2021

DBDA2E R Scripts Updated for R 4.1

The R scripts that accompany DBDA2E have been updated so they work with R 4.1. Please go to the book's software page at
https://sites.google.com/site/doingbayesiandataanalysis/software-installation
and scroll down to the bottom of that page to find the link to the zip file for the updated scripts. 

I changed some scripts that use the R function read.csv() and relied on the old default of casting string vectors as factors. The default was changed in R 4.0, and the global option stringsAsFactors=TRUE no longer works for read.csv() in R 4.1.

Friday, April 16, 2021

Benchmark Bayes factors for uncertain prior model probability

I've posted a new manuscript titled "Uncertainty of prior and posterior model probability: Implications for interpreting Bayes factors." Here's a summary and examples to stimulate your interest.

Summary: 

In most applications of Bayesian model comparison or Bayesian hypothesis testing, the results are reported in terms of the Bayes factor only, not in terms of the posterior probabilities of the models. Posterior model probabilities are not reported because researchers are reluctant to declare prior model probabilities, which in turn stems from uncertainty in the prior. Fortunately, Bayesian formalisms are designed to embrace prior uncertainty, not ignore it. This article provides

  • novel formal derivations expressing the prior and posterior distribution of model probability
  • a candidate decision rule that incorporates posterior uncertainty
  • numerous illustrative examples
  • benchmark BF’s using the uncertainty-based decision rule including benchmarks for a conventional uniform prior
  • computational tools in R that are freely available at https://osf.io/36527/

I hope that this article provides both a conceptual framework and useful tools for better interpreting Bayes factors in all their many applications.

Examples:

Suppose you're doing a model comparison or hypothesis test, and you have a well-constructed Bayes factor of, say, BF=5. What do you conclude about the models or hypotheses? Your conclusion will depend on the posterior probabilities of the models, which in turn depend on the prior probabilities of the models. And what are the prior probabilities of the models? You're probably uncertain about the prior probabilities. Instead of pretending to have some specific point value for the prior model probabilities (as is usually done if it's done at all), we can represent the uncertainty as a distribution. The distribution of prior model probability becomes a distribution of posterior model probability, and we consider the entire distribution to decide about the models.

Notation: M1 is model 1, M2 is model 2. p(M1) is the prior probability of M1. BF is the Bayes factor for M1 relative to M2.

Figure 1, below, shows an example with a high-certainty (a.k.a., narrow, high-concentration) prior distribution at p(M1)=0.5. This prior distribution (see Panel A of Figure 1) represents a belief that the prior odds, p(M1)/p(M2), are almost certainly 50/50. When people assume any prior odds at all, this is the usual conventional for representing neutrality. Panel B shows the posterior distribution for BF=5 in favor of M1. Notice the probability of M1 has increased. Panel C shows the posterior distribution for BF=11.3, which is sufficient for the 95% HDI of the posterior distribution to exceed a decision criterion indicated by the vertical dashed line.

Figure 1. Highly concentrated prior.
Figure 1. A: High-certainty prior on p(M1). B: Posterior when BF=5. Posterior when BF=11.3.

Figure 2, by contrast, shows an example with a very uncertain (a.k.a., broad, low-concentration) prior distribution. This prior distribution (see Panel A of Figure 2) represents a much more typical state of prior knowledge, or at least is a much better representation of neutrality between models. Panel B shows the posterior distribution for BF=5. Notice it is very spread out. Panel C shows the posterior distribution for BF=38.9, which is sufficient for the 95% HDI of the posterior distribution to exceed a decision criterion (again indicated by the vertical dashed line). This BF might be treated as a benchmark when assuming a more realistic "neutral" prior for the model probabilities.

Figure 2. Uniform prior.
Figure 2. A: Broad, uncertain prior on p(M1). B: Posterior when BF=5. C: Posterior when BF=38.9.

All the details are in the manuscript at https://osf.io/36527/. Please send me an email if you have comments!

Sunday, October 25, 2020

DBDA2E in brms and tidyverse


Solomon Kurz has been re-doing all the examples of DBDA2E with the brms package for ease of specifying models (in Stan) and with the tidyverse suite of packages for data manipulation and graphics. His extensive re-write of DBDA2E can be found here. It's definitely worth a look!

He has extensive re-writes of other books, too.

I've been meaning to make a post about this for ages, and have finally gotten around to it. Big thanks to Solomon Kurz!

Monday, September 28, 2020

Fixing a new problem in some DBDA2E scripts caused by a change in R 4.0.0

UPDATE: Scripts are now changed to work with R 4.1; see this blog post

The scripts that accompany DBDA2E have worked fine, "out of the box," for years. But recently some scripts have had problems. Why? R has changed. With R 4.0.0, various functions such as read.csv() no longer automatically convert strings to factors. Some DBDA2E scripts assumed the results of those functions contained factors, but if you're now using R 4.0.0 (or more recent) those scripts will balk. 

So, what to do? Here's a temporary fix. When you open your R session, type in this global option:

options(stringsAsFactors = TRUE)

Unfortunately this option will eventually be deprecated. I'll have to modify every affected script and post updated versions. This will happen someday. I hope.

Why did R change? You can read about it here: 

https://developer.r-project.org/Blog/public/2020/02/16/stringsasfactors/index.html


Friday, August 14, 2020

Need help finding corrigenda for DBDA2E


UPDATE: Now solved. Big thanks to Kent Johnson for re-constructing the table of corrigenda!


The host of the DBDA2E website (Google Sites), mandated a formatting change. Turns out the automatic reformatting mangled the table of Corrigenda. You can see it here:

https://sites.google.com/site/doingbayesiandataanalysis/corrigenda

I'd really like a properly formatted version!

Did you print or save or copy the previously formatted Corrigenda from the DBDA2E website sometime between October 2018 and August 2020? If so, please send it to me, and I'll attach it to the website.

(I got a version from the wayback machine at web.archive.org dated 2016, but there were subsequent modifications made until Sept 2018.)

Or, do you know of a way to re-format the mangled version so it appears properly?

Thanks!


Thursday, May 28, 2020

Teach (and learn) Bayesian and frequentist side by side

Teach (and learn) Bayesian and frequentist side by side: a talk and an app.

A talk explaining why that's a good idea:

(Talk delivered Saturday May 18, 2019.)

The interactive Shiny App with Bayesian and frequentist side by side: click HERE.
If you consider the app, especially for teaching, please let me know how it goes.