Wednesday, February 25, 2015

"Is the call to abandon p-values the red herring of the replicability crisis?"


In an opinion article [here] titled "Is the call to abandon p-values the red herring of the replicability crisis?", Victoria Savalei and Elizabeth Dunn concluded, "at present we lack empirical evidence that encouraging researchers to abandon p-values will fundamentally change the credibility and replicability of psychological research in practice. In the face of crisis, researchers should return to their core, shared value by demanding rigorous empirical evidence before instituting major changes."

I posted a comment which said in part, "people have been promoting a transition away from null hypothesis significance testing to Bayesian methods for decades, long before the recent replicability crisis made headlines. The main reasons to switch to Bayesian have little directly to do with the replicability crisis." Moreover, "It is important for readers not to think that Bayesian analysis merely amounts to using Bayes factors for hypothesis testing instead of using p values for hypothesis testing. In fact, the larger part of Bayesian analysis is a rich framework for estimating the magnitudes of parameters (such as effect size) and their uncertainties. Bayesian methods are also rich tools for meta-analysis and cumulative analysis. Therefore, Bayesian methods achieve all the goals of the New Statistics (Cumming, 2014) but without using p values and confidence intervals."


See the full article and comment at the link above.


Monday, February 23, 2015

Journal bans null hypothesis significance tests

In a recent editorial [here], the journal Basic and Applied Social Psychology has banned the null hypothesis significance testing procedure (NHSTP). "We hope and anticipate that banning the NHSTP will have the effect of increasing the quality of submitted manuscripts by liberating authors from the stultified structure of NHSTP thinking thereby eliminating an important obstacle to creative thinking. The NHSTP has dominated psychology for decades; we hope that by instituting the first NHSTP ban, we demonstrate that psychology does not need the crutch of the NHSTP, and that other journals follow suit."

In a short bit about Bayesian analysis, the editorial says, "The usual problem with Bayesian procedures is that they depend on some sort of Laplacian assumption to generate numbers where none exist." I think here the editors are too focused on Bayesian hypothesis tests instead of on the much broader application of Bayesian methods to parameter estimation. For example, in the 750 pages of DBDA2E, I never mention the Laplacian assumption because the procedures do not depend on it. Despite their narrow view of Bayesian methods, I am encouraged by the bold move that might help dislodge NHST.

Sunday, February 8, 2015

I've got variable Y that I want to predict from variables X1, X2, etc. What should I do?


For questions like yours -- I've got variable Y that I want to predict from variables X1, X2, etc.; What should I do? -- the best answer is usually informed by background knowledge of the domain. Generic models, like multiple linear regression, don't always make the most meaningful answer.

For example, suppose you're trying to predict the amount of fencing (Y) you'll need for rectangular lots of length X1 and width X2. Then a linear regression would serve you well. Why? Because we know (from background knowledge) that perimeter is a linear function of length and width.

But suppose you're trying to predict how much grass seed you'll need for the same lot. Then you'd want a model that includes the multiplicative product of X1 and X2, because that provides the area of the lot.

As another example, suppose you're trying to predict the installed length of a piece of pipe (Y) as a function of the date (X). You know that pipe expands and contracts as some function of temperature. And you also know that temperature cycles sinusoidally (across the seasons of a year) as a function of date. So, to predict pipe length as function of date, you'd use some trend that incorporates the expansion function on top of a sinusoidal function of date.

Whatever model you end up wanting, it can probably be implemented in JAGS (or BUGS or Stan). That's one of the beauties of the Bayesian approach with its general purpose MCMC software.

Monday, January 26, 2015

Institutionalized publication thresholds, p values, and XKCD

XKCD today is about p values (see image at right). I think that what XKCD is pointing out is not so much a problem with p values as with strongly institutionalized publication thresholds and the ritual of mindless statistics, as Gigerenzer would say. The same problem could arise with strongly institutionalized publication thresholds for Bayes factors, or even for HDI-and-ROPEs. One thing that's nice about the HDI-and-ROPE approach is that it's explicitly about magnitude and uncertainty, to help nudge thinking away from mindless decision thresholds.

(Thank you to Kevin J. McCann for pointing me to XKCD today.)

P.S. added 30-January-2015: Gigerenzer has a new article, extending the one linked above to Bayes factors. Links: http://jom.sagepub.com/content/41/2/421 and http://jom.sagepub.com/content/41/2/421.full.pdf+html Surrogate Science: The Idol of a Universal Method for Scientific Inference. Gerd Gigerenzer and Julian N. Marewski. Journal of Management, Vol. 41, No. 2, February 2015, 421–440.
In this article, we make three points.
1. There is no universal method of scientific inference but, rather, a toolbox of useful statistical methods. In the absence of a universal method, its followers worship surrogate idols, such as significant p values. The inevitable gap between the ideal and its surrogate is bridged with delusions—for instance, that a p value of 1% indicates a 99% chance of replication. These mistaken beliefs do much harm: among others, by promoting irreproducible results.
2. If the proclaimed “Bayesian revolution” were to take place, the danger is that the idol of a universal method might survive in a new guise, proclaiming that all uncertainty can be reduced to subjective probabilities. And the automatic calculation of significance levels could be revived by similar routines for Bayes factors. That would turn the revolution into a re-volution— back to square one.
These first two points are not “philosophical” but have very practical consequences, because
3. Statistical methods are not simply applied to a discipline; they change the discipline itself, and vice versa. In the social sciences, statistical tools have changed the nature of research, making inference its major concern and degrading replication, the minimization of measurement error, and other core values to secondary importance.

Friday, January 16, 2015

Free introduction to doing Bayesian data analysis - Share with friends!

The goal of Chapter 2 is to introduce the conceptual framework of Bayesian data analysis. Bayesian data analysis has two foundational ideas. The first idea is that Bayesian inference is reallocation of credibility across possibilities. The second foundational idea is that the possibilities, over which we allocate credibility, are parameter values in meaningful mathematical models. These two fundamental ideas form the conceptual foundation for every analysis in this book. Simple examples of these ideas are presented in this chapter. The rest of the book merely fills in the mathematical and computational details for specific applications of these two ideas. This chapter also explains the basic procedural steps shared by every Bayesian analysis.The chapter is here.