In this post, I describe how it is easier to intuit the beta distribution in terms of its mode than its mean. This is especially handy when specifying a prior beta distribution.
(In

a previous post, I explained how it is easier to intuit the

*gamma* distribution in terms of its mode instead of its mean.)

A problem with using the mean to describe a distribution is that for skewed distributions, the mean may be far from the mode, but the mode may be what we intuitively want as the "descriptive handle" on the distribution, and therefore the mean is not a good surrogate for the description of central tendency. Especially when we are specifying a prior distribution, we may want to express our intuition in terms of the mode of the prior instead of the mean.

For a beta distribution with shape parameters a and b, the mode is (a-1)/(a+b-2).

**Suppose we have a desired mode, and we want to determine the corresponding shape parameters.** Here's the solution. First, we express the "certainty" of the estimate in terms of the equivalent prior sample size,

k=a+b, with k≥2.

The certainty must be at least 2 because it essentially assumes that the prior contains at least one "head" and one "tail," which is to say that we know each outcome is at least possible. Then a little algebra reveals:

a = mode * (k-2) + 1

b = (1-mode) * (k-2) + 1

Here are a few examples:

The book expressed beta distributions in terms of mean and certainty instead of mode and certainty; cf. Eqn. 5.5, p. 83, where m denoted the mean and n denoted the certainty instead of k used here.