*Bayesian probability theory* is a formal method of reasoning about evidence. Its probabilities are typically *subjective *and* personal *measures*.* They represent either a real person’s felt confidence, or a hypothetical person’s theoretically justified confidence. Please do not be put off by the word *subjective*. Justified confidence is the foundation of prudent belief, action and behavior.

Richard Carrier is a serious independent scholar and internet celebrity who earned his doctorate in ancient history from Columbia University. He uses Bayesian methods to study history, especially the question of whether Jesus was a real historical person. Carrier professes serene assurance about the objectivity and validity of his Bayesian approach to history (link),

“I don’t think I’ll convince everyone, but the only people who won’t be convinced are people who are irrationally, dogmatically opposed to what I’m arguing.”

This post discusses how well Bayesian methods can resolve historical controversies, in the sense of achieving consensus founded on objective analysis of evidence. Within a community of Bayesians, objectivity and near-unanimity aren’t completely out of reach, but they tend to be elusive except when most people would be convinced whether or not they appeal to Bayes.

**Is being partially objective like being partially pregnant?**

As the word will be used here, *objective* means “accurately assessed independently of anybody’s opinions and preferences.” That’s what consumers of historical scholarship are concerned about when we ponder the odds of “Did that man really live?”

Since a subjective probability expresses somebody’s real or hypothetical opinion, there might seem to be no such thing as an objective subjective probability. Yet there is, however. There is an objective sense in which the probability of a fair coin coming up heads *is* 50-50. That 50-50 is *also* a useful expression of anybody’s confidence that the next toss will land heads. The subjective estimate and the objective value *coincide*.

There is a heuristic of Bayesian practice, called the “Principal Principle” (searchable). It advises subjectivists to adopt objective probabilities as their personal probability estimates, when reliable objective probabilities are available and are uncontroversially applicable to the uncertainty in question.

Statisticians often work in situations where some probabilities are measured objectively because there is a mathematical model of how the data are gathered, for example, by a random sample. Subjective judgment is still required to decide whether a messy real-world sampling procedure sufficiently resembles the ideal to apply a model without controversy, but once a model is adopted, the probability values it recommends have the desired impersonal objective quality. The values are nearly correct or not, independent of anybody’s opinion or preference.

Unfortunately, that is generally insufficient for complete objectivity all the way through to a conclusion. The characteristic Bayesian analysis (what “Bayes Theorem” governs) formalizes not confidence so much as *changes *in confidence. That is, when a Bayesian sees a piece of evidence, Bayes tells her to compare:

How likely (expected, unsurprising, …) that observed evidence would be if one uncertain hypothesis or explanation of the evidence were true

*versus*

How likely that same outcome would be if some alternative hypothesis or explanation were true instead.

A hypothesis under which the observed evidence is less surprising gains confidence while a hypothesis under which the evidence is more surprising loses confidence. The direction and “amount” of change prescribed by Bayes are objective if the hypothetical probabilities of the observed outcome are objective. Nevertheless, if the original confidence levels were subjective, then no matter how objective the basis for the change is, the end point remains subjective.

**Reference classes**

The maternal womb of probabilistic and statistical theory was the casino, and what happens in a casino is objective. Gambling apparatus are engineered to have orderly probabilistic properties. So, too, are sampling procedures, laboratory instruments and other data sources which real-life statisticians use.

History has no engineered apparatus making data for it, but there is another way to achieve objectivity. Shortly after the flowering of practical probability theory based on gambling apparatus, another breakthrough discovery was made. The dismal study of Seventeenth Century bills of mortality revealed a remarkable undesigned rule-likeness that exists “in the wild,” outside the casino or laboratory. The same probability mathematics that tells us about roulette spins will also tell us, with some precision, how many Londoners will die within a year. Many other kinds of groupings and many happier attributes also display a mathematically regular and usefully predictive stability of measured proportions. Such a grouping is called a *reference class*.

There remains some art to assembling useful reference classes. Art is judgment, and all judgment is subjective, but judgment can be principled and commonsensical. Like the profitability of a casino, the predictable profits of insurance companies prove that people can construct reference classes with objective probabilistic characteristics. The Principal Principle counsels Bayesians to adopt these proportions as their personal probabilities whenever that is possible and uncontroversially applicable.

Using reference classes, Bayes can be applied to history, with objective validity. For example, what proportion of known Fifth Century Egyptian papyri use a specific style of calligraphy? Interpreted as a probability, that measured proportion can help estimate a defensible date for a newly discovered ancient Egyptian papyrus with that style.

**Limits to unanimity and objectivity in ancient history**

The best news of Bayes is that if there is enough opportunity to gather high-quality evidence, the original starting confidence eventually becomes irrelevant, while the ending confidence becomes nearly fully determined by objective stuff alone, the actually observed and modeled outcomes. Whichever hypothesis describes the true state of affairs almost surely comes to acquire vastly more confidence than any of its false rivals. The end point thus becomes more and more one of supreme confidence in what’s really true. Since that happy ending depends very little on anybody’s earlier confidence in the winning hypothesis, disagreement among people decreases or disappears altogether.

The less joyous news for Bayesian ancient historians is that they have little opportunity to gather new good-quality evidence on some issues. There may be a Dead Sea Scrolls discovery or a Nag Hammadi find, but not often, and not on demand. Bayesian methods do the most possible with sparse evidence, but the best methods cannot make up for a shortage of evidence. So long as the available evidence is meager, Bayes *predicts* that existing principled differences of opinion will persist, not disappear.

As to achieving objectivity through reference classes, the ugliest fly in that ointment is that any single object often belongs to several different validly constructed reference classes. Yes, this is an Egyptian papyrus, but it was found in the north of Egypt, and its style of calligraphy was much rarer among Fifth Century northern Egyptian examples than in the country as a whole. So which is it? The document shows a typical Fifth Century style or a rare and atypical style?

There are heuristics that offer advice on which reference class among several should be used. However, selection among competing heuristics is itself an exercise of judgment. If there is wide disagreement among the available reference classes, then it is predictable that experts will differ about which reference class is really the right one. *Objectively* the datum belongs to *all* the reference classes whose membership requirements it satisfies. Disagreement among human beings, who might each choose a different reference class as a personal favorite, is *subjective*, a garden-variety difference of opinion.

The upshot, then, is that the potential of reference-class analysis to offer objective probability advice outside the artificial worlds of the casino and laboratory will sometimes remain unrealized. The reference class idea is unlikely to transcend personal opinion unless there’s already something similar to unanimity about which reference classes are applicable.

**Somebody always asks, “Where do the numbers come from?”**

Pierre-Simon Laplace, who invented much of what now goes by the name “Bayesian,” described the technique as “Common sense reduced to calculation.” In a field like ancient history there is often no definite source for any specific numbers to calculate with. Serious people also wonder exactly what calculation adds to common sense even when numbers are readily available.

A Bayesian historian may wish to consider qualitative probabilistic methods, especially where data are sparse. These techniques preserve the persuasive reasonableness of Bayesian analysis, without becoming bogged down in personal numerical values. Qualitative Bayesian practice has more the look and feel of a logical argument than of an algebra assignment.

The work of George Polya (an important Twentieth Century mathematician, friend and debate partner of the Bayesian hero Bruno de Finetti, link) is a good place for a novice qualitative Bayesian to start. Polya’s short, accessible but masterful book on evidential reasoning inspired by probability theory is available for free at archive dot org (link).

In theory I think its’ great, in practice I’m not that good with figures so I don’t practice.