Historians and probability: Is Bayes a blunder?

Greek mosaic of a Christian fish symbolProfessor James F. McGrath (Clarence L. Goodwin Chair in New Testament Language and Literature at Butler University in Indianapolis, Indiana) blogs as Religion Prof at Patheos. In a recent post (link), McGrath reviews another blogger’s review of Richard Carrier’s work concerning Saint Paul’s mention of James as “the brother of the Lord” (Galatians 1:19). McGrath alleges

In essence, Carrier’s approach commits the same blunder that undergraduate students sometimes do before coming to grips with how historians work.

Your obedient servant holds no brief from Dr. Carrier, but the essence of Carrier’s approach is that Bayesian methods can and should be applied to historical questions. I agree with that essence (link).

This post considers whether Professor McGrath has identified some hidden incompatibility between “common sense reduced to calculation,” as Laplace described Bayesian techniques, and normative post-graduate history.

Let’s hope not.

What McGrath thinks is relevant, what Carrier did, and what McGrath would do instead

Here’s an example of reasoning about the human past of which McGrath approves:

If surveillance video footage and fingerprints place you at a crime scene, the fact that your fingerprints were not found on the exterior door handle of the building in which the crime occurred, or one particular camera failed to record you, is irrelevant – or should be.

Irrelevant? At trial, the defense attorney should not be allowed to ask where fingerprints were looked for, where your fingerprints were found, and whether all the cameras agree? I’m glad I don’t live in that jurisdiction. In any real case, the devil would romp in the details: are there reasons why fingerprints might be found some places inside but not on the door handle, even if you were present? Is the coverage of the cameras hardly overlapping, so that even if you were present, you’d likely show up on some, but not others?

Even if the answers to both questions were yes, the questions would still be relevant. Among other reasons, that would be because the answers might be no. Anomalies in the evidence might point to planted or fabricated “evidence,” or the removal of exculpatory evidence (e.g. perhaps people besides the defendant were also present at the crime scene). If not yours, then whose fingerprints were found on that door handle, and if not you, then what do those other cameras show?

Professor McGrath’s example is especially unfortunate, because faked and edited are adjectives that come up often in discussions of New Testament literature. Some people wonder whether “Paul’s” mention of this very James is an interpolation. That’s professional New Testament language for a faked or edited passage within some larger work. That possibility imposes a serious limitation on the probative value of this evidence.

Anyway, Carrier wasn’t writing crime fiction, so what wrong did he do? According to Professor McGrath, with respect to Galatians 1:19 (emphasis in original):

… Richard Carrier (who has a PhD) begrudgingly decides that this piece of evidence is 2:1 in favor of the historicity of Jesus. What the rest of his Bayesian enterprise seeks to obscure is that that ought to settle the matter.

Why ought an unremarkable 2:1 likelihood ratio settle the historicity of Jesus? McGrath explains:

If you know my sibling and they mentioned me, but you have also heard a number of improbable things about me (whether that my parents won the lottery just in time to pay the medical bills after I was born, that I have been interviewed by MTV News and E! Online, or that I have a tenure track position at a university), the latter details should not be evaluated as reasons to doubt my historicity… once my existence is established, even ludicrous claims that turn out to be false do not make my existence less likely.

Paul doesn’t write that he knows the Lord’s sibling. Paul uses the same word for “brother” in verse 1:19 as he uses several times in the same letter in a variety of senses other than “sibling.” Sibling is an interpretation. The tenability of that interpretation isn’t in dispute: Carrier recognizes the verse as favoring a hypothesis adverse to his own. What is in dispute is whether Paul’s remark “establishes” Jesus’ existence as a historical person.

What’s at stake for the usefulness of Bayesian methods in history?

There is nothing unusual for two people to dispute over their interpretations of something they’ve both read. There is nothing at stake there for the “Bayesian enterprise” or for the practice of history.

Where things turn fraught is McGrath’s articulation of a heuristic principle that supposedly separates the naive undergrad from the savvy crowd in the Senior Common Room,

Each piece of evidence needs to be evaluated on its own merits. And the fact that some evidence does not confirm something should never be treated as undermining what the positive evidence shows.

The first sentence of that principle is fine as far as it goes. What Bayes reminds the common sensical is that each piece of evidence needs to be evaluated not only on its own merits but also for how much it contributes to the whole body of evidence. In probabilistic terms, we ask: is this observation conditionally independent of the rest of the evidence?

Example: Continuing Professor McGrath’s “sibling” hypothetical, is it the same person who tells me about meeting somebody’s sibling and then goes on to narrate the parents’ timely win, the cable TV interviews, or the cushy job at the university? If different persons, do they share any common interest in how the uncertainty about somebody’s existence is resolved?

False in one thing, then more likely false in all things is a heuristic known to the ancients. Bayes simply gives that venerable heuristic a mathematical representation, which is just one practical application of attending to probabilistic conditional dependency among pieces of evidence.

In a modern inference problem, not only can each piece of evidence get evaluated on its own merits and what it adds to the existing evidence, but it can also be evaluated for its potential to develop leads to new evidence. Those lottery winners, media interviews or the academic job should probably each leave some traces on the internet. We cannot corroborate Paul’s remark, stemming as it does from the remote past, and we would be remiss not to consider that when estimating its proper weight.

The second sentence of the proposed principle is just wrong from a Bayesian perspective. If each piece of evidence is evaluated properly, then it may increase, decrease or leave unchanged the confidence in any uncertain proposition. Increasing or decreasing confidence is ruled out only when the new evidence is effectively redundant, implied by the existing evidence, or there is no uncertainty about which one of the hypotheses is true. Otherwise, conclusions based on positive evidence (as much as on any evidence) ought to be “undermined” by observing things that are unexpected if the favorite hypothesis were true.


Although Richard Carrier, the object of McGrath’s wrath, is both a Bayesian and a skeptic, Bayesian methods are neither skeptical nor credulous. The methods are, however, constitutionally undogmatic, representing a norm of belief and belief change where only in extreme circumstances does confidence ever rise to moral certainty, the same circumstances where initial disagreement about the uncertain facts spontaneously converges onto nearly unanimous well-warranted agreement.

McGrath may have identified an impediment to the adoption of Bayesian and similar methods in the field of academic history. Maybe there is a heuristic accepted throughout that field which prescribes something like “When there is positive evidence for the existence of a hypothetical person, then that person shall be treated for investigative purposes as a confidently real person.”

Although there may be a case for organizing a collective investigation along those lines, the heuristic offers little or nothing to anybody whose confidence in the hypothetical person isn’t already high. As long as there are only a few such people, then the heuristic should present little difficulty. Such unpersuaded people as there are, however, will be inhibited from participating in a collective investigation constituted around the heuristic.

That is troublesome if the scale of the collective investigation is an entire field of inquiry. Inhibition of the non-credulous from entering the field and encouragement of the over-confident instead would be a bigger problem than Bayes can fix.

Photo credit: Detail of an image of a mosaic by photographer Dimitris Vetsikas (click on it to see full image). He has generously donated this photograph to the public, here is a link to his web page; click on the images there to enlarge.



