Friday, December 16, 2005

Bernoullian Combination

Jakob Bernoulli's posthumous Ars Conjectandi (1713) is a classic of probability theory that deserves to be better known among students of early modern thought. One of the interesting things about the work is its proposal of a non-Bayesian way of dealing with probability of arguments. We tend to forget that originally there was no notion of numerical probability; our tendency to think of probabilities numerically is precisely one of our inheritances from the early modern period. Originally a probability was an argument; and much early probability theory gets its major impulse from the first exploratory attempts to see if this sense of probability could be clarified by algebra.

Against this background, it was natural that people would begin seeing what analogies there were between gambling, an area where decision and argument was very easily mathematized, and other areas. Bernoulli explored a way to mathematize probably arguments by seeing them as analogous to a game of chance. A probable argument, Bernoulli suggested, can be broken down into 'cases of argument', where each case is a possible outcome of using the argument. So, as just one example, if I put forward a given argument, I could break this down into a case in which the argument yields A and a case in which the argument does not yield A. That is, if I have a probable argument, there will be cases in which the probable argument will give a given possible conclusion and cases in which it won't.

If we break down a probable argument into its cases, we can then combine arguments in an interesting probabilistic way. Here's an example. Suppose someone has been murdered. We know there was only one murderer, we have a profile of the murder, and we know that p people fit the profile. Since the profile fits p suspects, for each suspect we can regard the profile as a probable argument for the conclusion that that subject is the murderer. These are the cases of the argument. Suppose that one of the suspects is called Gracchus; then there are p-1 cases of the argument in which the argument does not prove Gracchus guilty, and 1 case in which it does. So far, so good; but let us add another argument, since we're interested here in the combination of arguments. Suppose that Gracchus, on being questioned about the murder, turns pale. Let's put on our sleuthing hats. Gracchus's pallor can form the basis of a probable argument about his participation in the murder. If we break this down into its cases, there would be a case in which this argument would prove his guilt, because the guilt and the pallor would actually be linked; and there would be cases in which it would not (because the pallor would not be linked to the guilt). Let the total number of these cases be called q; then the number of cases which leave the matter open is q-1.

By combining this argument with the previous one, we could reason in this way. We can use our first probable argument to divide the q-1 cases into cases in which Gracchus is guilty and cases in which he is innocent. In accordance with that first argument, proportion of these cases, 1/p, yield the conclusion of guilt, because one case out of p will lead to guilt; the rest, (p-1)/p, will yield the conclusion of innocence. So if we put together all the cases yielding the conclusion that Gracchus is guilty, we get the following number:

1 + (q-1)/p

Think it through a moment, and you'll see why: we have 1 case of guilt, and we add to that a proportion 1/p of the q-1 cases. The combined probability that Gracchus is guilty, then, is as follows:

[1 + (q-1)/p]/q

which can be simplified to:

(p + q -1)/pq

Suppose there were 20 suspects; and suppose that we judge on the basis of our experience that guilt can be inferred from pallor one out of every hundred times. Assuming I haven't made a stupid addition mistake, the probability of guilt is a little under 3/50 (more precisely, something like 119/2000). Suppose there were only 4 suspects, and we could corrrectly infer guilt from pallor one out of every ten times. Then the probability of guilt would be 13/40.

One of the interesting features of this approach is that, depending on the arguments we are combining, we can't assume that the probabilities for and against a conclusion sum to one. This isn't actually surprising, given that we are dealing with probable arguments; we would hardly expect it always to be the case that the arguments we are combining always cover all possible cases.

Bernoulli's work was only published poshumously in 1713; however, it was an attempt to bring mathematics to the understanding of probability that was then common. It's not surprising, then, that if we look at what people in the seventeenth and eighteenth century say about probabilities and chances, we find that they think of it in terms that are (broadly) like those of Bernoulli, and not really like the way we tend to think of them. Barry Gower had a nice article in the April 1990 Hume Studies in which he pointed out that recognizing this explains one of the common objections to Hume's argument against miracles. It was common for people to insist that whatever the probability against miracles may be, it has no effect on the probability for miracles. Their conception of probability is certainly not Bayesian; it is not even directly probability-theoretical (since it's based on a different, albeit related, notion of probability). Rather, it is much closer to the Bernoullian combination of arguments (as Hume's own argument also is). Pretty much everyone in the period has this view of the matter; they differ on details, but they tend to think along the same lines.

Glenn Shafer has a really good article on the significance of Bernoulli's Ars Conjectandi for probability theory (PDF format).