On Probability

Over this winter vacation, I’ve been intellectually focused on some background mathematics supporting my work in detection. One of my projects is reading through a rather introductory text on probability by Athanasios Papoulis (1965). I’ve found the opening sections of this book very philosophically relevant, in addition to being a promising text book – if I can ignore the philosophy being presented.

Papoulis opens the text with a very blunt statement, which instantly alerts me to the conceptual framework in which this mind is operating: “Scientific theories deal with concepts, never with reality”. What follows in his introduction is a justification for approaching the subject purely from a deductive process starting with a set of axioms, and not worrying – much – about the relationship between the theory and “the real world, whatever that means”. Among the blunt statements in the introduction is this:

To conclude, we repeat that the probability P(A) of an event A must be interpreted as a number assigned to this event, as mass is assigned to a body. In the development of the theory, one should not worry about the “physical meaning” of P(A). This is what is done in all theories.

After this very open confession of intellectual sterility, the author gets down to work by asking how probability – which he has just declared to be arbitrarily defined numbers – should be defined. He offers three strawmen definitions, and then settles on a fourth. The strawmen are of some interest as well:

(1) Relative Frequency Definition. This is probably the definition most commonly assumed – that probability is the ratio of the number of outcomes of interest, divided by the total number of trials. This needs to be more carefully constructed as the limit of this ratio as the number of trials “goes to infinity”, or, as I’ll even more carefully say it, as the number of trials becomes arbitrarily large. Papoulis rejects this approach as too cumbersome, because no matter how many actual trials are made, we never approach “infinity”, and therefore can never say that we have a sufficient number of trials to establish meaning for this ratio.

(2) Classical Definition. Here the probability of an event is determined by a-priori reasoning about the situation at hand. For example (Papoulis’ example) a six-sided (fair) die has six equally possible outcomes, so the probability of getting a one on a die roll is 1/6. No experiment need be made to reach this (rational) conclusion. This strawman is knocked down by stating that this really only can work for simple cases, and works most easily only when the outcomes are of equal likelihood. A couple examples are given to show that determining the proper probability in this manner is very error prone. For example, one could attempt to determine the likelihood of rolling a 2 with two die by saying the number of possible outcomes is 11 (2,3,…,12), and we are interested in the value 2, so the probability is 1/11, which is of course wrong.

(3) Measure of Belief. This one seems to be thrown in as an easily dismissed psychologically based argument.

Papoulis chooses to define probability from three axioms, and claims to then proceed to develop the entire theory of probability from these axioms (plus one more minor extension). The axioms seem ridiculously primitive:

The probability of an event A is a number P(A) assigned to this event. This number obeys the following three postulates, but is otherwise unspecified:
I. P(A) is positive or zero
II. The probability of the certain event is 1 [the certain event always occurs]
III. If A and B are mutually exclusive [they both cannot happen in the same trial] then P(A+B)=P(A)+P(B). [The probability that A or B happens is equal to the sum of the probability that A happens and the probability that B happens].

And that’s it! That’s his “definition” for probability, which clearly lacks any meaningful tie to the “real world, whatever that means”. So he can proceed in philosophical comfort.

Or can he?

Interestingly, in order to start his progression from these axioms, he needs to introduce a large segment of set theory – otherwise he cannot define what an “event” is, which is contained in this definition of probability. Without belaboring set theory here, an “outcome” is one possible result of a trial of the process we are trying to test. The set of all possible outcomes is the “certain event” mentioned in the definition – one element of the certain event will always be the outcome of any trial, so the probability of the certain event is 1. An “event” is then any subset of the set of all possible outcomes.

Next, we encounter a very bizarre twist in this “axiomatic” probability theory. Not all events, says Papoulis, can have a probability assigned to them. That is, not all sets of possible outcomes can be given a number that will meet the axiomatic conditions from which the theory of probability will be developed. Papoulis does not confront this problem directly, treating it as merely an annoyance, and refering the reader to measure theory for a better explanation, but he does give a major example to indicate where the problem lies. Suppose the outcome of a process that we are interested in could be any real number (real numbers are all of the common numbers from “negative infinity” to “postive infinity”, including all rational and irrational numbers). Then the “certain event” is the set of all real numbers. But consider the event which is the set consisting of a single number, say the set {3}. If every set of this type is given a probability, there will be an “infinite” number of these probabilities, and in order for axiom III to remain true, each of these probabilities would need to be zero. Then for any specific outcome of a trial of the process – A -, the event {A} would have a probability zero – which is a clear contradiction.

The escape from this contradiction is itself quite bizarre, and only partially explained (we are referred to measure theory for a more complete explanation). For this example, only events that can be formed from the union or intersection of a countable number of continuous intervals or isolated points will be given a probability. There are sets that cannot be so formed (we are told they are complicated to construct, and I recall similar constructions from my days of formal math training, and I agree with him), and these will not be given probabilities.

If this escape seems to make little sense – and Papoulis seems to understand that the reader will not be able to make sense of this – he offers a better escape: “…one can construct certain pathological sets that are not countable intersections or unions of intervals. Sets of this kind have no probabilities, but are of no importance in applications, and we can forget them“.

Now this is a simply amazing demonstration of a rationalist getting boxed into a corner, and then escaping inelegantly by refering us back to the real world (whatever that means) which he has already denied can be consulted in developing a proper mathematical “theory”.

Report This Post