binomial-probability

BINOMIAL PROBABILITY.

In some statistics text books, the application of the “binomial theorem” to probability theory is dealt with, but in an obscure and cursory manner. In this present treatment I aim to deal with this topic at some considerable length, but with absolute clarity; “dotting the i’s and crossing the t’s”, and leaving no room whatsoever for ambiguity or misunderstanding. Anyone reading this tutorial who finds any difficulty in understanding any of the points or topics is welcome to email me and advise me of any difficulties or points that I have failed to adequately clarify, so that I can make any necessary amendments to future editions. My email address is as follows:-

elliottroger99@googlemail.com

I hope that this tutorial will be useful to you.

All the best! Roger Elliott

Here is the type of question that frequently comes up in probability theory:-

QUESTION:- If you throw a (fair) dice five throws, what are the statistical odds against getting (for instance) a THREE on all five throws? Before answering this question, we must define a few terms.

If you throw a dice once, the probability (p) that SOME number will be thrown is 1.

(If an event is absolutely certain to happen, then the probability of that event happening is 1)

If you throw a dice, the probability (p) that NO number will be thrown is 0.

(The probability (p) of an impossible event happening is 0.)

In other words, by convention, probability is expressed as a number between 0 and 1.

If you throw a fair (six sided) dice, the probability of throwing one specific number (for instance a THREE) = p = 1/6 = 0.166666

The probability of NOT throwing a THREE = q = 1 – p = 1 – 1/6 = 5/6 = 1 – 0.166666 = 0.833333

In other words, if the probability that an event will happen is p, then the probability that the event will NOT happen is 1 – p, which is always denoted as q

Before we go any further, let’s state a generalized “formula” for calculating the probability p of an event occurring:-

The formula for the probability of an event having a successful outcome is defined as the number of possible successful outcomes divided by the “possibility space”. The “possibility space” is the total number of all the possible outcomes. Or to express it differently, the probability that an event will have a successful outcome is the number of possible successful outcomes divided by the total number of possible outcomes.

For instance, the possibility space (ie:- the number of possible outcomes) of a dice is 6, because the dice has 6 faces, any of which can be uppermost. With this “formula” in mind, let’s look at a number of examples:-

QUESTION:- If you throw a dice once, what is the probability of throwing a THREE?

ANSWER:- The number of possible successful outcomes = 1 (because there is only 1 number that “counts” as being a “success” ie:- throwing the number THREE). The possibility space (ie:- the total number of possible outcomes) = 6, because there are 6 faces on the dice. The probability of a successful outcome = p = the number of possible successful outcomes divided by the possibility space = (1 ÷ 6) = 1/6 = 0.16666

QUESTION:- If you throw a dice once, what is the probability of throwing an even number?

ANSWER:- The number of possible successful outcomes = 3 (because there are 3 even numbers on a six sided dice, ie:- 2, 4, and 6). The possibility space (ie:- the total number of possible outcomes) = 6 (because there are 6 faces on the dice). The probability of a successful outcome = p = the number of possible successful outcomes divided by the possibility space = 3/6 = 1/2 = 0.5

QUESTION:- If you throw a dice once, what is the probability of throwing a number that is less than SIX?

ANSWER:- The number of possible successful outcomes = 5 (because a dice has 5 numbers that are less than SIX, ie:- 1, 2, 3, 4, and 5). The possibility space (ie:- the total number of possible outcomes) = 6. The probability of a successful outcome = number of possible successful outcomes divided by the possibility space = 5/6 = 0.833333

Now let’s continue with our original question:- If you throw a (fair) dice five throws, what are the statistical odds against getting (for instance) a THREE on all five throws?

Let these five throws be considered as five separate events, with each event being designated by a letter of the alphabet, ie:-

Event A.

Event B

Event C

Event D.

Event E.

The probability that the dice throw of Event A will produce a THREE = 1/6 = 0.166666

The probability that the dice throw of Event B will produce a THREE = 1/6 = 0.166666

The probability that the dice throw of Event C will produce a THREE = 1/6 = 0.166666

The probability that the dice throw of Event D will produce a THREE = 1/6 = 0.166666

The probability that the dice throw of Event E will produce a THREE = 1/6 = 0.166666

The question is:- What is the probability that Events A, B, C, D, and E will ALL produce a THREE?

In order to calculate this, we use a rule known as “The Law of Compound Probability”. This rule states as follows:- The probability that two events will BOTH have “successful” outcomes is the PRODUCT of the probability of a successful outcome for the first event and the probability of a successful outcome for the second event. In other words, you MULTIPLY the probabilities together.

The “Law of Compound Probability” is a fundamental law of probability, and a foundation stone upon which almost the whole of probability theory rests.

Now we will apply The Law of Compound Probability to our question:- What is the probability that Events A, B, C, D, and E will ALL produce (for example) a THREE?

The answer to this question is:- 1/6 x 1/6 x 1/6 x 1/6 x 1/6 = 0.16666 x 0.16666 x 0.16666 x 0.16666 x 0.16666 = 0.1666665 = 0.0001286

That was an easy question. Now we will try a harder question.

What is the probability that (the five) Events A, B, C, D, and E will produce only 4 successful throws (where getting a THREE constitutes a “success”)?

The probability that the dice throw of Event A will produce a THREE = 1/6 = 0.16666

The probability that the dice throw of Event B will produce a THREE = 1/6 = 0.16666

The probability that the dice throw of Event C will produce a THREE = 1/6 = 0.16666

The probability that the dice throw of Event D will produce a THREE = 1/6 = 0.16666

The probability that the dice throw of Event E will NOT produce a THREE = 1 – 1/6 = 1 – 0.166666 = 5/6 = 0.83333

In that case (using The Law of Compound Probability), the probability of Events A, B, C, and D being “successful” (ie:- throwing a THREE) and Event E being unsuccessful (ie:- FAILING to throw a THREE) = 0.16666 x 0.16666 x 0.16666 x 0.166666 x 0.833333 = 0.0006429

In that case, the probability of throwing (the number) THREE four times in five throws = 0.0006429 – RIGHT????

No! WRONG!!!! We have omitted one small detail.

In the above example, we have assumed that specifically Event E is the event (dice throw) that “fails” (ie:- fails to throw a THREE). However, the “failure” might be with Event A, or with Event B, or with Event C, or with Event D, of with Event E. There are FIVE possible ways for failure to occur. In that case, we have to multiply our preliminary result by 5 in order to get the correct result.

0.0006429 x 5 = 0.0032145

In that case, the probability of a “success” (ie:- throwing the number THREE) 4 times in 5 dice throws = 0.0032145

(Let me remind you again that the probability (p) of an event being “successful” is expressed as a number between 0 and 1, where 1 represents absolute certainty of success, and 0 representing absolute impossibility of success.)

If the probability of success = 0.0032125, then the statistical odds AGAINST success is 1 chance in (1 ÷ 0.0032125) = 1 chance in 311. (in other words, the odds AGAINST success is the RECIPROCAL of the probability of success).

It becomes slightly more complicated when we ask the following question:- What are the statistical odds against getting 3 successes (where a “success” means throwing a THREE) with 5 throws of the dice?

Again, we will think of the 5 dice throws as being Event A, Event B, Event C, Event D, and Event E.

The probability that the dice throw of Event A will produce a THREE = 0.166666

The probability that the dice throw of Event B will produce a THREE = 0.166666

The probability that the dice throw of Event C will produce a THREE = 0.166666

The probability that the dice throw of Event D will NOT produce a THREE = 0.83333

The probability that the dice throw of Event E will NOT produce a THREE = 0.833333

In that case, the probability of achieving just 3 successes (where a “success” is throwing a THREE) in 5 dice throws = 0.16666 x 0.16666 x 0.16666 x 0.83333 x 0.83333 = 0.0032124

THE ABOVE ANSWER IS WRONG!!!!!!!! We omitted one small detail.

We have assumed that the two “failures” are Events D and E. However, the two failures could be A and B

or A and C

or A and D

or A and E

or B and C

or B and D

or B and E

or C and D

or C and E

or D and E.

In other words, there are 10 possible ways to fail. In that case, we must multiply our initial result by 10.

0.0032124 x 10 = 0.032124, or odds of 1 chance in (1 ÷ 0.032124) = odds of 1 chance in 31.

We can now derive a “formula” from these above results – a formula that will “solve” ALL similar questions.

Let’s ask the following question:- If you throw a dice 5 times, what is the probability of throwing a specific number (let’s say a FOUR – for example) 3 times (in those 5 throws)?

Throwing a FOUR constitutes a “success”. In one dice throw, the probability of success = p = 1/6 or 0.166666

Failing to throw a FOUR constitutes a “failure”. In one dice throw, the probability of failure = q = 1 – p

We throw the dice five times. A throw of the dice is referred to as a “trial”. In that case, the number of trials = n = 5

The number of times we hope to have a success (ie:- to throw a FOUR) = r = 3

We have just introduced four algebraic symbols that are ALWAYS used in this type of question.

The algebraic symbol n always refers to the number of trials (in this case, the number of dice throws).

The algebraic symbol r always refers to the number of successes that we are hoping for. (In this case, we are hoping for 3 successes in 5 dice throws).

The algebraic symbol p always refers to the probability of success in any single specific trial. (With a dice, p = 1/6 or 0.166666, and with a coin, p = 1/2 or 0.5, and with a (fair) 14 sided dice (which can be created), p = 1/14 or 0.071429 etc etc)

The algebraic symbol q always refers to the probability of failure in any single specific trial, so that q is always equal to (1 – p) because 1 represents the probability associated with certainty (or inevitability of outcome).

Now we can state the problem in algebraic terms.

What is the probability of getting 3 successes in 5 trials, where the probability of success = 0.16666 and the probability of failure = 0.8333?

(Note:- 0.166666 + 0.833333 = 1)

Stated algebraically:-

n = 5 and r = 3 and p = 0.16666 and q = 0.83333

The “formula” for solving this question is as follows:-

5C3 x 0.833332 x 0.1666663

Or the formula in its generalized algebraic format:-

nCr x qn – r x pr

(Notice the superscripts and the subscripts).

(Did you notice the fact that 2 (in the term 0.8333332) = 5 – 3, which is equivalent to n – r in the generalized algebraic version of the formula?)

Let’s look at the individual terms in this formula:-

QUESTION:- Why 0.833332 ?

ANSWER:- Because the probability of “failure” for one single trial is 0.833333 and there are 2 trials.

QUESTION:- Why 0.166663 ?

ANSWER:- Because the probability of “success” is 0.16666 and there are 3 successes.

QUESTION:- Why 5C3?

ANSWER:- The notation 5C3 means the number of possible combinations of 5 items taken 3 at a time. (In other words, the number of possible ways of choosing two items from a group of five items – or in this case, the number of possible ways of failing when there are 5 trials and 2 failures). You can do this on most good pocket calculators using the following series of 4 keystrokes:-

Keystroke 5

Keystroke nCr

Keystroke 3

Keystroke =

The result will be 10

Alternatively, you can set aside the pocket calculator and work out the number of possible ways of failing when you have 5 trials, with 3 (hoped for) successes and 2 (predicted) failures. Let the trials be denoted by letters of the alphabet thus:- Trial A, and Trial B, and Trial C, and Trial D, and Trial E.

In that case, your two failures can be as follows:-

Trials A and B

Trials A and C

Trials A and D

Trials A and E

Trials B and C

Trials B and D

Trials B and E

Trials C and D

Trials C and E

Trials D and E

The above list of ten possible ways of failing represents the number of possible combinations of 5 things taken 2 at a time, which is expressed algebraically as 5C2

Notice that in the term 5C2 the 5 is superscript and the 2 is subscript. This is a convention of notation always used in these types of problems.

In the formula 5C3 x 0.833332 x 0.1666663 the “reason” for the term 5C3 is that you have to multiply your initial result by the number of possible “ways” of “failing”, and the term 5C3 represents the number of possible “ways” of “failing”.

This formula can be set out in algebraic format with the standard notation, as follows:-

The number of trials = n

The number of successful trials = r

The probability that one single trial will be successful = p

The probability that one single trial will be unsuccessful = q

The four algebraic symbols used in this type of question are n, r, p, and q, which are always written in lower case.

The probability of getting r successful trials in a total of n trials, when the probability of success for one single trial = p and the probability of failure for one single trial = q is calculated with the following formula:-

nCr x q(n – r) x pr

This formula can be applied however many trials – however many successes – and whatever may be the probability for success of a single trial. Most situations in probability theory can be viewed as a number of trials, of which some will be successful trials, and some unsuccessful trials.

EXAMPLE:- It is possible to create a (fair) 14 sided dice. If you throw this dice 8 times, what is the probability that you will throw (for instance) the number ELEVEN exactly 5 times? (ie:- throwing the number ELEVEN is considered as a “success”, and throwing any other number is considered as a “failure”).

n = 8, and r = 5, and p = 1/14 = 0.07143 and q = 1 – 1/14 = 1 – 0.07143 = 0.92857

In that case, using the standard formula, the calculation is as follows:-

8C5 x 0.928573 x 0.071435 = 0.000083375

In that case, the statistical odds against getting exactly 5 successes in 8 throws of a 14 sided dice are 1 chance in (1 ÷ 0.000083375) = 1 chance in 11,994

In the preceding question, we asked – What is the probability that (with the 14 sided dice) you will throw the number ELEVEN exactly 5 times? However, a variant of this question is as follows:-

What is the probability that you will throw the number ELEVEN at least 5 times? In other words, you need to calculate the probability of throwing the number ELEVEN exactly 5 times PLUS the probability of throwing the number ELEVEN exactly 6 times PLUS the probability of throwing the number ELEVEN exactly 7 times PLUS the probability of throwing the number ELEVEN exactly 8 times.

You have to calculate these four probabilities separately, and then add the four results together in the following manner:-

8C5 x 0.928573 x 0.071435 = 0.000083375

+ 8C6 x 0.928572 x 0.071436 =0.00000320680

+ 8C7 x 0.928571 x 0.071437 = 0.000000070481

+ 8C8 x 0.928570 x 0.071438 =0.00000000068

The SUM of these four probabilities = 0.00008665

Incidentally, 7C7 = 1, and in general, nCn always = 1

And 0.928570 = 1, and in general ANY number raised to the power of zero is equal to 1.

The majority of probability theory problems require the solution of getting a “success” AT LEAST so many times, rather than EXACTLY so many times.

The above is an expanded version of pages 278 to 286 of – A Concise Course in Advanced Level Statistics by Crawshaw and Chambers, fourth edition, 2001, published by Nelson Thornes Ltd. I believe that my version, offers greater clarity to the student than Crawshaw and Chambers’ version, although my version discusses the topic at far greater length.

THE SAMPLE SPACE.

If you throw a dice three times and it comes up (for example) SIX three times in a row, you can calculate the improbability of this event in various ways. You can choose as your SAMPLE SPACE just these three dice throws. Alternatively you can factor in all the dice throws in that room that night, thereby choosing a larger SAMPLE SPACE. Alternatively you can factor in all the dice throws that night WITH THAT SAME DICE (on the basis that other dice have also been thrown that night). You have now restricted your SAMPLE SPACE somewhat. Alternatively you can choose a much larger SAMPLE SPACE – all the dice throws that night in this room and in the adjoining room. In each case – with each different choice of SAMPLE SPACE – you will get a different result for your probability calculations. The “tighter” (or narrower) your SAMPLE SPACE, the higher will be your calculated odds against getting SIX three times in a row. The larger your SAMPLE SPACE the lower will be your odds against getting SIX three times in as row. Generally speaking, restricting your SAMPLE SPACE tends to give the best odds, and expanding your SAMPLE SPACE tends to give lower odds. There are often differences of opinion as to the validity of calculated odds (against chance occurrence) for a particular series of events. These differences of opinion very often concern the size of the SAMPLE SPACE. If the calculated odds are very long, it might be argued that the SAMPLE SPACE was unreasonably small, and that a larger SAMPLE SPACE should have been adopted. Getting three SIXES in three dice throws may seem vastly improbable; but if you factor in dice throws for the whole night, it seems less improbable. The rule for choosing the size of the SAMPLE SPACE is to restrict it as far as is reasonable, but not to restrict it unduly, so that an “unfair” or misleading result will be obtained.

Now here are a few more worked examples.

QUESTION:- A chessboard contains 64 squares. If you throw a dart blindfold at a chessboard, what is the probability of hitting the white king’s square?

ANSWER:- There is just one “white king’s square”, and there are 64 squares on the chessboard. These 64 squares represent the (so called) “possibility space”. The “POSSIBILITY SPACE” means the sum total of all the possible outcomes. For instance, the possibility space for a dice is 6. The possibility space for a chessboard is 64. The possibility space for a coin is 2. etc etc----. The general formula for calculating the probability of achieving a successful outcome is the number of possible successful outcomes divided by the possibility space. In this case, there is just one possible successful outcome (ie:- landing on the white king’s square), and the possibility space is 64. So the probability of the dart landing on the white king’s square is 1/64. Another way of expressing this is as follows:- The probability of achieving a successful outcome is the number of possible successful outcomes divided by the total number of possible outcomes. The answer to this question then is 1/64, or 0.015625, ie:- the statistical odds against hitting the white king’s square are 1/0.015625 = 1 chance in 64.

A further question, using the same chessboard:- What is the probability of hitting either of the two king’s squares? There are 2 possible successful outcomes, and the possibility space or total number of possible outcomes is 64. So the answer is 2/64 = 0.03125 or statistical odds against success of 1 chance in (1/0.03125) = 1 chance in 32.

A further question:- What is the probability of hitting one of the knight’s squares? (Note:- There are 4 knights in all, two white, and two black.) There are 4 possible successful outcomes, and the possibility space (or the number of possible outcomes) is 64. In that case, the probability of a successful outcome = 4/64 = 0.0625, or statistical odds against success of 1 chance in (1/0.0625) = 1 chance in 16.

A further question:- What is the probability of hitting the squares of the black king or his two knights? In this case, there are 3 possible successful outcomes, and the possibility space is 64. So the answer is 3/64 = 0.046875 and the statistical odds against success are 1 chance in (1/0.046875) = 1 chance in 21.333

QUESTION:- We have a dartboard with 20 numbered segments. You throw one dart blindfold. What is the probability of hitting a specific number, for instance the number six?

ANSWER:- 1/20 = 0.05

QUESTION:- You throw 1 dart blindfold at the dartboard. What is the probability that you will NOT hit a six?

ANSWER:- 19/20 = 0.95

QUESTION:- You throw 7 darts at the dartboard blindfold. What is the probability that you will hit (for instance) number six exactly 3 times?

ANSWER:- Now we use our formula, with the 4 variables, n, r, p, and q.

n = number of trials = 7.

r = the number of successful trials = 3.

p = 1/20 = 0.05

q = 19/20 = 0.95

Now we interpolate these values into our formula (ie:- we replace algebraic symbols in our formula with numbers).

The algebraic version of our formula is as follows:-

nCr x q(n – r) x pr

With the substituted numerical values, our formula reads as follows:-

7C3 x 0.95(7 – 3) x 0.053

This is equal to 0.003563, or statistical odds against success of 1 chance in (1/0.003563) = 1 chance in 281.

ANOTHER QUESTION:- Imagine that you have a large sheet of paper fixed to the wall. On this sheet of paper is drawn the representation of a large compass, with The Eight Cardinal Points of the Compass included. You throw 1 dart blindfold at this image. What is the probability that your dart will land no further than 5.7 degrees away from Due Southeast (135 degrees)? (ie:- no more than 5.7 degrees less than 135 degrees, and no more than 5.7 degrees more than 135 degrees; in other words, for the throw to be successful, the dart must fall in a segment between 129.3 degrees and 140.7 degrees.)

ANSWER:- the probability of a successful throw = p = (5.7 x 2) ÷ 360

(Note:- This is because there are 360 degrees in a circle; and also because there are 2 “success” options; the dart can land in a place anything up to 5.7 degrees LESS than 135 degrees, OR the dart can land in a place anything up to 5.7 degrees MORE than 135 degrees.)

.

ANOTHER QUESTION:- With the same image of a compass on the wall, you throw 1 dart. What is the probability that the dart will land no further than 5.7 degrees from one of The Eight Cardinal Points of The Compass? (ie:- no more than 5.7 degrees less than one of The Eight Cardinal Points of The Compass, and no more than 5.7 degrees more than one of The Eight Cardinal Points of The Compass.)

The probability of success = p = (5.7 x 2 x 8) ÷ 360

The reason for this is as follows;-

5.7 because the dart must land no more than 5.7 degrees from one of The Eight Cardinal Points of The Compass.

x 2 because there are 2 options – 5.7 degrees LESS or 5.7 degrees MORE than one of The Eight Cardinal Points of The Compass.

x 8 because there are 8 Cardinal Points of The Compass.

÷ 360 because there are 360 degrees in a circle.

NOW LET’S GO BACK TO OUR ORIGINAL QUESTION:-

QUESTION:- How improbable is it that 6 randomly chosen angles would all be no further than 7.9 degrees from one of the Eight Cardinal Points of The Compass purely by chance?

Number of trials = n = 6

Number of successful trials = r = 6

Probability of success for one single trial = p = (7.9 x 2 x 8) ÷ 360 = 0.35111

Probability of failure for one single trial = q = (1 – p) = 1 – 0.35111 = 0.64888

Our general formula is expressed as follows:-

nCr x q(n – r) x pr

Now we replace the algebraic symbols with numerical values.

6C6 x 0.64888(6 – 6) x 0.351116 = 0.0018735

Or statistical odds against success of 1 chance in (1/0.0018735) = statistical odds of 1 chance in 533.

I hope that this tutorial has been instructive. Good luck with your calculations! All the best! Roger Elliott.