Download Normal Approximation to Binomial Distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Normal Approximation to Binomial Distributions
Charlie Vollmer
Department of Statistics
Colorado State University
Fort Collins, CO
[email protected]
September 18, 2015
Abstract
This document is a supplement to class lectures for STAT 307-003, Fall, 2015. It
details how the Normal Distribution can approximate the Binomial Distribution as the
number of trials, n, gets large.
How large does n need to be?
How well does the Normal Distribution approximate a Binomial Distribution?
Let us find out....
1
Setup: Defining some terms
1.1
Expected Value
If we go to wikipedia, the following is the very first sentence that we’ll see:
In probability theory, the expected value of a random variable is intuitively the long-run
average value of repetitions of the experiment it represents.
Great! It is simply what we expect to see most often if we did something over and over
and over again! And if we go down a few more sentences on the wikipedia page, we find
something even more useful:
The expected value is also known as the expectation, mathematical expectation, EV,
mean, or first moment.
Bam! Look at that fourth synonym: the mean! That is exactly what I would expect to
see most often if I did an experiment over and over and over lots of times!
Note: if a Random Variable is Binomially Distributed, its mean is: np.
1
1.2
Standard Error, SE
If we go to wikipedia, the following is the first two sentences that we’ll see:
The standard error (SE) is the standard deviation of the sampling distribution of a
statistic,[1] most commonly of the mean. The term may also be used to refer to an estimate
of that standard deviation, derived from a particular sample used to compute the estimate.
Ok, this is wordy but it’s actually very accurate and descriptive. It’s just saying that
the standard error, SE, is the standard deviation of our statistic.
So... if our statistic is:
Sn =
n
X
Xi
i=1
Where Xi is simply a 1 or 0, in the case of a coin toss (heads or tails), then our SE of
this statistic is its standard deviation.
Now, we know -from class- that this statistic, Sn , is a Binomially Distributed Random
Variable (it follows a Binomial Distribution). In the case of a binomial, we (humans... and
now you, too!) know that the variance of a Binomially Distributed Random Variable is
simply:
npq
Do you remember how to find the standard deviation from the variance? Well, if you
need the SE, it’s just the standard deviation.
So, now we know how to get the variance from a binomial, which means we have the
standard deviation or -in other words- the standard error, SE!
2
2
The Approximation: Toss a coin 100 times
The object of this section is to illustrate how if we plot out the histogram of the number
of heads from a coin toss will be well approximated by a Normal Distribution as the number
of tosses, n gets “large.”
For instance, say that we toss a fair coin 100 times and see how many times that we get
heads. We could do this and get 88 heads. We could also do it and get 45 heads. Let’s say
we do the entire experiment (toss the coin 100 times) 50 times. Thus, we’ll get 50 different
numbers. Let’s see what that plot looks like:
6
count
4
2
0
40
45
50
55
60
65
Number of Heads
It looks like on one experiment we got 40 heads and one time we got 41 heads. On
another experiment we got 61 heads. Yet again, on 6 experiments we got 48 heads, and 6
more experiments we got 59 heads.
You get the picture.
So, in this situation, we only did this experiment (toss a coin 100 times) 50 times. And
the plot above shows our results from those 50 experiments.
What happens if we did this experiment 100 times? Or a thousand times??
Let us see...
3
Below, we see what happens when we do this experiment 500, 1000, and 5000 times:
0.075
density
density
0.10
0.050
0.05
0.025
0.00
0.000
40
45
50
55
60
40
50
fifty
60
thousand
0.08
0.075
density
density
0.06
0.04
0.050
0.025
0.02
0.00
0.000
40
50
60
30
five_hundred
40
50
60
70
five_thousand
Whoa! We see that our histograms start to look like a bell curve!
Clearly, this is no coincidence!
This is because a Binomial Random Variable begins to look like a Normally Distributed
Random Variable as the number of trials, n, grows large!
Careful!! Take notice that we did NOT increase n yet, only the number of
times that we did the experiment!
So, now if we increase n, we would expect to see this bell-shaped-looking curve actually
start to look more and more like a Normal Distribution. As of now, you can notice that it
doesn’t quite look like a normal distribution, but rather just a similar-looking curve.
4
3
The Approximation: Toss a coin 100 times
Now, we do the same thing as above, but each experiment is tossing the coin 1000 times.
What do you think this does to the Expected Value? Ponder this question:
Is it easier to get all heads if I only toss the coin 10 times?
Would it be harder to get all heads if I tossed the coin 1000 times?
These questions have us think about the expected value and the standard error.
As we do more and more trials, do we expect the mean of our sample to get closer to the
“true mean” more often?
So, let’s do the experiment where we toss the coin 1,000 times. And let’s do this
experiment 50 times, as we did before. These are our results:
6
count
4
2
0
460
480
500
520
Number of Heads
And we see that it’s centered around 500 heads, as per our intuition of the outcome,
and goes from around 450 heads in some experiments to about 550 heads in others.
Does it look -upon quick glance- that it’s about the same as our first plot??
5
4
Examine the difference between n = 100 and n = 1000:
As per the first section of these notes, we know what the variance of a Binomially
Distributed Random variable is: npq.
So, if we look at our two different situations, we see that our variances/standard deviations are:
V ar(Sn ) = npq = 100 ×
1 1
× = 25
2 2
in our first context of n = 100, and we have:
V ar(Sn ) = npq = 1000 ×
1 1
× = 250
2 2
in our second context of n = 1000.
Careful! What we care about is our standard error, SE!
We actually have that our standard errors are:
r
p
1 1 √
√
V ar(Sn ) = npq = 100 × × = 25 = 5
2 2
in our first context of n = 100, and we have:
r
p
1 1 √
√
V ar(Sn ) = npq = 1000 × × = 250 ≈ 15
2 2
in our second context of n = 1000.
Take a second to examine this further... this is actually striking!
We know that most (95%) of our data will lie between 2 standard deviations (standard
errors) in the context of a Normal Distribution. And here, that means in our first context
that most will lie between 40 and 60 heads, while it will be between 470 and 530 in the
second.
However!...
An interval of length 20 is actually 20% of the possible values of the first context (since we
could get anywhere between 0 and 100 heads in 100 coin tosses) and an interval of length
60 is only 6% of the possible values in the second context (since we could get anywhere
between 0 and 1000 heads if we flip a coin 1000 times).
That means our distribution is MUCH tighter about the mean when we made 1000
tosses (as n got larger) than when we only made 100 tosses.
6
5
Visualize 1000 tosses:
Let’s see what it looks like when we do the 1000 toss experiment many times. Below is
for 50, 500, 1000, and 5000 experiments of 1000 tosses:
0.08
0.03
density
density
0.06
0.04
0.02
0.01
0.02
0.00
0.00
475
500
525
450
fifty
475
500
525
550
thousand
0.03
0.02
density
density
0.02
0.01
0.01
0.00
0.00
450
475
500
525
480
five_hundred
520
560
five_thousand
The important thing to look at is the five thousand experiment plot in the lower-right
corner. If we compare this to the same plot in the previous 100-toss experiment, this should
look more similar to a Normal Distribution.
Let’s see as n gets even larger...
7
As n gets larger and larger:
We see what happens when n = 10, 000 below:
1250
1000
count
750
500
250
0
498000
499000
500000
501000
502000
Number of Heads
And again for n = 100, 000:
4000
3000
count
6
2000
1000
0
497000
498000
499000
500000
501000
502000
Number of Heads
And this looks pretty Normal to me!
Note: In fact... we can check that this is extremely close to a Normal Curve.
8
7
Is n = 1000 a good Approximation?
If we perform the n = 1, 000 experiment many, many times, we can actually get a
good idea of how well it is approximated by a Normal Distribution. We plot the 1000-toss
experiment done 100,000 times below:
density
0.02
0.01
0.00
450
500
550
heads
What does this show us?
Well... if we have a random variable that follows a Binomial Distribution where the n
is at least 1,000... that we find that it is almost a Normal Distribution!
This is a very important discovery of ours!
Careful! Recall that a Normal Distribution is defined by two things: its
mean and variance.
If that’s all we need, the mean and variance... well, we’re gold! We have both of those
things!
9