Download The Law of Averages and the Central Limit Theorem

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Law of large numbers wikipedia , lookup

Central limit theorem wikipedia , lookup

Transcript
The law of averages
The central limit theorem
The Law of Averages and the Central Limit
Theorem
Patrick Breheny
September 24
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
Kerrich’s experiment
A South African mathematician named John Kerrich was
visiting Copenhagen in 1940 when Germany invaded Denmark
Kerrich spent the next five years in an interment camp
To pass the time, he carried out a series of experiments in
probability theory
One of them involved flipping a coin 10,000 times
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
The law of averages
What does the law of averages say will happen with Kerrich’s
coin-tossing experiment?
We all know that a coin lands heads with probability 50%
After many tosses, the law of averages says that the number
of heads should be about the same as the number of tails . . .
. . . or does it?
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
Kerrich’s results
Number of
tosses
10
100
500
1,000
2,000
3,000
4,000
5,000
6,000
7,000
8,000
9,000
10,000
Number of
heads
4
44
255
502
1,013
1,510
2,029
2,533
3,009
3,516
4,034
4,538
5,067
Patrick Breheny
Heads 0.5·Tosses
-1
-6
5
2
13
10
29
33
9
16
34
38
67
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
50
0
−50
Number of heads minus half the number of tosses
Kerrich’s results plotted
10
50
100
500
1000
Number of tosses
Patrick Breheny
STA 580: Biostatistics I
5000
10000
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
Where’s the law of averages?
Instead of the number of heads getting closer to the number
of tails, they seem to be getting farther apart
This is not a fluke of this particular experiment; as the number
of tosses goes up, the absolute size of the difference between
the number of heads and number of tails is also likely to go up
However, compared with the number of tosses, the difference
is becoming quite small
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
Chance error
Consider the following equation, where h equals the number
of heads and n equals the number of tosses:
h=
1
· n + chance error
2
As n goes up, chance error will tend to go up as well
However, it does not go up as fast as n does; i.e., if we
flipped our coin another 10,000 times, the chance error will be
likely to be bigger, but not twice as big
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
Chance error (cont’d)
Now consider dividing the previous equation by n:
h̄ =
1 chance error
+
2
n
The final term will go to zero as n gets larger and larger
This is what the law of averages says: as the number of tosses
goes up, the difference between the number of heads and
number of tails gets bigger, but the difference between the
percentage of heads and 50% gets smaller
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
55
50
45
40
Percentage of heads
60
Kerrich’s percentage of heads
10
50
100
500
1000
Number of tosses
Patrick Breheny
STA 580: Biostatistics I
5000
10000
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
100
50
0
−50
−100
Number of heads minus half the number of tosses
Repeating the experiment 50 times
10
50
100
500
1000
Number of tosses
Patrick Breheny
STA 580: Biostatistics I
5000
10000
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
60
40
20
0
Percentage of heads
80
100
Repeating the experiment 50 times (cont’d)
10
50
100
500
1000
Number of tosses
Patrick Breheny
STA 580: Biostatistics I
5000
10000
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
Convergence
From the coin-tossing experiment, we saw that as the sample
size increases, the average tends to settle in on an answer
Statisticians say that the average number of heads converges
to one half
The connection between tossing coins and sampling from a
population may not be obvious, but imagine drawing a large
random sample from the population and counting the number
of males vs. females
This is essentially the same random process
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
Population parameters
We said that researchers are usually interested in a numerical
quantity (parameter) of a population that they would like to
generalize their findings to, but that they can only sample a
fraction of that population
As that fraction gets larger and larger, the numerical quantity
that they are interested in will converge to the population
parameter
There are two main points here:
The population parameter is the value I would get if I could
take an infinitely large sample from the population
The accuracy of my sample statistic will tend to increase (i.e.,
it will tend to get closer to the population parameter) as my
sample size gets bigger
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
A note about wording
It is important to keep in mind that the population mean is a
different concept than a sample mean
The population mean is an unknown quantity that we would
like to measure
The sample mean is a statistic that describes one specific list
of numbers
So be careful to distinguish between the mean of a
population/distribution and the mean of a sample
The same goes for other statistics, like the population
standard deviation vs. the sample standard deviation
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
The mean and standard deviation of the normal
distribution
Recall that when you standardize a variable, the resulting list
of numbers has mean 0 and standard deviation 1
Since histograms of these standardized variables seem to
match up quite well with the normal distribution, one would
think that the normal distribution also has mean 0 and
standard deviation 1
Indeed, one can show this to be true
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
The mean of the binomial distribution
When our random variable is an event that either occurs (1)
or doesn’t occur (0), the population mean is simply the
probability of the event
This matches up with our earlier definition of probability as
the long-run fraction of time that an event occurs
What about a binomial distribution?
If the average number of times an event occurs with each trial
is p, then the average number of times the event occurs in n
trials must be p + p + · · · + p = np
Therefore, the mean of the binomial distribution is np
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
The standard deviation of a binomial distribution
Because we know the probability of each value occurring, we
can derive the standard deviation of the binomial distribution
as well (although the actual calculations are a little too
lengthy for this class)
This standard deviation turns out to be
p
np(1 − p)
This formula makes sense: as p gets close to 0 or 1, there is
less unpredictability in the outcome of the random process,
and the spread of the binomial distribution will be smaller
Conversely, when p ≈ 0.5, the unpredictability is at its
maximum, and so is the spread of the binomial distribution
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
Sampling distributions
We now turn our attention from distributions of data to the
sampling distributions that we talked about earlier
Imagine we were to sample 10 people, measure their heights,
and take the average
The distribution we are talking about now is the distribution
of the average height of a sample of 10 people
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
The expected value and the standard error
When trying to summarize a histogram, we said that the two
main qualities we wanted to describe were the center
(measured by the average) and the spread (measured by the
standard deviation)
The same is true for sampling distributions
We will be interested in characterizing their center and spread,
only now we will call its center the expected value and its
spread the standard error
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
The expected value of the mean
To study expected values, lets continue with our hypothetical
experiment where we go out and measure the average height
of 10 people
In an actual experiment, we would only do this once, and our
population would be unknown
To illustrate the concepts of sampling distributions, let’s
instead repeat the experiment a million times, and sample
from a known population: the NHANES sample of adult
women
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
0
50000
Frequency
100000
150000
200000
Histogram of our simulation
60
62
64
66
Sample means
Patrick Breheny
STA 580: Biostatistics I
68
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
Expected value of the mean
So where is the center of this histogram?
The mean of the 1 million sample averages was 63.49 inches
Its center is exactly at the population mean: 63.49 inches
This is what statisticians mean when they say that the sample
average is an unbiased estimator: that its expected value is
equal to the population value
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
The standard error of the mean
How about the spread of the histogram?
Would we expect its spread to be the same as the spread of
the population?
No; as we saw with the coin flipping, the spread of the sample
average definitely went down as our sample size increased
As Sherlock Holmes says in The Sign of the Four: “While the
individual man is an insoluble puzzle, in the aggregate he
becomes a mathematical certainty. You can, for example,
never foretell what any one man will do, but you can say with
precision what an average number will be up to. Individuals
vary, but percentages remain constant. So says the
statistician.”
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
The standard error of the mean (cont’d)
Indeed, the standard error of the mean (the standard deviation
of the sample means) was 0.869 inches
This is much lower than the population standard deviation:
2.75 inches
Therefore, while an individual person would be expected to be
63.5 inches give or take 2.75 inches, the average of a group of
ten people would be expected to be 63.5 give or take only
0.87 inches
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
The square root law
The formula relating the standard error for a sample mean
and the standard deviation of the population is called a square
root law
It says that
SD
SE = √
n
For this example,
2.749
SE = √
10
= 0.869
Note the connection with the law of averages: as n gets large,
SE goes to 0 and the average converges to the expected value
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
The expected value of the standard deviation
So we saw that the expected value of the mean was exactly
equal to the population mean
Is the same true for the sample standard deviation?
What about the root-mean-square?
We can repeat the experiment, recording the sample standard
deviations and the root-mean-squares of our 10-person
samples
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
Convergence and population parameters
The expected value and the standard error
Neither estimator is unbiased
The results:
Expected value of the sample standard deviation: 2.67 inches
Expected value of the root-mean-square: 2.53 inches
Neither one is unbiased!
However, the standard deviation is less biased than the
root-mean-square, and that is why people use it
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
Suppose we have an infinitely large urn with two types of balls
in it: half are numbered 0 and the other half numbered 1
0.3
0.2
0.0
0.1
Probability
0.4
0.5
What does this distribution look like?
0
1
Number on ball
What is this distribution?
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
What is its mean?
np = 1(0.5) = 0.5
What is its standard deviation?
p
p
np(1 − p) = 1(0.5)(0.5) = 0.5
Patrick Breheny
STA 580: Biostatistics I
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
The law of averages
The central limit theorem
0.15
0.00
0.05
0.10
Probability
0.20
0.25
0.30
Now consider the sum of five balls from the urn:
0
1
2
3
4
5
Sum of balls
What is this distribution?
Patrick Breheny
STA 580: Biostatistics I
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
The law of averages
The central limit theorem
Instead of the sum, let’s consider the average of those five
balls
0.15
0.10
0.05
0.00
Probability
0.20
0.25
0.30
We know the distribution of the sum, so we know the
distribution of its average:
0
0.2
0.4
0.6
0.8
1
Average of balls
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
What is the expected value of the average?
5(0.5)
np
=
= 0.5
n
5
What is the standard error of the average?
p
p
np(1 − p)
5(0.5)(0.5)
=
= 0.22
n
5
Patrick Breheny
STA 580: Biostatistics I
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
The law of averages
The central limit theorem
0.15
0.10
0.00
0.05
Probability
0.20
The sampling distribution of the average of 10 balls:
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Average of balls
What is its expected value?
What is its standard error?
Patrick Breheny
STA 580: Biostatistics I
1
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
0.04
0.00
0.02
Probability
0.06
The sampling distribution of the average of 100 balls:
0
0.03
0.07 0.1 0.13
0.17 0.2 0.23
0.27 0.3 0.33
0.37 0.4 0.43
0.47 0.5 0.53
0.57 0.6 0.63
0.67 0.7 0.73
Average of balls
What is its expected value?
What is its standard error?
Patrick Breheny
STA 580: Biostatistics I
0.77 0.8 0.83
0.87 0.9 0.93
0.97
1
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
To recap: For the population as a whole, the variable had
mean 0.5, standard deviation 0.5, and a flat shape
The expected value, standard error, and shape of the sampling
distribution of the average of the variable were:
n
5
10
100
Expected value
0.5
0.5
0.5
Standard error
0.22
0.16
0.05
Patrick Breheny
Shape
Kind of normal
More normal
Pretty darn normal
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
The central limit theorem
There are three very important phenomena going on here:
#1 The expected value of the sampling distribution is always equal
to the population average
#2 The standard error of the sampling distribution is always equal
to the population standard deviation divided by the square root
of n
#3 As n gets larger, the sampling distribution looks more and
more like the normal distribution
These three properties of the sampling distribution of the
sample average hold for any distribution
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
The central limit theorem (cont’d)
This result is called the central limit theorem, and it is one of
the most important, remarkable, and powerful results in all of
statistics
In the real world, we rarely know the distribution of our data
But the central limit theorem says: we don’t have to
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
The central limit theorem (cont’d)
Furthermore, as we have seen, knowing the mean and
standard deviation of a distribution that is approximately
normal allows us to calculate anything we wish to know with
tremendous accuracy – and the sampling distribution of the
mean is always approximately normal
The only caveats:
Observations must be independently drawn from the
population
The central limit theorem applies to the sampling distribution
of the mean – not necessarily to the sampling distribution of
other statistics
How large does n have to be before the distribution becomes
close enough in shape to the normal distribution?
Patrick Breheny
STA 580: Biostatistics I
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
The law of averages
The central limit theorem
Example #1
n=10
Density
0.5
1.0
1.5
2.0
1.5
0.0
1.0
Density
2.0
2.5
2.5
3.0
3.0
3.5
Population
0.0
0.2
0.4
0.6
0.8
1.0
x
0.2
0.4
0.6
Sample means
Patrick Breheny
STA 580: Biostatistics I
0.8
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
The law of averages
The central limit theorem
Example #2
n=10
0.3
Density
0.2
0.10
0.1
0.05
0.0
0.00
Density
0.15
0.4
0.5
0.20
Population
−6
−4
−2
0
2
4
6
x
−3
−2
−1
0
Sample means
Patrick Breheny
STA 580: Biostatistics I
1
2
3
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
Rules of thumb
In the previous two examples, the sampling distribution was
very close to the normal distribution with samples of size 10
A widely used “rule of thumb” is to require n to be about 20
However, this depends entirely on the original distribution:
If the original distribution was close to normal, n = 2 might be
enough
If the original distribution is highly skewed or strange in some
other way, n = 50 might not be enough
Patrick Breheny
STA 580: Biostatistics I
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
The law of averages
The central limit theorem
A troublesome distribution
For example, imagine an urn containing the numbers 1, 2, and 9:
0.4
0.0
0.2
Density
0.6
0.8
n=20
2
3
4
5
6
7
Sample mean
0.4
0.2
0.0
Density
0.6
0.8
n=50
2
3
4
Patrick Breheny
5
STA 580: Biostatistics I
Sample mean
6
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
The law of averages
The central limit theorem
A troublesome distribution (cont’d)
0.6
0.4
0.2
0.0
Density
0.8
1.0
n=100
2.5
3.0
3.5
4.0
4.5
Sample mean
Patrick Breheny
STA 580: Biostatistics I
5.0
5.5
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
An example from the real world
Weight tends to be skewed to the right (far more people are
overweight than underweight)
As we did before with height, let’s perform an experiment in
which the NHANES sample of adult men is the population,
and we draw samples from it (i.e., we are re-sampling the
NHANES sample)
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
Results
0.02
0.01
0.00
Density
0.03
0.04
n=20
160
180
200
220
240
Sample mean
Patrick Breheny
STA 580: Biostatistics I
260
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
Sampling distribution of serum cholesterol
According the National Center for Health Statistics, the
distribution of serum cholesterol levels for 20- to 74-year-old
males living in the United States has mean 211 mg/dl, and a
standard deviation of 46 mg/dl
We are planning to collect a sample of 25 individuals and
measure their cholesterol levels
What is the probability that our sample average will be above
230?
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
Solution
The first thing we would do is determine the expected value
and standard error of the sampling distribution
The expected value will be identical to the population mean,
211
√
The standard error will be smaller by a factor of n:
SD
SE = √
n
46
=√
25
= 9.2
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
Solution (cont’d)
Next, we would determine how many standard deviations away
from the mean 230 is:
230 − 211
= 2.07
9.2
What is the probability that a normally distributed random
variable is more than 2.07 standard deviations above the
mean?
1-.981 = 1.9%
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
A different question
95% of our sample averages will fall between what two
numbers?
The first step: what two values of the normal distribution
contain 95% of the data?
The 2.5th percentile of the normal distribution is -1.96
Thus, a normally distributed random variable will lie within
1.96 standard deviations of its mean 95% of the time
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
Solution
Which numbers are 1.96 standard deviations away from the
expected value of the sampling distribution?
211 − 1.96(9.2) = 193.0
211 + 1.96(9.2) = 229.0
Therefore, 95% of our sample averages will fall between 193
mg/dl and 229 mg/dl
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
A different sample size
What if we had only collected samples of size 10?
Now, the standard error is
46
SE = √
10
= 14.5
Now what is the probability of that our sample average will be
above 230?
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
Solution
Now 230 is only
230 − 211
= 1.31
14.5
standard deviations away from the expected value
The probability of being more than 1.31 standard deviations
above the mean is 9.6%
This is almost 5 times higher than the 1.9% we calculated
earlier for the larger sample size
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
Solution #2
What about the values that would contain 95% of our sample
averages?
The values 1.96 standard errors away from the expected value
are now
211 − 1.96(14.5) = 182.5
211 + 1.96(14.5) = 239.5
Note how much wider this interval is than the interval
(193,229) for the larger sample size
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
Another example
What if we’d increased the sample size to 50?
Now the standard error is 6.5, and the values
211 − 1.96(6.5) = 198.2
211 + 1.96(6.5) = 223.8
contain 95% of the sample averages
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
Summary
n
10
25
50
SE
14.5
9.2
6.5
Interval
(182.5,239.5)
(193.0,229.0)
(198.2,223.8)
Width of interval
57.0
36.0
25.6
The width of the interval is going down by what factor?
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
What sample size do we need?
Finally, we ask a slightly harder question: How large would the
sample size need to be in order to insure a 95% probability
that the sample average will be within 5 mg/dl of the
population mean?
As we saw earlier, 95% of observations fall within 1.96
standard deviations of the mean
Thus, we need to get the standard error to satisfy
1.96(SE) = 5
SE =
Patrick Breheny
5
1.96
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
Solution
The standard error is equal to the standard deviation over the
square root of n, so
5
SD
= √
1.96
n
√
n = SD ·
1.96
5
n = 325.1
In the real world, we of course cannot sample 325.1 people, so
we would sample 326 to be safe
Patrick Breheny
STA 580: Biostatistics I
The law of averages
The central limit theorem
Introduction
The central limit theorem
How large does n have to be?
Applying the central limit theorem
One last question
How large would the sample size need to be in order to insure
a 90% probability that the sample average will be within 10
mg/dl of the population mean?
There is a 90% probability that a normally distributed random
variable will fall within 1.645 standard deviations of the mean
Thus, we want 1.645(SE) = 10, so
10
46
=√
1.645
n
n = 57.3
Thus, we would sample 58 people
Patrick Breheny
STA 580: Biostatistics I