Download Statistics Normal Distributions Unit Plan

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Normal Distributions Unit Plan
A random variable is called discrete if we can represent the
probabilities of its instances with a table. The distributions
we studied in the last section, Geometric and Poisson,
were still discrete, even though they were open ended,
because we could use tables to find their probabilities.
A random variable is called continuous if we cannot use
tables to represent its probabilities. In Chapter 5 and
beyond, the text discusses this kind of random variable.
The most famous and widespread kind of continuous r.v.’s
are the Normal Distributions.
A random variable x is called normal if and only if:
1.) Its instances can be any real number.
[This means we can’t represent a normal distribution with
a table, even an open-ended one! Therefore, normal
random variables are continuous]
2.) Instead, the probabilities in the normal distribution are
represented by a shape with an area of 1.
[Remember the frequency polygons in Chapter 2? This is
similar.]
The normal distribution a very special property worthy of
noting: mean=median=mode (this means the distribution
is symmetric)
This shape is generated with a special equation. When
graphed, it looks like a bell. Hence another famous name
for the normal distribution: The β€œBell Curve.”
Finding Probabilities on a Normal Distribution: A two-step
process
Step 1: Converting to the Standard Normal
Just like the other random variables we have studied, the
normal distribution is not a single distribution, but rather a
family of similar distributions (see p. 217). Just like Poisson
Distributions, Normals are parameterized by πœ‡, so we are
given that ahead of time. Normals are parameterized by 𝜎
as well, so we are also given that ahead of time.
Probabilities on the normal distribution can only be found
directly through integral calculus, but fortunately there are
many resources that give us a table of approximate values
[I will be printing out a table for you to use. There is one in
the book, but mine is slightly more user-friendly.]
But since Normals are a family of distributions, how can a
single table give us these values? Fortunately, we know
that the z-scores of any normal distribution are also
normally distributed with a mean of 0 and a standard
deviation of 1. This specific normal distribution is called the
Standard Normal Distribution and this is the distribution
for which the values are given in the table.
Recall the formula for the z-score:
𝑧π‘₯ =
π‘₯βˆ’πœ‡
𝜎
Step 2: Using the table
The table in the book assumes we are looking up a β€œless
than” probability, so if we want a greater than or a
between, we will have to use a conversion. We’ve seen the
β€œgreater than” conversion before, but the between
conversion is new. Here it is:
𝑃(π‘Ž < π‘₯ < 𝑏) = 𝑃(π‘₯ < 𝑏) βˆ’ 𝑃(π‘₯ < π‘Ž)
Note: Since the normal distribution can take on any real
number, the difference between 𝑃(π‘₯ < π‘Ž) and 𝑃(π‘₯ ≀ π‘Ž) is
trivial. Therefore, the distinction between β€œless than or
equal to” and β€œstrictly less than” that was so important in
Chapter 4 now matters not at all (whew!)
To use the table, first note that the negative z-scores are
on one side and the positive z-scores are on the other.
Now, to find the probability, the whole number and the
tenth digit are located on the left side of the table. The
hundredths digit is located on the top of the table. To look
up the β€œless than” probability associated with a particular
z-score, go to the row with the proper whole number and
tenth, and then over to the column with the proper
hundredth.
Note: Traditionally, z-scores are only calculated to two
decimal places. If you need more accuracy, there are many
computer programs that do the direct integral of the
normal curve and can give you an arbitrary level of
precision.
p.229 Example 1: β€œLess thans”
𝑃(π‘₯ < 2) =
(2 βˆ’ 2.4)
𝑃 (𝑧 <
)=
0.5
𝑃(𝑧 < βˆ’0.80) =
Now, use the table. Make sure you are on the β€œnegative”
side. Go down to the row labeled -0.8. Since you are
looking up -0.80, you will want the probability in the first
column: 0.2119. What this means is the probability of a
random variable having a z-score of less than -0.80 in any
normal distribution is 0.2119. Therefore:
𝑃(π‘₯ < 2) = 0.2119
Now do the β€œTry it yourself” on the bottom of p.229, and
p.232 #1,2
p. 230 Example 2-2a: β€œGreater thans”
𝑃(π‘₯ > 39) =
1 βˆ’ 𝑃(π‘₯ < 39) =
(39 βˆ’ 45)
1 βˆ’ 𝑃 (𝑧 <
)=
12
1 βˆ’ 𝑃(𝑧 < βˆ’0.50) =
1 βˆ’ 0.3085 =
0.6915
Now do p.232 #3,4
p.230 Example 2-1: β€œBetweens”
𝑃(24 < π‘₯ < 54) =
𝑃(π‘₯ < 54) βˆ’ 𝑃(π‘₯ < 24) =
(54 βˆ’ 45)
(24 βˆ’ 45)
𝑃 (𝑧 <
) βˆ’ 𝑃 (𝑧 <
)=
12
12
𝑃(𝑧 < 0.75) βˆ’ 𝑃(𝑧 < βˆ’1.75) =
0.7734 βˆ’ 0.0401 =
0.7333
Now do p.230 β€œTry it yourself #2” and p.232 #5,6
HW: p.232-5 #7-30 (or #1-30 if accel.)
Finding Bounds on a Normal Distribution
In some problems, we are given the probability, and we
must find the bound (or bounds, in the case of a
β€œbetween”) that generated the probability. To do these
problems, we basically reverse the procedure in the
previous section, with some complications.
p.240 Example 4: Greater than
Since they’re only taking the top five percent, we must find
the bound such that only 5% of the curve is greater than
that bound. In other words:
𝑃(π‘₯ > π‘Ž) = 0.05
Since this is a greater than, and the table assumes we’re
looking up less thans, we need to do the conversion:
𝑃(π‘₯ < π‘Ž) = 0.95
Which makes sense: If 5% of the curve is greater than our
bound a, then 95% of the curve will be less than the
bound.
Go to the table and look up the P-value that is closest to
0.95, and see what z-value it corresponds to. You will find
that the z-score of 1.64 yields P=0.9495 and the z-score of
1.65 yields P=0.9505. So, they are both 0.0005 away from
the value we want. In this case, we say these β€œtie”, and we
will use their mean, z=1.645.
So, now we know that since for any point on the x-axis,
including our bound,
𝑧=
π‘₯βˆ’πœ‡
𝜎
we can re-write that equation as
π‘Žβˆ’πœ‡
𝑧=
𝜎
and we now already have a z, a πœ‡ and a 𝜎, we can find the
bound. You might find it useful to re-write this equation in
terms of the bound, that way you don’t have to do the
algebra each time:
π‘Ž = πœ‡ + π‘§πœŽ
or, more generally:
π’‚π’π’š 𝒃𝒐𝒖𝒏𝒅 = 𝝁 + 𝒛𝒃𝒐𝒖𝒏𝒅 𝝈
Remind you of Chebyshev’s Theorem?? It should!
So, now to solve example 4:
π‘Ž = 75 + 1.645(6.5)
π‘Ž = 85.69
So to pass the Civil Service Test, you need to score at least
an 85.69.
Now try p.240 β€œTry it yourself” #4, p.241 Ex. 5 and β€œTry it
yourself” #5. These are β€œless thans” so they should be a bit
easier. For HW: p.244 #43-47
Example: p.244 #48: β€œBetweens”:
Let’s look at the middle question in this problem, β€œthe
middle 40% receive a C.” What are the two grades
between which a student would receive a C? In other
words,
𝑃(π‘Ž < π‘₯ < 𝑏) = 0.40
We know that since 40% of the curve is in between a and
b, 60% is outside of these bounds. If we’re able to assume
that a and b are symmetric about the mean (S.A.M),
which we can here, since it says the β€œmiddle” 40% get a C,
then by the symmetry of the normal curve, we know that
30% of the curve is above b, and 30% is below a.
Let’s examine the second part of that sentence: β€œ30% of
the curve is below a.” It turns out, this will give us all we
need to solve the entire problem! We can write this as:
𝑃(π‘₯ < π‘Ž) = 0.30
Now, by using the same procedure as in the previous
example, we can find the corresponding z-score to be 0.52. Applying the equation π‘Žπ‘›π‘¦ π‘π‘œπ‘’π‘›π‘‘ = πœ‡ + π‘§π‘π‘œπ‘’π‘›π‘‘ 𝜎,
we can now find:
π‘Ž = 72 + βˆ’0.52(9)
π‘Ž = 67.32
Ok, but how does that help us find b? Well, by symmetry,
we know that the z-score of b is simply the opposite of the
z-score of a. In other words, 𝑧𝑏 = 0.52. Applying the
equation π‘Žπ‘›π‘¦ π‘π‘œπ‘’π‘›π‘‘ = πœ‡ + π‘§π‘π‘œπ‘’π‘›π‘‘ 𝜎 again, we can now
find:
𝑏 = 72 + 0.52(9)
𝑏 = 76.68
In other words, to get a β€œC” on this test, you have to score
between a 67.32 and a 76.68
There is a formula shortcut we can use to get from a
β€œbetween” probability to a β€œless than” probability:
𝟏 βˆ’ 𝑷(𝒂 < 𝒙 < 𝒃)
𝑷(𝒙 < 𝒂) =
𝟐
Given, of course, that a and b are symmetric about the
mean.
HW: Normal distributions worksheets