Download Chapter 7: The Normal Distribution 7.1 Introduction

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Chapter 7: The Normal Distribution
7.1 Introduction
• Many continous random variables have a mound or bell-shape distribution.
• The bell-shaped distribution of these random variables can be approximated by the
normal distribution.
• Check out Figure 7.1 on p. 257. We did histograms early on in the semester. Look at
what happens as the number of sample increases when going from (a) to (b) to (c) to
(d).
• As the sample size increases to ∞, we have our bins getting smaller and smaller.
• In figure 7-1 (d) we end up with a smooth curve.
• In Figure 7.2 on p. 257 we have another look at symmetrical and skewed distributions.
• Just as our histograms in Chapter 3 can have symmetry and skewness, so can probability distributions for continuous random variables.
• Be able to recognize skewness based on shape and also by the relationship of the mean,
median, and mode.
7.2 Properties of the Normal Distribution
• The mathematical equation for the normal distribution can be quite formidable. It is
given by
1 x−µ 2
1
e− 2 ( σ )
f (x) = √
2πσ 2
• Luckily we will rely on a table (Table E in the front of the text to be exact) in order
to calculate probabilities as we would need calculus and numerical analysis to use the
formula.
• The area under the curve represents probabilities.
• Consider the various shapes of the normal distribution in Figure 7-4 p. 259. These are
very important. You should be able to reproduce these pictures.
• Recall that µ is a measure of central tendency. The normal distribution will be centered
over this parameter.
• Also recall that both σ 2 and σ are measures of variable (a.k.a. spread or dispersion).
• Notice how the shape flattens out as σ increases (Figure 7-4 (a) and (c)).
• Properties of the Normal distribution are summarized in the green box on p. 259.
These are extremely important. You need to know these for the exams.
1
• The normal distribution is symmetrical about it’s mean µ.
• 50% of the observations will fall below the mean and 50% will fall above the mean.
• This sounds like the definition of the median. In fact, because of the normal distribution’s symmetry, the mean, median, and mode are equal.
• The Normal distribution ranges from −∞ to +∞ although there is very little area
under the curve for values greater than 3.00 or smaller than -3.00.
• Recall the Empirical rule and notice the last point in the green table on p. 259.
Whereas the Empirical rule only requires mound or bell-shaped data, when we add
extra information about the specifics of the distribution we can improve upon our
estimates of where proportions of data lie. This will become evident in the next
section.
7.3 The Standard Normal Distribution
• Homework for this section: 1 - 49 odd. Turn-in 34, 38, and 50 - by the way, these
would be fantastic test questions.
• Mantra of the day: Exploit the symmetry of the normal distribution.
• There are an infinite number of normal distributions. Think about it... the mean µ
can be any number on the real line and the variance σ 2 need only be positive.
• A special name is given to a Normal distribution that has a mean of zero and a variance
(and standard deviation) of 1.
• This specific Normal distribution is called the Standard normal.
• IMPORTANT!!!! Any normal distribution can be transformed to a standard normal
distribution using
X −µ
z=
σ
• One of the most important skills for you to master is learning how to calculate probabilities of Normally distributed random variables using Table E in the text.
• There is a procedure table found on the bottom of p. 261. We worked similar examples
in class for each of the 7 different situations.
2
• The following table references the work we did in class with related text examples.
Situation
1
Example in Text
7-1 to 7-3
2
7-4 and 7-5
3
7-6 and 7-7
4
5
6
7
7-8
7-9
7-10
7-11
In-Class Example
Area betw/ z = 0 and z = .82
Area betw/ z = −.64 and z = 0
Area to the right of z = 1.96
Area to the left of z = −1.645
Area betw/ z = 1.00 and z = 2.00
Area betw/ z = −1.55 and z = −.55
Area betw/ z = −.75 and z = 1.25
Area to the left of z = 1.39
Area to the right of z = −1.05
Area to the right of z = 2.05
and to the left of z = −1.94
Solution to In-Class Example
.2939
.2389
.0250
.0495
.1359
.2306
.6678
.9177
.8531
.0464
• PUNCHLINE! Area under the probability distribution of continuous random variables
represents probabilities!
• The area to the right of z = 1.50 is just P (Z ≥ 1.50).
• The area between z = −1.20 and z = 1.38 is P (−1.20 ≤ Z ≤ 1.38).
• WORKING BACKWARDS: Using Table E from the inside-out. See Example 7-13.
This is also very important for you to be able to do.
• Exercise 7-46 was worked in class. It is similar to Example 7-13. The solutions were
(a) z = .10, (b) z = .58, and (c) z = 1.85.
7.4 Applications of the Normal Distribution
Introduction
• HW: 55 - 91 odd - turn-in 84, 89, 90, and 91 - these are wonderful test questions.
• Solutions for 34) .3842, 38) .9177, 50) 90th is 1.28, 80th is .84, 50th is 0, 5th is -1.65.
• Mantra of the day: Exploit the standard normal distribution using z =
x−µ
.
σ
• Suppose X is a random variable that has a normal distribution with mean µ and
variance σ 2 , then z = x−µ
has a normal distribution with µ = 0 and σ 2 = 1 (and of
σ
course σ = 1.)
• We will use the Normal distribution and the transformation of z =
types of problems.
x−µ
σ
to solve two
Case 1: Solving problems in terms of probability for a given data point. The data point comes
from a normal distribution with mean µ and variance σ 2 .
• Examples in the text that relate to this case are 7-14 and 7-15.
3
• We will work Exercise 7.62 on p. 280. Remember to draw pictures!!
• Solution to part (a)
P (Y ≥ 82) = P Z ≥
82 − 62
= P (Z ≥ 1.67) = .0475.
12
• Solution to part (b)
50 − 62
P (Y ≤ 50) = P Z ≤
= P (Z ≤ −1.00) = .1587.
12
Omit Example 7-16
Case 2: Finding a data value given a specific probability. We will use the standard normal
table from the inside-out and exploit the relationship z = x−µ
which yields x = µ + zσ.
σ
• See examples 7-17 and 7-18 in the text.
• We will work exercise 7-70.
• Speed-reading scores are normally distributed with µ = 80 and σ = 8. If the top 15%
are selected, what is the cut-off score for reading speed.
• First find the z-score such that 15% of the area under the curve is to the right of it.
That is, what z is there such that P (Z ≥?) = .1500?. Using the table inside-out yields
z = 1.04.
• Now using the relationships x = µ + zσ we have that x = 80 + (1.04)(8). So the cut-off
score is 88.32.
• Exercise 7-72.
• Test scores are normally distributed with a mean µ = 100 and σ = 15. We want the
middle 50%. Draw a picture and find the relevent z scores. That is, between what two
z values will the middle 50% of the data lie?
• Because of symmetry, we find that the z values are -.67 and +.67.
• Now using the relationships x = µ + zσ we have that x = 100 + (−.67)(15) and
x = 100 + (+.67)(15). These two equations yield a lower bound of 89.95 and an upper
bound of 110.05.
7.5 Central Limit Theorem
Homework: 93 to 117 odd, numbers 112 and 116 would make great test problems.
Introduction:
• Suppose we collect 100 samples of size n.
4
• For each of the 100 samples calculate x̄, i.e. now we have 100 x̄’s given by x̄1 , x̄2 , . . . , x̄n .
• Treat these 100 x̄’s as data points.
• The distribution of the x̄’s is called the sampling distribution of the means.
Note:
• x̄ is a point estimate for the true mean; (the true mean from where the original data
were sampled is µ).
• Sampling error is the difference between our estimate (x̄) and the truth (µ).
Properties of the Distribution of Sample Means (i.e, the x̄0 s)
1. If our data come from a distribution with mean µ then the mean of x̄ has a mean of
µ, too.
2. If our data come from a distribution with standard deviation σ, then the standard
deviation of x̄ is given by nσ . Hence! the standard deviation of x̄ depends on the
sample size!
Example - Suppose we sample the weights of 15 farm raised catfish who come from a distribution with µ = 2.5 lbs and σ = .75 lbs. We can conclude then
√ that x̄ has a distribution
with mean µx̄ = 2.5 and standard deviation given by σx̄ = .75/ 15.
√
Special Note: σx̄ = σ/ n is also called the standard error of the mean. Be sure you
know this terminology!
The Central Limit Theorem (CLT)
This is one of the most important theorems in statistics. From this theorem we know that x̄
approximates a normal distribution when the sample size n is large. If the data are drawn
from a distribution
with mean µ and standard deviation σ then, as before, µx̄ = µ and
√
σx̄ = σ/ n.
Notes:
• If the data used to calculate x̄ come from a normal distribution then x̄ is exactly
normally distributed.
• If the data used to calculate x̄ do not come from a normal distribution, then n ≥ 30 is
required for x̄ to be approximately normally distributed.
√
• If x̄ is approximately normal with µx̄ = µ and σx̄ = σ/ n then
z=
is a standard normal random variable.
5
x̄ − µ
√
σ/ n
Ignore the finite population correction factor (p. 292).
The following problems were worked in class.
Exercise 7-102.
We are given that µ = 21.0, σ = 2, and n = 25.
√
P (x̄ ≥ 21.3) = P (z ≥ 21.3−21.0
) = P (z ≥ .58) = .2810.
2/ 15
Exercise 7-106: Solution .9513
Exercise 7-110: Solution .1443
6