Download Section 2.2 Normal Distributions Normal curves Normal

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Section 2.2
Normal Distributions
Normal Distributions
One particularly important class of density curves are the Normal curves, which describe Normal
distributions.


All Normal curves are symmetric, single-peaked, and bell-shaped
A Specific Normal curve is described by giving its mean µ and standard deviation σ.
Two Normal curves, showing the mean µ and standard deviation σ.
We can locate σ by eye on a Normal curve. Imagine that you are skiing down a mountain that has the
shape of a Normal curve. At first, you descend at an ever-steeper angle as you go out from the peak:
Fortunately, before you find yourself going straight down, the slope begins to grow flatter rather than
steeper as you go out and down:
The points at which this change of curvature takes place are located at a distance σ on either side of the
mean µ.
DEFINITION: Normal distribution and Normal curve
A Normal distribution is described by a Normal density curve. Any particular Normal distribution is
completely specified by two numbers: its mean µ and standard deviation σ.
- The mean of a Normal distribution is the center of the symmetric Normal curve.
- The standard deviation is the distance from the center to the change-of-curvature points
on either side.
- We abbreviate the Normal distribution with mean µ and standard deviation σ as N(µ,σ).
Normal distributions are good descriptions for some distributions of real data.
Normal distributions are good approximations of the results of many kinds of chance outcomes, like the
number of heads in many tosses of a fair coin.
Many statistical inference procedures are based on Normal distributions.
The 68-95-99.7 Rule
Although there are many Normal curves, they all have properties in common.
DEFINITION: The 68-95-99.7 Rule (“The Empirical Rule”)
In the Normal distribution with mean µ and standard deviation σ:
- Approximately 68% of the observations fall within σ of µ.
- Approximately 95% of the observations fall within 2σ of µ.
- Approximately 99.7% of the observations fall within 3σ of µ.
Example
ITBS Vocabulary Scores (Using the 68-95-99.7 rule)
The distribution of Iowa Test of Basic Skills (ITBS) vocabulary scores for seventh-grade students in Gary,
Indiana, is close to Normal. Suppose that the distribution is exactly Normal with mean μ = 6.84 and
standard deviation σ = 1.55. (These are the mean and standard deviation of the 947 actual scores.)
(a) Sketch the Normal density curve for this distribution.
(b) What percent of ITBS vocabulary scores are less than 3.74?
(c) What percent of the scores are between 5.29 and 9.94?
✓CHECK YOUR UNDERSTANDING
The distribution of heights of young women aged 18 to 24 is approximately N(64.5, 2.5).
1. Sketch a Normal density curve for the distribution of young women’s heights. Label the points one,
two, and three standard deviations from the mean.
2. What percent of young women have heights greater than 67 inches? Show your work.
3. What percent of young women have heights between 62 and 72 inches? Show your work.
The Standard Normal Distribution
All Normal distributions are the same if we measure in units of size σ from the mean µ as center.
DEFINITION: Standard Normal distribution
The standard Normal distribution is the Normal distribution with mean 0 and standard deviation 1.
If a variable x has any Normal distribution N(µ,σ) with mean µ and standard deviation σ, then the
standardized variable
has the standard Normal distribution, N(0,1).
The Standard Normal Distribution
Because all Normal distributions are the same when we standardize, we can find areas under any Normal
curve from a single table.
DEFINITION: The Standard Normal Table
Table A is a table of areas under the standard Normal curve. The table entry for each value z is the area
under the curve to the left of z.
Example
Standard Normal Distribution (Finding area to the right)
What if we wanted to find the proportion of observations from the standard Normal distribution that are
greater than -1.78? To find the area to the right of z = -1.78, locate -1.7 in the left-hand column of Table
A, then locate the remaining digit 8 as .08 in the top row. The corresponding entry is :
This is the area to the left of z = -1.78. To find the area to the right of z = -1.78, we use the fact that the
total area under the standard Normal density curve is 1. So the desired proportion is:
Example
Catching some “z”s (Finding areas under the Standard Normal curve
PROBLEM: Find the proportion of observations from the standard Normal distribution that are between
-1.25 and 0.81.
✓CHECK YOUR UNDERSTANDING
Use Table A to find the proportion of observations from a standard Normal distribution that fall in each of
the following regions. IN each case, sketch a standard Normal curve and shade the area representing the
region.
1. z < 1.39
2. z > -2.15
3. -0.56 < z < 1.81
Use Table A to find the value z from the standard Normal distribution that satisfies each of the following
conditions. In each case, sketch a standard Normal curve with your value of z marked on the axis.
th
4. The 20 percentile
5. 45% of all observations are greater than z
Normal Distribution Calculations
How to Solve Problems Involving Normal Distributions
State: Express the problem in terms of the observed variable x.
Plan: Draw a picture of the distribution and shade the area of interest under the curve.
Do: Perform calculations.
- Standardize x to restate the problem in terms of a standard Normal variable z.
- Use Table A and the fact that the total area under the curve is 1 to find the required
area under the standard Normal curve.
Conclude: Write your conclusion in the context of the problem.
Example
Tiger on the Range (Normal calculations)
On the driving range, Tiger Woods practices his swing with a particular club by hitting many, many balls.
When Tiger hits his driver, the distance the ball travels follows a Normal distribution with mean 304 and
standard deviation 8 yards. What percent of Tiger’s drives travel at least 290 yards?
STATE:
PLAN:
DO:
CONCLUDE:
Example
Tiger on the Range (Continued) (More complicated calculations)
What percent of Tiger’s drives travel between 305 and 325 yards?
STATE:
PLAN:
DO:
CONCLUDE:
Example
Cholesterol in Young Boys (Using Table A in revers)
High levels of cholesterol in the blood increase the risk of heart disease. For 14-year-old boys, the
distribution of blood cholesterol is approximately Normal with mean of 170 milligrams of cholesterol per
deciliter of blood (mg/dl) and standard deviation of 30 mg/dl. What is the first quartile of the distribution
of blood cholesterol?
STATE:
PLAN:
DO:
CONCLUDE:
From z-scores to areas, and vice versa
✓CHECK YOUR UNDERSTANDING
Follow the method shown in the examples to answer each of the following questions. Use your calculator
to check your answers.
1. Cholesterol levels above 240 mg/dl may require medical attention. What percent of 14-year-old boys
have more than 240 mg/dl of cholesterol?
2. People with cholesterol levels between 200 and 240 mg/dl are at considerable risk for heart disease.
What percent of 14-year-old boys have blood cholesterol between 200 and 240 mg/dl?
th
3. What distance would a ball have to travel to be at the 80 percentile of Tiger Wood’s drive lengths?
Assessing Normality
The Normal distributions provide good models for some distributions of real data. Many statistical
inference procedures are based on the assumption that the population is approximately Normally
distributed. Consequently, we need a strategy for assessing Normality.
 Plot the data.
Make a dotplot, stemplot, or histogram and see if the graph is approximately symmetric and bellshaped.
 Check whether the data follow the 68-95-99.7 rule.
Count how many observations fall within one, two, and three standard deviations of the mean and
check to see if these percents are close to the 68%, 95%, and 99.7% targets for a Normal distribution.
Example
Unemployment in the States (Are the data close to Normal?
Let’s start by examining data on unemployment rates in the 50 states in November 2009. Here are the
data arranged from lowest (North Dakota’s 4.1%) to highest (Michigan’s 14.7%).
4.1 4.5 5.0 6.3 6.3 6.4 6.4 6.6 6.7
7.8 8.0 8.0 8.2 8.2 8.4 8.5 8.5 8.6
10.2 10.3 10.5 10.6 10.6 10.8 10.9
6.7 6.7 6.9 7.0 7.0 7.2 7.4 7.4 7.4
8.7 8.8 8.9 9.1 9.2 9.5 9.6 9.6 9.7
11.1 11.5 12.3 12.3 12.3 12.7 14.7
PROBLEM: Check to see if the distribution is approximately Normal.
Normal Probability Plots
Most software packages can construct Normal probability plots. These plots are constructed by plotting
each observation in a data set against its corresponding percentile’s z-score.
Interpreting Normal Probability Plots
If the points on a Normal probability plot lie close to a straight line, the plot indicates that the data are
Normal. Systematic deviations from a straight line indicate a non-Normal distribution. Outliers appear as
points that are far away from the overall pattern of the plot.
Example
Guinea Pig Survival (Assessing Normality)
The exercise from Chapter 1 introduced data on the survival times in days of 72 guinea pigs after they
were injected with infectious bacteria in a medical experiment.
43 45 53 56 56 57 58 66 67 73 74 79 80 80 81 81 81 82 83 83 84 88
89 91 91 92 92 97 99 99 100 100 101 102 102 102 103 104 107 108 109
113 114 118 121 123 126 128 137 138 139 144 145 147 156 162 174 178 179
184 191 198 211 214 243 249 329 380 403 511 522 598
Problem: Determine whether these data are approximately Normally distributed.