Download Friday, September 29: Chapter 6: The Normal Model

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Chapter 6: The Standard Deviation as a Ruler
Suppose that a professional soccer team has the money to sign one additional player and they are
considering adding either a goalie or a forward. The goalie has a 90% save percentage and the forward
averages 1.2 goals a game.
In this league, the average goalie saves 86% of shots with a standard deviation of 5% while the average
forward scores 0.9 goals per game with a standard deviation of 0.2. Who is the better player at his
position?
The goalie is 4% higher than the average and the forward is 0.3 goals higher than average. But, since
we are comparing different units, we cannot just say the goalie is better since 4 > 0.3. To make
comparisons possible, we consider where each player falls in their respective distributions.
90  86
 0.8 standard deviations above the goalie mean.
5
1.2  0.9
 1.5 standard deviations above the forward mean.
The forward is
0.2
The goalie is
When we are using the standard deviation as a ruler to measure how far an observation is above or
below the mean, we are using a ______________________________, or z-score.
y y
z
s
Using standardized scores has many advantages. Since z-scores have no units, we can compare values
that are measured on different scales, with different units, or for different populations.
Suppose that the distribution of male heights has a mean of 69 inches with a standard deviation of 3
inches and the distribution of female heights has a mean of 64 inches with a standard deviation of 2.5
inches.
Who is taller, relatively speaking, a 72 inch male or a 67 inch female?
An exclusive club only allows members who are especially tall. The height requirement for women is
72 inches. What is the equivalent requirement be for men?
HW #1 Read 102-105, Problems 6.5-6.8
93
Chapter 6: Shifting and Rescaling Data
Suppose that I took a random sample of 9 students and recorded their score on a recent quiz. There
scores are listed below:
30, 35, 36, 40, 40, 43, 45, 45, 50
Sketch a dotplot for this distribution and list all the summary statistics for this data set (mean, standard
deviation, median, quartiles, IQR, range).
0
20
40
60
80
100
Now, suppose that I was feeling especially generous and added 5 points to each score. Sketch the new
distribution (same scale!) and recalculate the summary statistics. Which ones changed? Which did not?
Did the shape change?
0
20
40
60
80
100
Now, go back to the original data and assume that the quiz was out of 50 points. To convert these scores
to percents, we could multiply each of them by 2 (since 2 x 50 = 100). Sketch the new distribution
(same scale!) and recalculate the summary statistics. Which ones changed? Which did not? Did the
shape change?
0
20
40
60
94
80
100
A Manhattan Taxi cab company studied the lengths of taxi rides and computed the following statistics
(in miles):
mean = 4.8, standard deviation = 4.2, median = 3.6, Q1 = 1.8, Q3 = 5.9, IQR = 4.1
If they converted the mileage measurements to km, what will the summary statistics be in km? (1 mile =
1.62 km)
If the cost of a taxi ride is $2.50 plus $3.20 per mile, what are the summary statistics for the distribution
of costs?
Note: When we convert measurements to z-scores, we are simply shifting and rescaling the data.
Subtracting the mean shifts the center (mean) of the distribution to 0. Dividing by the standard deviation
rescales the data by making the spread (standard deviation) equal to 1. Neither of these operations
changes the shape of the distribution.
HW #2 Read 105-107 Problems 6.1-6.4
Chapter 6: The Normal Model
When distributions are unimodal and symmetric, they often can be modeled by a normal curve.
Technically, the normal curve is a mathematical function that extends from negative infinity to positive
infinity. Thus, a normal curve will never model a data set perfectly and data sets should be described as
“approximately normal”, never “normal.”
When we use a mathematical function to describe a distribution of real data, we call the function a
_______. The normal model is centered at the mean  with a standard deviation of  : N(  ,  )
95
When graphing a normal distribution, the inflection points are located at 1 standard deviation above and
below the mean. An inflection point is where a graph goes from concave up (happy face) to concave
down (frowny face).
We use Greek letters to denote that these values are not calculated from the data in a sample. They are
either calculated from the entire population or they are assumed values to help solve problems.
Numbers like these are called ____________________ (for example,  and  ). In contrast,
summaries of data from samples are called __________________ and are represented with non-Greek
letters (such as y and s).
When we standardize data that is approximately normal in shape, we calculate:
y
z

If the distribution of y was approximately normal, then the distribution of z is also approximately normal
with  = 0 and  = 1. That is y ~ N(0,1). This distribution is called the
__________________________________.
Note: z-scores only follow a standard normal model when the original data was approximately normal!
The 68-95-99.7 Rule
When a distribution is approximately normal, then approximately 68% of the observations will be within
1 standard deviation of the mean, 95% of the observations will be within 2 standard deviations of the
mean, and 99.7% of the observations will be within 3 standard deviations of the mean.
96
Suppose that the birth weights of newborn babies are approximately normal with a mean of 7.6 lb and a
standard deviation of 1.3 pounds.
o sketch the normal curve with the correct scale
o What percentage of babies will weigh less than 8.9 lbs?
o What percentage will weigh more than 8.9 lbs?
o What percentage will weigh more than 5 lbs?
o What between which two weights will the middle 95% of babies be?
HW #3 Read 107-111 Problems 6.9-6.13, 6.17
97
Chapter 6: Finding Normal Percentiles
Using the baby weight example from yesterday, can we find what proportion of babies weigh less than 8
lb?
The z-score for a weight of 8 lb is:
This z-score is between 0 and 1 so we know at least 50% weigh less than 8 lb. We also know that at
most 84% weigh 8 lb. Thus, between 50%-84% of babies weigh less than 8 lb.
To be more precise, we can use the standard normal table, which gives normal percentiles for z-scores
between -3.50 and 3.50.

What proportion of babies weigh less than 7 lb?

What proportion weigh more than 7 lb?

What proportion weigh between 7 and 8 pounds?

What is the 90th percentile for weights? That is, 90% of babies will weigh below what value?
98

What is the first quartile for weights? That is, 25% of babies will weigh below what value?
The circumferences of oranges from a certain tree have an approximately normal distribution with a
mean of 20 cm with a standard deviation of 4 cm.
 Draw and label the normal model

What proportion of oranges have diameters less than 18 cm?

What proportion have diameters greater than 25 cm?

What proportion have diameters between 17 cm and 19 cm?

What is the interquartile range for this distribution?
HW #4 Read 111-114 Problems 6.23, 25, 26
99
Chapter 6: Using the TI-83 for normal calculations
The number of M&M’s in a 1-pound bag has an approximately normal distribution with a mean of 550
and a standard deviation of 12.

Draw and label the normal model

What proportion of bags will have fewer than 525 M&M’s?

What proportion will have fewer than 600 M&M’s?

What proportion will have more than 565 M&M’s?

What proportion will have between 530 and 570 M&M’s?
100

What is the 30th percentile for this distribution?

What is the interquartile range for this distribution?

If the company wants to guarantee that no more than 5% of bags have fewer than 540 M&M’s,
what mean should they try to achieve?

If the mean must stay at 550, but they still want to have no more than 5% of bags have fewer
than 540 M&M’s, what standard deviation should they try to achieve?
HW #5 Read 114-122 Problems 6.27-29, 31, 32, 44
Chapter 6: Review
HW #6 Problems 126-127:33, 35, 37, 41- 43
Chapter 6 Test
101