Download Standard Normal Calculations

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Standard Normal Calculations
What you’ll learn
 Properties of the standard normal distn
 How to transform scores into normal distn
scores
 Determine the proportion of observations
above, below and between two stated numbers
in a normal distribution.
 Calculate the point for a variable with a normal
distribution for which a stated proportion of
values lie either above or below.
 Comparing individuals from different
distributions
Standard Normal Distribution
 The Standard Normal Distribution
(also known as the “z-distribution”)
N( 0, 1)
Function Plot
no data
0.5
0.4
y
0.3
0.2
0.1
-3
-2
-1
y = normalDensity
0
x
x
1
2
3
Standardizing Scores
 We find that all normal distributions are the
same if we measure in units of σ.
 We
Using the Standard Normal
Distribution
 The level of cholesterol in the blood is important
because high cholesterol levels may increase the
risk of heart disease. We know that the
distribution of blood cholesterol levels in a large
population of people of the same age and sex is
roughly normal. For 14-year-old boys, the mean
is μ=170 mg/dl and the standard deviation,
σ=30m/dl. Levels above 240 mg/dl may require
medical attention.
Steps to solving a “normal” distn
problem.
 Step 1:
– Write the question as a probability statement.
 Step 2:
– Calculate a z-score
– Draw a picture and shade the region
 Step 3:
– Find the appropriate region using a standard normal
table
 Step 4:
 Write the answer in the context of the problem
Question with Area below
What percent of 14-year-old boys have less than
160 mg/dl of cholesterol?
 Step 1 (probability statement)
– P(X< 160)
Function Plot
no data
0.5
0.4


z
X 

160  170
z
 .33
30
0.3
y
 Step 2: (z-score)
0.2
0.1
-3
-2
-1
0
x
1
2
y = normalDensity x
= -0.33
 Since we want the percent of boys whose cholesterol is less than
160, we will find the percent of boys whose cholesterol -.33σ or
more below the mean.
3
Step 3: (Area from Table A)
We can now use Table A to find the percent of observations below 0.33. (Remember that Table A always gives the area under the
curve below a given value.
Z
.00
.01
.02
.03
.04
.05
.06
.07
.08
.09
-0.5
.3085
.3050
.3015
.2981
.2946
.2912
.2877
.2843
.2810
.2776
-0.4
.3446
.3409
.3372
.3336
.3300
.3264
.3228
.3192
.3156
.3121
-0.3
.3821
.3783
.3745
.3707
.3669
.3632
.3594
.3557
.3520
.3483
-0.2
.4207
.4168
.4129
.4090
.4052
.4013
.3974
.3936
.3897
.3859
-0.1
.4602
.4562
.4522
.4483
.4443
.4404
.4364
.4325
.4286
.4247
-0.0
.5000
.4960
.4920
.4880
.4840
.4801
.4761
.4721
.4681
.4641
0.0
.5000
.5040
.5080
.5120
.5160
.5199
.5239
.5279
.5319
.5359
 Step 3 (cont.)
 The area under the curve (the proportion of
observations) below -3.3σ is .3707
 Step 4: (Context)
The percent of 14-year-old boys whose
cholesterol level is less than 160mg/dl is
approximately 37.07%
Question with Area above
What percent of 14-year-od boys have more that
240mg/dl of cholesterol?
Step 1 (probability statement)
– P(X> 240)




Step 2: (z-score)
z
z
X 

240  170
 2.33
30
Function Plot
no data
0.5
0.4
0.3
y

0.2
0.1
-3
-2
-1
0
x
1
2
3
y = normalDensity x
= 2.33
Since we want the percent of boys whose cholesterol is greater than 240,
we will find the percent of boys whose cholesterol 2.33σ or more above the
mean.
Step 3: (Area from Table A)
We can now use Table A to find the percent of observations below
2.33. (below because that’s what our table gives us)
Z
.00
.01
.02
.03
.04
.05
.06
.07
.08
.09
2.0
.9772
.9778
.9783
.9788
.9793
.9798
.9803
.9808
.9812
.9817
2.1
.9821
.9826
.9830
.9834
.9838
.9842
.9846
.9850
.9854
.9857
2.2
.9861
.9864
.9868
.9871
.9875
.9878
.9881
.9884
.9887
.9890
2.3
.9893
.9896
.9898
.9901
.9904
.9906
.9909
.9911
.9913
.9916
2.4
.9918
.9920
.9922
.9925
.9927
.9929
.9931
.9932
.9934
.9936
2.5
.9938
.9940
.9941
.9943
.9945
.9946
.9948
.9949
.9951
.9952
2.6
.9953
.9955
.9956
.9957
.9959
.9960
.9961
.9962
.9963
.9964
2.7
.9965
.9966
.9967
.9968
.9969
.9970
.9971
.9972
.9973
.9974
 Step 3: Area (continued)
– The value from the table is .9901. We need to
remember that the table gives us area below a
value. Since the total area under the curve is 1,
to find the area above we can subtract the area
from the table from 1. So 1- .9901 = .0099
 Step 4: (context)
– The percent of 14-year-old boys whose
cholesterol level is more than 240 mg/dl is
approximately .99%.
Question between two values
What percent of 14-year-old boys have cholesterol
levels between 170mg/dl and 240 mg/dl
Step 1 (probability statement)
– P(170 < X < 240)
Step 2: (z-scores, we need to find zscores for both endpoints)




z
X 

z
X 

170  170
z
0
30
240  170
z
 2.33
30
Function Plot
no data
0.5
0.4
y

0.3
0.2
0.1
-3
-2
-1
0
x
1
2
y = normalDensity x
= 2.33
=0
Since we want the percent of boys whose cholesterol is between 170
mg/dl and 240mg/dl, we will find the percent of boys whose cholesterol is
between 0σ and 2.33σ.
3
Step 3: (Area from Table A)
We can now use Table A to find the percent of observations below
2.33 and the area below z= 0.00 (below because that’s what our
table gives us)
Z
.00
.01
.02
.03
.04
.05
.06
.07
.08
.09
-0.2
.4207
.4168
.4129
.4090
.4052
.4013
.3974
.3936
.3897
.3859
-0.1
.4602
.4562
.4522
.4483
.4443
.4404
.4364
.4325
.4286
.4247
-0.0
.5000
.4960
.4920
.4880
.4840
.4801
.4761
.4721
.4681
.4641
0.0
.5000
.5040
.5080
.5120
.5160
.5199
.5239
.5279
.5319
.5359
0.1
.5398
.5438
.5478
.5517
.5557
.5596
.5636
.5675
.5714
.5753
2.1
.9821
.9826
.9830
.9834
.9838
.9842
.9846
.9850
.9854
.9857
2.2
.9861
.9864
.9868
.9871
.9875
.9878
.9881
.9884
.9887
.9890
2.3
.9893
.9896
.9898
.9901
.9904
.9906
.9909
.9911
.9913
.9916
2.4
.9918
.9920
.9922
.9925
.9927
.9929
.9931
.9932
.9934
.9936
2.5
.9938
.9940
.9941
.9943
.9945
.9946
.9948
.9949
.9951
.9952
2.6
.9953
.9955
.9956
.9957
.9959
.9960
.9961
.9962
.9963
.9964
 Step 3: Area (continued)
– The values from the table are .9901 for the z-score of
2.33 and .5000 for the z-score of 0. We need to
remember that the table gives us area below a value.
We can take the area from 2.33 (.9901) and subtract
the area from 0 (.5000) to get the area between.
.9901 - .5000 =
.4901
Function Plot
no data
0.5
0.4
y
So:
Step 4: Context---
0.3
0.2
0.1
-3
-2
-1
y = normalDensity x
= 2.33
=0
0
x
1
2
3
The percent of 14year-old boys whose
cholesterol is
between 170 and 240
is approximately
49.01%
Finding the value of the variable when we
know the percent above or below
 What cholesterol level do the top 10% of 14year-old boys have?
Step 1: Write a probability statement
P ( X >x)= .10
0.4
0.3
y
This statement says: we want to find the
value that separates the top 10% from
the bottom 90% of the curve.
Function Plot
no data
0.5
0.2
0.1
-3
Since our table gives area below the
curve, we will find a z-score that
corresponds to 90% area
-2
-1
y = normalDensity x
0
x
1
2
3
Step 2: Find the z-score from the table. Remember that the area is
located on the “inside” of the table. Since the z-score that we are looking
for is above the mean, we know the z-score will be positive. We’ll look for
a value close to .9000.
Standard Normal Probability Distribution
Z
.00
.01
.02
.03
.04
.05
.06
.07
.08
.09
0.8
.7881
.7910
.7939
.7967
.7995
.8023
.8051
.8078
.8106
.8133
0.9
.8159
.8186
.8212
.8238
.8264
.8289
.8315
.8340
.8365
.8389
1.0
.8413
.8438
.8461
.8485
.8508
.8531
.8554
.8577
.8599
.8621
1.1
.8643
.8665
.8686
.8708
.8729
.8749
.8770
.8790
.8810
.8830
1.2
.8849
.8869
.8888
.8907
.8925
.8944
.8962
.8980
.8997
.9015
1.3
.9032
.9049
.9066
.9082
.9099
.9115
.9131
.9147
.9162
.9177
1.4
.9192
.9207
.9222
.9236
.9251
.9265
.9279
.9292
.9306
.9319
1.5
.9332
.9345
.9357
.9370
.9382
.9394
.9406
.9418
.9429
.9441
1.6
.9452
.9463
.9474
.9484
.9495
.9505
.9515
.9525
.9535
.9545
1.7
.9554
.9564
.9573
.9582
.9591
.9599
.9608
.9616
.9625
.9633
The closest value is .8997, so we will use a z-score of 1.28
Step 3: Using the z-score found, use the formula to standardize values
substituting the three known values.
z
X 

Now using algebra, solve the equation for X
Step 4:
Write a statement back in context
A 14-year-old boys cholesterol level must be at
least 208.40 to be in the top 10% of cholesterol
levels.
X  170
1.28 
30
X  170
(30)1.28 
(30)
30
(30)1.28 170  X
208.40  X
Comparing Individuals
 One of the best reasons to standardize values
(find their corresponding z-scores) is to be able to
compare individuals from different distributions.
 Consider again the three baseball players that we
looked at earlier in the year
Ty Cobb Ted Williams George Brett
.420
.406
.390
How can we compare the batting averages of these
players when they played in different eras under
different conditions? Was Ty Cobb actually the
best hitter of these three? Let’s find out.
Comparing Individuals (Cont.)
 We know that
batting averages are
quite symmetric and
reasonably normal
with the following
characteristics for
each era:
Decade Mean
Std Dev
1910s
.266
.0371
1940s
.267
.0326
1970s
.261
.0317
Now, using that information, find the
corresponding z-score for each player.
 Ty Cobb
z
X 

.420  .266
z
.0371
z  4.15
Ted Williams
z
X 

George Brett
X 
z

.406  .267
z
.0326
.390  .261
z
.0317
z  4.26
z  4.07
Now that we have standardized each score onto the standard normal curve,
we can compare the scores of these three individuals. Since, in this case, a
larger value indicates a better batting average---it appears that Ted Williams
is the best batter of these three. 4.26 > 4.15 > 4.07
Additional Resources
 Practice of Statistics, Pg 83-97
Related documents