Download Lecture 3 - Wharton Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
5/26/2010
Statistics 111 - Lecture 3
Continuous Random Variables
“The probable is what usually happens.” (Aristotle )
Moore, McCabe and Craig: Section 4.3,4.5
Continuous Random Variables
• Continuous random variables have a noncountable number of values
• Can’t list the entire probability distribution, so
we use a density curve instead of a histogram
• Eg. Normal density curve:
1
5/26/2010
Continuous Random Variables
Call center agents’
service times
Examples of continuous variables: service times, weight, height, grades
Calculating Continuous Probabilities
• Discrete case: add up bars from probability histogram
• Continuous case: we have to use integration to
calculate the area under the density curve:
• Although it seems more complicated, it is often easier to
integrate than add up discrete “bars”
• Note: P(X=c)=0. The probability of X equaling a
particular value is zero. Therefore, P(X<c)=P(X<=c).
2
5/26/2010
Example: Normal Distribution
We will use the normal distribution throughout
this course for two reasons:
1.
2.
It is usually good approximation to real data
We have tables of calculated areas under the
normal curve, so we avoid doing integration!
The Normal Distribution
• The Normal distribution has the shape of a “bell
curve” with parameters  and 2 that determine
the center and spread:
1
2𝜋𝜎
(𝑥−𝜇 )2
−
𝑒 2𝜎 2


3
5/26/2010
Normal Distribution: A family of density
curves
Here, means are the same ( = 15)
while standard deviations are different
( = 2, 4, and 6).
0
2
4
6
8 10 12 14 16 18 20 22 24 26 28 30
Here, means are different
( = 10, 15, and 20) while standard
deviations are the same ( = 3)
0
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
The 68-95-99.7% Rule for Normal Distributions

About 68% of all observations
are within 1 standard deviation
Inflection point
() of the mean ().

About 95% of all observations
are within 2  of the mean .

Almost all (99.7%)
observations are within 3  of
the mean.
mean µ = 64.5
standard deviation  = 2.5
N(µ, ) = N(64.5, 2.5)
4
5/26/2010
The standard Normal distribution
Because all Normal distributions share the same
properties, we can standardize our data to transform
any Normal curve N(m,s) into the standard Normal
curve N(0,1).
N(64.5, 2.5)
N(0,1)
=>
x
Standardized height (no units)
z

The standard Normal distribution
• We convert a non-standard normal distribution
into a standard normal distribution using a linear
transformation
• If X has a N(,2) distribution, then we can
convert to Z which follows a N(0,1) distribution
z
(x  )

• First, subtract the mean  from X
• Then, divide by the standard deviation  of X
5
5/26/2010
Example: Normal Distribution
Women’s heights follow the N(64.5”,2.5”) distribution.
What percent of women are shorter than 67 inches tall ?
1. X is a r.v. listing a random women’s height
2. mean µ = 64.5“, standard deviation  = 2.5"
3. Calculate the z-score for x=67.
z
(x  )

, z
(67  64.5) 2.5

 1  1 stand. dev. above mean
2.5
2.5
4. Following the 68-95-99.7 rule, we know that the percent
of women shorter than 67” should be, approximately,
0.5 + (1 - .68)/2 = .84 or 84%.
Example: Normal Distribution
For more general probability calculations, we have to do
integration
For the standard normal distribution, we have tables of
probabilities already made for us!
6
5/26/2010
Example: Normal Distribution
Percent of women shorter than 67” P(X<67)=P(Z<1)
Conclusion: 84.13% of women are shorter than 67”.
Standard Normal Table
The area under the Standard Normal curve to the left of any z
value.
7
5/26/2010
Example: Normal Distribution
What is the probability that a woman’s height is between
68 and 70 inches ? P(68 < X < 70)= P(X<70)-P(X<68)
We calculate the z-scores for 68 and 70.
(68  64.5)
 1.4
2.5
For x = 68",
z
For x = 70",
z
(70  64.5)
 2.2
2.5
P(68 < X < 70)= P(X<70)-P(X<68) =

P(Z<2.2)-P(Z<1.4) = 0.9861-0.9192=0.0669
Tips on using Table A
• The Normal distribution is
symmetric meaning:
a. P(Z<-z)=P(Z>z)
Area = 0.9901
b. P(Z<-z)=1-P(Z<z)
• P(Z=z)=0
• P(a<Z<b)=P(Z<b)-P(Z<a)
Area = 0.0099
z = -2.33
area right of z = area left of -z
8
5/26/2010
Example: SAT SCORES
• NCAA Division 1 SAT Requirements: athletes are required to
score at least 820 on combined math and verbal SAT
• In 2000, SAT scores were normally distributed with mean 
of 1019 and SD  of 209
• What percentage of students have scores greater than 820?
• P(X > 820)
=P(Z > (820-1019)/209)
= P(Z>-0.95)
=1- P(Z < -0.95)
= 1-0.17 =0.83
• 83% of students meet NCAA requirements
Example: SAT SCORES
• Now, just look at X = Verbal SAT score, which is
normally distributed with mean  of 505 and SD 
of 110
• What Verbal SAT score will place a student in the
top 10% of the population?
9
5/26/2010
Example: SAT SCORES
• From the table, P(Z >1.28) = 0.10
• We know that
1.28 
z
x

therefore
x  505
 x  1.28 110  505  646
110
• So, a student needs a Verbal SAT score of at least
646 in order to be in the top 10% of all students
10