Download Handout 3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Continuous Probability Distributions
• Continuous random variable
– Values from interval of numbers
– Absence of gaps
• Continuous probability distribution
– Distribution of continuous random variable
• Most important continuous probability
distribution
– The normal distribution
The Uniform Distribution
• “Rectangular shaped”
• Every value between
a and b is equally
likely
• The mean and median
are in the middle
• Prob(X<=v) is the area
on the left of v
f(X)
a
v

Mean
Median
b
X
The Normal Distribution
• “Bell shaped”
• Symmetrical
• Mean, median and
mode are equal
• Interquartile range
equals 1.33 s
• 68-95-99 % rule
• Random variable
has infinite range
f(X)

Mean
Median
X
The Mathematical Model
f X  
1

e
1
2s
2
X





2s 2
f  X  : density of random variable X
  3.14159;
e  2.71828
 : population mean
s : population standard deviation
X : value of random variable    X   
Many Normal Distributions
There are an infinite number of normal distributions
By varying the parameters s and , we
obtain different normal distributions
Finding Probabilities
Probability is
the area under
the curve!
P c  X  d   ?
f(X)
c
d
X
Which Table to Use?
An infinite number of normal distributions
means an infinite number of tables to look up!
Solution: The Cumulative Standardized
Normal Distribution
Cumulative Standardized Normal
Distribution Table (Portion)
Z
.00
.01
Z  0
sZ 1
.02
.5478
0.0 .5000 .5040 .5080
Shaded Area
Exaggerated
0.1 .5398 .5438 .5478
0.2 .5793 .5832 .5871
Probabilities
0.3 .6179 .6217 .6255
0
Z = 0.12
Only One Table is Needed
Standardizing Example
Z
X 
s
6.2  5

 0.12
10
Standardized
Normal Distribution
Normal Distribution
s  10
 5
sZ 1
6.2
X
Shaded Area Exaggerated
Z  0
0.12
Z
Example:
P  2.9  X  7.1  .1664
Z
X 
s
2.9  5

 .21
10
Z
X 
s
7.1  5

 .21
10
Standardized
Normal Distribution
Normal Distribution
s  10
.0832
sZ 1
.0832
2.9
 5
7.1
X
0.21
Shaded Area Exaggerated
Z  0
0.21
Z
Example:
P  2.9  X  7.1  .1664(continued)
Cumulative Standardized Normal
Distribution Table (Portion)
Z
.00
.01
Z  0
sZ 1
.02
.5832
0.0 .5000 .5040 .5080
Shaded Area
Exaggerated
0.1 .5398 .5438 .5478
0.2 .5793 .5832 .5871
0.3 .6179 .6217 .6255
0
Z = 0.21
Example:
P  2.9  X  7.1  .1664(continued)
Cumulative Standardized Normal
Distribution Table (Portion)
Z
.00
.01
.02
Z  0
sZ 1
.4168
-03 .3821 .3783 .3745
Shaded Area
Exaggerated
-02 .4207 .4168 .4129
-0.1 .4602 .4562 .4522
0.0 .5000 .4960 .4920
0
Z = -0.21
Example:
P  X  8  .3821
Z
X 
s
85

 .30
10
Standardized
Normal Distribution
Normal Distribution
s  10
sZ 1
.3821
 5
8
X
Shaded Area Exaggerated
Z  0
0.30
Z
Example:
P  X  8  .3821
Cumulative Standardized Normal
Distribution Table (Portion)
Z
.00
.01
Z  0
(continued)
sZ 1
.02
.6179
0.0 .5000 .5040 .5080
Shaded Area
Exaggerated
0.1 .5398 .5438 .5478
0.2 .5793 .5832 .5871
0.3 .6179 .6217 .6255
0
Z = 0.30
Finding Z Values
for Known Probabilities
What is Z Given
Probability = 0.1217 ?
Z  0
sZ 1
Cumulative Standardized
Normal Distribution Table
(Portion)
Z
.00
.01
0.2
0.0 .5000 .5040 .5080
.6217
0.1 .5398 .5438 .5478
0.2 .5793 .5832 .5871
Shaded Area
Exaggerated
0
Z  .31
0.3 .6179 .6217 .6255
Recovering X Values
for Known Probabilities
Standardized
Normal Distribution
Normal Distribution
s  10
sZ 1
.1179
.3821
 5
?
X
Z  0
0.30
X    Zs  5  .3010  8
Z
Finding Probabilities for X Values
Using Excel
Excel function:
=NORMDIST(x,mean,standard_deviation,TRUE)
=NORMSDIST(z,TRUE)
Example
Prob.(weight <= 165 lbs) when mean=180, std_dev=20:
=NORMDIST(165,180,20,true)
Answer: 0.2267
Prob.(weight >= 185 lbs) ?
Prob.(weight <= 165 and weight <= 185 lbs) ?
Finding X Values for Known Probabilities
Using Excel
Excel function:
=NORMINV(probabiltiy,mean,standard_deviation)
=NORMSINV(probability)
Example
Prob.(weight <= X)= 0.2
=NORMINV(0.2,180,20)
(mean=180, std_dev=20)
Answer: X=163
Prob.(weight >= X)=0.4
X?
Answer: X=185
Generating Random Variables
Using Excel
• Excel can be used to generate Discrete and
Continuous Random Variables
• Complex Probabilistic Models can be
constructed and simulation can give insight and
suggest managerial decisions
• Tutorial
Assessing Normality
• Not all continuous random variables are
normally distributed
• It is important to evaluate how well the data set
seems to be adequately approximated by a
normal distribution
Assessing Normality
(continued)
• Construct charts
– For large data sets, does the histogram appear bellshaped?
• Compute descriptive summary measures
– Do the mean, median and mode have similar
values?
– Is the interquartile range approximately 1.33 s?
– Does the data obey the 68-95-99 percent rule?
– Is the range approximately 6 s?
Assessing Normality
(continued)
• Observe the distribution of the data set
– Do approximately 2/3 of the observations lie
between mean  1 standard deviation?
– Do approximately 4/5 of the observations lie
between mean  1.28 standard deviations?
– Do approximately 19/20 of the observations lie
between mean  2 standard deviations?
Why Study
Sampling Distributions
• Sample statistics are used to estimate
population parameters
– e.g.:X  50 Estimates the population mean 
• Problems: different samples provide different
estimate
– Large samples gives better estimate; Large samples
costs more
– How good is the estimate?
• Approach to solution: theoretical basis is
sampling distribution
Sampling Distribution
• Theoretical probability distribution of a
sample statistic
• Sample statistic is a random variable
– Sample mean, sample proportion
• Results from taking all possible samples of
the same size
Example
• Population: 100 subjects, numbered from 1 to
100
• Take sample of 10 and compute average
• Take another sample, etc.
• Excel workbook
Developing Sampling Distributions
• Assume there is a population …
• Population size N=4
B
• Random variable, X,
is age of individuals
• Values of X: 18, 20,
22, 24 measured in
years
C
D
A
Developing Sampling Distributions
(continued)
Summary Measures for the Population Distribution
N

X
i 1
P(X)
i
.3
N
18  20  22  24

 21
4
N
s 
 X
i 1
i

N
.2
.1
0
2
 2.236
A
B
C
D
(18)
(20)
(22)
(24)
Uniform Distribution
X
Developing Sampling
Distributions
(continued)
All Possible Samples of Size n=2
1st
Obs
2nd Observation
18
20
22
24
18 18,18 18,20 18,22 18,24
20 20,18 20,20 20,22 20,24
16 Sample Means
22 22,18 22,20 22,22 22,24
1st 2nd Observation
Obs 18 20 22 24
24 24,18 24,20 24,22 24,24
18 18 19 20 21
16 Samples Taken
with Replacement
20 19 20 21 22
22 20 21 22 23
24 21 22 23 24
Developing Sampling
Distributions
(continued)
Sampling Distribution of All Sample Means
Sample Means
Distribution
16 Sample Means
1st 2nd Observation
Obs 18 20 22 24
18 18 19 20 21
20 19 20 21 22
22 20 21 22 23
24 21 22 23 24
P(X)
.3
.2
.1
0
_
18 19
20 21 22 23
24
X
Developing Sampling
Distributions
(continued)
Summary Measures of Sampling Distribution
N
X 
X
i 1
N
i
18  19  19 

16
N
sX 
 X
i 1
i
 X 
 21
2
N
18  21  19  21
2

 24
16
2

  24  21
2
 1.58
Comparing the Population with its
Sampling Distribution
Population
N=4
  21
s  2.236
Sample Means Distribution
n=2
 X  21
P(X)
.3
P(X)
.3
.2
.2
.1
.1
0
0
A
B
C
(18)
(20)
(22)
D X
(24)
s X  1.58
_
18 19
20 21 22 23
24
X
Properties of Summary Measures
•
X  
– i.e.
X is unbiased
• Standard error (standard deviation) of the
sampling distribution s Xis less than the
standard error of other unbiased estimators
• For sampling with replacement:
– As n increases, s X decreases
sX 
s
n
Unbiasedness
P(X)
Unbiased

Biased
X
X
Effect of Large Sample
Larger
sample size
P(X)
Smaller
sample size

X
When the Population is Normal
Population Distribution
Central Tendency
X  
Variation
sX 
s
n
Sampling with
Replacement
s  10
  50
Sampling Distributions
n4
n  16
sX 5
s X  2.5
 X  50
X
When the Population
is Not Normal
Population Distribution
Central Tendency
X  
Variation
sX 
s
n
Sampling with
Replacement
s  10
  50
Sampling Distributions
n4
n  30
sX 5
s X  1.8
 X  50
X
Central Limit Theorem
As sample
size gets
large
enough…
the
sampling
distribution
becomes
almost
normal
regardless
of shape of
population
X
How Large is Large Enough?
• For most distributions, n>30
• For fairly symmetric distributions, n>15
• For normal distribution, the sampling distribution
of the mean is always normally distributed
 8
Example:
s =2
n  25
P  7.8  X  8.2   ?
 7.8  8 X   X 8.2  8 
P  7.8  X  8.2   P 



sX
2 / 25 
 2 / 25
 P  .5  Z  .5  .3830
Standardized
Normal Distribution
Sampling Distribution
2
sX 
 .4
25
sZ 1
.1915
7.8
8.2
X  8
X
0.5
Z  0
0.5
Z
Population Proportions
 p
• Categorical variable
– e.g.: Gender, voted for Bush, college degree
• Proportion of population
characteristic
 p  having a
• Sample proportion provides an estimate
–
X number of successes
pS  
n
sample size
• If two outcomes, X has a binomial distribution
– Possess or do not possess characteristic
Sampling Distribution
of Sample Proportion
• Approximated by
normal distribution
– np
5
n 1  p   5
P(ps)
.3
.2
.1
0
– Mean:
•
Sampling Distribution
p  p
0
.2
.4
.6
8
1
ps
S
– Standard error:
•
sp 
S
p 1  p 
n
p = population proportion
Standardizing Sampling Distribution of
Proportion
Z
pS   pS
sp
S
p 1  p 
n
Standardized
Normal Distribution
Sampling Distribution
sp

pS  p
sZ 1
S
p
S
pS
Z  0
Z
Example:
n  200
p  .4
P  pS  .43  ?

 p 
.43  .4
S
pS

P  pS  .43  P

 s pS
.4 1  .4 

200

Standardized
Normal Distribution
Sampling Distribution
sp


  P  Z  .87   .8078



sZ 1
S
 p .43
S
pS
0 .87
Z
Related documents