Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Lecture 5
Selected material from:
Ch. 7 Random variables and probability distributions
Ch. 8 Sampling variability and sampling distributions
Random variables
A random variable is a numerical variable whose value depends on the outcome of a chance experiment. Discrete random variables have values that are isolated points on the number line.
Possible values of a
discrete random
variable
Continuous random variables have possible values that include an entire interval on the number line.
Possible values of a
continuous random
variable
2
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Discrete probability distributions
The probability distribution of a discrete random variable x gives the probability associated with each possible x value. The probabilities pi for a discrete variable x with k possible values (xi, i = 1,…,k) must satisfy
0 ≤ pi ≤ 1 for each i
p1 + p2 + ... + pk = 1
The probability P(x in A) of any event A is found by summing the pi for the outcomes xi making up A.
3
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Hot tub example
A store sells electric and gas hot tubs; 40% of all customers buy electric and 60% buy gas. Assume customers arrive independently to the store.
What is the probability that out of the next four customers, the first three buy gas and the last one buys an electric hot tub? Find the distribution of the number of customers out of 4 that buy electric hot tubs.
4
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Hot tub example
A company sells electric and gas hot tubs; 40% of all customers buy electric and 60% buy gas.
What is the probability that out of the next four customers, the first three buy gas and the last one buys an electric hot tub? Solution
G = customer buys gas
E = customer buys electric
Need P(GGGE).
Assume customers are independent.
P(GGGE) = P(G)P(G)P(G)P(E) = (.6)(.6)(.6)(.4) = .0864.
There is a 8.64% chance of observing this sequence of purchases.
5
Hot tub example
Find the distribution of the number of customers out of 4 that buy electric hot tubs; 40% buy electric, 60% gas.
Solution
• X = number of customers out of 4 that buy electric hot tubs.
• X = 0, 1, 2, 3, or 4.
• P(X = 0) = P(GGGG) = P(G)P(G)P(G)P(G) = (.6)(.6)(.6)(.6) = .1296,
using independence and that 60% buy gas hot tubs.
• P(X = 1) = P(EGGG) + P(GEGG) + P(GGEG) + P(GGGE) =
4(.4)(.6)(.6)(.6) noting that the order does not matter, = 4(.0864)
using the previous calculation, = .3456.
• P(X = 2) = 6(.4)(.4)(.6)(.6) noting that there are 6 ways to choose 2
objects out of 4 [double-check that you can write out these
sequences yourself], = .3456 again.
6
Hot tub example
Find the distribution of the number of customers out of 4 that buy electric hot tubs; 40% buy electric, 60% gas.
Solution continued
• P(X = 3) = 4(.4)(.4)(.4)(.6) = .1536.
• P(X = 4) = (.4)(.4)(.4)(.4) = .0256.
• Summarizing gives the following table:
X
0
1
2
3
4
P(X)
.1296
.3456
.3456
.1536
.0256
Double check that the P(X) sum to 1!
7
Hot tub example
The distribution of the number of customers out of 4 that buy electric hot tubs is given by the following table. Find the probability that out of 4 customers at least 2 buy electric hot tubs.
X
0
1
2
3
4
P(X)
.1296
.3456
.3456
.1536
.0256
Solution
P(X  2) = P(X = 2) + P(X = 3) = P(X = 4)
= .3456 + .1536 + .0256
= .5248
There is a 52.5% chance that at least two out of four customers buy
electric hot tubs.
8
Continuous Probability Distributions
If one constructs histograms of the actual amount of water in collections of 1 liter bottles of water from a factory, they might observe something like the following.
Sample 1: Water contents from 100 bottles measured to the nearest .1 milliliter (mL). 9
Sample 2: Water contents from 100 bottles measured to the nearest .01 mL.
Limiting curve as the accuracy of water content increases
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Probability distributions for continuous random variables
A probability distribution for a continuous random variable x is specified by a function f(x) called the density function, which satisfies
1. f(x)  0 2. The total area under the density curve is equal to 1:
 f ( x)dx  1.
10
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Probabilities: The probability that x falls in any particular interval is the area under the density curve that lies above the interval.
a
P(x < a)
a
P(x < a)
Notice that for a continuous random variable x, P(x = a) = 0 for any specific value a because the “area above a point” under the curve is a line segment and hence has no area. This means P(x < a) = P(x ≤ a).
11
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
More examples of probabilities
a b
P(a < x < b)
a
b
P(a < x < b)
P(a < x < b)
12
a
b
P(a < x < b) = P(a ≤ x < b) = P(a < x ≤ b) = P(a ≤ x ≤ b)
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Method of probability calculation
The probability that a continuous random variable x lies between a lower limit a and an upper limit b is
P(a < x < b) = (area to the left of b) – (area to the left of a)
= P(x < b) –
P(x < a)
=
a
13
b
b
a
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Example: You’re late
Define a continuous random variable x by x = the number of minutes by which the Professor starts the class late. Suppose that x has a probability distribution with density function
.25 2  x  6
f(x)  
 0 otherwise
The graph looks like
.25
So this professor
is always late.
f(x)
1
14
2
3 4
5
6
7
x = number of minutes
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Example: You’re late
Find the probability that the professor is between 3 and 4.5 minutes late.
.25
1
2
3 4
5
6
7
The probability is represented by the orange shaded area in the graph. Since that shaded area is a rectangle, area = (base)(height)=(1.5)(.25) = .375.
15
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Mean, standard deviation
The mean value of a random variable x, denoted by μx or μ, describes where the probability distribution of x is centered.
Larger mean
The standard deviation of a random variable x, denoted by σx or σ, describes variability in the probability distribution. Smaller standard deviation
16
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Discrete random variables
The mean of a discrete random variable X is calculated as :

 x p(x).
all values of x
The variance is : σ 2X 
2
(x
μ)
p(x) and the standard deviation

all values of x
(sd) is  X  σ 2X .
17
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Mean, variance of a linear function
If x is a random variable with mean x
2
and variance X and a and b are
numerical constants, the random variable
y defined by y = a + bx is called a linear
function of the random variable x.
The mean of y = a + bx is y = a + bx = a + bx
The variance of y is  y  abx  b X
2
2
2
2
From which it follows that the standard deviation
of y is  y  a bx  b  x
| | Absolute value 18
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Example: Car sales
A car dealership would like to evaluate its average daily costs (and standard deviation). The daily cost depends directly on how many salesmen are working on a given day. Specifically the cost of doing business on a day involves a fixed cost of $255 plus an additional cost of $110 for each salesman working.
30% of the time just one salesman is working,
40% of the time two salesmen are working,
20% of the time three salesmen are working
10% of the time four salesmen are working.
How do we solve this problem?
19
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Example: Car sales
Specifically the cost of doing business on a day involves a fixed cost of $255 plus an additional cost of $110 for each salesman working.
30% of the time just one salesman is working,
40% of the time two salesmen are working,
20% of the time three salesmen are working
10% of the time four salesmen are working.
Sounds like there are two variables: cost and number of salesmen working and that these two are related to each other. This is a linear function problem!
20
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Example: Car sales
30% of the time just one salesman is working,
40% of the time two salesmen are working,
20% of the time three salesmen are working
10% of the time four salesmen are working.
This is a given distribution, so it must be the
x variable. Define x to be the number of
salesmen working.
x
1
2
3
4
p(x)
0.3
0.4
0.2
0.1
Specifically the cost of doing business on a day involves a fixed cost of $255 plus an additional cost of $110 for each salesman working.
y must be the cost since it depends on x, specifically: y = 255 + 110 x
21
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Example: Car sales
First, calculate the mean of x using the formula for discrete random variables.
x
1
2
3
4
p(x) xp(x)
0.3 0.3
0.4 0.8
0.2 0.6
0.1 0.4
2.1
 x  2.1
On average 2.1 salesmen are working on a given day.
According to the linear function formula the
mean of y = 255 + 110x is  y   255110 x  255  110 x
 255  110(2.1)  $486
22
The average daily
cost for the dealership is $486.
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Example: Car sales
Calculate the variance of x using the formula for discrete random variables.
2
x p(x) (x-) p(x)
1 0.3 0.3630
2 0.4 0.0040
3 0.2 0.1620
4 0.1 0.3610
0.8900
  0.89
2
x
  0.89  0.9434
x
Use the linear function to obtain the standard deviation of y = 255 + 110x:
 2   (1 1 0 ) 2  2x  (1 1 0 ) 2 (0 .89 )  1 07 6 9
25 5  11 0
X
255 + 110x

2 55  1 10  X
 1 1 0  x  1 1 0(0 .9 4 3 4 )  1 0 3 .7 7
255 + 110x
The typical daily cost for the dealership is $486 +/‐ $103.77.
23
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Binomial distribution
Properties of a Binomial experiment
1. It consists of a fixed number n of observations called trials.
2. Each trial can result in one of only two mutually exclusive outcomes labeled success (S) and failure (F).
3. Outcomes of different trials are independent.
4. The probability that a trial results in S is the same for each trial.
Example: Flip a coin 10 times and count the number of times heads appears, that is a Binomial random variable X ~ Bin(10,1/2).
A binomial random variable X is defined as the number of successes out of n trials: X takes on the values 0, 1, 2, …., n.
24
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Binomial distribution
Let n = number of independent trials in a binomial experiment
πp = constant probability that any particular trial results in a success.
Then
P(x) = P(x successes among n trials)
=
n!
 x (1   ) xn-x
x!(n  x)!
Example: X ~ Bin(10,1/2)  X can only be 0, 1, 2, …, up to 10 heads.
The expected number of heads out of 10 tosses is 10(1/2)=5.
• n! = n  (n‐1)  ...  2  1. Use 0! = 1.
• E(x) = n
• Var(x) = n(1 ‐ )
25
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Geometric distribution
Suppose an experiment consists of a sequence of trials with the same conditions as for a binomial experiment:
The trials are independent.
Each trial can result in one of two possible outcomes, success and failure.
The probability of success is the same for all trials.
x = number of trials until the first success is observed (including the success trial).
The probability distribution of x is called the geometric probability distribution:
26
p(x) = (1 – )x-1
…
Example: What is the chance that a coin has to be flipped three times before a head appears?
X~Geo(1/2), P(X = 3)= (1/2)2(1/2)=1/8
=.125.
x = 1, 2, 3,
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Normal distribution
Two numbers determine a normal distribution
1. Mean µ
2. Standard deviation 
Probability density function
f ( x) 
Cumulative distribution function F ( x) 
27
1
2

2
e
1
2
2

(

)
x
2
,   x  
x


f (u )du
Warning! Some texts write N(µ, 2), some N(µ, ).
Normal Distributions  = 1





Normal Distributions  = 0
-6
-4
-2
0
2
4
6

8




-4
28
-2
0
2
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
4
Standard normal distribution
A normal distribution with mean 0 and standard deviation 1 is called the standard (or standardized) normal distribution. Symmetric around 0.
Half of the area (0.50) lies below 0 and half above 0.
29
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Normal tables
based on a standard Normal distribution
P(z < ‐1.96) = .0250
Bottom half of the
Normal table
(negative z values)
30
z*
-3.8
-3.7
-3.6
-3.5
-3.4
-3.3
-3.2
-3.1
-3.0
-2.9
-2.8
-2.7
-2.6
-2.5
-2.4
-2.3
-2.2
-2.1
-2.0
-1.9
-1.8
-1.7
-1.6
-1.5
-1.4
-1.3
-1.2
-1.1
-1.0
-0.9
-0.8
-0.7
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
-0.0
0.00
0.0001
0.0001
0.0002
0.0002
0.0003
0.0005
0.0007
0.0010
0.0013
0.0019
0.0026
0.0035
0.0047
0.0062
0.0082
0.0107
0.0139
0.0179
0.0228
0.0287
0.0359
0.0446
0.0548
0.0668
0.0808
0.0968
0.1151
0.1357
0.1587
0.1841
0.2119
0.2420
0.2743
0.3085
0.3446
0.3821
0.4207
0.4602
0.5000
0.01
0.0001
0.0001
0.0002
0.0002
0.0003
0.0005
0.0007
0.0010
0.0013
0.0019
0.0025
0.0034
0.0046
0.0061
0.0080
0.0104
0.0136
0.0174
0.0222
0.0281
0.0351
0.0436
0.0537
0.0655
0.0793
0.0951
0.1131
0.1335
0.1562
0.1814
0.2090
0.2389
0.2709
0.3050
0.3409
0.3783
0.4168
0.4562
0.4960
0.02
0.0001
0.0001
0.0001
0.0002
0.0003
0.0005
0.0006
0.0009
0.0013
0.0018
0.0024
0.0033
0.0044
0.0059
0.0078
0.0102
0.0132
0.0170
0.0217
0.0274
0.0344
0.0427
0.0526
0.0643
0.0778
0.0934
0.1112
0.1314
0.1539
0.1788
0.2061
0.2358
0.2676
0.3015
0.3372
0.3745
0.4129
0.4522
0.4920
0.03
0.0001
0.0001
0.0001
0.0002
0.0003
0.0004
0.0006
0.0009
0.0012
0.0017
0.0023
0.0032
0.0043
0.0057
0.0075
0.0099
0.0129
0.0166
0.0212
0.0268
0.0336
0.0418
0.0516
0.0630
0.0764
0.0918
0.1093
0.1292
0.1515
0.1762
0.2033
0.2327
0.2643
0.2981
0.3336
0.3707
0.4090
0.4483
0.4880
0.04
0.0001
0.0001
0.0001
0.0002
0.0003
0.0004
0.0006
0.0008
0.0012
0.0016
0.0023
0.0031
0.0041
0.0055
0.0073
0.0096
0.0125
0.0162
0.0207
0.0262
0.0329
0.0409
0.0505
0.0618
0.0749
0.0901
0.1075
0.1271
0.1492
0.1736
0.2005
0.2296
0.2611
0.2946
0.3300
0.3669
0.4052
0.4443
0.4840
0.05
0.0001
0.0001
0.0001
0.0002
0.0003
0.0004
0.0006
0.0008
0.0011
0.0016
0.0022
0.0030
0.0040
0.0054
0.0071
0.0094
0.0122
0.0158
0.0202
0.0256
0.0322
0.0401
0.0495
0.0606
0.0735
0.0885
0.1056
0.1251
0.1469
0.1711
0.1977
0.2266
0.2578
0.2912
0.3264
0.3632
0.4013
0.4404
0.4801
0.06
0.0001
0.0001
0.0001
0.0002
0.0003
0.0004
0.0006
0.0008
0.0011
0.0015
0.0021
0.0029
0.0039
0.0052
0.0069
0.0091
0.0119
0.0154
0.0197
0.0250
0.0314
0.0392
0.0485
0.0594
0.0721
0.0869
0.1038
0.1230
0.1446
0.1685
0.1949
0.2236
0.2546
0.2877
0.3228
0.3594
0.3974
0.4364
0.4761
0.07
0.0001
0.0001
0.0001
0.0002
0.0003
0.0004
0.0005
0.0008
0.0011
0.0015
0.0021
0.0028
0.0038
0.0051
0.0068
0.0089
0.0116
0.0150
0.0192
0.0244
0.0307
0.0384
0.0475
0.0582
0.0708
0.0853
0.1020
0.1210
0.1423
0.1660
0.1922
0.2206
0.2514
0.2843
0.3192
0.3557
0.3936
0.4325
0.4721
0.08
0.0001
0.0001
0.0001
0.0002
0.0003
0.0004
0.0005
0.0007
0.0010
0.0014
0.0020
0.0027
0.0037
0.0049
0.0066
0.0087
0.0113
0.0146
0.0188
0.0239
0.0301
0.0375
0.0465
0.0571
0.0694
0.0838
0.1003
0.1190
0.1401
0.1635
0.1894
0.2177
0.2483
0.2810
0.3156
0.3520
0.3897
0.4286
0.4681
0.09
0.0001
0.0001
0.0001
0.0002
0.0002
0.0003
0.0005
0.0007
0.0010
0.0014
0.0019
0.0026
0.0036
0.0048
0.0064
0.0084
0.0110
0.0143
0.0183
0.0233
0.0294
0.0367
0.0455
0.0559
0.0681
0.0823
0.0985
0.1170
0.1379
0.1611
0.1867
0.2148
0.2451
0.2776
0.3121
0.3483
0.3859
0.4247
0.4641
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Normal tables
Top half of the
Normal table
(positive z values)
31
z*
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
3.0
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
0.00
0.5000
0.5398
0.5793
0.6179
0.6554
0.6915
0.7257
0.7580
0.7881
0.8159
0.8413
0.8643
0.8849
0.9032
0.9192
0.9332
0.9452
0.9554
0.9641
0.9713
0.9772
0.9821
0.9861
0.9893
0.9918
0.9938
0.9953
0.9965
0.9974
0.9981
0.9987
0.9990
0.9993
0.9995
0.9997
0.9998
0.9998
0.9999
0.9999
0.01
0.5040
0.5438
0.5832
0.6217
0.6591
0.6950
0.7291
0.7611
0.7910
0.8186
0.8438
0.8665
0.8869
0.9049
0.9207
0.9345
0.9463
0.9564
0.9649
0.9719
0.9778
0.9826
0.9864
0.9896
0.9920
0.9940
0.9955
0.9966
0.9975
0.9982
0.9987
0.9991
0.9993
0.9995
0.9997
0.9998
0.9998
0.9999
0.9999
0.02
0.5080
0.5478
0.5871
0.6255
0.6628
0.6985
0.7324
0.7642
0.7939
0.8212
0.8461
0.8686
0.8888
0.9066
0.9222
0.9357
0.9474
0.9573
0.9656
0.9726
0.9783
0.9830
0.9868
0.9898
0.9922
0.9941
0.9956
0.9967
0.9976
0.9982
0.9987
0.9991
0.9994
0.9995
0.9997
0.9998
0.9999
0.9999
0.9999
0.03
0.5120
0.5517
0.5910
0.6293
0.6664
0.7019
0.7357
0.7673
0.7967
0.8238
0.8485
0.8708
0.8907
0.9082
0.9236
0.9370
0.9484
0.9582
0.9664
0.9732
0.9788
0.9834
0.9871
0.9901
0.9925
0.9943
0.9957
0.9968
0.9977
0.9983
0.9988
0.9991
0.9994
0.9996
0.9997
0.9998
0.9999
0.9999
0.9999
0.04
0.5160
0.5557
0.5948
0.6331
0.6700
0.7054
0.7389
0.7704
0.7995
0.8264
0.8508
0.8729
0.8925
0.9099
0.9251
0.9382
0.9495
0.9591
0.9671
0.9738
0.9793
0.9838
0.9875
0.9904
0.9927
0.9945
0.9959
0.9969
0.9977
0.9984
0.9988
0.9992
0.9994
0.9996
0.9997
0.9998
0.9999
0.9999
0.9999
0.05
0.5199
0.5596
0.5987
0.6368
0.6736
0.7088
0.7422
0.7734
0.8023
0.8289
0.8531
0.8749
0.8944
0.9115
0.9265
0.9394
0.9505
0.9599
0.9678
0.9744
0.9798
0.9842
0.9878
0.9906
0.9929
0.9946
0.9960
0.9970
0.9978
0.9984
0.9989
0.9992
0.9994
0.9996
0.9997
0.9998
0.9999
0.9999
0.9999
0.06
0.5239
0.5636
0.6026
0.6406
0.6772
0.7123
0.7454
0.7764
0.8051
0.8315
0.8554
0.8770
0.8962
0.9131
0.9279
0.9406
0.9515
0.9608
0.9686
0.9750
0.9803
0.9846
0.9881
0.9909
0.9931
0.9948
0.9961
0.9971
0.9979
0.9985
0.9989
0.9992
0.9994
0.9996
0.9997
0.9998
0.9999
0.9999
0.9999
0.07
0.5279
0.5675
0.6064
0.6443
0.6808
0.7157
0.7486
0.7794
0.8078
0.8340
0.8577
0.8790
0.8980
0.9147
0.9292
0.9418
0.9525
0.9616
0.9693
0.9756
0.9808
0.9850
0.9884
0.9911
0.9932
0.9949
0.9962
0.9972
0.9979
0.9985
0.9989
0.9992
0.9995
0.9996
0.9997
0.9998
0.9999
0.9999
0.9999
0.08
0.5319
0.5714
0.6103
0.6480
0.6844
0.7190
0.7517
0.7823
0.8106
0.8365
0.8599
0.8810
0.8997
0.9162
0.9306
0.9429
0.9535
0.9625
0.9699
0.9761
0.9812
0.9854
0.9887
0.9913
0.9934
0.9951
0.9963
0.9973
0.9980
0.9986
0.9990
0.9993
0.9995
0.9996
0.9997
0.9998
0.9999
0.9999
0.9999
0.09
0.5359
0.5753
0.6141
0.6517
0.6879
0.7224
0.7549
0.7852
0.8133
0.8389
0.8621
0.8830
0.9015
0.9177
0.9319
0.9441
0.9545
0.9633
0.9706
0.9767
0.9817
0.9857
0.9890
0.9916
0.9936
0.9952
0.9964
0.9974
0.9981
0.9986
0.9990
0.9993
0.9995
0.9997
0.9998
0.9998
0.9999
0.9999
0.9999
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Calculations using the standard normal distribution
P(z < 1.83) = 0.9664
32
P(z > 1.83) = 1 – P(z < 1.83)
= 1 – 0.9664
= 0.0336
Symmetry: P(z < ‐1.83) = P(z > 1.83) = 0.0336
Intervals: P(‐1.83 < z < 1.83) = P(z < 1.83) ‐ P(z < ‐1.83)
= 0.9664 – 0.0336 = 0.9328
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Finding quantiles
Using the standard Normal tables, in each of the following, find the z values that satisfy :
The point z with 98% of the observations falling below it.
The closest entry in the table to 0.9800 is 0.9798 corresponding to a z value of 2.05. 
33
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Normal tables
need .98
.9798 is the closest probability to .98.
Row # + Column #
= 2.0 + .05
= 2.05 is the answer
34
z*
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
3.0
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
0.00
0.5000
0.5398
0.5793
0.6179
0.6554
0.6915
0.7257
0.7580
0.7881
0.8159
0.8413
0.8643
0.8849
0.9032
0.9192
0.9332
0.9452
0.9554
0.9641
0.9713
0.9772
0.9821
0.9861
0.9893
0.9918
0.9938
0.9953
0.9965
0.9974
0.9981
0.9987
0.9990
0.9993
0.9995
0.9997
0.9998
0.9998
0.9999
0.9999
0.01
0.5040
0.5438
0.5832
0.6217
0.6591
0.6950
0.7291
0.7611
0.7910
0.8186
0.8438
0.8665
0.8869
0.9049
0.9207
0.9345
0.9463
0.9564
0.9649
0.9719
0.9778
0.9826
0.9864
0.9896
0.9920
0.9940
0.9955
0.9966
0.9975
0.9982
0.9987
0.9991
0.9993
0.9995
0.9997
0.9998
0.9998
0.9999
0.9999
0.02
0.5080
0.5478
0.5871
0.6255
0.6628
0.6985
0.7324
0.7642
0.7939
0.8212
0.8461
0.8686
0.8888
0.9066
0.9222
0.9357
0.9474
0.9573
0.9656
0.9726
0.9783
0.9830
0.9868
0.9898
0.9922
0.9941
0.9956
0.9967
0.9976
0.9982
0.9987
0.9991
0.9994
0.9995
0.9997
0.9998
0.9999
0.9999
0.9999
0.03
0.5120
0.5517
0.5910
0.6293
0.6664
0.7019
0.7357
0.7673
0.7967
0.8238
0.8485
0.8708
0.8907
0.9082
0.9236
0.9370
0.9484
0.9582
0.9664
0.9732
0.9788
0.9834
0.9871
0.9901
0.9925
0.9943
0.9957
0.9968
0.9977
0.9983
0.9988
0.9991
0.9994
0.9996
0.9997
0.9998
0.9999
0.9999
0.9999
0.04
0.5160
0.5557
0.5948
0.6331
0.6700
0.7054
0.7389
0.7704
0.7995
0.8264
0.8508
0.8729
0.8925
0.9099
0.9251
0.9382
0.9495
0.9591
0.9671
0.9738
0.9793
0.9838
0.9875
0.9904
0.9927
0.9945
0.9959
0.9969
0.9977
0.9984
0.9988
0.9992
0.9994
0.9996
0.9997
0.9998
0.9999
0.9999
0.9999
0.05
0.5199
0.5596
0.5987
0.6368
0.6736
0.7088
0.7422
0.7734
0.8023
0.8289
0.8531
0.8749
0.8944
0.9115
0.9265
0.9394
0.9505
0.9599
0.9678
0.9744
0.9798
0.9842
0.9878
0.9906
0.9929
0.9946
0.9960
0.9970
0.9978
0.9984
0.9989
0.9992
0.9994
0.9996
0.9997
0.9998
0.9999
0.9999
0.9999
0.06
0.5239
0.5636
0.6026
0.6406
0.6772
0.7123
0.7454
0.7764
0.8051
0.8315
0.8554
0.8770
0.8962
0.9131
0.9279
0.9406
0.9515
0.9608
0.9686
0.9750
0.9803
0.9846
0.9881
0.9909
0.9931
0.9948
0.9961
0.9971
0.9979
0.9985
0.9989
0.9992
0.9994
0.9996
0.9997
0.9998
0.9999
0.9999
0.9999
0.07
0.5279
0.5675
0.6064
0.6443
0.6808
0.7157
0.7486
0.7794
0.8078
0.8340
0.8577
0.8790
0.8980
0.9147
0.9292
0.9418
0.9525
0.9616
0.9693
0.9756
0.9808
0.9850
0.9884
0.9911
0.9932
0.9949
0.9962
0.9972
0.9979
0.9985
0.9989
0.9992
0.9995
0.9996
0.9997
0.9998
0.9999
0.9999
0.9999
0.08
0.5319
0.5714
0.6103
0.6480
0.6844
0.7190
0.7517
0.7823
0.8106
0.8365
0.8599
0.8810
0.8997
0.9162
0.9306
0.9429
0.9535
0.9625
0.9699
0.9761
0.9812
0.9854
0.9887
0.9913
0.9934
0.9951
0.9963
0.9973
0.9980
0.9986
0.9990
0.9993
0.9995
0.9996
0.9997
0.9998
0.9999
0.9999
0.9999
0.09
0.5359
0.5753
0.6141
0.6517
0.6879
0.7224
0.7549
0.7852
0.8133
0.8389
0.8621
0.8830
0.9015
0.9177
0.9319
0.9441
0.9545
0.9633
0.9706
0.9767
0.9817
0.9857
0.9890
0.9916
0.9936
0.9952
0.9964
0.9974
0.9981
0.9986
0.9990
0.9993
0.9995
0.9997
0.9998
0.9998
0.9999
0.9999
0.9999
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Standardization
 = true population mean
 = true population standard deviation
The formula
x
z

gives the number of standard deviations that x is from the mean.
• Analog to the Z‐score (Ch. 4), which used the sample mean and standard deviation.
• Z then follows the standard normal distribution N(0,1).
35
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Finding normal probabilities
To calculate probabilities for any Normal distribution, standardize the relevant values and then use the table of z curve areas. More specifically, if x is a variable whose behavior is described by a Normal distribution with mean μ and standard deviation σ, then
 x b 
*
P ( x  b)  P

  P( z  b )
 
 
b
*
where z ~ N(0,1) and b 
is the number you

should look up in the standard Normal table.
36
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Example : Picante sauce
A company produces “20 ounce” jars of picante sauce. Suppose the “20 ounce” jars follow a N(20.2,0.1252) distribution curve. What proportion of the jars are under‐filled, defined as having less than 20 ounces of sauce?
Looking up ‐1.60 on the standard normal table yields P(Z  ‐1.60)= 0.0548.
5.48% of the jars contain less than 20 ounces of sauce.
Solution
z 
37
x 
2 0  2 0 .2

  1 .6 0

0 .1 2 5
Example: Picante sauce
99% of the jars contain more than what amount of sauce? Recall,  = 20.2 ounces and  = 0.125 ounces.
Solution: When we try to look up 0.0100 in the body of the table we do not find this value. We do find the following
z
- 2 .4
- 2 .3
- 2 .2
0 .0
0 .0 0
0 .0 1
0 .0 1
0
82
07
39
0 .0
0 .0 0
0 .0 1
0 .0 1
1
80
05
35
0 .0
0 .0 0
0 .0 1
0 .0 1
2
78
02
32
0 .0
0 .0 0
0 .0 0
0 .0 1
3
75
99
29
0 .0
0 .0 0
0 .0 0
0 .0 1
4
73
96
25
0 .0
0 .0 0
0 .0 0
0 .0 1
5
71
94
22
The entry closest to 0.0100 is 0.0099  the z value ‐2.33.
Since the relationship between the real scale (x) and the z scale is z=(x‐ )/  we solve for x getting x =  + z
x = 20.2+(‐2.33)(0.125) = 19.90875 = 19.91.
99% of all “20 oz” jars contain more than 19.91 ounces (oz).
38
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Properties of the sampling distribution of the sample mean
1 n
X   X i , where X 1 , X 2 ,..., X n independent with mean  , variance  2 .
n i 1
Rule 1 :  X  

Central Limit Theorem
Rule 2 : X 
n
Rule 3 : When X i normally distributed for i  1,...,n, then X also normal.
Rule 4 : When X i not normally distributed but n large ( 30) then, X also normal.
Symmetric distributions
39
Population
n =4
n=9
n = 16
Skewed distributions
Population
n=4
n=10
n=30
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Central Limit Theorem
If n is large (> 30) or the population distribution is normal, the standardized variable x  X x  
z

X
 n
has approximately a standard normal distribution, N(0,1).
40
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Example: Cereal
A food company sells “18 ounce” boxes of cereal. Let x denote the actual amount of cereal in a box of cereal. Suppose that x is normally distributed with µ = 18.03 ounces and  = 0.05.
a) What proportion of the boxes will contain less than 18 ounces?
Solution
18  18.03 

P(x  18)  P  z 

0.05 

 P(z  0.60)  0.2743 Note! You did not need the central limit theorem here.
41
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Example: Cereal
b) A case consists of 24 boxes of cereal. What is the probability that the mean amount of cereal across the 24 boxes is less than 18 ounces?
Now you need the central limit theorem!

18  18.03 
P(x  18)  P  z 

0.05 24 

 P(z  2.94)  0.0016
This chance is much less than for a single box (p=0.27).
42
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Properties of the sampling distribution of p
Let p be the proportion of successes in a random sample of size n
from a population whose proportion of successes is .
Denote the mean of p by p and the standard deviation by p.
Then the following rules hold.
Rule 1: p  
(1  )
Rule 2:
p 
n
Rule 3: When n is large and  is not too near 0 or 1, the sampling distribution of p is approximately Normal.
Informal Rule
If both np ≥ 10 and n(1‐p)  10, then it is safe to use a Normal approximation. 43
Example: Defectives
If the true proportion of defectives produced by a certain manufacturing process is 0.08 and a sample of 400 is chosen, what is the probability that the proportion of defectives in the sample is greater than 0.10?
Solution: Since n = 400(0.08) = 32 > 10 and n(1‐) = 400(0.92) = 368 > 10, it is reasonable to use the normal approximation. p    0.08
p 
z
p  p
p

(1  )
0.08(1  0.08)

 0.013565
n
400
0.10  0.08
 1.47
0.013565
P( p  0.1)  P( z  1.47)  1  P( z  1.47)
 1  0.9292 (from the standard normal table)
 0.0708
44
There is a 7.1% chance that the proportion of defectives exceeds 0.10.
© 2008 Brooks/Cole, a division of Thomson Learning, Inc.
Tutorial 5a: Consulting times
The amount of time spent by a statistical consultant with a client is a random variable having a Normal distribution with mean 60 minutes and standard deviation 10 minutes. Use the Normal tables to help answer the questions below.
1.) What is the probability that more than 45 minutes is spent with the consultant? 2.) What amount of time is exceeded by only 10% of all clients at a meeting?
3.) If the consultant assesses a fixed charge of 10 euros (overhead) and then 50 euros per hour, what is the average cost of a meeting?
45
Tutorial 5b: Playing catch
Sophie is a dog that loves to play catch. Unfortunately she is not very good and only catches 10% of all the balls thrown to her.
1.) Let x be the number of times you have to throw a ball to Sophie before she catches it. What is the distribution of x? Calculate the probability that it takes two throws before Sophie catches the ball, i.e., she misses the first but catches the second ball.
2.) Suppose you will throw the ball five times to Sophie and let x be the number of times out of five that she catches the ball. What is the distribution of x? What is the chance she catches none of the five balls? Does this result make intuitive sense? Why?
46
Related documents