Download Class 03. The Lady Tasting Tea

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Class 04. Wunderdog and the
Normal Distribution
EMBS Section 6.2
Class 03 Assignment
• Answers are posted on the course website
• My office hours are IN THE CLASSROOM.
– 3 to 430 on class days
– Or email me for an appointment
[email protected]
• TA Office hours
– Sundays and Tuesday Nights
– MCoB 266
– 7 to 8:30 pm
What we learned last class
• Hypothesis Testing
– H0: She is guessing
– Randomized double blind experiment
– Test statistic: number correct
– Specify α=0.05 (level of significance)
– Observe 7 correct
– P(x>=7│H0) = 1-BINOMDIST(6,10,.5,true) = 0.17
– Since this pvalue > α, the result is NOT statistically
significant.
Case: Wunderdog Sports Picks
Wunderdog is just like LTT?
•
•
•
•
•
•
H0: He is guessing (p=.5, independent events)
Ha: He is skillful (p>.5)
Test statistic: Number correct = 87.
P( X≥87 │H0 ) = 1 – BINOMDIST(86,149,.5,true)
= 0.024
Conclusion: Statistically significant at the
α=0.05 level.
Wunderdog
•
•
•
•
•
X is number correct
X is binomial, n=149, p=0.5, if H0 is true.
Mean = E(X) = n*p = 74.5
Variance = Var(X) = n*p*(1-p) = 37.25
Standard deviation = 37.25^.5 = 6.1
Binomial pmf with n=149, p=0.5
Each possible outcome x
has a mass of probability
calculated as
BINOM.DIST(x,149,.5,false)
0.07
0.06
0.05
P(x)
0.04
0.03
0.02
0.01
1
5
9
13
17
21
25
29
33
37
41
45
49
53
57
61
65
69
73
77
81
85
89
93
97
101
105
109
113
117
121
125
129
133
137
141
145
149
0
x = number of successes
As n gets big, the binomial “looks like”
the normal (bell-shaped curve)
• So if n is big, we sometimes use the normal
distribution to approximate the binomial.
– X is actually binomial.
– It would be better to use BINOMDIST
– But the probabilities we calculate come out pretty
much the same if we use the appropriate normal
distribution.
Binomials distributions for n=149
All three are
“bell-shaped
curves”
0.08
P(x)
0.06
P=0.5
0.04
0.02
1
13
25
37
49
61
73
85
97
109
121
133
145
0
0.1
0.08
0.08
0.06
0.04
P=0.2
P(x)
0.1
0.06
0.04
0.02
0
0
1
13
25
37
49
61
73
85
97
109
121
133
145
0.02
P=0.8
1
13
25
37
49
61
73
85
97
109
121
133
145
P(x)
x=number correct
x=number correct
x=number correct
1
8
15
22
29
36
43
50
57
64
71
78
85
92
99
106
113
120
127
134
141
148
f(x)
1
8
15
22
29
36
43
50
57
64
71
78
85
92
99
106
113
120
127
134
141
148
P(x)
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
x=number correct
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
x
1
8
15
22
29
36
43
50
57
64
71
78
85
92
99
106
113
120
127
134
141
148
f(x)
1
8
15
22
29
36
43
50
57
64
71
78
85
92
99
106
113
120
127
134
141
148
P(x)
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
x=number correct
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
x
The Normal Distribution
• X is continuous
• Applies to LOTS of random variables
• Parameters are mean μ and the standard
deviation σ.
– Mean or E(X) = μ
– Variance = σ2
– Standard deviation = σ
– Symmetric: mean = median = mode (all = μ)
EMBS Fig 6.4, p 249
To calculate probabilities
• P(X=x) = 0
• P(X≤x) = NORMDIST(x,μ,σ,true)
• P(X<x) = NORMDIST(x,μ,σ,true)
0.07
0.06
Mean=E(X)=74.5
P(x)
0.05
Standard
deviation = 6.1
0.04
0.03
0.02
0.01
1
8
15
22
29
36
43
50
57
64
71
78
85
92
99
106
113
120
127
134
141
148
0
x=number correct
0.07
Normal with
μ=74.5, σ=6.1
0.06
0.04
0.03
0.02
0.01
0
1
8
15
22
29
36
43
50
57
64
71
78
85
92
99
106
113
120
127
134
141
148
f(x)
0.05
x
Just like the
binomial, the
normal is a
FAMILY of
distributions.
The member of
the Normal
family we want
to use is the
one with the
mean and
standard
deviation that
match our
binomial.
0.07
X is discrete
P(x≥87) =
0.06
1-BINOMDIST(86,149,.5,true)
=0.024
0.05
P(x)
P(x=87) = 0.008
0.04
0.03
0.02
0.01
1
8
15
22
29
36
43
50
57
64
71
78
85
92
99
106
113
120
127
134
141
148
0
x=number correct
0.07
X is continuous
P(x≥87) =
P(x>87) =
0.06
1-NORMDIST(87,74.5,6.1,true)
0.04
=0.020
0.03
0.02
0.01
0
1
8
15
22
29
36
43
50
57
64
71
78
85
92
99
106
113
120
127
134
141
148
f(x)
P(x=87) = 0
0.05
x
To calculate probabilities
• P(X=x) = 0
• P(X≤x) = NORMDIST(x,μ,σ,true)
• P(X<x) = NORMDIST(x,μ,σ,true)
• P(X>x) = 1 – NORMDIST(x,μ,σ,true)
• P(x1<X<x2) = NORMDIST(x2, μ,σ,true)-NORMDIST(x1,μ,σ,true)
Weights of CEO’s are normally distributed with µ = 155 and σ=25. What
percentage of CEO’s do we expect weigh between 160 and 200?
=NORMDIST(200,155,25,true)-NORMDIST(160,155,25,true)
= 0.964 – 0.579
= 0.385
To go backwards from a p to an x
• The find the x value such that P(X<x) = p, use
=NORMINV(p,μ,σ)
EMBS problem 21, page 260
A person must score in the top 2% of the population on an IQ test to qualify for
membership in MENSA (U.S. Airways Attache, September 2000). If the population
of IQ scores is normal with mean of 100 and standard deviation of 15, what score
qualifies one for MENSA?
We want the score, x, such that the probability(X<x) is 0.98.
=NORMINV(.98,100,15)
= 130.8
Fun facts about the normal
distribution
• Let X be normal with mean μ and standard deviation σ.
– X ~ N(μ,σ)
• If Y = a + b*X
• Then Y will be normal with mean a+b*μ and standard deviation b*σ
– Y ~ N(a+b*μ,b*σ)
• So if weight in pounds is normal, weight in kilograms will also be
normal.
• If Temperature in degrees F is Normal, temperature in degrees C
will also be normal.
• If I add 10 points to all exams, I add ten points to the mean but do
not change the standard deviation.
• If I multiply all scores by 1.5, I multiply the mean and the standard
deviation by 1.5.
More Fun Facts
• There are a multitude of normal
distributions…one for each possible pair of μ
and σ values.
• But…they all follow the same “curve” and
have identical properties so that, in that
sense, there is only ONE normal distribution.
EMBS Fig 6.4, p 249
Before there was NORMDIST
• We asked everyone to convert their
probability question about x into a probability
question about z. Because then we needed
only ONE table of normal probabilities. Those
that applied to z.
z tells us where x
z is all we need to
answer a probability
quesgtion.
𝑥−𝜇
𝑧=
𝜎
is on its normal
curve.
z is how far x is
above/below the mean
in units of standard
deviation.
A changing world…
• We can use =NORMDIST(x,μ,σ,true) to answer
our probability questions.
• We used to have to use =NORMSDIST([x- μ]/σ)
The standard
normal
distribution.
Uses z as the
input. We needed
calculate the z in
order to answer
probability
questions.
There is math and calculus behind all
this…
=NORMDIST(x1,μ,σ,true)
𝑥1
=
−∞
1
𝜎 2𝜋
𝑒
−(𝑥−𝜇)2 /2𝜎 2
𝑑𝑥