Download New Coke - STOR at UNC

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Transcript
Stor 155, Section 2, Last Time
•
•
Prediction in Regression
–
Given new point X0, predict Y0
–
Confidence interval for mean
–
Prediction Interval for value
Review…
Stat 31 Final Exam:
Date & Time:
Tuesday, May 8, 8:00-11:00
Last Office Hours:
•
•
•
Thursday, May 3, 12:00 - 5:00
Monday, May 7, 10:00 - 5:00
& by email appointment (earlier)
Bring with you, to exam:
•
•
Single (8.5" x 11") sheet of formulas
Front & Back OK
Review Slippery Issues
Major Confusion:
Population Quantities
Vs.
Sample Quantities
Response to a Request
You said at the end of today's class that you would be
willing to take class time to "reteach" concepts that
might still be unknown to us.
Well, in my case, it seems that probability and probability
distribution is a hard concept for me to grasp.
On the first midterm, I missed … and on the second
midterm, I missed …
I seem to be able to grasp the other concepts involving
binomial distribution, normal distribution, tdistribution, etc fairly well, but probability is really
killing me on the exams.
If you could reteach these or brush up on them I would
greatly appreciate it.
Levels of Probability
•
Simple Events
–
–
•
Big Rules of Prob (Not, And, Or)
Bayes Rule
Distributions (in general)
–
Defined by Tables
•
•
–
Summary of discrete probs
Get probs by summing
Uniform
•
Get probs by finding areas
Levels of Probability
•
•
Distributions (in general)
Named (& Useful) Distributions
–
Binomial
•
•
–
Discrete distribution of Counts
Compute with BINOMDIST & Normal Approx.
Normal
•
•
–
Continuous distribution of Averages
Compute with NORMDIST & NORMINV
T
•
•
Similar to Normal, for estimated s.d.
Compute with TDIST & TINV
Detailed Look
Simple Events:
•
Big Rules of Probability:
–
–
–
–
•
Not Rule ( 1 – P{opposite})
Or Rule (glasses – football)
And rule (multiply conditional prob’s)
Use in combination for real power
Bayes Rule
–
–
–
Turn around conditional probabilities
Write hard ones in terms of easy ones
Recall surprising disease testing result
Detailed Look
•
Distributions (in general)
–
Defined by Tables
•
•
•
Summary of discrete probs
Get probs by summing
Easy to forget after so much other stuff…
Studied in Notes: 2/20, 2/22, 3/1
Some highlights…
Highlights of Dist’ns in Tables
•
Distributions (in general)
–
Defined by Tables
•
•
•
Summary of discrete probs
Get probs by summing
Easy to forget after so much other stuff…
Studied in Notes: 2/20, 2/22, 3/1
Some highlights…
Random Variables
Die rolling example,
for X = “net winnings”:
Win $9 if 5 or 6, Pay $4, if 1, 2 or 4
Probability Structure of X is summarized by:
P{X = 9} = 1/3 P{X = -4} = 1/2 P{X = 0} = 1/6
Convenient form:
Winning
Prob.
a table
9
-4
0
1/3
1/2
1/6
Summary of Prob. Structure
In general: for discrete X, summarize
“distribution” (i.e. full prob. Structure) by a
table:
Values
x1
x2
…
xk
Prob.
p1
p2
…
pk
Where:
i.
All pi are between 0 and 1
k
ii.  pi  1 (so get a prob. funct’n as above)
i 1
Summary of Prob. Structure
Summarize distribution, for discrete X,
by a table:
Values
x1
x2
…
xk
Prob.
p1
p2
…
pk
Power of this idea:
•
Get probs by summing table values
•
Special case of disjoint OR rule
Summary of Prob. Structure
E.g.
Die Rolling game above:
P{X = 9} = 1/3
Winning 9 -4 0
Prob.
1/3 1/2 1/6
P{X < 2} = P{X = 0} + P{X = -4} =1/6+1/2 = 2/3
P{X = 5} = 0
(not in table!)
Summary of Prob. Structure
E.g.
Die Rolling game above:
Winning 9 -4 0
Prob.
1/3 1/2 1/6
P X  9  &  X  0
PX  9 | X  0 
PX  0
1
3
1
PX  9
2
3


 
PX  0 1  1 1 3
6 3 2
Mean of Discrete Distributions
Frequentist approach to mean:
k
X   pi xi
i 1
a weighted average of values
where weights are probabilities
Mean of Discrete Distributions
E.g. Above Die Rolling Game:
Winning
Prob.
9
-4
0
1/3 1/2 1/6
Mean of distribution =
= (1/3)(9) + (1/6)(0) +(1/2)(-4) = 3 - 2 = 1
Interpretation: on average (over large number
of plays) winnings per play = $1
Conclusion: should be very happy to play
Variance of Random Variables
So define:
Variance of a distribution
As:
   p j x j   X 
2
X
k
2
j 1
random variable
Variance of Random Variables
E. g. above game:
Winning
Prob.
9
-4
0
1/3 1/2 1/6
1
1
1
2
2
2
   4  1  0  1  9  1
2
6
3
2
X
=(1/2)*5^2+(1/6)*1^2+(1/3)*8^2
Note: one acceptable Excel form,
e.g. for exam (but there are many)
X
Standard Deviation
Recall standard deviation is square root of
variance (same units as data)
E. g. above game:
Winning
Prob.
9
-4
0
1/3 1/2 1/6
Standard Deviation
=sqrt((1/2)*5^2+(1/6)*1^2+(1/3)*8^2)
And Now for Something
Completely Different
Thought Provoking Movie…
http://www.aclu.org/pizza/
Review Slippery Issues
Major Confusion:
Population Quantities
Vs.
Sample Quantities
Recall Pepsi Challenge
In class taste test:
•
Removed bias with randomization
•
Double blind approach
•
Asked which was:
–
Better
–
Sweeter
–
which
Recall Pepsi Challenge
Results summarized in
http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stat155CokePepsiResults2007.xls
Recall Eyeball impressions:
a. Perhaps no consensus preference
between Pepsi and Coke?
–
Is 54% "significantly different from 50%?
Result of "marketing research"???
Recall Pepsi Challenge
b. Perhaps no consensus as to which is
sweeter?
•
Very different from the past, when Pepsi was
noticeably sweeter
•
This may have driven old Pepsi challenge
phenomenon
•
Coke figured this out, and matched Pepsi in
sweetness
Recall Pepsi Challenge
c. Most people believe they know
–
Serious cola drinkers, because now flavor driven
–
In past, was sweetness driven, and there were many
advertising caused misperceptions!
d. People tend to get it right or not??? (less clear)
–
Overall 71% right. Seems like it, but again is that
significantly different from 50%?
Recall Pepsi Challenge
e. Those who think they know tend to be right???
–
People who thought they knew: right 71% of the
time
f. Those who don't think they know seem to right as
well. Wonder why?
–
People who didn't: also right 70% of time? Why?
"Natural sampling variation"???
–
Any difference between people who thought they
knew, and those who did not think so?
Recall Pepsi Challenge
g. Coin toss was fair (or is 57% heads significantly
different from %50?)
How accurate are those ideas?
•
Will build tools to assess this
•
Called “hypo tests” and “P-values”
•
Revisit this now
Pepsi – Coke Taste Test
Data and Analysis:
http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stat155CokePepsiResults2007.xls
Hypothesis Tests:
•
Proportions based (i.e. think about p)
•
Interesting Hypos:
H 0 : p  0.5
H A : p  0.5
•
Recall Sampling Distribution:
p 1  p  

pˆ  p ~ N  0,

n 

Pepsi – Coke Taste Test
Data and Analysis:
http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stat155CokePepsiResults2007.xls
P-value: P{what saw or m.c. | p = 0.5}
Under assumption p = 0.5,
p 1  p 
0.51  0.5 0.5
1



n
n
n 2 n
So compute P-value as:
Area
obs’d p
ˆ  0.5
Pepsi – Coke Taste Test
Data and Analysis:
http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stat155CokePepsiResults2007.xls
Compute P-value as:
Area
obs’d p
ˆ  0.5
=NORMDIST(ABS(phat – 0.5),0,
1/(2*SQRT(n),TRUE)
Pepsi – Coke Taste Test
Conclusions (P-values):
http://stat-or.unc.edu/webspace/postscript/marron/Teaching/stor155-2007/Stat155CokePepsiResults2007.xls
•
•
•
•
•
•
•
No consensus, Pepsi vs. Coke (0.46)
No consensus, Sweeter (0.81)
Most think know (e-5, very strong)
Get It Right (0.0006, very strong)
Fair Coin Toss (0.21, seems OK)
Thought Right, Were Right (0.003,yes)
Thought Not, Were Right (0.09,
perhaps too modest?)
Pepsi – Coke Taste Test
Some interesting history of this test:
•
First Attempts
–
Pepsi was preferred
–
Pepsi was sweeter
–
Many got it wrong (even if thought new)
–
Reason for “Pepsi challenge”?
•
New Coke Came Out
–
Response to Pepsi Challenge?
Pepsi – Coke Taste Test
Some interesting history of this test:
•
•
New Coke Came Out
–
People thought they hated it…
–
Anger over changing the flavor…
–
So Coke Classic came out
Fun for me:
New Coke vs. Coke Classic
Pepsi – Coke Taste Test
Some interesting history of this test:
•
Taste test: New Coke vs. Coke Classic
–
New Coke preferred to Coke Classic!
–
New Coke was sweeter
–
Most got it wrong (even if thought new)
•
Changes Over Time
–
Appears Coke Classic slowly morphed into
New Coke…