Download Probability

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Probability interpretations wikipedia , lookup

Randomness wikipedia , lookup

Probability wikipedia , lookup

Transcript
Statistics – OR 155
Section 1
J. S. Marron, Professor
Department of Statistics
and Operations Research
Class Information
•
•
•
•
•
Go to Blackboard
Choose this course
In Control Panel
Go to “Course Information”
Read document in “Course Info”
Class Information
Important Change:
IA’s Office Hour:
Mon. 10:30-11:30 
Wed. 10:30-11:30
(now on Blackboard)
Audio Recording Info
•
•
•
•
•
Audio Recordings + Displayed Slides
Stored at: http://coursecast.unc.edu
Choose “Students”
Choose “155STOR, Section 1: J S Marron”
Choose “Test Recording”
In case needed:
• User ID:
• Password:
155STOR
155STOR
Reading In Textbook
Approximate Reading for Today’s Material:
Pages 237-250
Approximate Reading for Next Class:
Pages 251-266
Random Sampling
How Accurate?
•
Can (& will) calculate using “probability”
•
Justifies term “scientific sampling”
•
2nd improvement over quota sampling
Random Sampling
What is random?
Simple Random Sampling:
Each member of population is
equally likely to be in sample
Key Idea: Different from “just choose some”
Random Sampling HW
Interesting Question:
What is the % of Male Students at UNC?
(Your chance of date,
or take 100% -
to get your chance)
HW:
C1: Class Handout
http://stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/HWAsst/Stor155HWC1.pdf
Random Sampling HW
Interesting Question:
What is the % of Male Students at UNC?
(Your chance of date,
or take 100% HW:
to get your chance)
for Class Problem # 1
C1: Class Handout
http://stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/HWAsst/Stor155HWC1.pdf
Random Sampling HW
Serious Problem that has arisen:
See Blackboard
(note format of Question and Answer)
(please use this, and send me an email)
Random Sampling HW
Hint on “Equally Likely”:
Suppose 2 page phone book & only 4 students
Al
Bob
Carl
Doug
Sampling Scheme 1:
•
Randomly choose page
•
Randomly choose person
Problem: Doug is chosen half the time,
Not “equally likely”
Random Sampling HW
Hint on “Equally Likely”:
Suppose 2 page phone book & only 4 students
Al
Bob
Carl
Doug
Sampling Scheme 2:
•
Randomly choose page
•
Randomly choose i = 1,…,max page size
•
If no person i on chosen page, repeat both draws
More On Surveys
More Common Sense:
How you ask the question
makes a big difference
HW:
3.75, 3.74
(Note: work & turn in, in order assigned)
More about Sampling
The “simple random sample” (recall “each
equally likely”) can be expensive
(e.g. nationwide political poll, collected by
personal interview)
So there are many cheaper variations:
–
–
–
–
Stratified Sampling
Multi Stage Sampling
See text, Section 3.2
And there are many others as well
Get up to Speed on EXCEL
HW C2: Class Problem 2
(on Blackboard – HW Assignments)
Recall: only turn in one printed page (per
problem part)
(recall HW rules in Course Info)
Note: you can also write on that sheet
(e.g. your name & highlight answer)
Get up to Speed on EXCEL
Critical Step: Load Analysis ToolPak:
• Circular Office Button (upper left)
• Excel Options (low button)
• Add-Ins (menu on left)
• Manage Excel Add-Ins > Go
• Check: Analysis ToolPak > OK
• Makes “Analysis Box” appear on right,
under “Data” tab
Random Sampling
EXCEL generation of random samples:
http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg1.xls
Goal 1:
Generate Random Numbers
Random Sampling
EXCEL generation of random samples:
http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg1.xls
Goal 1:
Generate Random Numbers
EXCEL approaches:
•
RAND function (under Formulas 
Function Library  Math & Trig)
Random Sampling
EXCEL generation of random samples:
http://www.stat-or.unc.edu/webspace/courses/marron/UNCstor155-2009/ClassNotes/Stor155Eg1.xls
Goal 1:
Generate Random Numbers
EXCEL approaches:
•
RAND function (under Formulas 
Function Library  Math & Trig)
•
Data  Analysis  Random Number
Generation
EXCEL Random Sampling
Goal 2:
Randomly Reorder List
EXCEL Random Sampling
Goal 2:
Randomly Reorder List
EXCEL approach:
•
Highlight block with list & random num’s
•
Sort whole thing on numbers
EXCEL Random Sampling
Goal 2:
Randomly Reorder List
EXCEL approach:
•
Highlight block with list & random num’s
•
Sort whole thing on numbers
Goal 3:
Random Sample from List
EXCEL Random Sampling
Goal 2:
Randomly Reorder List
EXCEL approach:
•
Highlight block with list & random num’s
•
Sort whole thing on numbers
Goal 3:
Random Sample from List
•
Choose 1st subset from random re-order
•
Since, each equally likely in each spot
EXCEL Details
RAND:
•
Can also get from menu under fx
•
•
•
•
•
•
Can find on “All” menu
Note no (explicit) inputs
Just put in desired cell
Drag downwards for several random #s
Caution: these change on each re-comp.
Thus not recommended for this
EXCEL Details
Data  Analysis  Random Number
Generation :
• Set: # Variables: 1
Distribution: Uniform (over [0,1])
• Generates Fixed List
(doesn’t change with re-computation)
(note entries are “just numbers”)
• Thus stable for later interpretation
• Recommended for random sample choice
EXCEL Details
Sorting Lists:
•
Highlight Block with Both:
–
Names to sort
–
Random numbers
•
Data  Sort & Filter  Sort
•
Choose Column to sort on (B or C)
•
Result is random re-ordering of List
Random Sampling HW
HW:
C3: For the letters A – L, use EXCEL to:
(a) Put in a random order.
(b) Choose a random sample of 6.
(Hints: for (a), want each equally likely,
for (b), reorder, and choose a subset)
Chapter 4: Probability
Goal: quantify (get numerical) uncertainty
Chapter 4: Probability
Goal: quantify (get numerical) uncertainty
•
Key to answering questions above
(e.g. what is “natural variation”
in a random sample?)
(e.g. which effects are “significant”)
Chapter 4: Probability
Goal: quantify (get numerical) uncertainty
•
Key to answering questions above
(e.g. what is “natural variation”
in a random sample?)
(e.g. which effects are “significant”)
Idea: Represent “how likely” something is
by a number
Simple Probability
E.g. (will use for a while, since simplicity
gives easy insights)
Roll a die (6 sided cube, faces 1,2,…,6)
Simple Probability
E.g. (will use for a while, since simplicity
gives easy insights)
Roll a die (6 sided cube, faces 1,2,…,6)
• 1 of 6 faces is a “4”
• So say “chances of a 4” are:
“1 out of 6”  1 6.
Simple Probability
E.g. (will use for a while, since simplicity
gives easy insights)
Roll a die (6 sided cube, faces 1,2,…,6)
• 1 of 6 faces is a “4”
• So say “chances of a 4” are:
“1 out of 6”  1 6.
• What does that number mean?
• How do we find such for harder
problems?
Simple Probability
A way to make this precise:
“Frequentist Approach”
Simple Probability
A way to make this precise:
“Frequentist Approach”
In many replications (repeat of die roll),
expect about 16 of total will be 4s
Simple Probability
A way to make this precise:
“Frequentist Approach”
In many replications (repeat of die roll),
expect about 16 of total will be 4s
Terminology (attach buzzwords to ideas):
Simple Probability
A way to make this precise:
“Frequentist Approach”
In many replications (repeat of die roll),
expect about 16 of total will be 4s
Terminology (attach buzzwords to ideas):
Think about “outcomes” from an
“experiment”
e.g. #s on die
e.g. roll die, observe #
Simple Probability
Quantify “how likely” outcomes are by
assigning “probabilities”
I.e. a number between 0 and 1, to each
outcome, reflecting “how likely”:
Simple Probability
Quantify “how likely” outcomes are by
assigning “probabilities”
I.e. a number between 0 and 1, to each
outcome, reflecting “how likely”:
Intuition:
• 0 means “can’t happen”
• ½ means “happens half the time”
• 1 means “must happen”
Simple Probability
HW:
C4: Match one of the probabilities:
0, 0.01, 0.3, 0.6, 0.99, 1
with each statement about an event:
a. Impossible, can’t occur.
b. Certain, will happen on every trial.
c. Very unlikely, but will occur once in a
long while.
d. Event will occur more often than not.
Simple Probability
Main Rule:
Sum of all probabilities (i.e. over all
outcomes) is 1:
Simple Probability
Main Rule:
Sum of all probabilities (i.e. over all
outcomes) is 1:
P1  1 6
E.g. for die rolling:
P2  1 6
P3  1 6
P4  1 6
P5  1 6
P6  1 6
 1
Simple Probability
Main Rule:
Sum of all probabilities (i.e. over all
outcomes) is 1:
P1  1 6
E.g. for die rolling:
P2  1 6
P3  1 6
P4  1 6
Greek letter
P5  1 6
“sigma” for
sum
P6  1 6
 1
Simple Probability
HW:
4.23
4.24a
Probability
General Rules for assigning probabilities:
Probability
General Rules for assigning probabilities:
i. Frequentist View
(what happens in many repetitions?)
Probability
General Rules for assigning probabilities:
i. Frequentist View
(what happens in many repetitions?)
ii. Equally Likely: for n outcomes
P{one outcome} = 1/n (e.g. die rolling)
Probability
General Rules for assigning probabilities:
i. Frequentist View
(what happens in many repetitions?)
ii. Equally Likely: for n outcomes
P{one outcome} = 1/n (e.g. die rolling)
iii. Based on Observed Frequencies
e.g. life tables summarize when people die
Gives “prob of dying” at a given age
“life expectancy”
Probability
General Rules for assigning probabilities:
iv. Personal Choice:
–
–
–
HW:
4.25
Reflecting “your assessment”
E.g. Oddsmakers
Careful: requires some care
(key is prob’s need to sum to 1)
Probability - Events
More Terminology (to carry this further):
• An event is a set of outcomes
Probability - Events
More Terminology (to carry this further):
• An event is a set of outcomes
Die Rolling: “an even #”, is the event {2, 4, 6}
Probability - Events
More Terminology (to carry this further):
• An event is a set of outcomes
Die Rolling: “an even #”, is the event {2, 4, 6}
Notes:
–
–
–
–
If betting on even don’t care about #, only
even or odd
Thus events are our foundation
Each outcome is an event: the set containing
just that outcome
So event is the more general concept
Probability on Events
Sample Space is the set of all outcomes
Probability on Events
Sample Space is the set of all outcomes =
= “event with everything that can happen”
Probability on Events
Sample Space is the set of all outcomes =
= “event with everything that can happen”
Extend Probability to Events by:
P{event} = sum of probs of outcomes in event
Probability on Events
Sample Space is the set of all outcomes =
= “event with everything that can happen”
Extend Probability to Events by:
P{event} = sum of probs of outcomes in event

 PO
outcomes O
Probability
Technical Summary:
•
A probability model is a sample space
•
I.e. set of outcomes, plus a probability, P
Probability
Technical Summary:
•
A probability model is a sample space
•
I.e. set of outcomes, plus a probability, P
•
P assigns numbers to events,
•
Events are sets of outcomes
Probability Function
The probability, P, is a “function”,
defined on a set of events
Probability Function
The probability, P, is a “function”,
defined on a set of events
Recall function in math:
f ( x )  3x  2
2
Probability Function
The probability, P, is a “function”,
defined on a set of events
Recall function in math:
plug-in
f ( x )  3x  2
2
get out
Probability Function
The probability, P, is a “function”,
defined on a set of events
Recall function in math:
plug-in
Probability:
f ( x )  3x  2
2
get out
P{event} = “how likely”
Probability Function
E.g. Die Rolling
•
Sample Space = {1, 2, 3, 4, 5, 6}
Probability Function
E.g. Die Rolling
•
Sample Space = {1, 2, 3, 4, 5, 6}
•
“an even #” is the event {2, 4, 6} (a “set”)
Probability Function
E.g. Die Rolling
•
Sample Space = {1, 2, 3, 4, 5, 6}
•
“an even #” is the event {2, 4, 6} (a “set”)
P{“even”} = P{2, 4, 6} =  Po
•
o
= P{2} + P{4} + P{6} =
= 1/6 + 1/6 + 1/6 = 3/6 = ½
Probability Function
E.g. Die Rolling
•
Sample Space = {1, 2, 3, 4, 5, 6}
•
“an even #” is the event {2, 4, 6} (a “set”)
P{“even”} = P{2, 4, 6} =  Po
•
o
= P{2} + P{4} + P{6} =
= 1/6 + 1/6 + 1/6 = 3/6 = ½
•
Fits, since expect “even # half the time”
Probability HW
HW:
4.21
4.31
And now for something
completely different
Is this class too “monotone”?
•
Easier to understand?
•
Calm environment enhances learning?
•
Or does it induce somnolence?
What is “somnolence”?
Google definition:
Sleepiness, a condition of
semiconsciousness approaching coma.
And now for something
completely different
Recall last class’s Student Questionnaire…
I asked you for:
•
Name
•
Major
•
Contact Info
•
Background…
And now for something
completely different
One (previous class) response:
And now for something
completely different
OK, will try to send your mind in a different
direction
Hopefully, a mental break …
(not on the Homework Assignment!)
And now for something
completely different
•
Did you hear about the constipated
mathematician?
And now for something
completely different
•
Did you here about the constipated
mathematician?
•
He worked it out with a pencil!
And now for something
completely different
•
Did you here about the constipated
mathematician?
•
He worked it out with a pencil!
•
Apologies for the juvenile nature…
And now for something
completely different
•
Did you here about the constipated
mathematician?
•
He worked it out with a pencil!
•
Apologies for the juvenile nature…
•
But there is an important point:
And now for something
completely different
•
Did you here about the constipated
mathematician?
•
He worked it out with a pencil!
•
Apologies for the juvenile nature…
•
But there is an important point:
The pencil is a powerful
mathematical tool
And now for something
completely different
The pencil is a powerful
mathematical tool
•
An old student:
–
I was once “good in math”
–
But suddenly lost that
–
Reason: tried to do too much in head
–
Because: never learned power of the pencil
And now for something
completely different
The pencil is a powerful
mathematical tool
•
For us: now is time to start using pencil
•
I do PowerPoint in class
•
You use pencil on HW (and exams)
•
Change in mindset, from Excel…
Probability
Now stretch ideas with more interesting e.g.
Probability
Now stretch ideas with more interesting e.g.
E.g. Political Polls, Simple Random Sampling
Probability
Now stretch ideas with more interesting e.g.
E.g. Political Polls, Simple Random Sampling
2 views:
1. Each individual equally likely to be in sample
Probability
Now stretch ideas with more interesting e.g.
E.g. Political Polls, Simple Random Sampling
2 views:
1. Each individual equally likely to be in sample
2. Each possible sample is equally likely
Probability
Now stretch ideas with more interesting e.g.
E.g. Political Polls, Simple Random Sampling
2 views:
1. Each individual equally likely to be in sample
2. Each possible sample is equally likely
Allows for simple Probability Modelling
Simple Random Sampling
•
Sample Space is set of all possible
samples
Simple Random Sampling
•
Sample Space is set of all possible
samples
•
An Event is a set of some samples
Simple Random Sampling
•
Sample Space is set of all possible
samples
•
An Event is a set of some samples
E.g. For population A, B, C, D
–
Each is a voter
Simple Random Sampling
•
Sample Space is set of all possible
samples
•
An Event is a set of some samples
E.g. For population A, B, C, D
–
Each is a voter
–
Only 4, so easy to work out
S. R. S. Example
For population A, B, C, D,
Draw a S. R. S. of size 2
S. R. S. Example
For population A, B, C, D,
Draw a S. R. S. of size 2
Sample Space =
{(A,B), (A,C), (A,D), (B,C), (B,D), (C,D)}
S. R. S. Example
For population A, B, C, D,
Draw a S. R. S. of size 2
Sample Space =
{(A,B), (A,C), (A,D), (B,C), (B,D), (C,D)}
outcomes, i.e. possible samples of size 2
S. R. S. Example
Now assign P, using “equally likely” rule:
S. R. S. Example
Now assign P, using “equally likely” rule:
P{A,B} = P{A,C} = … = P{C,D} =
= 1/(#samples) = 1/6
S. R. S. Example
Now assign P, using “equally likely” rule:
P{A,B} = P{A,C} = … = P{C,D} =
= 1/(#samples) = 1/6
An interesting event is:
“C in sample” = {(A,C),(B,C),(D,C)}
(set of samples with C in them)
S. R. S. Example
1
P{C in sample} =  P{sample}  
samples
samples 6
with C
with C
S. R. S. Example
1
P{C in sample} =  P{sample}  
samples
samples 6
with C
with C
1
1 1
# samples with C    3  
6
6 2
S. R. S. Example
1
P{C in sample} =  P{sample}  
samples
samples 6
with C
with C
1
1 1
# samples with C    3  
6
6 2
i.e. happens “half the time”.
S. R. S. Probability HW
HW C5:
Abby, Bob, Mei-Ling, Sally and Roberto work for a
firm. Two will be chosen at random to attend
an overseas meeting. The choice will be made
by drawing names from a hat (this is an S. R.
S. of 2).
a. Write down all possible choices of 2 of the 5
names. This is the sample space.
b. Random choice makes all choices equally
likely. What is the probability of each choice?
(1/10)
c. What is the prob. that Sally is chosen? (4/10)
d. What is the prob. that neither Bob, nor Roberto
is chosen? (3/10)
Political Polls Example
What is your chance of being in a poll of
1000, from S.R.S. out of 200,000,000?
(crude estimate of # of U. S. voters)
Political Polls Example
What is your chance of being in a poll of
1000, from S.R.S. out of 200,000,000?
(crude estimate of # of U. S. voters)
Recall each sample is equally likely so:
# samples with you
Psample with you 
total # samples
Political Polls Example
What is your chance of being in a poll of
1000, from S.R.S. out of 200,000,000?
(crude estimate of # of U. S. voters)
Recall each sample is equally likely so:
# samples with you
Psample with you 
total # samples
Problem:
this is really big  2.66  10
5733
(5,733 digits, too big for easy handling….)
Political Polls Example
More careful calculation:
199,999,999 
1 

999
# samples with you


Psample with you 

total # samples
 200,000,000 


 1,000

Political Polls Example
More careful calculation:
199,999,999 
1 

999
# samples with you


Psample with you 

total # samples
 200,000,000 


 1,000

199,999,999!
1000
1
199,999,000!999!



200,000,000!
200,000,000 200,000
199,999,000!1000!
Political Polls Example
More careful calculation:
199,999,999 
1 

999
# samples with you


Psample with you 

total # samples
 200,000,000 


 1,000

199,999,999!
1000
1
199,999,000!999!



200,000,000!
200,000,000 200,000
199,999,000!1000!
Makes sense, since you are “equally likely to
be in samples”