Download 1 - Department of Statistics and Probability

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Birthday problem wikipedia , lookup

Inductive probability wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
What have we learned?
Shifting data by adding or subtracting the same
amount from each value affects measures of
center and position but not measures of
spread.
Rescaling data by multiplying or dividing every
value by a constant changes all the summary
statistics—center, position, and spread.
What have we learned? (cont.)
We’ve learned the power of standardizing data.
Standardizing uses the SD as a ruler to
measure distance from the mean (z-scores).
With z-scores, we can compare values from
different distributions or values based on
different units.
z-scores can identify unusual or surprising
values among data.
What have we learned? (cont.)
We’ve learned that the 68-95-99.7 Rule can be a
useful rule of thumb for understanding
distributions:
For data that are unimodal and symmetric,
about 68% fall within 1 SD of the mean, 95%
fall within 2 SDs of the mean, and 99.7% fall
within 3 SDs of the mean.
What have we learned? (cont.)
We see the importance of Thinking about
whether a method will work:
Normality Assumption: We sometimes work
with Normal tables (Table Z). These tables are
based on the Normal model.
Data can’t be exactly Normal, so we check the
Nearly Normal Condition by making a
histogram (is it unimodal, symmetric and free
of outliers?) or a normal probability plot (is it
straight enough?).
Ex. 6.10
Cars currently sold in the US have an
average of 135 horsepower, with a standard
deviation of 40 horsepower. What is the zscore for a car with 195 horse power?
Z=(195-135)/40=1.5
Ex. 6.12
People with z-scores greater than 2.5 on an
IQ test are sometimes classified as
geniuses. If IQ test scores have a mean of
100 and a std. dev. of 16 points, what IQ
score do you need to be considered a
genious?
2.5=(x-100)/16
x=140
Frequency table for quiz1 grades
Descriptive statistics for Grades by
sections
Box plots for Grades by sections
Assume that I picked a student with a 10 point
from each section. Will this mean that these
students are equivalent by means of their
success?
Section 10
Section 11
Mean=13.33
Std=3.241
Mean=13.300
Std=3.064
Section 12
Mean=12.567
Std=3.07
Assume that I picked a student with a 10 point
from each section. Will this mean that these
students are equivalent by means of their
success?
Section 10
Section 11
Mean=13.33
Std=3.241
Z-score= (10-13.33)/3.241=-1.027
Mean=13.300
Std=3.064
Z-score= (10-13.3)/3.064=-1.07
Section 12
Mean=12.567
Std=3.07
Z-score= (10-12.567)/3.07=-0.8367
Ex. 6.42
In a standard Normal model, what value(s) of z
cut(s) off the region described?
A) The lowest 12%
-1.175
B) The highest 30%
0.53
C) The highest 7%
1.47
D) The middle 50%
(-0.67, 0.67)
Ex. 6.43
Based on the Normal model N(100,16) describing IQ scores, what
percent of people’s IQS would you expect to be
A) Over 80?
Z=(80-100)/16=-1.25
1-0.1056=0.8944 ⇒89.4%
B) Under 90?
Z=(90-100)/16=-0.625
The mean for the values of -0.62 and -0.63=(0.2676+0.2643)/2=0.2659
⇒26.6%
C) Between 112 and 132?
Z1=(112-100)/16=0.75
Z2=(132-100)/16=2.00
The value for 2.00-The value for 0.75=0.9772-0.7734=0.2038 ⇒20.4%
Ex. 6.27
A)
B)
C)
D)
E)
Environmental protection agency (EPA) fuel economy
estimates for automobile models tested recently predicted a
mean of 24.8 mpg and a standard deviation of 6.2 mpg for
highway driving. Assume that the distribution is moundshaped(i.e; Normal model applies)
Draw the model for auto fuel economy. Clearly label it showing
what the 68-95-99.7 rule predicts about miles per gallon.
In what interval would you expect the central 68% of autos to
be found?
About what percent of autos should get more than 31 mpg?
About what percent of autos should get between 31 and 37
mpg?
Describe the gas mileage of the worst 2.5% of all cars?
Chapter 14
From Randomness to
Probability
Thinking Challenge
What’s the probability of
getting a head on the
toss of a single fair
coin? Use a scale from
0 (no way) to 1 (sure
thing).
So toss a coin twice.
Do it! Did you get one
head & one tail? What’s
it all mean?
Many Repetitions!*
Total Heads
Number of Tosses
1.00
0.75
0.50
0.25
0.00
0
25
50
75
Number of Tosses
100
125
Dealing with Random Phenomena
A random phenomenon is a situation in which we know
what outcomes could happen, but we don’t know which
particular outcome did or will happen.
In general, each occasion upon which we observe a
random phenomenon is called a trial.
At each trial, we note the value of the random
phenomenon, and call it an outcome .
The most basic outcome of a trial is a sample point.
The collection of all possible outcomes is called the
sample space.
Visualizing
Sample Space
1. Listing for tossing a coin once and noting up face
S = {Head, Tail}
Sample point
2.
A pictorial method for presenting the sample space
3.
Venn Diagram
H
T
S
Example
Tossing two coins and recording up faces:
Is sample space as below?
S={HH, HT, TT}
Tree Diagram
1st coin
H
T
2nd coin
H
T
H
T
Sample Space Examples
Sample Space
Toss a Coin, Note Face
Toss 2 Coins, Note Faces
Select 1 Card, Note Kind
Select 1 Card, Note Color
Play a Football Game
Inspect a Part, Note Quality
Observe Gender
{Head, Tail}
{HH, HT, TH, TT}
{2♥, 2♠, ..., A♦} (52)
{Red, Black}
{Win, Lose, Tie}
{Defective, Good}
{Male, Female}
Events
1. Specific collection of sample points
2. Simple Event
• Contains only one sample point
3. Compound Event
• Contains two or more sample points
Venn Diagram
Trial: Toss 2 Coins. Note Faces.
Sample Space
Outcome;
Sample
point
S = {HH, HT, TH, TT}
TH
HH
Compound
Event: At
least one
Tail
HT
TT
S
Venn Diagram
Trial: Toss 2 Coins. Note Faces.
Sample Space
S = {HH, HT, TH, TT}
HT
TH
TT
HH
S
Simple
Event: Tail
for both
tosses
Thinking challenge
A fair coin is tossed till to get the first head or four
tails in a row. Which one is the sample space for
this experiment?
a. S={T, TH, TTH, TTTH, TTTT}
b. S={T, HT, TTH, TTTH, TTTT}
c. S={H, TH, TTH, TTTH, TTTT}
d. S={H, HT, HHT, HHHT, HHHH}
The Law of Large Numbers
First a definition . . .
When thinking about what happens with
combinations of outcomes, things are simplified if
the individual trials are independent.
Roughly speaking, this means that the
outcome of one trial doesn’t influence or
change the outcome of another.
For example, coin flips are independent.
The Law of Large Numbers (cont.)
The Law of Large Numbers (LLN) says that the
long-run relative frequency of repeated
independent events gets closer and closer to a
single value.
We call the single value the probability of the
event.
Because this definition is based on repeatedly
observing the event’s outcome, this definition of
probability is often called empirical probability.
The Nonexistent Law of Averages
Many people believe, for example, that an outcome of a
random event that hasn’t occurred in many trials is “due”
to occur.
A common term for this is Law of Averages which
doesn’t exist at all.
The LLN says nothing about short-run behavior.
Relative frequencies even out only in the long run, and
this long run is really long (infinitely long, in fact).
If the probability of an outcome doesn’t change and the
events are independent, the probability of any outcome in
another trial is always what it was, no matter what has
happened in other trials
Modeling Probability
When probability was first studied, a group of French
mathematicians looked at games of chance in which all
the possible outcomes were equally likely.
It’s equally likely to get any one of six outcomes from
the roll of a fair die.
It’s equally likely to get heads or tails from the toss of a
fair coin.
However, keep in mind that events are not always equally
likely.
A skilled basketball player has a better than 50-50
chance of making a free throw.
Modeling Probability (cont.)
When outcomes are equally likely, their
probability is just 1/ #of possible outcomes.
So probability of having 3 when we roll a fair die
is 1/6
The probability of picking the ace of spades from
the top of a well-shuffled deck is 1/52.
It’s almost as simple to find probabilities for
events that are made of several equally likely
outcomes.
Modeling Probability (cont.)
The probability of an event is the number of
outcomes in the event divided by the total
number of possible outcomes.
P(A) =
# of outcomes in A
# of possible outcomes
Modeling Probability (cont.)
The probability of drawing a face card (JQK) from
a deck I
P(face card)= #face cards/# cards
= 12/52
Formal Probability
1. Two requirements for a probability:
A probability is a number between 0 and 1.
For any event A, 0 ≤ P(A) ≤ 1.
Formal Probability (cont.)
2. Probability Assignment Rule:
The probability of the set of all possible
outcomes of a trial must be 1.
P(S) = 1 (S represents the set of all possible
outcomes.)
Formal Probability (cont.)
3. Complement Rule:
The set of outcomes that are not in the event
A is called the complement of A, denoted AC.
The probability of an event occurring is 1
minus the probability that it doesn’t occur:
P(A) = 1 – P(AC)
Formal Probability (cont.)
Events that have no outcomes in common (and,
thus, cannot occur together) are called disjoint
(or mutually exclusive).
Formal Probability (cont.)
4. Addition Rule :
For two disjoint events A and B, the
probability that one or the other occurs is the
sum of the probabilities of the two events.
P(A or B) = P(A) + P(B), provided that A and
B are disjoint.
Example
d) For the probabilities given
in part c,
P(A)=0.3, P(B)=0.2
P(A)=0.25, P(B)=0.3
i)
under the assumption
that events A and B are
disjoint, find P(A or B)=?
ii)
Find P(Ac)
Ex. 14.19 from text book
A consumer org. estimates that over a 1-year
period 17% of cars will need to be repaired
once,7% will need repairs twice, and 4% will
require three or more repairs. What is the
probability that a car chosen at random will need
a) no repairs?
P(NR)=1-[(0.17)+(0.07)+(0.04)]=0.72
b) no more than one repair?
P(NR)+P(1R)=0.72+0.17=0.89
c) some repairs?
P(1R)+P(2R)+P(3mR)=0.28
Formal Probability
5. Multiplication Rule (cont.):
For two independent events A and B, the
probability that both A and B occur is the
product of the probabilities of the two events.
P(A and B) = P(A) x P(B), provided that A
and B are independent.
Formal Probability (cont.)
5. Multiplication Rule (cont.):
Two independent events A and B are not
disjoint, provided the two events have
probabilities greater than zero:
Formal Probability (cont.)
5. Multiplication Rule:
Many Statistics methods require an
Independence Assumption, but assuming
independence doesn’t make it true.
Always Think about whether that assumption
is reasonable before using the Multiplication
Rule.
Q2
A certain bowler can bowl a strike 70% of the time. What
is the probability that she
A) goes three consecutive frames without a strike?
0.3*0.3*0.3=0.027
B) makes her first strike in the third frame?
0.3*0.3*0.7=0.063
C) Has at least one strike in the first three games?
3*(0.3*0.3*0.7)+3*(0.3*0.7*0.7)+0.7*0.7*0.7)=0.973
Or 1-P(NoS)=1-(1-0.7)3 =0.973
D) Bowls a perfect game (12 consecutive strikes)?
(0.7)12=0.0138
Just checking
a)
b)
c)
d)
Opinion polling organizations contact their respondents by telephone.
Random telephone numbers are generated, and interviewers try to
contact those house holds. According to the Pew Research center for
the people and the Press, by 2003 this contact rate had risen to 76%.
We can reasonably assume each house hold’s response to be
independent of the others.
What is the probability that the interviewer successfully contact the
next households on the list?
What is the probability that the interviewer successfully contact both
of the next households on her list?
What is the probability that the interviewer’s first successful contact is
the third house hold on the list?
What is the probability that the interviewer makes at least one
successful contact among the next five households on the list?
Just checking
a)
b)
c)
d)
Opinion polling organizations contact their respondents by telephone.
Random telephone numbers are generated, and interviewers try to
contact those house holds. According to the Pew Research center for
the people and the Press, by 2003 this contact rate had risen to 76%.
We can reasonably assume each house hold’s response to be
independent of the others.
What is the probability that the interviewer successfully contact the
next households on the list? 0.76
What is the probability that the interviewer successfully contact both
of the next households on her list? 0.76*0.76
What is the probability that the interviewer’s first successful contact is
the third house hold on the list? (1-0.76)2 * 0.76
What is the probability that the interviewer makes at least one
successful contact among the next five households on the list?
1-(1-0.76)5
Example
Solution
a. S={Brown, yellow, red, blue, orange,green}
b. P={0.13, 0.14, 0.13, 0.24, 0.20, 0.16}
c. Let event A=selecting brown candy
P(A)=P(Brown)=0.13
d. Let event B=selecting red, green or yellow candy
P(B)= 0.13+0.16+0.14=0.43
e. Let event C= selecing a candy other than blue
P(C) = 0.13 +0.14+ 0.13+ 0.20+ 0.16=0.76
or P(C) = 1-0.24=0.76
Formal Probability - Notation
Notation alert:
In the text book the notation P(A or B) and P(A
and B) are used.
In other situations, you might see the following:
P(A ∪ B) instead of P(A or B)
P(A ∩ B) instead of P(A and B)
What have we learned?
Probability is based on long-run relative
frequencies.
The Law of Large Numbers speaks only of longrun behavior.
Watch out for misinterpreting the LLN.
What have we learned? (cont.)
There are some basic rules for combining
probabilities of outcomes to find probabilities of
more complex events. We have the:
Probability Assignment Rule
Complement Rule
Addition Rule for disjoint events
Multiplication Rule for independent events