Download Student Notes - Prep Session Topic: Probability, Part 1 Content The

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Inductive probability wikipedia , lookup

Probability amplitude wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
Student Notes - Prep Session Topic: Probability, Part 1
Content
The AP Statistics topic outline contains a long list of items in the category titled Anticipating Patterns. These
items are copied below. In this session we will work only on sections A and B.
III. Anticipating Patterns: Exploring random phenomena using probability and simulation
Probability is the tool used for anticipating what the distribution of data should look like under a given model.
A. Probability
1. Interpreting probability, including long-run relative frequency interpretation
2. “Law of Large Numbers” concept
3. Addition rule, multiplication rule, conditional probability, and independence
4. Discrete random variables and their probability distributions, including binomial and geometric
5. Simulation of random behavior and probability distributions
6. Mean (expected value) and standard deviation of a random variable, and linear transformation of a
random variable
B. Combining independent random variables
1. Notion of independence versus dependence
2. Mean and standard deviation for sums and differences of independent random variables
C. The normal distribution
1. Properties of the normal distribution
2. Using tables of the normal distribution
3. The normal distribution as a model for measurements
D. Sampling distributions
1. Sampling distribution of a sample proportion
2. Sampling distribution of a sample mean
3. Central Limit Theorem
4. Sampling distribution of a difference between two independent sample proportions
5. Sampling distribution of a difference between two independent sample means
6. Simulation of sampling distributions
7. t-distribution
8. Chi-square distribution
1
Gloria Barrett, Virginia Advanced Study Strategies
December, 2012
Formulas
You will want to be familiar with the probability formulas that are provided on the exam. The formulas related
to Sections A and B that are on the formula sheet are:
P( A
B)  P( A)  P( B)  P( A
P( A B) 
B)
P( A B)
P( B)
E( X )   X   xi pi
Var ( X )   X2   ( xi  X )2 pi
If X has a binomial distribution with parameters n and p, then:
n
P( X  k )    p k (1  p)n k
k 
 X  np
 X  np(1  p)
Additional formulas (not included in test booklet):
For any random variables X and Y, where a and b are constants:
E (aX + b) = aE( X ) + b
E (X +Y ) = E ( X ) + E (Y )
Var (aX + b) = a2 Var X
For independent random variables X and Y:
Var ( X ±Y ) =Var ( X ) + Var (Y )
Here, the random variable written as X1 + X2 denotes the result of doing the same underlying action twice,
noting the value obtained on the first action is independent of the one obtained on the second action, and
adding the two values together.
The random variable written as 2X denotes the result of doing the underlying action once, noting the value of X,
and doubling it.
Thus X1 + X2 and 2X are very different random variables even though X, X1, and X2 have the same distribution.
Using the formulas above, you can see that X1 + X2 and 2X have the same expected value but a different
variance and standard deviation.
Vocabulary Reminder
The expected value of a random variable is the same as the mean of that random variable.
2
Gloria Barrett, Virginia Advanced Study Strategies
December, 2012
Calculator Use
To save time on the exam, you will want to use your calculator for probability computations. Specifically, you
will want to know how to:
 Enter a probability distribution in lists (values in one list and probabilities in a second list) and use 1-Var
Stats to compute the mean (expected value) and variance.
 Use binomPdf to compute the probability of a specific binomial outcome and binomCdf to compute the
probability of an interval of binomial outcomes.
Note: When you use your calculator for computations on Free Response questions, it will be very important to
provide proper communication and support for your work. You must be able to explain your thinking using
standard notation. AP Exam readers are told that calculator syntax alone will not be sufficient support for
numerical answers.
Other useful tools:
Tree diagrams provide easy-to-interpret models for situations in which a scenario may be decomposed into
multiple stages, one following another. It is important to understand that the probability on each branch in a
tree diagram is a conditional probability, conditioned upon being at the particular node from which the branch
originates.
Two-way (contingency) tables are particularly useful in interpreting conditional probabilities and examining
independence. In such tables the entry in each cell is the joint probability for the row and column events that
define the cell. The total probability for all cells must be 1, and the sum for each row (column) gives the
“marginal” probability for that row (column). In particular, having a single row or column add to 1 indicates a
need to rethink the formulation of the table!
Example:
Right handed
Left handed
Right eye dominant
0.45
0.15
0.60
Left eye dominant
0.35
0.05
0.40
0.80
0.20
1.00
P(Right-handed) = .80, P(Right eye dominance) = 0.60, and P(Right handed AND Right eye dominance) = 0.45.
Venn Diagrams are sometimes helpful , especially when given the probability of separate events, for example
P(A) and P(B), as well as the joint probability P(A and B).
3
Gloria Barrett, Virginia Advanced Study Strategies
December, 2012
Multiple Choice Questions from 1997, 2002, and 2007 Exams
1. A fair coin is flipped 10 times and the number of heads is counted. This procedure of 10 coin flips is repeated
100 times and the results are placed in a frequency table. Which of the frequency tables below is most likely to
contain the results from these 100 trials?
Number
of Heads
0
1
(A)
Frequency
19
12
(B)
Frequency
9
9
(C)
Frequency
0
0
(D)
Frequency
7
10
(E)
Frequency
0
0
2
3
4
5
6
7
8
9
10
9
6
2
1
3
5
8
14
21
9
9
9
10
9
9
9
9
9
6
9
22
24
18
12
7
2
0
6
11
8
10
9
12
7
11
9
0
2
24
51
22
1
0
0
0
2. The number of sweatshirts a vendor sells daily has the following probability distribution.
If each sweatshirt sells for $25, what is the expected daily total dollar amount taken in by the vendor from the sale
of sweatshirts?
A)
B)
C)
$5.00
$7.60
$35.50
D)
E)
$38.00
$75.00
5. In a carnival game, a person can win a prize by guessing which one of 5 identical boxes contains the prize.
After each guess, if the prize has been won, a new prize is randomly placed in one of the 5 boxes. If the prize has
not been won, then the prize is again randomly placed in one of the 5 boxes. If a person makes 4 guesses, what is
the probability that the person wins a prize exactly 2 times?
A)
B)
C)
D)
E)
4
Gloria Barrett, Virginia Advanced Study Strategies
December, 2012
4. Lynn is planning to fly from New York to Los Angeles and will take the Airtight Airlines flight that leaves at
8a.m. The Web site she used to make her reservation states that the probability that the flight will arrive in Los
Angeles on time is 0.70. Of the following, which is the most reasonable explanation for how that probability
could have been estimated?
A) By using an extended weather forecast for the date of her flight, which showed a 30% chance of bad
weather.
B) By making assumptions about how airplanes work, and factoring all of those assumptions into an
equation to arrive at the probability.
C) From the fact that, of all airline flights arriving in California, 70% arrive on time.
D) From the fact that, of all airline flights arriving in the United States, 70% arrive on time.
E) From the fact that, on all previous days this particular flight has been scheduled, it had arrived on
time 70% of those days.
5. One hundred people were interviewed and classified according to their attitude toward small cars and their
personality type. The results are shown in the table below.
Personality Type
Type A
Type B
Total
Attitude
Positive
25
12
37
Toward
Neutral
11
9
20
Small Cars
Negative
24
19
43
Total
60
40
100
Which of the following is true?
A) Of the three attitude groups, the group with the negative attitude has the highest proportion of Type A
personality types.
B) Of the three attitude groups, the group with the neutral attitude has the highest proportion of Type B
personality types.
C) For each personality type, more than half of the 100 respondents has a neutral attitude toward small
cars.
D) The proportion that has a positive attitude toward small cars is higher among people with a type B
personality type than among people with a type A personality type.
E) More than half of the 100 respondents have a type A personality type and a positive attitude toward
small cars.
5
Gloria Barrett, Virginia Advanced Study Strategies
December, 2012
AP Exam Free Response Questions for Practice and Discussion
2003, Form B, #2
A simple random sample of adults living in the suburb of a large city was selected. The age and annual income
of each adult in the sample were recorded. The resulting data are summarized in the table below.
Annual Income
Age Category
$25,000-$35,000
$35,001-$50,000
Over $50,000
Total
21-30
8
15
27
50
31-45
22
32
35
89
46-60
12
14
27
53
Over 60
5
3
7
15
Total
47
64
96
207
(a) What is the probability that a person chosen at random from those in this sample will be in the 31-45 age
category?
(b) What is the probability that a person chosen at random from those in this sample whose incomes are over
$50,000 will be in the 31-45 age category? Show your work.
(c) Based on your answers to parts (a) and (b), is annual income independent of age category for those in this
sample? Explain.
2005 Form B, #2
For an upcoming concert, each customer may purchase up to 3 child tickets and 3 adult tickets. Let C be the
number of child tickets purchased by a single customer. The probability distribution of the number of child
tickets purchased by a single customer is given in the table below.
C
P(c)
0
0.4
1
0.3
2
0.2
3
0.1
(a) Compute the mean and standard deviation of C.
(b) Suppose the mean and the standard deviation of adult tickets purchased by a single customer are 2 and 1.2,
respectively. Assume that numbers of child tickets and adult tickets purchased are independent random
variables. Compute the mean and the standard deviation of the total number of adult and child tickets
purchased by a single customer.
(c) Suppose each child ticket costs $15 and each adult ticket costs $25. Compute the mean and standard
deviation of the total amount spent per purchase.
6
Gloria Barrett, Virginia Advanced Study Strategies
December, 2012
2004, #4
Two antibiotics are available as treatment for a common ear infection in children.


Antibiotic A is known to effectively cure the infection 60 percent of the time. Treatment with antibiotic
A costs $50.
Antibiotic B is known to effectively cure the infection 90 percent of the time. Treatment with antibiotic
B costs $80.
The antibiotics work independently of one another. Both antibiotics can be safely administered to children. A
health insurance company intends to recommend one of the following two plans of treatment for children with
this ear infection.


Plan I: Treat with antibiotic A first. If it is not effective, then treat with antibiotic B.
Plan II: Treat with antibiotic B first. If it is not effective, then treat with antibiotic A.
(a) If a doctor treats a child with an ear infection using plan I, what is the probability that the child will be cured?
If a doctor treats a child with an ear infection using plan II, what is the probability that the child will be cured?
(b) Compute the expected cost per child when plan I is used for treatment. Compute the expected cost per child
when plan II is used for treatment.
(c) Based on the results in parts (a) and (b), which plan would you recommend? Explain your recommendation.
2004, #3
At an archaeological site that was an ancient swamp, the bones from 20 brontosaur skeletons have been
unearthed. The bones do not show any sign of disease or malformation. It is thought that these animals
wandered into a deep area of the swamp and became trapped in the swamp bottom. The 20 left femur bones
(thigh bones) were located and 4 of these left femurs are to be randomly selected without replacement for DNA
testing to determine gender.
(a) Let X be the number out of the 4 selected left femurs that are from males. Based on how these bones were
sampled, explain why the probability distribution of X is not binomial.
(b) Suppose that the group of 20 brontosaurs whose remains were found in the swamp had been made up of 10
males and 10 females. What is the probability that all 4 in the sample to be tested are male?
(c) The DNA testing revealed that all 4 femurs tested were from males. Based on this result and your answer
from part (b), do you think that males and females were equally represented in the group of 20 brontosaurs
stuck in the swamp? Explain.
(d) Is it reasonable to generalize your conclusion in part (c) pertaining to the group of 20 brontosaurs to the
population of all brontosaurs? Explain why or why not.
7
Gloria Barrett, Virginia Advanced Study Strategies
December, 2012
Solution – 2003 Form B Question 2
89
 0.42995
(a)
P(age 31-45) =
(b)
P(age 31-45 | income over 50,000) =
207
35
96
 0.36458
(c)
If annual income and age were independent, the probabilities in part (a) and (b) would be equal. Since
these probabilities are not equal, annual income and age are not independent for the people in this sample.
Solution - 2005 Form B Question 2
(a) Mean of C = 0  0.4  1 0.3  2  0.2  3 0.1  1
Standard deviation of C = (0  1) 2  0.4  (1  1) 2  0.3  (2  1) 2  0.2  (3  1) 2  0.1  1
Note: Supporting work must be shown to get credit for these answers.
(b) Let T = C + A, where A is the total number of adult tickets purchased by a single customer, denote the total
number of tickets purchased by a single customer.
The mean of T is T  C   A  1  2  3 .
The standard deviation of T is  T   C2   A2  12  1.22  2.44  1.562
(c) Let M  15  C  25  A denote the total amount of money spent per purchase.
The mean of M is M  15C  25 A  15 1  25  2  $65
The standard deviation of M is  M  152  C2  252  A2  225 12  625 1.22  1125  $33.54
Solution – 2004 Question 4
(a) Let A be the event “antibiotic A works” and let B be the event “antibiotic B works.”
The probability that a child will be cured with plan I is:
P(Cure with Plan I)  P( A)  P(not A)P( B)  0.6  (0.4  0.9)  0.96
The probability that a child will be cured with plan II is:
P(Cure with Plan II)  P( B)  P(not B)P( A)  0.9  (0.1 0.6)  0.96
Note: Tree diagrams are very helpful for this part.
(b) Treatment with antibiotic A costs $50, and treatment with antibiotic B costs $80.
8
Gloria Barrett, Virginia Advanced Study Strategies
December, 2012
The expected cost per child when Plan I is used for treatment is:
E (Cost for Plan I)  $50  0.6  $130  0.4  $30  $52  $82
OR
E (Cost for Plan I)  $50  0.4  $80  $82
The expected cost per child when Plan II is used for treatment is:
E (Cost for Plan II)  $80  0.9  $130  0.1  $72  $13  $85
OR
E (Cost for Plan II)  $80  0.1 $50  $85
(c) Since the probability that a child will be cured is the same under either plan, some other criterion must be
used to make a recommendation. From a financial point of view, Plan I should be recommended because the
expected cost per child is less than Plan II.
Solution – 2004 Question 3
(a) X is not binomial since the trials are not independent due to sampling without replacement. The conditional
probability of selecting a male changes at each trial depending on the previous outcome.
 10  9  8  7 
     0.043
 20  19  18  17 
(b) P( X  4)  
(c) No. If males and females were equally represented, the probability of observing four males is small (0.043).
(d) No, we cannot generalize to the population of all brontosaurs because it is not reasonable to regard this
sample as a random sample from the population of all brontosaurs; there is reason to suspect that this sampling
method might cause bias.
9
Gloria Barrett, Virginia Advanced Study Strategies
December, 2012