Download AP Review Random Variables Key File

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Transcript
AP Statistics
AP Review Random Variables
Vocabulary
Random variable
Probability distribution
Discrete random variable
Mean (expected value) of a discrete random variable
Variance of a discrete random variable
Standard deviation of a discrete random variable
Continuous random variable
Linear transformation
Effect on a random variable of multiplying/dividing by a constant
Effect on a random variable of adding/subtracting by a constant
Mean of the sum of random variables
Independent random variables
Variance of the sum of independent random variables
Mean of the difference of random variables
Variance of the difference of independent random variables
Binomial setting
Binomial random variable
Binomial coefficient
Binomial probability
Mean of a binomial random variable
Standard deviation of a binomial random variable
Normal approximation for binomial distributions
Geometric setting
Geometric random variable
Geometric probability
Mean of a geometric random variable
Summary
 A random variable takes numerical values determined by the outcome of a chance process. The
probability distribution of a random variable X tells us what the possible values of X are and how
probabilities are assigned to those values. There are two types of random variables: discrete and
continuous.
 A discrete random variable has fixed set of possible values with gaps between them. The probability
distribution assigns each of these values a probability between 0 and 1 such that the sum of all the
probabilities is exactly 1. The probability of any event is the sum of the probabilities of all the values that
make up the event.
 A continuous random variable takes all values in some interval of numbers. A density curve describes the
probability distribution of a continuous random variable. The probability of any event is the area under
the curve above the values that make up the event.
 The mean of a random variable  X is the balance point of the probability distribution histogram or density
curve. Because the mean is the long-run average value of the variable after many repetitions of the
chance process, it is also known as the expected value of the random variable,


If X is a discrete random variable, the mean is the average of the values of X, each weighted by its
probability:
The variance of a random variable  X2 is the “average” squared deviation of the values of the variable from
their mean. The standard deviation  X is the square root of the variance. The standard deviation
measures the typical distance of the values in the distribution from the mean.

For a discrete random variable X, the variance is  X2    xi   x  pi and the standard deviation is
2
X 



 x   
2
i
x
pi
Adding a positive constant a to (subtracting a from) a random variable increases (decreases) the mean of
the random variable by a but does not affect the standard deviation or the shape of its probability
distribution.
Multiplying (dividing) a random variable by a positive constant b multiplies (divides) the mean of the
random variable by b and the standard deviation by b but does not change the shape of the probability
distribution.
A linear transformation of a random variable involves adding or subtracting a constant a, multiplying or
dividing by a constant b, or both. We can write a linear transformation of the random variable X in the
form Y = a + bX. The shape, center, and spread of the probability distribution of Y are as follows:
Shape: same as the probability distribution of X if b > 0.
Center: Y  a  b X
Spread:  Y  b  X

If X and Y are any two random variables,
The mean of the sum of two random variables is the sum of their means X Y  X  Y
The mean of the difference of two random variables is the differences of their means X Y  X  Y

If X and Y are two independent random variables, then knowing the value of one variable tells you nothing
about the value of the other. In that case, variances add:
The variance of the sum of two independent variables is the sum of their variances.  X2 Y   X2   Y2 .
The variance of the difference of two independent variables is the sum of their variances.
 X2 Y   X2   Y2 .



The sum or difference of independent Normal random variables follows a Normal distribution.
A binomial setting consists of n independent trials of the same chance process, each resulting in a success
or a failure, with probability of success p on each trial. (BINS) The count X of successes is a binomial
random variable. Its probability distribution is a binomial distribution.
n
n!
The binomial coefficient   
counts the number of ways k successes can be arranged among n
 k  k ! n  k !
trials. The factorial of n is n!  n  n  1 n  2... 321 for positive whole numbers n, and 0! = 1.

If X has the binomial distribution with parameters n and p, the possible values of X are the whole numbers
n
nk
0, 1, 2, , n. The binomial probability of observing k successes in n trials is P  X  k     pk 1  p 
k 

The mean and standard deviation of a binomial random variable X are  X  np and  X  np 1  p


The binomial distribution with n trials and probability p of success gives a good approximation to the count
of successes in an SRS of size n from a large population containing proportions p of success. This is true as
long as the sample size n is no more than 10% of the population size N.
The Normal approximation to the binomial distribution says that if X is a count of successes having the
binomial distribution with parameters n and p, then when n is large, X is approximately normally
distributed with mean np and standard deviation
np 1  p  . We will use this approximation when
np  10 and n 1  p  10


A geometric setting consists of repeated trials of the same chance process in which the probability p of
success is the same on each trial, and the goal is to count the number of trials it takes to get one success.
If Y = the number of trials required to obtain the first success, then Y is a geometric random variable. Its
probability distribution is called a geometric distribution.
If Y has the geometric distribution with probability of success p, the possible values of Y are the positive
integers 1, 2, 3, …. The geometric probability that Y takes any value is P Y  k   1  p 

The mean (expected value) of a geometric random variable Y is Y 
k 1
p
1
p
Problems
1. Consider two 4-sided dice, each having sides labeled 1, 2, 3, 4. Let X = the sum of the numbers that appear
after a roll of the dice.
A. Is X a discrete or a continuous random variable? Sketch the probability distribution of X. Describe
what you see.
0.3
1
2
3
4
1
2
3
4
5
2
3
4
5
6
3
4
5
6
7
4
5
6
7
8
0.25
0.2
0.15
0.1
0.05
0
1
2
3
4
5
6
7
8
X is a random variable. We are most likely to roll a sum of 5 and least likely to roll a sum of 2 or 8.
B. If someone rolled the dice 10 times and got a sum less than 3 each time, would you be surprised? Why
or why not?
Yes, we should be surprised. In 10 rolls we would expect to see a sum of 3 or less about once or twice.
2. Suppose the random variable Y = number of goals in a randomly selected high school hockey game has the
following probability distribution:
Goals
0
1
2
3
4
Probability
0.155
0.195
0.243
0.233
0.174
Sketch the probability distribution. Then calculate the mean and standard deviation of Y and interpret
them in the context of the situation.
E Y   0  0.155  1 0.195  2  0.243  3  0.233  4 0.174   2.076
In the long run, we’d expect to see an average of 2.076 goals per game for many, many games.
VAR Y    0  2.076   0.155  1  2.076   0.195  2  2.076   0.243 
2
2
2
3  2.076  0.233  4  2.076   0.174   1.7382
2
2
Standard deviation = 1.7382  1.3184.
We would expect the number of goals per game to vary by about 1.3184 from 2.076 in the long run.
3. The weights of toddler boys follow an approximately Normal distribution with mean 34 pounds and
standard deviation 3.5 pounds. Suppose you randomly choose one toddler boy and record his weight.
What is the probability that the randomly selected boy weighs less than 31 pounds?
31  34
z
 0.8571
3.5
.
P  z  .8571  0.1956
You can calculate this probability using the normalcdf function on your calculator. (low = -999, high = 31,
mean = 34, standard deviation = 3.5
4. A carnival game involves tossing a ball into numbered baskets with the goal of having your ball land in a
high-numbered basket. The probability distribution of X = value of the basket on a randomly selected toss.
Value
Probability
0
1
2
3
0.3
0.4
0.2
0.1
The expected value of X is 1.1 and its standard deviation is 0.0943.
Suppose it costs $2 to play and you earn $1.50 for each point earned on your toss. That is, ifyou land in a
basket labeled “2,” you earn $3.00.
Define Y to be the amount of profit you make on a randomly selected toss. Describe the shape, center,
and spread of the probability distribution of Y in the context of the situation.
The shape will be slightly skewed to the right. E Y   1.5  E  X    2  0.35. In the long run, we would
expect to lose $0.35 each time we play the game, on average.
StdDev Y   1.5 0.943  1.4145. On average, we would expect our profit to ary by about $1.42 around a
loss of $0.35.
5. Students in Mr. Costello’s class are expected to check their homework in groups of 4 at the beginning of
class each day. Students must check it as quickly as possible, one at a time. The means and standard
deviations of the time it takes to check homework for the 4 students in one group are noted. Assume their
times are independent.
Mean
Standard deviation
Alan
1.4 min
0.1 min
Barb
1.2 min
0.4 min
Corey
0.9 min
0.8 min
Doug
1.0 min
0.7 min
A. If each student checks one after the other, what are the mean and standard deviation of the total time
necessary for these four students to check their homework on a randomly chosen day?
Mean  1.4  1.2  0.9  1  3.3 min.
StdDev  0.12  0.42  0.82  0.72  1.3  1.14 min.
B. Suppose Alan and Doug like to race to see who ca check their homework faster. What are the mean
and standard deviation for the difference between their times (Doug – Alan)? Interpret these values in
the context of the situation.
Mean  1  1.4  0.4 min. On average, Doug is faster by 0.4 min.
StdDev  0.72  0.12  0.7071 min. The difference between Doug and Allan’s times will vary by
0.7071 min around 0.4 on average.
6. Mr. Molesky and Mr. Liberty are avid video game golfers. Both like to compare times to complete a
particular course on their favorite game. Mr. Molesky’s times are Normally distributed with a mean of 110
minutes and standard deviation of 10 minutes. Mr. Liberty’s times are Normally distributed with mean
100 minutes and standard deviation 8 minutes.
A. Find the mean and standard deviation of the difference of their times (Molesky – Liberty). Assume
their times are independent.
MeanM  L  110  100  10
StdDev  M  L   102  82  164  12.81
B. Find the probability that Mr. Molesky will finish his game before Mr. Liberty on any given day.
0  10 

P M  L  0  P  z 
  P  z  0.78   0.2177
12.81 

There is about a 21.77% chance Mr. Molesky will finish before Mr. Liberty on any given day.
7. Recall that there are 4 suits – spades, hearts, clubs, and diamonds – in a standard deck of playing cards.
Suppose you play a game in which you draw a card, record the suit, replace it, shuffle, and repeat until you
have observed 10 cards. Define X = number of hearts observed.
A. Show that X is a binomial random variable
B: A card is either a heart or it isn’t
I: Each draw is independent since cards are replaced and the deck is shuffled
N: There are 10 observations in each game
S: The P(heart) = 0.25 in each draw.
B. Find the probability of observing fewer than 4 hearts in this game.
P  X  4   P  X  0   P  X  1  P  X  2  P  X  3  0.7759
8. Suppose 72% of students in the U.S. would give their teachers a positive rating if asked to score their
effectiveness. A survey is conducted in which 500 students are randomly selected and asked to rate their
teachers. Let X = the number of students in the sample who would give their teachers a positive rating.
A. Show that X is approximately a binomial random variable.
B: Students either give a positive or negative rating
I: Since there are more than 10(500) students in the population, we can assume independence
N: 500 students were selected
S: P(positive rating) = 0.72 for each student.
B. Use a Normal approximation to find the probability that 400 or more students would give their teacher
a positive rating in this sample.
Mean  np  500 .72  360
StdDev  np 1-p   10.04
400  360 

P  S  400   P  z 
  P  z  3.98   0.000034
10.04 

9. Suppose 20% of Super Crunch cereal boxes contain a secret decoder ring. Let X = the number of boxes of
Super Crunch that must be opened until a ring is found.
A. Show that X is a geometric random variable.
There are two outcomes (ring or no ring). Each box is independent. The probability of a ring in any
given box is 0.2. We are interested in how long it will take to find a ring.
B. Find the probability that you will have to open 7 boxes to find a ring.
P  X  7  0.86  0.2  0.0524
C. Find the probability that it will take fewer than 4 boxes to find a ring.
P  X  7  0.488
D. How many boxes would you expect to have to open to find a ring?
1
E X  
 5 boxes
0.2