Download YMS Chapter 7 Random Variables

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Inductive probability wikipedia , lookup

Taylor's law wikipedia , lookup

Central limit theorem wikipedia , lookup

Probability amplitude wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
YMS Chapter 7 Random Variables
Q1. A random variable is a variable whose value is a ________ of a random
phenomenon.
A1. Numerical outcome
Q2. A random variable with a countable number of possible values is a _____ random
variable.
A2. Discrete.
Q3. What is a probability distribution of a discrete random variable?
A3. A list of the values the variable can take on, and the probability for each value.
Q4. For the probability distribution of a discrete random variable, every probability is
between ___ and ___, and the sum of all the probabilities is equal to ___.
A4. 0 and 1, 1
Q5. In a probability histogram, what quantity do the horizontal and vertical axes
represent, respectively?
A5. The horizontal axis represents the possible values the random variable can take on,
and the vertical axis represents the probability of that value.
Q6. A continuous random variable can take on how many values for a certain interval in
its domain?
A6. An infinite number
Q7. A continuous random variable’s probability distribution is described by a graph
called the ___.
A7. Density curve, or probability density curve.(This is the graph of the probability
density function, or pdf.)
Q8. Events, for continuous random variables, are described by the random variable’s
taking on a value within a certain interval. The probability of that event is represented by
what aspect of the density curve?
A8. The area under the curve, between the two points that bound the interval, or the area
under the curve, over the values (on the x-axis) that make up the event.
Q9. Suppose you have a continuous random variable X. What is the probability that
X=10?
A9. Zero. Continuous probability distributions assign probability 0 to every individual
outcome.
Q10. In a continuous probability distribution, what is the relationship between the
probability that X<10 and the probability that X<=10?
A10. The two are equal, because the probability that X=10 is 0.
Q11. True or false: the normal distribution is an example of a continuous probability
distribution.
A11. True.
Q12. The mean of a discrete random variable is the sum of the products of all the possible
values and the __________.
A12. Probabilities of those values.
Q13. Suppose there are two possible outcomes for a certain random variable, 0 and 100.
The probability of getting 0 is .99 and the probability of getting 100 is .01. What is the
mean of the random variable?
A13. 1.
Q14. The mean of a random variable is often called the e_____ v_____ of the variable.
A14. Expected value
Q15. Someone is offered a gambling game where there is a .25 chance of her losing
$100, and a .75 chance of her winning $60. If she plays many times, what would her
average winnings be?
A15. $20. (Because this number is positive, it is unlike any gambling games anyone is
likely to be offered.)
Q16. Someone is invited to send in a contest entry in which the chances are 1 in 50
million of winning a million dollars, and one in a million of willing a thousand dollars.
How do the expected earnings (or the mean earnings) from this contest compare with the
price of a first-class stamp?
A16. The expected earnings are 1/50million * 1million + 1/1million*1000 + the rest of
the probability*0. This comes out to one fiftieth of a dollar plus one thousandth of a
dollar, or 2.1 cents. At the time of this writing, a first class stamp costs 37 cents. So the
cost is 34.9 cents more than the expected earnings.
Q17. The mean of symmetric continuous probability distributions lies at the ____ of the
curves.
A17. Center
Q18. The variance of a discrete random variable is the sum of the products of the squared
deviation of each possible value from the mean of the distribution and the _____ for that
value.
A18. Probability
Q19. Suppose there is a distribution with possible values 0, 1, and 2, each with
probability 1/3. What is the variance, i.e. the sigma-squared, of this distribution? (This is
also known as the variance of the population.)
A19. 1*1/3 +0*1/3 +1*1/3, or 2/3.
Q20. Think back to the definition of the variance of a sample. Suppose you had a sample
consisting of 0, 1, and 2, with mean 1. Is the variance of this sample the same as the
variance of the population?
A20. No. The variance of the sample is the sum of the squared deviations over n-1. So the
variance of the sample would be (1+0+1)/(3-1) or 1, rather than 2/3.
Q21. Please take a few seconds to enter 0,1,and 2 in a list on your calculator. (On the TI83 or 84, stat >edit.) Then please compute 1 variable stats on these. (Stat>calc>1varstats,
listname). Look at sx and sigma x results. What are they, and why do they make sense vis
a vis the different definitions of population and sample variance and standard deviation?
A21. They are 1 and .81650, respectively, and they do make sense because they are the
square roots of 1 and 2/3, respectively.
Q22. What is the law of large numbers, in your own words?
A22. One way of putting it is that as the sample size approaches infinity, the sample
mean approaches the population mean. Another is that you can make the sample mean
get as close as you want to the population mean by getting a large enough sample.
Q23. True or false: If by chance, you flip a coin and get 10 heads in a row, the law of
large numbers tells us that if we flip many more times, we will get just a tiny bit under
50% heads in the remaining tosses, to compensate for the first 10 heads and make the
long-range probability equal 50%.
A23. False. The definition of independent trials implies that the coin “doesn’t remember”
the first 10 flips and the subsequent results are not influenced by the initial ones.
Q24. True or false. When you are picking a “large number” so as to make the sample
mean get within a certain distance of the population mean, you need a larger number the
greater the variability of the random outcomes.
A24. True. In other words, with very small variability, the sample mean gives a close
approximation to the population mean with much smaller sample size than with very
great variability.
Q25. True or false: the mean of a linear function of a random variable is that same linear
function of the mean of the random variable. In other words, the mean of a + bX is a
+b*the mean of X.
A25. True.
Q26. The mean of the sum of two random variables equals what?
A26. The sum of the means of the two variables.
Q27. If the mean amount that Linda makes at her lemonade stand per day is $10 and the
mean amount her brother Tom makes is $9, what’s the average of their total daily
receipts?
A27. $19.
Q28. Suppose someone tells you that the standard deviation of single scores on the SAT
is 100 points. Suppose that there are two people who take the SAT independently of one
another. How would you find the sd of the sum of their scores?
A28. You’d first square their scores to get the variances of their individual scores –
10000 apiece. Then you add those variances, to get the total variance – 20000. Then you
take the square root of that to get the sd of the sum, 100*the square root of 2, or 141.42.
Q29. If the standard deviation of the SAT math and critical reading are both about 100, is
the standard deviation of the sum of these two scores for an individual more or less than
141.42? Why?
A29. More, because the two scores are correlated with each other. The variance of the
sum is the sum of the variances plus twice the correlation between the variables times the
product of their standard deviations.
Q30. Can you give an intuitive explanation for why the variance of the sum of two
random variables is increased, the more highly they are correlated with each other?
A30. The variance is increased the more extreme values you have. If the two variables
are independent, then a high value of one variable tends to be balanced by a lower value
of the other. For example, if you get a 6 on rolling one die, you’ll on the average get a
sum of 9.5 once you add in the value of the other die – not a sum of 12. The same thing
goes in the other direction: low values of one variable are on the average, balanced by
higher values of the other. But when the values are highly correlated, then high values
of one variable predict high values of the other, and you sums that are higher, and low
values of one variable predict low values of the other, and you get sums that are lower.
So with sums that are higher and lower, the variance of the sum is greater with correlated
variables.
Q31. A linear combination of two independent normally distributed random variables is
distributed how?
A31. Normally.
YMS Chapter 8 The Binomial and Geometric Distributions
Q1. Suppose someone looks at the numbers of 1’s, 2’s, 3’s, 4’s, 5’s, and 6’s that result
from 600 die rolls. Is this situation an example of the “binomial setting”?
A1. Almost, but not quite. For there to be a binomial setting, you have to have each
observation fall into only two categories, rather than the 6 categories described here.
However, if you defined a 1 as a “success” and anything else as a “failure,” then you
would have a binomial setting, and you could then do the same thing separately with 2, 3,
4, 5, and 6.
Q2. What are the four requirements for the binomial setting?
A2. 1. Two categories 2. Fixed n of observations 3. Independence 4. p of success same
for all observations.
Q3. The distribution of the number of successes out of n trials (with probability of
success p on each trial) is the ______ _______.
A3. binomial distribution
Q4. If someone has 51 socks in a drawer, with 1/3 red and 2/3 black, and the person grabs
a handful of 5 of them, and counts the number of black, will the results of such a trial
follow the binomial distribution? Why or why not?
A4. Not quite, because grabbing a handful of 5 is equivalent to sampling without
replacement. The probability of a black’s being included in the handful is altered some
depending on what other socks are also in the handful. If you picked a sock one at a time,
replaced the sock and mixed them thoroughly, and then picked again, the binomial
distribution would apply.
Q5. Suppose you roll a die 1000 times and count “1” as success and “anything else” as
failure. Is this an example of the binomial setting, and does the count have the binomial
distribution? Why or why not?
A5. Yes and yes, because the conditions of 2 categories, fixed n of observations,
independence, and p of success constant all hold.
Q6. In Chapter 1, the word distribution was defined as what values the variable takes and
how often it takes these values. Let’s say you roll a die 1000 times and count the number
of 1’s. The count of successes comes out to 165. Someone asks you, “What does this
have to do with a distribution? We just got one number from this experiment. How’s
anybody going to plot a histogram or any other representation of a distribution with this?”
What would be your answer?
A6. The number of 1’s that you would get in such an experiment is a random variable. If
you did the experiment many times, you would get a distribution that could be plotted
with a histogram, and which would take the approximate shape of the theoretically
derived binomial distribution for this situation.
Q7. When we say a certain random variable has a B(100, .7) distribution, what do we
mean?
A7. That there is a binomial distribution with 100 observations and probability of success
on each observation .7.
Q8. If there is a discrete random variable (such as a binomial), and you want to find the
probability of any given value of X, what function do you use – the cumulative
distribution function or the probability distribution function? (cdf or pdf?)
A8. pdf
Q9. Suppose you want to know the probability that a binomial random variable B(100,
.7) takes on a value less than or equal to 60. One way would be to use the binomial pdf
for values 0, 1, .. 60 and then add them all up. A much less laborious way would be to do
what?
A9. Use the binomial cdf function. On the TI-83 or TI-84 the command would be
binomcdf(100, .7, 60).
Q10. Suppose you roll a die six times, and you want the probability of getting exactly 3
1’s. What would be the appropriate expression of the binomial formula that would give
the answer to this?
A10. (6 choose 3) * (1/6^3 * 5/6^3).
Q11. Can you please explain why the binomial probability formula is as it is, using this
example of rolling a die six times? Please give an explanation for why each of the three
factors is what it is.
A11. You have 6 independent rolls. The probability of getting 1 on the first three and
something other than 1 on the next three is 1/6^3 * 5/6^3 because of the multiplication
rule for independent events. The number of different ways you can get three ones is the
number of ways you can select 3 dice to be ones out of the 6 different rolls: i.e. roll 1, 2,
and 3, roll 1, 2, and 4, etc. This is the number of combinations of six things taken 3 at a
time, or 6 choose 3. Because each of these ways of getting 3 1’s is mutually exclusive of
the others, you can use the addition rule to add the probability of each of the 6 choose 3
ways of getting the outcome of 3 1’s, and since these probabilities are all the same, a
shorter way is to multiply the probability of any one of them time 6 choose 3.
Q12. Can you please explain why, to obtain the binomial coefficient, you use the number
of combinations rather than the number of permutations, in calculating n choose k?
A12. In our die roll example, using the number of permutations would for example count
the event of getting 1’s on roll number 1, 3, and 5 as a different event from getting 1’s on
rolls number 5, 3, and 1. Since these events are the same, order does not make a
difference in enumerating subsets, and therefore you want combinations rather than
permutations.
Q13. In chapter 7 we learned that means and variances are additive when you want to
know the mean and variance of sums of independent random variables. How are these
facts crucial in figuring out the formulas for the mean and variance of a binomial random
variable?
A13. In the case of both the mean and the variance, we consider a certain random
variable to be the outcome of any one trial, giving a success numerical value 1 and failure
value 0. We calculate the mean and variance of this variable from the defining formulas.
Then we define another random variable to be the sum of n of these random variables,
which is the number of successes in n trials. We use the additive properties of the mean
and the variance that we learned in chapter 7 to move us from an expression for the mean
and variance of any one trial, to the mean and variance of the sum of n trials.
Q14. What are the formulas for the mean, variance, and standard deviation of a binomial
random variable in terms of n and p, and, if you want, q (or 1-p).
A14. The mean mu=np. The variance sigma squared = npq. The standard deviation is
the square root of npq or (np(1-p))^.5.
Q15. When n is “large,” the binomial distribution with n trials and success probability p
can be approximated by what?
A15. The normal distribution with mean np and standard deviation (npq)^.5.
Q16. As a rule of thumb, the normal distribution may be used as an approximation to the
binomial when both np and nq (expected successes, expected failures) equal or exceed
what number?
A16. 10
Q17. Please describe how to have your calculator simulate a binomial experiment. What
are the keys that you press?
A17. For the TI-83 or 84, You press math>prb>randbin( and then enter 1, p, and n in
parentheses. You press the sto> key to store the results, and 2nd L1 to get the results
stored in L1; then you hit the colon which is done by alpha and then the decimal point
key; then you do the sum function which is under list>math>. Then in parentheses you
enter L1. So the command is randbin(1, p, n)->L1:sum(L1). What this does is to
generate n numbers that are either 0 or 1, with a p probability of being 1, and adds them
to get the number of successes in the n trials, and displays that number. Each time you
press the enter key, this will be repeated. So you take your number 2 pencil and write
down each of these numbers, and then you count how many of them had the result you
were looking for, and you express that result as a fraction of the number of trials you did.
Q18. For a binomial setting, the number of trials is fixed, and the random variable is the
number of successes in that trial. For a geometric setting, the random variable is the
number of ____ necessary to achieve the first _____.
A18. trials, success
Q19. True or false: in the geometric setting, as in the binomial setting, you have 1. two
categories, 2. with the same probability for each observation, and 3. independent
observations.
A19. True.
Q20. In a geometric setting, with probability of success p, what is the probability that the
first success will occur on the nth trial?
A20. P(X=n) = (1-p)^n-1 * p
Q21. True or false: the probabilities of success on the first, second, third, etc. trial in a
geometric setting, when arranged in order, form a geometric series where p is the first
term and each successive term being (1-p) (or q) times the previous one?
A21. True
Q22. True or false: if you apply the formula a/(1-r) for the sum of the terms of an infinite
geometric series, where a is the first term and r is the ratio of each term to the previous
one, for the geometric setting p is the first term and (1-p) is the ratio, so the sum becomes
p/(1-(1-p)) or 1. Thus even though there are infinitely many possibilities for the outcome
of the experiment in the geometric setting, the probabilities of each outcome sum to 1.
A22. True
Q23. If your chances of rolling a 1 on a die roll are one in 6, what is the expected or
average number of times that you would have to roll the die before getting a 1?
A23. 6 times.
Q24. If your chances of getting a success at anything in the geometric setting is p, what is
the average or expected number of trials that you would have to conduct before getting a
success?
A24. 1/p trials.
Q25. What is the variance in the geometric random variable?
A25. q/p^2, or (1-p)/p^2
Q26. In the geometric setting, if q=1-p, what is the probability that it takes more than n
trials to see the first success?
A26. P(X>N)= q^n.
Q27. On page 470 of YMS there is a derivation of the formula for the probability that it
takes more than n trials to see the first success. Can you think of a really simple way to
arrive at the same formula?
A27. If, and only if, the first n trials are failures, it will take more than n trials to get the
first success. The probability of the first n trials being failures, using the multiplication
rule for independent events, is q^n.
Q28. For a geometric distribution, would you say that it is approximately true that 34% of
the observations would fall between the mean and 1 standard deviation above the mean,
and 34% would fall between the mean and 1 standard deviation below the mean? Why or
why not?
A28. No, because the geometric distribution is always strongly skewed to the right, and
its shape doesn’t resemble the normal distribution (for which the above statement is true).
Q29. Suppose that some experts estimate that the probability of a major nuclear war in
any given year is 1%. You think that you will live another 65 years. You are wondering
what the chance is that you will and will not see a nuclear war. Please fit some of the
concepts of this chapter to this situation, and calculate the probability.
A29. This is the geometric setting, where “success” is defined as nuclear war, and
“failure” is defined as no nuclear war! What you are being asked is the probability that it
will take more than 65 trials to see a “success.” You use the formula P(X>n)=q^n, to get
the P(X>65)=.99^65. The probability of no nuclear war comes out to .52. So would that
be comforting, or what?