Download YMS Chapter 9 Sampling Distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
YMS Chapter 9 Sampling Distributions
Q1. A parameter (which begins with p) is a number describing a _____; a statistic (which
begins with s) is a number describing a ______.
A1. population, sample
Q2. What symbols are used in our book’s notation to represent a population mean,
sample mean, sample proportion, and population proportion, respectively?
A2.  (mu), x-bar, p-hat, and p. (Note that the letter P will also be used later to indicate
the “P-value” for a statistical test. Our text will use a capital P for that, or the phrase Pvalue. But many other writings will use small p for P-value.)
Q3. Suppose you were to take a large number of samples (all the same size) from a
population, compute the mean of each, and plot a histogram of the sample means that you
obtain. This histogram would approximate the shape of the ______ ________ of the
mean.
A3. Sampling distribution.
Q4. The sampling distribution for a proportion or mean changes as the number in the
sample increases: the mean of that sampling distribution (increases, stays the same,
decreases) and the variance of the sampling distribution (increases, stays the same,
decreases).
A4. stays the same, decreases
Q5. If the mean of a sampling distribution is the true value of the parameter being
estimated, we refer to the statistic used to estimate the parameter as being _____.
A5. unbiased
Q6. True or false: if a statistic is unbiased, the value of the statistic computed from the
sample equals the population parameter.
A6. False. Samples vary. It’s only the mean of all possible samples that equals the
population parameter for an unbiased statistic.
Q7. True or false: the variability (and thus the accuracy) of statistics are very sensitive to
the size of the population from which the samples are drawn.
A7. False. The sample size is much more important than the population size.
Q8. An organization wants to sample with equal accuracy from each state of the USA.
Would it make more sense to sample 2000 from each state, or 1% of each state?
A8. 2000 from each state, because the absolute sample size determines the accuracy, and
you don’t need a greater sample with a higher population.
Q9. To review from Chapter 7, on the binomial distribution: what are the mean and
standard deviation of a binomially distributed variable X, where p is the (population)
probability of success, q is the probability of failure, and n is the size of the sample?
A9. The mean is np, and the standard deviation is the sqrt(npq).
Q10. What are the mean and sd of sample proportion, which is X/n where X is binomially
distributed (but X/n is not binomially distributed)?
A10. The mean is p, and the standard deviation is sqrt(pq/n).
Q11. If you want a standard deviation for a sample proportion that is half as big as some
other one, you have to get a sample that is how many times bigger?
A11. 4 times bigger.
Q12. If the sample is a substantial fraction of the population, then the assumption of
independence that leads to the binomial distribution is violated. How many times bigger
should the population be than the sample, so that we don’t worry about this?
A12. At least 10 times bigger.
Q13. True or false: The standard deviation of the sampling distribution of a proportion is
only approximately sqrt(pq/n); this approximation is most accurate when np>=10 and
nq>=10.
A13. False. The sd of the sampling distribution of a proportion is always exactly
sqrt(pq/n). But that distribution is approximately NORMAL when np and nq are >=10.
Q14. If you know the population proportion, how do you use the normal approximation
to figure out the probability that the proportion obtained from a random sample of size n
will be between two given values?
A14. You use p and sqrt(pq/n) as the mean and sd, and with these compute a z score for
the upper and lower bounds of the interval you’re interested in. Then you use the normal
table, or a calculator, to get the proportion of the normal curve between those two z
scores. This is the probability that the sample proportion will fall between those values.
Q15. How do the sampling distributions of means compare with the distributions of
individual observations? They are less _____ and more _____.
A15. Variable, normal
Q16. Suppose you have a population with mean mu and sd sigma. What are the mean
and sd of the sampling distribution for means with sample size n?
A16. The mean of the sampling distribution is mu and the sd is sigma/sqrt(n).
Q17. Under what conditions will the sampling distribution of the mean have an exact
normal distribution, no matter what the sample size is?
A17. When the population is normally distributed.
Q18. What does the central limit theorem tell us?
A18. That as the sample size gets larger, the sampling distribution of the mean
approaches the normal, regardless of the distribution of the population from which the
observations are drawn.
Q19. True or false: suppose that income in a large country is not normally distributed, but
is very skewed. The central limit theorem tells us that if we were to collect several very
large samples and compute the mean income for each sample, those means would be
approximately normally distributed, even though the incomes in the population are not
normally distributed.
A19. True.
Q20. Why do you think the central limit theorem is so “central” to statistics?
A20. Because it enables us to use normal probability calculations to answer questions
about sample means even when population distributions are not normal. Those questions
include the big idea of confidence intervals: how likely is the right answer to be between
these two bounds. Thus the central limit theorem helps us say, “There’s x probability that
the true mean of the population is between a and b.”
YMS Chapter 10 Introduction to Inference
Section 1
Q1. Statistical inference consists in drawing conclusions about a ____ from data in a
____.
A1. population, sample
Q2. If the standard deviation of a population is sigma, what is the sd of the sampling
distribution for the sample mean (this is often called the standard error of the mean) with
sample size n?
A2. sigma/sqrt(n)
Q3. Suppose we know that the sd of the sample mean (a.k.a. standard error of the mean)
is 4.5. This implies that if we were to draw many samples from the population, about
95% of these sample means would fall within what interval?
A3. The population mean plus or minus 9.
Q4. True or false: we should imagine the sample mean as being at the center of a bellshaped curve, with 2 standard deviations of the sample mean (a.k.a. standard errors) on
either side of this point encompassing 95% of the other sample means.(Assume the
sample is an srs and the sample means are normally distributed.)
A4. False. We should imagine the population mean as being at the center of that bell
curve. We visualize the sample mean as falling within 2 standard errors of the population
mean 95% of the time.
Q5. True or false: The reasoning we use in making confidence intervals around a sample
mean is as follows: if the sample mean is normally distributed, then 95% of the time, xbar will be within 2 sample standard deviations (standard errors) of the population mean,
mu. Whenever x-bar is within 2 standard errors of mu, mu is within 2 standard errors of
x-bar. So if we make an interval + or – 2 standard errors around x-bar, that interval will
encompass mu for 95% of the sample means we obtain.
A5. True.
Q6. A confidence inteval has two parts: 1) the interval itself (usually expressed as an
estimate plus or minus a margin of error) and 2) ______________.
A6. The confidence level, e.g. .95.
Q7. Someone says, “I read that the 95% confidence interval for a certain group’s score on
a certain test was 115 to 128. That means that 95% of all the members of the group score
in that range.” Is this an accurate interpretation? If not, please give a better one.
A7. Not accurate. The confidence interval stated means that we are 95% confident that
the population mean lies within the stated interval. And 95% confident means that 95%
of the intervals obtained the way we got this one would encompass the population mean.
Q8. In order to construct a confidence interval for a mean, what two conditions need to
be met?
A8. That the data come from a SRS of the population, and that the sampling distribution
of the x-bar is approximately normal.
Q9. A first person says, “I want a 90% confidence interval. So I’ll look in the normal
table for the z-score with 95% of the area to the left of it.” A second person says, “You
mean 90% of the area, don’t you?” What is the correct way to look in the table?
A9. The first person got it right. The region around the population mean that subsumes
90% of the sample means is that with 5% above that region and 5% below that region. So
you want z for .95 or the negative of the z for .05.
Q10. What are the “tail areas” you look for, for confidence intervals of .90, .95, and .99,
respectively?
A10. .05, .025, and .005.
Q 11. If C is the confidence level, what is the expression for the area to the right of the
interval subsuming fraction C of the distribution for sample means?
A11. (1-C)/2.
Q12. What does the symbol z* stand for?
A12. The z-score with (1-C)/2 of the area lying to the right of it. Or: the number of
standard deviations above and below the mean that bound the C level confidence interval.
Q13. True or false: The values mu- z* sigma/sqrt(n) and mu + z* sigma/sqrt(n)
represent the upper and lower bounds for the confidence interval for the mean.
A13. False. The confidence interval is centered around x-bar, not around mu, because we
don’t know mu. (If we did, we wouldn’t need to make a confidence interval.) The values
listed above are the bounds within which there is a probability C that any observed
sample mean will fall. Whenever the sample mean falls between these values, then we
can say that the population mean falls between x-bar –z* sigma/sqrt(n) and x-bar + z*
sigma/sqrt(n).
Q14. If my wife’s age falls in the interval of my age plus or minus 5 years, then my age
must fall within the interval of my wife’s age plus or minus 5 years. Is this true, and is
this sort of reasoning central to the reasoning about confidence intervals?
A14. Yes and yes.
Q15. True or false: The way in which the statement in the previous question has its
analogy in the reasoning about confidence intervals is: any time the sample mean falls
within the interval of mu plus or minus the margin of error, then the population mean
must fall within the interval of x-bar plus or minus the same margin of error.
A15. True.
Q15. True or false: the values x-bar –z* sigma/sqrt(n) and x-bar + z* sigma/sqrt(n) form
the upper and lower bounds for the confidence interval for the mean (assuming the
assumptions are met).
A15. True.
Q16. Example 10.5 on page 546 is worthy of careful study. What are the 4 steps that were
exemplified in using confidence intervals?
A16. 1. Identify the population of interest and the parameter to be estimated. 2. Choose
the appropriate procedure, and verify that the conditions for using it are met. 3. Carry out
the procedure. CI = estimate + or – margin of error. 4. Interpret the results in the context
of the problem.
Q17. Please tell whether the margin of error (which is half the width of the confidence
interval), or the width of the confidence interval itself, gets bigger or smaller under each
of the following circumstances: a. the population standard deviation gets smaller, b. the
level of confidence C gets bigger (e.g. a move from a 90% confidence interval to a 99%
confidence interval) c. the sample size gets bigger, and d. the population size gets bigger?
A17. a. smaller, b. bigger, c. smaller, d. no effect
Q18. Is it preferable in research for a 95% confidence interval to have its upper and lower
bounds closer together, or farther apart?
A18. Closer together, because this represents a more accurate estimate of whatever
you’re trying to estimate.
Q19. Suppose you are a researcher planning a study, and you are deciding how many
subjects to enroll. You want a certain margin of error m. You know what level of
confidence you want, and you know (or estimate) the sigma for the population. How do
you figure out the sample size?
A19. Set m>= z* sigma/sqrt(n) and solve that inequality for n. As usual, you use as z*
the z score that has (1-C)/2 area to the right of that score.
Q20. Some of the problems in the use of confidence intervals can be surmounted by
getting a large enough sample size – with this, the distribution of sample means can be
considered normal even if the population isn’t normal. Also, with a large enough sample
size, the sample standard deviation is close to the population standard deviation. What’s
the main problem that can’t be overcome with a large sample size?
A20. That the data should be a SRS of a population, and that the margin of error in a
confidence interval covers only random sampling errors. If you are dealing with a
convenience sample (as in the vast majority of research studies) then you can go ahead
and compute a confidence interval, but the results are generalizable only to the population
of which this particular sample is representative, and sometimes it’s impossible to say
what that population is.
Chapter 10, Section 2
Q21. True or false: The basic reasoning for significance testing is: an outcome that would
happen rarely if a claim were true is good evidence that the claim is not true.
A21. True. (Note: there are problems with this basic presumption.)
Q22. In doing statistical tests, the first step is to identify what you want to make
conclusions about. Are you always wanting to make conclusions about sample statistics,
or population parameters? Or is it sometimes one or sometimes another?
A22. Always the population parameter.
Q23. What does a “null hypothesis” typically state?
A23. That there is no difference between two parameters, or no effect, or no change (or
that a parameter is equal to a certain value).
Q24. A significance test works by assessing how likely the ______ _____ would be if
the ____ _____ were true.
A24. observed outcome, null hypothesis
Q25. True or false: the p-value is the probability of getting exactly the results we
observed, presuming the null hypothesis to be true.
A25. False. The probability of getting exactly the results we obtained is almost always
very small (or even theoretically 0 when dealing with continuous distributions). The p
value is the probability of getting results as extreme, or more extreme, than the actually
observed results; “extreme” means “far from what we would expect if the null hypothesis
were true.”
Q26. We are more likely to reject the null hypothesis of “no difference” or “no effect,”
and infer that there is a difference or an effect, when the P-value is large, or small?
A26. Small
Q27. The null hypothesis has to do with a population parameter; in analyzing your
sample data you calculate a ______ that estimates that population parameter.
A27. Statistic (the phrase, “sample statistic” is correct but redundant.)
Q28. When a drug company researcher is hoping to find evidence that a drug is better
than placebo, is the researcher wishing to reject, or fail to reject, the null hypothesis?
A28. Reject. The null hypothesis would be that drug and placebo are equal in effect.
Q29. Suppose someone is testing a drug versus placebo. If the researcher is interested
only in the alternative that the drug is better than placebo, then the alternative hypothesis
is _____-sided, but if the researcher both harmful effects (drug worse than placebo) and
beneficial effects (drug better than placebo) as rejections of the null hypothesis, then the
alternative hypothesis is ____-sided.
A29. one, two
Q30. What is the meaning of the significance level, or alpha?
A30. It’s a threshold level for the p-value that we consider decisive, with which the
obtained p-value is compared.
Q31. Do we reject the null hypothesis when the p value is less than alpha, or greater than
alpha?
A31. Less than alpha.
Q32. Are we more likely to reject the null hypothesis with a larger alpha, or a smaller
alpha, all other things equal?
A32. With a larger alpha. The larger alpha is, the more likely the p-value we obtain is less
than that cutoff. Thus smaller p-values, and larger alphas, dispose to rejection of the null.
Q33. If a test is statistically significant at the .05 level, what does that mean?
A33. That the p-value obtained is less than or equal to .05.
Q34. Someone finishes writing up a statistical test by saying, “In conclusion, p=.021.”
What step of the “inference toolbox” are they leaving out, that should come after what
they said?
A34. Interpreting the results in the context of the problem. So they should say something
like, “Therefore we reject the hypothesis that drug and placebo are equal; our study gives
evidence that our drug is more effective than placebo.”
Q35. When we are testing the hypothesis that a population mean is equal to a certain
hypothesized value, in the unlikely situation where we know the population standard
deviation, what is our test statistic?
A35. The standardized sample mean, which is the sample mean minus the hypothesized
population mean, divided by the standard deviation of the sample means. This is also
known as the one-sample z statistic.
Q36. What distribution does the one-sample z statistic, a.k.a. the standardized sample
mean, have when the null hypothesis is true?
A36. The standard normal distribution.
Q37. True or false: for a one-sided test (or a one-sided alternative hypothesis), results
extreme in one direction are counted as evidence against the null hypothesis; for a twosided test (or a two-sided alternative hypothesis), results extreme in either direction are
counted as evidence against the null hypothesis.
A37. True.
Q38. Please explain why the two-sided p-value is double that of the one-sided p-value.
A38. The p-value is the probability of getting results as deviant as, or more deviant than,
the results obtained. For a two sided test, we add the probability of getting results deviant
in both directions to get the total p value; for symmetrical distributions, those two
probabilities are equal, thus amounting to twice the value for any single direction.
Q39. How do you compute the one sample z statistic?
A39. Sample mean minus hypothesized population mean, over the sd of the sampling
distribution of the mean (that sd is sigma/sqrt(n) ).
Q40. True or false: What is meant by doing “tests with fixed significance level” for a
one-sample z test is that you become aware of what the cutoff (or critical) values are for z
for the alpha you’ve picked. If the z your data yield is more extreme than the z for the
alpha you’ve picked, the test is significant at the specified level of alpha. This method is
most useful for those who don’t have access to calculators or computers that will give a
p-value directly.
A40. True.
Q41. True or false: If you obtained a 95% confidence interval for a mean that ranged
from 10 to 30, then a null hypothesis that the mean was equal to any value outside that
range would be rejected and a null hypothesis of a mean within that range would not be
rejected, at the .05 level, using a two-sided test.
A41. True.
Chapter 10, Sections 3 and 4
Q42. Suppose you thought your research would overturn a conclusion that many people
had held for a long time. Would you tend to choose a smaller alpha (thus necessitating a
larger sample size to reject the null hypothesis) or a larger alpha?
A42. A smaller alpha. You know that people will be skeptical of your conclusion, so
you'd be better off getting a larger sample size that enables you to say, "This result is
significant at the .001 level" rather than just the .05 level.
Q43. Suppose you had limited subjects with whom to work, and you were looking for
evidence of toxicity from a chemical. The consequences of declaring that the chemical is
safe when it isn't are very bad. The consequences of declaring the chemical dangerous
when it isn't are primarily that more studies would be done than you have the resources to
do. Given these consequences, would you tend to set alpha higher, or lower?
A43. Higher. You want to give yourself as good a chance as possible to reject the null,
under these conditions.
Q44. True or false: If you report the p-value itself, rather than saying, that p<.05, you in a
sense let the readers of your journal set their own alpha, i.e. make their own decision as to
whether they want to reject the null hypothesis given the p-value you report.
A44. True.
Q45.True or false: P-values slightly over .05 should not be considered statistically
signficant.
A45. False. Where we draw the cutoff depends on the stakes of the various outcomes of
the decision we are trying to make.
Q46. Suppose we test a drug with a very large number of subjects. We find that on a 60point rating scale, the drug group has a mean depression score rating that is 2 points
lower than the placebo group. The p-value is .03. Someone is likely to say that the
difference is ____ significant but not ____ significant.
A46. Statistically, practically
Q47. A researcher designs a study, gathers data, punches the data into the computer, runs
a significance test, and interprets the result based on the significance test. What important
step is being left out? Please give one reason why this step is important.
A47. Inspecting the data from graphs. This is important because 1) outliers can have big
effects on significance tests, 2) you want to see how closely the data follow a normal
distribution, especially with small samples; and 3) you want to see if an effect is visible
in the plots; if not, perhaps the effect is not practically significant.
Q48. Suppose you are trying to find the effect of lighting levels on worker productivity.
But your design has confounded lighting level with the effect of workers being observed
and monitored; the effect of monitoring gives a false impression that lighting makes a
difference. This situation would be referred to as the ____ effect.
A48. Hawthorne
Q49. If there are bad design problems, is it likely, or unlikely, that sophisticated
inferential statistical analysis techniques can get around these problems to produce valid
inferences?
A49. Unlikely.
Q50. Should an ethical statistician attempt to produce inferential statistics on a data set
that did not involve random sampling or random assignment?
A50. Definitely yes. Much data which was not produced by random methods can
nonetheless be treated as if it were, to produce very useful results.
Q51. Someone is looking for drugs or dietary factors that could prevent Parkinson's
Disease. The researcher gives a questionnaire to a random sample of people with the
disease and of people without it. The researcher looks at the mean intake of each of 100
substances for people with and without Parkinson's, and tests each one to see if there is a
signficant difference in intake between the two groups. The researcher comes out with
about 5 substances that are significantly different between the two groups, at the .05
level. The researcher recommends that people adjust their intakes accordingly. Do you
have a problem with this? If so, what?
A51. Yes -- this is the multiple comparison problem. Out of 100 hypothesis tests you
would expect about 5 to come out significant by chance alone. It could be that the effects
are real, but it is very possible that they have happened by chance alone.
Q52. Our textbook agrees with your instructor, that (confidence intervals, tests of
signficance) are underutilized, whereas _______(same options) are perhaps overutilized.
A52. confidence intervals, significance tests
Q53. Our book points out that "many tests run at once will probably produce some
significant results by chance alone." What does you logical reasoning tell you about
many tests run sequentially, each testing one association, in a separate study? (The
answer to this isn't specifically discussed in our book, but tell what you think.)
A53. Logically, the same problem holds. And if people choose to publish the significant
findings and relegate the non-significant ones to the "file drawer," there is a distortion of
the truth just as there can be from the multiple comparison problem.
Q54. What is acceptance sampling?
A54. Sampling a subset of a product and using statistical methods to decide whether to
accept or throw out the whole batch on the basis of the sample.
Q55. When we are thinking of statistical inference as making a decision, as in acceptance
sampling, what is different about the status of the null hypothesis and the alternative
hypothesis from null hypothesis significance testing?
A55. There's no special status allocated to the null hypothesis; there is a null hypothesis
and an alternative hypothesis that we are trying to decide between.
Q56. Suppose the null hypothesis is that a drug has no effect. If this is true and our
analysis yields the decision that the drug is effective, that is a _____ error; if the drug has
an effect but our analysis concludes that there is no effect, that is a ______ error.
A56. Type 1, Type 2
Q57. If we set alpha at .01, what is the probability of a type 1 error given that the null
hypothesis is true?
A57. The probability is .01.
Q58. Somebody does a statistical test and rejects the null with p less than the preset alpha
of .05. Someone else says, "There is a 5% chance that your result is a type 1 error." Is
this correct?
A58. Despite the sentence in our text on page 596 that reads, "The significance level of
any fixed level test is the probability of a Type 1 error," the answer to this is no. The 5%
probability is a conditional probability: the probability of rejecting the null GIVEN that
the null is false. The person who made the statement above is mistakenly thinking that
"Given that the null is rejected, there is a 5% chance that the null is rejected falsely."
To understand better why such reasoning is wrong: suppose we are testing whether
running really fast raises the heart rate of previously resting college students. We set an
alpha of .05 when comparing the resting heart rates and the immediate post-running heart
rates. When we find a huge effect of our intervention, what is the chance that it's a type 1
error? Approximately zero, because it's not an error at all.
On the other hand, if we are testing the effects of a drug with absolutely no effect,
versus placebo, then if we rejected the null, the chance that we had made a type 1 error
would be 100%; before doing the test, our probability of making a type 1 error would be
5%.
Q59. Suppose someone says, "I'm testing whether a certain mean equals 0 -- that's my
null hypothesis. I'm setting alpha at .05. I have a sample size of 50. The population
standard deviation, we're assuming, is known to be 1. Can you please calculate a
probability of a type 2 error for me? (Or the "power" of my test, which is 1-the
probability of a type 2 error, the probability of rejecting a false null?)
Can you calculate this, or is there something missing, and if something's missing,
what is it?
A59. The probability of rejecting a false null depends upon "how false" the null is. If the
true mean of the population the researcher is drawing from is really 100, the probability
of rejecting the false null is very close to 1. If the population mean is really 0.000001, the
probability of rejecting the false null is very close to 0. So what's missing is a
hypothesized mean that represents "how false" the null is.
Q60. Suppose you are given a hypothesized true mean, a null hypothesis mean, an alpha,
a sample size, and a population sd, and the right to assume a normal distribution of the
mean you obtain. How do you calculate the probability of a type 2 error?
A60. You figure out the region where the null hypothesis is not rejected, which is, for a
.05 alpha, 1.96 sample standard deviations on either side of the null hypothesized mean.
Then you use the normal cumulative distribution function for the distribution N(mu,
sigma), where mu and sigma are the population mean and sd, to find the probability that
the sample mean will fall within the interval you've calculated.
Q61. The probability that a fixed level alpha signficance test will reject the null when a
particular alternative value of the parameter is true is called the _____ of the test against
that alternative.
A61. Power
Q62. The power of a test is what function of the probability of a type 2 error?
A62. 1 minus prob of type 2 error.
Q63. True or false: the p-value tells what would happen if we tested many samples, when
the null hypothesis is true; the power tells what would happen if we tested many samples,
when a particular alternative hypothesis is true.
A63. True.
Q64. When alternative hypothesis parameter is very close to that of the null hypothesis,
the power tends to be (lower than, higher than, the same as) the power when the
alternative hypothesis parameter is very far from that of the null hypothesis.
A64. Lower than.
Q65. What happens to power as you uncrease sample size, all other things equal?
A66. It increases.
Q67. What happens to power as you increase the diversity of the population you are
looking at, i.e. increase the population standard deviation of whatever parameter you are
estimating?
A67. Power goes down.
Q68. What happens to power as you increase the precision and accuracy of your
measuring devices?
A68. Power goes up, because the more accurately the phenomenon is measured, the less
the standard deviation of the measurments becomes.
Q69. If too many tests of a hypothesis have insufficient power, then what happens: (a
false null tends to become believed to be true, or a true null tends to become believed to
be false)?
A69. The false null tends to become accepted as true. This is because study after study
fails to reject the null.