Download 1-Ch9.1-HT-INTRO-S15

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Inductive probability wikipedia , lookup

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Regression toward the mean wikipedia , lookup

German tank problem wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
7.2 material - Due next class – Completing this assignment will help you understand the topic covered next class
Introduction to Chapter 9 – Let’s get ready for Testing Hypothesis – due class after exam 2
Name_______________
Complete pages 1 and 2, 3, and 4, except for the simulations. Then, read pages 5 and 6 many times to understand the problem.
1) X~N(mu = 0, sigma = 1)
We select a random sample of size 10 from this population.
a) Describe the distribution of sample means for samples of size 10: shape, mean and standard error. Round the standard error to
five decimal places.
b) Sketch the distribution of sample means for samples of size 10, labeling on the axis the values of the mean and 1, 2, 3
standard errors on either side of the mean. For convenience, use two decimal places for the standard error calculated in (a).
c)
Suppose we select a sample of size 10 and we observe an x-bar of 0.3. Locate x-bar = 0.3 on the graph from part (b) and find
the z-score of 0.3. (Use the z-score formula from chapter 7. Use the standard error with 5 decimal places as in (a))
Z=
Recall: values that are more than two standard errors from the mean are unusual – select one of the following statements:
d)

x-bar = 0.3 is a usual score; it can easily occur by chance when mu = 0

x-bar = 0.3 is an unusual score; it would rarely occur just by chance if mu = 0
Suppose we select a sample of size 10 and we observe an x-bar of 1.02. Locate x-bar = 1.02 on the graph from part (b) and
find the z-score of 1.02. (Use the z-score formula from chapter 7. Use the standard error with 5 decimal places as in (a))
Z=
Recall: values that are more than two standard errors from the mean are unusual – select one of the following statements:
e)

x-bar = 1.02 is a usual score; it can easily occur by chance when mu = 0

x-bar = 1.02 is an unusual score; it would rarely occur just by chance if mu = 0
If we select a sample of size 10 from the population, what is the probability that x-bar is 0.3 or more?
Use the calculator and indicate calculator input and output.
P(x-bar > 0.3) = normalcdf(
If the probability is lower than or equal to 0.05, a score is unusual. Is 0.3 unusual?
f)
If we select a sample of size 10 from the population, what is the probability that x-bar is 1.02 or more?
Use the calculator and indicate calculator input and output.
P(x-bar > 1.02) = normalcdf(
If the probability is lower than or equal to 0.05, a score is unusual. Is 1.02 unusual?
1
g) The probability obtained in part (e) says that the likelihood of observing a sample mean x-bar of 0.3 or more in a sample of
size 10 is ______________. Suppose you are not sure whether the population mean is mu = 0 or mu >0. What will this
probability suggest? Circle one:
(1) The population mean is mu = 0
OR
(2) The population mean is mu > 0
h) The probability obtained in part (f) says that the likelihood of observing a sample mean x-bar of 1.02 or more in a sample of
size 10 is ______________. Suppose you are not sure whether the population mean is mu = 0 or mu >0. What will this
probability suggest? Circle one:
(1) The population mean is mu = 0
OR
(2) The population mean is mu > 0
i) (We’ll do in class) - SIMULATION of problem 1 – follow directions:
Let’s simulate selecting a sample of size 10 from a normal population with mean 0 and standard deviation of 1 and finding the mean
of the 10 numbers. We can accomplish this simulation ALL IN ONE STEP by doing:
Mean(RandNorm(0, 1, 10)) and pressing ENTER
Note: Mean is in 2nd STAT, arrow to MATH
RandNorm is in MATH arrow to PRB
Every time you press ENTER, the mean of the 10 selected numbers is given. Record this mean on the table below (rounded to three
decimal places). Do this process 10 times (by pressing ENTER 10 times) and record the ten sample means on the top row of the table.
Write xbars
Here
Is it ≥
0.3?
Is it ≥
1.02?
Record your experimental probabilities:
P(x-bar > 0.3) = Exp.Pr ob 
number of times you got 0.3 or more
=
10
P(x-bar > 1.02) = Exp.Pr ob 
number of times you got 1.02 or more
=
10
2
2) X ~ N (mu = 18, sigma = 4.2)
We select a random sample of 45 numbers from this population.
a) Describe the distribution of sample means for samples of size 45: shape, mean and standard error. Round the standard error to
five decimal places.
b) Sketch the distribution of sample means for samples of size 45 labeling on the axis the values of the mean and 1, 2, 3
standard errors on either side of the mean. For convenience, use two decimal places.
c)
Suppose we select a sample of size 45 and we observe an x-bar of 17.2. Locate x-bar = 17.2 on the graph from part (b) and
find the z-score of 17.2. . (Use the z-score formula from chapter 7. Use the standard error with 5 decimal places as in (a))
Z=
Recall: values that are more than two standard errors from the mean are unusual – select one of the following statements:

x-bar = 17.2 is a usual score; it can easily occur by chance when mu = 18

x-bar = 17.2 is an unusual score; it would rarely occur just by chance if mu = 18
d) Suppose we select a sample of size 45 and we observe an x-bar of 14.68. Locate x-bar = 14.68 on the graph from part (b) and
find the z-score of 14.68. (Use the z-score formula from chapter 7. Use the standard error with 5 decimal places as in (a))
Z=
Recall: values that are more than two standard errors from the mean are unusual – select one of the following statements:
e)

x-bar = 14.68 is a usual score; it can easily occur by chance when mu = 18

x-bar = 14.68 is an unusual score; it would rarely occur just by chance if mu = 18
If we select a sample of size 45 from the population, what is the probability that x-bar is 17.2 or less?
Use the calculator and indicate calculator input and output.
P(x-bar < 17.2) = normalcdf(
If the probability is lower than or equal to 0.05, a score is unusual. Is 17.2 unusual?
f)
If we select a sample of size 45 from the population, what is the probability that x-bar is 14.68 or less?
Use the calculator and indicate calculator input and output.
P(x-bar < 14.68) = normalcdf(
If the probability is lower than or equal to 0.05, a score is unusual. Is 14.68 unusual?
3
g) Based on the likelihood (probability) of the event (answer to part (e)________): If you selected a sample of size 45 and you
are not sure whether the population mean is mu = 18 or mu < 18, what would you conclude if the sample mean x-bar is 17.2?
(1) The population mean may be mu = 18
OR
(2) The population mean is mu < 18
h) Based on the likelihood (probability) of the event (answer to part (f)________): If you selected a sample of size 45 and you
are not sure whether the population mean is mu = 18 or mu < 18, what would you conclude if the sample mean x-bar is
14.68?
(1) The population mean may be mu = 18
OR
(2) The population mean is mu < 18
i) (We’ll do in class) - SIMULATION of problem 2
Let’s simulate selecting a sample of size 45 from a normal population with mean 18 and standard deviation of 4.2 and finding the
mean of the 45 numbers.
Mean(RandNorm(18,4.2,45))
ENTER
Do this process 10 times (by pressing ENTER 10 times). Round the sample means to 3 decimal places and record on the top row of
the following table
Write xbars
Here
Is it ≤
17.2?
Is it ≤
14.68?
We’ll collect class results here and find the experimental probabilities:
P(x-bar < 17.2) = Exp.Pr ob 
number of times you got 17.2 or less
=
10
P(x-bar < 14.68) = Exp.Pr ob 
number of times you got 14.68 or less
=
10
4
HERE WE HAVE THE REAL STORY for all the numbers we have been using on problem 1
Diet colas use artificial sweeteners to avoid sugar. These sweeteners gradually lose their sweetness over time.
Manufacturers therefore test new colas for loss of sweetness before marketing them. Trained tasters sip the cola
along with drinks of standard sweetness and score the cola on a “sweetness score” of 1 to 10. The cola is then
stored for a month at high temperature to imitate the effect of four months storage at room temperature. Each
taster scores the cola again after storage. This is a matched pairs experiment. Our data are the differences (score
before storage minus score after storage) in the taster’s scores. The bigger these differences, the bigger the loss
of sweetness
Suppose we know that for any cola, the sweetness loss scores vary from taster to taster according to a Normal
distribution with standard deviation sigma = 1. The mean mu for all tasters measures loss of sweetness and is
different for different colas.
Here are the sweetness losses for a new cola, as measured by 10 trained tasters:
2.0, 0.4, 0.7, 2.0, -0.4, 2.2, -1.3, 1.2, 1.1, 2.3
Most are positive. That is, most tasters found a loss of sweetness, (Sweetness before – sweetness after > 0) and
two tasters thought the cola gained sweetness. (Sweetness before – sweetness after < 0)
The average sweetness loss is given by the sample mean x-bar = 1.02. Are these data good evidence that the
cola lost sweetness in storage?
SOLUTION
A hypothesis test is to be performed to decide whether the cola has lost sweetness.
We have two conflicting hypotheses:
(1) The cola did not lose sweetness: Sweetness before – sweetness after = 0
(2) The cola lost sweetness: Sweetness before – sweetness after > 0
mu = 0
mu > 0
We assume that the cola did not lose sweetness (that is, mu = 0) and analyze the observed x-bar = 1.02 as a
member of the x-bar distribution when mu = 0
From page (1), our z-score = 3.23 and the probability of 0.0006 tells us that this number does not “fit” very well
in this distribution.
An x-bar = 1.02 is way out on the Normal curve; so far out that an observed value this large would rarely occur
just by chance if the true mu were 0. This observed value is good evidence that the true mu is greater than
0, that is, that the cola lost sweetness. If such an x-bar is observed, the manufacturer must reformulate
the cola.
The basic LOGIC in Hypothesis testing is simple: an outcome that would rarely happen if
a hypothesis were true is good evidence that the hypothesis is not true.
If, under a certain assumption, the probability of an observed event is very low, there is
good evidence that the assumption is not true
THIS REASONING APPLIED TO OUR PROBLEM:
Assumption: the colas have not lost sweetness (mu = 0)
Observed: in a group of 10 colas, the x-bar = 1.02,
Assuming mu = 0, we observe that P(x-bar > 1.02) = 0.0006 very low
Conclusion: There is evidence that the assumption of mu = 0 is not true. Sample results imply that mu > 0
which indicates that the cola lost sweetness
5
HERE WE HAVE THE REAL STORY for all the numbers we have been using on problem 2
The Food and Nutrition Board of the National Academy of Sciences states that the recommended daily
allowance (RDA) of iron for adult females under the age of 51 is 18 mg. Assume that the population standard
deviation is sigma = 4.2 mg. A researcher thinks that adult females under the age of 51 get on average less
than the RDA of 18 mg. of iron.
In order to test his claim, he selects a random sample of 45 women under 51-years-old and records their iron
intake (in mg.) during a 24-hour period obtaining a sample mean of 14.68 mg. Is this suggesting that women
under 51 years of age are getting less than the recommended amount of 18mg daily?
REASONING:
We have two conflicting hypotheses:
(1) Women under 51 get their RDA of iron of 18 mg:
(2) Women under 51 get on average less than 18 mg of iron daily:
mu = 18
mu < 18
We assume that they actually get the recommended mu = 18 and analyze the observed x-bar = 14.68 as a
member of the x-bar distribution when mu = 18
From page (3), our z-score = -5.30 and the probability of 0.00000006 tells us that this number does not “fit”
very well in this distribution.
An x-bar = 14.68 is so far out on the Normal curve, that would rarely occur just by chance if the true mu were
18. This observed value is good evidence that the true mu is in fact lower than 18. Such an x-bar will give
evidence that adult females under the age of 51 are, on average, getting less than the RDA of 18 mg. of iron.
The basic LOGIC in Hypothesis testing is simple: an outcome that would rarely happen if
a hypothesis were true is good evidence that the hypothesis is not true.
If, under a certain assumption, the probability of an observed event is very low, there is
good evidence that the assumption is not true
THIS REASONING APPLIED TO OUR PROBLEM:
Assumption: mean daily intake of iron for women under 51-years of age = 18 mg
Observed: in a group of 45 women under 51 years of age, the mean iron intake x-bar is 14.68
Assuming mu = 18, we observe that P(x-bar<14.68) is 0.00000006 (very low)
Conclusion: this is good evidence to say that the mean iron intake of 18mg does not seem to be true. With a
certain level of confidence we can say that women under 51-years of age are actually getting less than the daily
recommended 18 mg of iron.
6
Problem 1 again
Diet colas use artificial sweeteners to avoid sugar. These sweeteners gradually lose their sweetness over time. Manufacturers therefore
test new colas for loss of sweetness before marketing them. Trained tasters sip the cola along with drinks of standard sweetness and
score the cola on a “sweetness score” of 1 to 10. The cola is then stored for a month at high temperature to imitate the effect of four
months storage at room temperature. Each taster scores the cola again after storage. This is a matched pairs experiment. Our data are
the differences (score before storage minus score after storage) in the taster’s scores. The bigger these differences, the bigger the loss
of sweetness
Suppose we know that for any cola, the sweetness loss scores vary from taster to taster according to a Normal
distribution with standard deviation sigma = 1. The mean mu for all tasters measures loss of sweetness and is
different for different colas.
Here are the sweetness losses for a new cola, as measured by 10 trained tasters:
2.0, 0.4, 0.7, 2.0, -0.4, 2.2, -1.3, 1.2, 1.1, 2.3
Most are positive. That is, most tasters found a loss of sweetness, (Sweetness before – sweetness after > 0) and two tasters thought the
cola gained sweetness. (Sweetness before – sweetness after < 0)
Is this good evidence that the cola lost sweetness in storage?
a. What are the null and alternative hypotheses?
b. Will you use a right-tailed test, a left-tailed test, or a two tailed test?
c. Sketch the normal curve for the sampling distribution of x when the null hypothesis is true.
Shade the area that represents the p-value for the observed outcome.
d. Find the value of the test statistic.
e. Calculate the p-value.
f. Is the result statistically significant at the   .05 level?
g. Is the result statistically significant at the   .01 level?
h. Do you think that there is convincing evidence that the mean sales are higher?
7
The Food and Nutrition Board of the National Academy of Sciences states that the recommended daily
allowance (RDA) of iron for adult females under the age of 51 is 18 mg. Assume that the population standard
deviation is sigma = 4.2 mg. A researcher thinks that adult females under the age of 51 get on average less
than the RDA of 18 mg. of iron.
In order to test his claim, he selects a random sample of 45 women under 51-years-old and records their iron
intake (in mg.) during a 24-hour period obtaining a sample mean of 14.68 mg. Is this suggesting that women
under 51 years of age are getting less than the recommended amount of 18mg daily?
i. What are the null and alternative hypotheses?
j. Will you use a right-tailed test, a left-tailed test, or a two tailed test?
k. Find the value of the test statistic x .
l. Sketch the normal curve for the sampling distribution of x when the null hypothesis is true.
Shade the area that represents the p-value for the observed outcome.
m. Calculate the p-value.
n. Is the result statistically significant at the   .05 level?
o. Is the result statistically significant at the   .01 level?
8
p. Do you think that there is convincing evidence that the mean sales are higher?
9