Download File

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Confidence interval wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Comparing Two Means
Lesson 3 and Lesson 4: Section 10.2
Homework Even Answers – Lesson 1 & 2

Lesson 1: p.621 #1, 3, 5, 7-10, 11, 13
8.) Random. Normal Condition not met.
10.) Normal condition is not met.

Lesson 2: p. 623 #15, 17, 20, 21, 23, 25, 28
20.) Here are Min Jae’s mistakes:
(1) Random: This is a randomized experiment.
(2) Proportions should be: pˆ1  30  0.60 and pˆ 2  22  0.44
50
(3) Test statistic should be:
z
50
(0.60  0.44)  0
 1.60
(0.52)(0.48) (0.52)(0.48)

50
50
(4) The new P-value is P(Z>1.60) = 0.0548
Homework Even Answers – Lesson 1 & 2

Lesson 2: p. 623 #15, 17, 20, 21, 23, 25, 28
28. (a) If not, it would break the independence rule.
(b) State: H0: p1 – p2 = 0 versus Ha: p1 – p2 ≠ 0 where p1 = the actual
proportion of people who speed in the presence of radar and p2 = the
actual proportion of people who speed when there is no radar obvious.
Plan: Two sample z-test for p1 – p2.
Random: Both samples were selected randomly.
Normal: 1051, 2234, 5690, 7241 are all at least 10
Independent: There are more than 32,80 cars when
there is radar and 129,310 cars when there is no radar
Do: z = 12.47, P-value ≈ 0.
Conclude: Since P-value < 0.05, we reject H0. We have enough
evidence to conclude that there is a difference in the proportions of
speeders when ther is obvious radar present and when there isn’t.
objectives
Lesson 3:
 Describe the characteristics of the sampling distribution of
x1  x2
x1  x2

Calculate the probabilities using the sampling distribution of

Determine whether the conditions for performing inference are met.
Lesson 4:
 Using two-sample t procedures to compare two means based on summary
statistics.

Use two-sample t procedures to compare two means from raw data.

Perform a significance test to compare two means.
the sampling distribution of x1  x2
 Shape: If the population is Normal, then x1  x2


will be Normal. In other cases, it will be Normal if the
sample size is large enough (n1 ≥ 30 and n2 ≥ 30)
Center:  x1  x2  1   2
Spread:
 12  2 2
 x1  x2 

n1
n2


As long as the sample is ≤ 10% of the population
This is on your formula sheet under Two-Sample!
example 1: potato chips





A potato chip manufacturer buys potatoes from two different suppliers,
Riderwood Farms and Camberley, Inc.
The weights of potatoes from Riderwood Farms are approximately
Normally distributed with a mean of 175 grams and a standard deviation
of 25 grams.
The weights of potatoes from Camberley are approximately Normally
distributed with a mean of 180 grams and a standard deviation of 30
grams.
When shipments arrive at the factory, inspectors randomly select a sample
of 20 potatoes from each shipment and weigh them. They are surprised
when the average weight of the potatoes in the sample from Riderwood
Farms xr is higher than the average weight of the potatoes in the sample
from Camberley xc .
(a) Describe the shape, center, and spread of the sampling distribution of
xc  xr

The shape of the sampling distribution of xc  xr is approximately Normal
because both populations are approximately Normal. The mean of the
sampling distribution is 180 – 175 = 5 grams, and it standard deviation is
 x x
c
r
252 302


 8.73grams
20 20
example 1: potato chips

(b) Find the probability that the mean weight of the
Riderwood sample is larger than the mean weight of the
Camberley sample. Should the inspectors have been
surprised?
If the mean of the Riderwood sample is larger, then xc  xr
must be negative.
05
z

P( xc  xr  0 ) = P(
)=
8.73
P( z<-0.57) = (Using Table A) = 0.28
(Draw a picture!)


The inspectors should not be surprised, because the
Riderwood sample will have a higher mean more than onefourth of the time.
Some things to know about two sample
means:


We rarely know the population standard deviation, so we
use our sample’s standard deviation sx  which means
we use a t statistic
If you do not know the population standard deviation,
then you will use sx in the standard error formula: s 2 s
1
n1


2
2
n2
The t statistic (tells the number of standard deviations
from the mean) is calculated as such: t  ( x1  x2 )  ( 1   2 )
2
2
s1 s2

n1 n2
conditions for intervals and tests


Confidence intervals and significance test for 2 sample
means have the same conditions (and the same
standard error.)
Conditions:



Random: both samples are collected by a random sample or
from a randomized experiment.
Normal: If the populations are Normal, then this condition is
satisfied. If the populations are not Normal, then the sample
size needs to be large (n1 and n2 are≥ 30. ) If the sample size
is smaller, make a graph to check for any major
skewness/outliers.
Independent: Both the samples or groups themselves and the
individual observations in each sample or group are
independent. When sampling without replacement, check the
10% rule on both populations.
Two Sample t Interval for a Difference
Between Two Means

When the conditions are met, an approximate level C
confidence interval for x1  x2 :
2
2
s1 s2
( x1  x2 )  t *

n1 n2

Where t* is the critical value for the confidence level C for
the t distribution (in Table B) with degrees of freedom from
either Option 1 (technology) or Option 2 (the smaller of
n1 – 1 and n2 – 1)
example 2: plastic grocery bags
Do plastic bags from Target or plastic bags from Bashas hold
more weight? A group of AP Statistics student decided to
investigate by filling a random sample of 5 bags from each
store with common grocery items until the bags ripped.
Then they weighed the contents of items in each bag to
determine its capacity. Here are their results, in grams:
Target: 12,572 13,999 11,215 15,447 10,896
Bashas: 9,552
10,896 6,983
8,767
9,972
 Problem: (a) Construct and interpret a 99% confidence
interval for the difference in mean capacity of plastic grocery
bags from Target and Bashas.
 (b) Does your interval provide convincing evidence that
there is a difference in the mean capacity between the two
stores?

example 2: plastic grocery bags

(a) STATE: We want to estimate µT – µB at the 99%
confidence level where µT = the true mean capacity of plastic
bags from Target (in grams) and µB = the true mean capacity
of plastic bags from Bashas (in grams).

PLAN: We should use a two-sample t interval for µT – µB if
the conditions are satisfied.



Random: The data came from separate random samples.
Normal: Small sample size, so graph the data. (no obvious
skewness of outliers, it is safe to use t procedures)
Independent: The sample were taken independently and there
were at least 10(5) = 50 plastic bags at each store.
example 2: plastic grocery bags

(a) DO: For these data, xT  12,825, sT  1912.5, xB  9234, sB  1474.2
Using the conservative degrees of freedom 5-1 = 4, the
critical value for 99% confidence is 4.604 (using Table B).
1474.2 2 1912.52
(12,826  9234)  4.604

5
5
 = 3592 ± 4972 = (-1380, 8564)

CONCLUDE: We are 99% confident that the interval
from -1380 to 8564 captures the true difference in the mean
capacity of plastic grocery bags from Target and from Bashas.

FYI: If we used technology to find the df, it would be = 7.5 and
give us a CI = (-101, 7285). Notice how much narrower this
interval is!
example 2: plastic grocery bags

(b) Since 0 is included in the interval, it is plausible that
there is no difference between the two means. Thus, we do
not have convincing evidence that there is a difference.
(However, if you looked at the graph, I think if we made our
sample size bigger, we would have convincing evidence!)
in your calculator: two sample z interval for
difference between two means

(for another example of a STATE, PLAN, DO, CONCLUDE
with a two sample t interval for a difference between two
means, see pg. 635 in text and CYU on pg. 638)

To check answer in your calculator, press STAT, scroll to
TESTS, and choose option 0: 2-SampTInt
Choose Stats as the input method (you could put data as
long as you put your data into L1 and L2)
Enter the confidence level
For Pooled, choose “no”
Highlight Calculate and press Enter (this is how you can use
technology to find degrees of freedom)




Two Sample t test for a Difference Between
Two Means

When the conditions are met, to test the hypothesis
H0: µ1 – µ2 = 0,

Calculate the t statistic:
t
( x1  x2 )  (hypothesizedvalue)
2
2
s1 s2

n1 n2

Then find the P-value by calculating the probability of getting a t
statistic this large or larger in this direction specified by the
alternative hypothesis Ha using Table B or the calculator.
example 3: the stronger picker upper
In commercials for Bounty paper towels, the manufacturer claims
that they are the “quicker picker upper”. But are they also the
strong picker upper? Two AP Statistics students, Joe and
Courtney, decided to find out. They selected a random sample of
30 Bounty paper towels and random sample of 30 generic paper
towels and measured their strength when wet. To do this, they
uniformly soaked each paper towel with 4 ounces of water, held
two opposite edges of the paper towel, and counted how many
quarters each paper towel could hold until ripping, alternating
brands. Here are the results:
Bounty: 106, 111, 106, 120, 103, 112, 115, 125, 116, 120, 126, 125,
116, 117, 114, 118, 126, 120, 115, 116, 121, 113, 111, 128, 124, 125,
127, 123, 115, 114
Generic: 77, 103, 89, 79, 88, 86, 100, 90, 81, 84, 84, 96, 87, 79,
90, 86, 88, 81, 91, 94, 90, 89, 85, 83, 89, 84, 90, 100,
94, 87

example 3: the stronger picker upper

(a) Display these distributions using parallel box plots (box
plots on the same graph) and briefly compare these
distributions. Based on only the box plots, discuss whether
or not you think the mean for Bounty is significantly higher
than the mean for generic.

The 5-number summary:
Bounty: (103, 114, 116.5, 124, 128)
 Generic: (77, 84, 88, 90, 103)
Both box plots are roughly symmetric, but the generic brand has two high
outliers. The center of the Bounty distribution is much higher than the
center of the generic distribution. The range is roughly the same, but the
IQR of Bounty distribution is larger.
Since the centers are so far apart and there is almost no overlap in the two,
the Bounty mean is almost certain to be significantly higher than the
generic mean.

example 3: the stronger picker upper

(b) Use a significance test to determine whether there is convincing
evidence that wet Bounty paper towels can hold more weight, on
average, than we generic paper towels can.

STATE: We will test H0: µB – µG = 0 versus Ha: µB – µG > 0 at

the 0.05 significance level, where µB = the mean number of
quarters a wet Bounty paper towel can hold and and µG = the
mean number of quarters a wet generic paper towel can hold.
PLAN: We should use a two-sample t test for µB – µG if the
conditions are satisfied.



Random: The students used a random sample of paper towels from each
brand.
Normal: Even there are two outliers in the generic distribution, both
distributions are reasonably symmetric and the sample sizes are both at
least 30.
Independent: The sample were taken independently and there were at least
10(30)=300 paper towels of each brand.
example 3: the stronger picker upper

(a) DO: For these data,
Test statistic:

t
xB  117.6, sB  6.64, xG  88.1, sG  6.30
(117.6  88.1)  0
2
6.64 6.30

30
30
2
 17.64
= Using conservative degrees of freedom 30 -1 = 29 (if you used
technology, df = 57.8) to find the P-value, the P(t>17.64) ≈ 0.

CONCLUDE: Since the P-value is less than 0.05, we reject H0.
We have convincing evidence that the wet Bounty paper towels
can hold more weight, on average, than we generic paper towels.

Conclude in terms of problem: Since the P-value is approximately
0, there is really no chance to get a difference in means of at least
29.5 quarters by random chance if the two brands of paper towels
can hold the same mean amount of weight when wet.
in your calculator: two sample t test for
difference between two means.

(for another example – see pg. 639 in text and CYU on pg.
644)

To check answer in your calculator,
ENTER group 1 in L1 and group 2 in L2
Press STAT, scroll to TESTS, and choose option 4: 2SampTTest
Choose data or stats
Choose your alternate and pooled should say “no”
Highlight Calculate and press Enter (or you can press Draw
to see t distribution)





homework



Assigned reading: p. 627-643
HW problems:
 p. 652 #35, 37, 39, 40, 57
Check answers to odd problems.