Download Monday F Chapters 8

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

German tank problem wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Math 3307
Lecture Notes
Perkowsky text
Monday format
May’13
Jan. 2015
Chapters 8 – 10
Homework Assignments
10 points each problem
Homework 6
Chapter 8
2, 4, 6, 8, 10, 14, 16
Homework 7
Chapter 9
70 points
40 points
2, 4, 6, 12
Homework 8
60 points
Chapter 10 2, 4, 6, 10, 12, 16
Homework style sheet and rules:
Work on one side only; pdf it and upload it before the deadline on the calendar.
Work that is poorly scanned or illegible will be given a zero.
This includes sideways or upside down scans!
Do NOT crowd the work, leave at least 3” between problems.
Label the answers carefully so the grader can grade efficiently.
1
Chapter 8 – Distributions from Random Samples
8.1
Random Sampling
Let’s go with the book’s comment about defining “random” by what it’s NOT:
systematic, logical, having a clear pattern or order.
In statistics, random has to do with the process of picking a sample – each element
in the population has an equally likely chance to be chosen.
Let’s look at Classroom Exploration 8.1 page 235
Let’s read it – will there be repetitions in the scenario?
Plan A
24 cards, one name per card
Plan B
roll a die – the number on top is the row number
Plan A
Plan B
how many possible samples? Equally likely?
how many possible samples? Equally likely?
Question 3 and Question 4
Picking Amy?
Let’s now read page 237 at the top: an exerpt…
Note that in this part of the class we are doing inferential statistics – we want to
infer some conclusion about the population from our work…and we want to
quantify how reliable this conclusion is.
2
Now let’s read the Focus on Understanding project that starts on page 237…and
check out the results from doing it on page 240.
What do you notice about the dot plots?
What can you conclude about small samples vs bigger samples?
Note that we look at a range of values for the mean – why do we do this? What are
we trying to ensure by doing this? Focus on the discussion on page 241 in the
middle of the page for a discussion about these ideas.
3
8.2
The Distribution of Sample Means
The mean of a random sample is an estimator of the true population mean. It can
be a good estimate or a poor estimate. We want to ensure that it’s a good one!
How can we do this?
A
we want the mean to be unbiased
We can check this by finding the expected mean of the SAMPLE means.
If the expected mean is the true mean, then the sample is unbiased.
Operationally, the more perfectly random your samples, the more unbiased
your sample means are.
B
we want a large sample size, not a small one
Operationally, n = 30 is the best minimum sample size, but more is better if
you can afford it!
When we have these, then the distribution of sample means is normally distributed
about the true mean,  .
This is so important! And took so long to discover!
Page 249
The Central Limit Theorem:
Regardless of the distribution of the population being sampled, the distribution of
sample means taken from random samples of size n is approximately normally
distributed when n is large.
See the caution on page 240 at the bottom of the last paragraph.
4
The mean of the sample means is the true population mean and the standard
deviation is the population standard deviation divided by the square root of n.
x  
x 

n
Let’s discuss that standard deviation:
Suppose n is small
Suppose n is large
Now compare the two dot diagrams on page 240 again.
So now, suppose we have 50 samples (random!) and we calculate the mean of each.
We then have a list of sample means as our data.
We find the mean of these sample means and the standard deviation of these sample
means. What do we know about the original population?
We know the means are the same and we can multiply our standard deviation to get
the original population standard deviation. Do you see how?
What DON’T we know?
The shape of the original distribution!
ACTIVITIES 8 #1
5
Let’s look at the example on page 250:
Back to Nicky’s free throws!
Recall her distribution (page 250 – mean is .96). Now we’ll look at a simulation of
size 50.
Let’s go through the calculations to find the mean and standard deviation for the
distribution of the sample means.
How do you find the mean and the standard deviation? What are the formulas?
WHERE are the formulas in the textbook?
6
Now let’s walk through Lauren’s simulation of doing 50 free throws and calculate
the probability that Lauren’s sample mean will be within .1 of the actual mean.
See page 251
Suppose we do this 4 times and take the AVERAGE mean from those 4
attempts…will this be more accurate than doing it just once? Why or why not?
What we are doing here with that “0.1” is finding an error bound or margin of
error. The probability that our estimate is within the given error bound is what we
calculated in this example. The probability is called the “confidence level” of our
estimate. The confidence level of an estimate goes up as n increases.
Let’s review our procedure from a Big Picture viewpoint.
We got our sample and calculated the mean
We then went to z-scores* to find the probability “between”
We used Table 1 or our calculators to get the probability
We described our confidence level in our estimate
*and we used the distribution of the SAMPLE MEANS not the original distribution
in our calculations!
ACTIVITIES 8 #2
7
SD – Problem 1
A company that specializes in data analysis tests all its applicants for employment
by having them solve three short problems that are indicative of the type of work
they will be required to perform. An applicant is given a score from 0 to 10 for
each problem. From the performances of previous applicants, the sampling
distribution of mean scores has been found to be as shown in the table below.
Sketch this distribution on the right:
Mean
Prob
0
.001
1
.005
2
.010
3
.045
4
.060
5
.100
6
.150
7
.350
8
.200
9
.070
10
.009
Use your calculator to find the mean (6.570) and standard deviation (1.63)
Page numbers for formulas:
Check the Empirical Rule on your distribution. What is the z-score for 8?
8
SD
Problem 2
The number of patients admitted per day to a medium-sized regional hospital is 35
with a standard deviation of 10. If, on a given day, there are 60 beds available for
new patients, do you think the hospital will have to divert emergency vehicles to
another hospital?
SD Problem 3
The sampling distribution of X, the number of people who arrive at a cashier’s
counter in a bank per minute is given below:
X
P(X)
0
.36
1
.38
2
.18
3
.06
4
.02
Verify the Empirical Rule.
ACTIVITIES 8 #3
9
8.3
The Distribution of Sample Proportions
Proportions have a place in statistics. And we use a sample proportion from a
random sample to estimate the true proportion of a population that has a specific
property often.
Let’s look at Classroom Exploration 8.3 on page 253…
Let’s look at “drawing more blocks” page 254…
And look at the proportion on the bottom of page 254 to see how this differs a bit
from a sample mean.
Class discussion: What are the differences?
“hat” or caret notation is discussed on page 255 at the top…we have special
notation to use when we are talking about a sample proportion
p
The expected value of “p-hat”
The standard deviation of “p-hat”
page 256
page 258
The distribution – no surprises here!
ACTIVITIES 8 #4
10
SP
Problem 1
Suppose a warship takes 6 shots at a target, and it takes at least 4 hits to sink the
target. If the warship has a record of hitting with 20% of its shots, in the long run,
what is the probability of sinking the target.
Is this binomial?
Sketch the distribution … make a table first.
Answer the question.
11
SP
Problem 2
Let’s consider the 107th Congress: There are 100 senators (2 per state). At that
time, there were 87 males and 13 females.
What is the population proportion of each type of senator? (M/F, NOT R/D)
Suppose we take 5 random samples of size 10.
S1
S2
S3
S4
S5
MFMMFMMMMM
MFMMMMMMMM
MMMMMMFMMM
MMMMMMMMMM
MMMMMMMMFM
Calculate the sample proportions.
Now suppose we go on and do 95 more samples resulting the in the following table:
Sketch the frequency table:
Prop F
Freq
0.0
26
0.1
41
0.2
24
0.3
7
0.4
1
0.5
1
Check that the mean is 0.119 and the standard deviation is 0.100
12
What would the frequency table look like if we did this 10,000 times?
13
SD Problem 3
Here is the population of all 5 US Presidents who had professions in the military
along with their ages at inauguration:
Eisenhower
Grant
Harrison
Taylor
Washington
(62)
(46)
(68)
(64)
(57)
Assume that samples of size 2 are randomly selected WITH REPLACEMENT.
How many samples are possible?
What is the mean age of each sample?
Make a frequency table for these means…is this a sampling distribution?
What is the distribution for these means?
What is the mean of the table? How does this compare with the actual mean of the
presidents?
14
Chapter 9 – Estimating with Confidence
9.1
Confidence Intervals for Proportions
Now for a bit more reality. What if we really DON’T know the population mean
and standard deviation. What if we CAN’T check our sample means or sample
proportions against a true mean? This is usually the case, too.
We know that, within some boundaries, x and p are estimators. Let’s put those
boundaries on and quantity our certainty or confidence about them. We know that a
sample statistic might not be the true value, of course.
So we report the sample statistic with a “margin of error” and a percent.
For example: our sample mean might be 87…so we’d report:
87  5 with a confidence level of 95%
This means that we’re 95% sure that the true mean lies in the interval (82, 93).
Note that if you do 100 samples, the true mean would NOT be in the interval 5
times out of the 100 samples! This is what 95% confidence means…5% bad news!
We do need a large enough, random sample of course! ( n  30 or n  min( p, q)  5 )
You need to be able to approximate your distribution of sample statistics as normal.
Typically you start with an industry standard confidence level (90%, 95%, and 99%
are the usual). And we’re going to work BACKWARDS. We set a confidence
level or use the industry standard…we find the associated z-score…backsolve for
the necessary statistic ( x or p ) , and get our error bound.
15
Let’s step through the example on page 269:
You’d do your random draws and establish the distribution of the statistic. This
time, n = 40 experiments with a population proportion. You’d find the sample
proportion mean.
Let’s reproduce their distribution. Their sample mean is 0.375 and they want a 95%
confidence interval. The sample standard deviation is 0.07655.
Let’s get the z-scores for “95% of the data is inbetween these error bounds”
Sketch the normal curve and place the “CI” on the x-axis.
Half of 5% is 2.5% which is .025. Look in the chart for area .975 (WHY NOT
95%?
)…or use
2nd…DISTR….3:invNORM(.975)…enter
You’ll get a z-score of 1.96
Now we’ll find that 95% of the area, the probability, is between −1.96 and 1.96
Let’s find our error bounds from this:
z-score:
p  p
p
let’s discuss what EACH item is
We KNOW z-score: 1.96
16
1.96 =
.375  pboundaryvalue
.07655
Now let’s look at their equation…do you see where it comes from?
We’ll find that the boundary sample p is 0.525 so this is our error bound. We’ll
report that the mean of our sampling distribution is .375  .525 with 95%
confidence.
Let’s go through the Focus on Understanding at the bottom of page 270…
Note, too that our confidence interval goes from −.15 to .9 Effectively from 0 to
90%....this is because we demanded 95% confidence. If we’d drop down to 60%
confidence, our range would shorten dramatically…
Sketch this ci as though it were perfect.
17
In general with confidence intervals:
High confidence
wide range – long interval
Lower confidence
shorter range – shorter interval
See this on the sketch above.
Page 272
Still working with sample proportions, let’s find a formula for the error bounds with
95% confidence…that’s going to make our computations quite a bit shorter.
We’ll start with their formula from the middle of page 270
p2   p  z2 p
Now let’s do some substituting
p  p  1.96
pq
n
Now our error is the difference between our statistic and our mean:
we want the error to be positive so we can add/subtract it easily.
p p
Substitute in the formula above…the p’s cancel and we get
E  1.96
pq
n
This is somewhat unrealistic! We’ll actually use the p and q from our sampling
distribution rather than the true values that we don’t know!
This formula makes it simple to see how to change out to confidence levels other
than 95%. We just change the “1.96” to a z-score that reflects less confidence.
18
Let’s look at shortening our interval by dropping back to 90% confidence.
Sketch the standard normal curve. Mark off a symmetric area of 90%...how much
area is in each tail?
Translate that to an upper z-score:
2nd…DISTR….3:invNORM(____)…enter
Substitute in the error bound formula:
19
Let’s find the size of the confidence interval from above when the sampling mean
and sd are 0.375 and 0.07655 with n = 40.
These specific z-scores for the confidence levels are call critical values.
Let’s make a table with 99%, 95%, and 90% critical values. How will we find the
one for 99%?
Let’s wrap up this section by checking out the mayfly experiment on pages 278 and
279. First read the example, then let’s go around the room answering the questions.
ACTIVITIES 9 #2
20
Confidence Interval for proportions Problems
Problem 1
Suppose a pollster interviews 1000 voters and finds that 540 favor building a new
elementary school in Houston’s east side.
Find an estimate for the proportion in favor of building the school.
Find a 95% confidence interval
Hints: what is the appropriate z-score and how did you find it
Use the formula: p  z
pq
n
why?
What is the Error Bound and how did you find it? State it in terms of 54%
Ez
pq
n
21
This leads directly to estimating the sample size for a proportion:
z
N =  
 2E 
2
where z is the appropriate z-score for your desired percent
confidence and E is your error bound from above.
ACTIVITIES 9 #3
22
Problem 2
Alcohol abuse has been described by college presidents as the number one problem
on campus, and it is a major cause of death in young adults. How common is it?
In 2000, an article by Henry Wechsler (and colleagues) in the Journal of American
College Health reported the following data.
“Binge drinking” is defined as having five or more drinks in a row for men and four
or more for women. “Frequent” is defined as having three or more binge drinking
times in the past two weeks. The survey included 13, 819 students and 3, 140
students were considered frequent binge drinkers.
Find a 90% confidence interval for how many binge drinkers there are in the
student population of the USA.
23
Problem 3
The most recent Walmart retail worker survey found that 65 out of 100 employees
agreed that work stress had a negative impact on their personal lives. What is the
90% confidence for this information? What is the 98% CI for this information?
How do they compare?
24
Problem 4
One research firm in 2000 interviewed n = 1156 drivers aged 18 – 20 years old and
found that 83% enjoyed driving.
Construct a 95% confidence interval for the proportion of 18 – 20 year olds who
enjoy driving. Identify the error bound.
Do an 80% CI. Compare to the first one…what do you notice?
25
9.2
Confidence Intervals for Means
This is a way to estimate a population mean…it’s an interval with a margin for
error attached. You can have, for example, 92% confidence that the true mean is in
the interval.
Finding a 95% confidence interval for the true mean step by step:
Note that 95% of the data is inside the curve bounded by z = −1.96 and +
1.96…let’s look at why!
Next solve this inequality for  (note: it’s the z-score formula)
1.96 
x
x
 1.96
The hard part is ALWAYS be the standard deviation. Using the adjusted sd will
help.
26
Suppose we have information from the State of Texas Education Coordinating
Board that they have selected 100 tests from fifth graders for analysis (randomly, of
course). The sample mean for the tests is 74.4 and the sample standard deviation is
12.4.
How will we find the true mean for these kids?
How will we describe our ERROR BOUND
and what will we report to the parents and tax payers?
What if we wanted an 85% confidence interval?
A 99% confidence interval?
ACTIVITIES 9 #4
27
So what does it mean when the insert in the medicine box says there’s an allergic
reaction in
.02% of takers plus/minus .01%?
Again, a normal distribution with, say 95% of the area marked of symmetrically.
How much area in the tails…what is the appropriate z-score?
SKETCH:
28
Suppose you want to estimate the mean 4th grade STARR score for the more than
53,000 students in the fourth grade at a Texas school system statewide. At
considerable effort and expense, you give the appropriate STARR test to a SRS of
300 Texas fourth grade students and the mean score is 78 with a standard deviation
of 15. What can you say about your level of confidence in this statistic?
First, it’s normal with a true mean and an adjusted standard deviation!
What are those numbers?
Now, the Empirical Rule says that 95% of the data is within 2 standard deviations
of the mean…let’s look at those numbers – frame the mean with them
What do we have here? What is our level of confidence here? How can we report
this?
29
Problem 1
A study based on a sample of size 35 reported a mean of 93 with a margin of error
of 11 with 95% confidence.
Give the confidence interval.
Discuss where the true mean might be?
What measurement is at the midpoint of the confidence interval?
30
Problem 2
A survey of 1532 recent UH grads found that 175 had loans in excess of $35,000
for their education. Give a 95% confidence interval for the proportion of all student
loan borrowers who have loans this size at UH.
Is this a proportion question or a mean question?
Work the problem!
31
Problem 3
Is a large sample confidence interval valid if the population from which the sample
is taken has a distribution VERY different from a normal one?
Problem 4
A fact long known but little understood is that twins, in their early years, tend to
have lower IQ’s and pick up language more slowly than singletons. Recently,
psychologists have found that this may be caused by benign parental neglect – it’s
too much for most parents dividing time between two babies. A random sample of
46 sets of 2 year old twins is taken and at the end of one week, the attention time
given to each pair is recorded. The mean attention time is 22 hours and the sample
standard deviation is 16 hours.
Using this data find a 90% CI for the mean attention time given to each pair.
32
Technology makes it simpler:
See pages 290 – 291 for getting CI’s from TI’s
33
9.3
Sample Size
Which is better and why
A SR sample size of 1000
A SR sample size of 45
How do we find the ideal sample size? We use the error bound formula from
confidence intervals. Let’s do this:
Solve for n
Proportions
Ez
pq
n
Means
 s 
E  z

 n
34
Problem 1
The variance on the weight of a Hershey’s kiss is 1.5129. How many kisses are
needed to estimate the mean weight of each candy to within 0.1 grams with 85%
confidence?
Proportion or Mean? How do you know?
85% confidence n = _______
What about 90% confidence?
n = _______
35
Problem 2
Suppose you want to estimate the number of people who watched a new television
show. How large a sample would you need if you want to be 95% confident that
your estimate is within 3% of the true value?
Proportion or mean? Why?
Big hint: see the box page 295, in the middle
36
Chapter 10 – Testing Hypotheses
10.1 What is a Hypothesis Test?
One real reason for using statistics is to make a decision. And hypothesis testing
provides a format for doing just that when the topic is the numerical value of a
statistical parameter.
The STEPS are:
Formulate the null hypothesis
(H nought -- Ho)
This is generally a claim like “67% of all dentists use Crest at home”
  .67
And the alternative hypothesis
(Ha)
This would be from the competitors. Both must be in mathematical form and
the hypothesis comes first:
For example
  .67
  .67
Not equal, greater then, or less than all work
Select the appropriate test statistic
Use an expert here – there are many.
We have chosen mean or average so we’d use z-score. But a binomial
hypothesis would use proportions. There are LOTS of distributions we
haven’t studied in this brief course.
37
TS for means
TS for proportions
z=
xx
s
n
z=
p p
pq
n
Determine the decision rule (IN ADVANCE!)
Industry standards loom large here. 70% works for most political situations
with 99% for medical situations. Deciding your level of confidence totally
determines the outcome!
Collect the data/Evaluate the test statistic
We’ve done some work on this.
This is where Sampling Means and proportions show up!
Make a decision
Reject the hypothesis; cannot reject the hypothesis
ACTIVITIES 10 #1
38
Now a four part chart sums up what’s happening
Across the top: the hypothesis (Hnought) is true or false (in reality)
Down the side: Reject; do not reject
Note that we are trying to minimize errors. A Type 1 error (reject Ho when it is
really true) and a Type 2 error (don’t reject it when it is actually false). These are
linked! Minimizing one generally maximizes the other! The other two outcomes
are correct decisions!
39
Let’s look at a courtroom for a non-numerical example
Ho
Ha
the defendant is innocent
the defendant is guilty
The defendant is convicted or not
Set up the 2x2 table. Which error does our system want to minimize!
40
So let’s look at a numerical hypothesis test about means!
Example
Building specifications in Houston require that residential sewer pipe have a
minimum mean breaking strength of 2400 pounds per foot (ppf). A contractor has
been having problems with pipe bought from a particular manufacturer and this
contractor thinks the pipe does not meet the minimum standard. In an attempt to
substantiate this feeling, the contractor hires a testing lab to test a random sample of
55 sections of pipe and finds the following:
x  2340 ppf
s  200 ppf
Is there sufficient evidence to conclude that the contractor is correct?
Use an alpha of 10%...ie a confidence level of 90%
Note that n = 55 is enough to use the idea that the distribution of SAMPLE means is
normally distributed with an adjusted standard deviation.
Ho
Ha
the mean = 2400 ppf
nope, it’s less than 2400 ppf
Picture:
So our critical z-score is z = −1.28 This defines our rejection region! If our test
statistic is below this critical value we’ll reject Ho and go with Ha.
41
Calculate our test statistic. NOTE the adjusted sd:
z
2340  2400
 2.22
200
55
WOW decision time!
What do we decide?
What do we do next?
Give up in despair? Mediation? Court?
42
Note different pictures for different Ha’s
Not equal
alpha of , for example, 15%
two tailed
Greater than
alpha of 20%
one tailed
Less than
alpha of 5%
one tailed
Again, there are MANY test statistics and MANY distributions, but the gist of the
process is here.
ACTIVITIES 10 #2
43
Another example:
A research psychologist will administer a test designed to measure self-confidence
to a random sample of 50 professional athletes. The psychologist thinks that
professional athletes are more self-confident than the population at large. Since the
national average on the test is known to be 72, the psychologist does his testing.
He finds a sample mean of 74.1 and a standard deviation of 13.3. Is he right with
an alpha of .05?
What is the picture?
Ho: mean = 72
Ha: mean > 72
Test statistic:
z
74.1  72
 1.12
13.2
50
Not very unusual, but is it enough?
Well, our alpha is .05. We find that .05 of the area corresponds to a z score cut off
of 1.65
Check your chart.
Nope. The TS needed to be HIGHER than 1.65 for him to claim he’s right. We do
not reject Ho.
This doesn’t mean he’s totally wrong and too stupid to live. It does mean that the
results of this test with this random sample don’t support his belief. He won’t get a
published article out of this research.
44
Let’s review the types of alternate hypotheses and the rejections regions
all in one place
Alpha:
10%
Less than
z < −1.28
left-tailed test
Greater than
z > 1.28
right-tailed test
Not equal
z > 1.65 OR < −1.65
two-sided test
Why the change in z?
Pictures:
45
05%
Less than
z < −1.65
Greater than
z > 1.65
Not equal
z > 1.96 OR < −1.96
Pictures:
46
01%
Less than
z < −2.33
Greater than
z > 2.33
Not equal
z > 2.58 OR < − 2.58
Pictures:
ACTIVITIES 10 #3
47
Problem 1
A consumer advocate group thinks that a cereal manufacturer is wrong about the
bran content of their cereal. The manufacturer claims that the cereal has 1.2 oz of
bran per serving; the advocacy thinks it’s less than the claimed amount.
The group selects 60 boxes randomly from grocery stores all over the country and
has an analysis done by an outside lab.
The mean is 1.170 and the standard deviation is .111
The group wants an alpha of .05
Who’s claim is supported by the evidence?
48
Problem 2
Ford thinks the new Focus exceeds the EPA recommended mileage
recommendation of 43 mpg. They have an independent firm do the testing. 40 cars
are selected and tested; the sample mean is 43.6 and the sample standard deviation
is 1.3?
Are they right?
49
Problem 3
A new pain reliever is being tested on hospital patients. The pain reliever currently
in use is effective in 4.0 minutes. The new drug is randomly administered to 50
patients and the relief time is recorded. The sample mean is 3.8 with a sample
standard deviation of 1.4 minutes. Is the new drug more effective? Test with an
alpha of 10%, then with 1%
50
Problem 4
Brides magazine sponsored a survey of 3600 subscribers and found that 62% of
them spent more than $750 on their gown. Use an alpha of .05 to test the claim that
less than 65% of brides spend more than $750 on their gown.
Comment on using subscribers vs the general population of prospective brides.
51
10.2 Tests about Proportions
The reputation and the sales of a manufacturing company can be damaged by
shipments that contain large percentages of damaged goods. For example, an
artisan who creates earring on Etsy for sale to other businesses wants to keep
damaged goods under 3%. A random sample of 300 earrings is selected from a
very large shipment from her factory in Indonesia; each is tested and 27 are found
to be defective. Does this provide sufficient evidence at the .10 level of
significance that this shipment should not be sent?
So this is about proportions. What changes?
Only the test statistic! Why are we still using z-score?
Everything else is the SAME, including the rejection regions!
TS:
z=
p p
pq
n
Let’s list the steps together!
ACTIVITIES 10 #4
52
Example 1
There is a LOT of debate about mandatory retirement (with an eye toward draining
Social Security, for one reason). A survey was conducted to estimate the fraction of
workers who were forcibly retired at 65 and would have preferred to stay on the
job. In a random sample of 450 Americans in this situation, it was found that 278
would have preferred to keep working. Is there sufficient evidence that most
Americans would prefer to keep on working at an alpha of .05?
53
Problem 2
Of 880 randomly selected drivers 56% admitted to running red lights. This
information was used to write that “the majority of Americans run red lights”.
Is this accurate at an alpha of .05?
54
10.3 The P-Value for a Test
Another style of hypothesis test gives the reader a chance to make a decision about
whether or not the results of the test statistic are significant. This method uses the
first steps of the traditional test and then, instead of rejecting or not rejecting Ho, a
level of significance, called the p-value is reported.
Let’s look at one of these using data we’ve already processed. Remember the
researcher who felt that athletes were more self-confident? He got a z-score of
1.12 and not the required 1.65 for an alpha of 0.05. What he could have done was
figure out the percentile for his results and publish that. What is the p-value for a
one-sided test that results in a z-score of 1.12?
Look in the chart? Do you see .8686? He could report 87% as the p-value and let
the reader decide.
Now this was a one-tailed (right sided) test.
What about p-value with a “not equals” two-tailed test? Before, with alpha, we
split the significance and balanced it between the two tails…for example, an alpha
of .05 would have .025 area under the left tail and .025 under the right tail. With a
p-value, we need to DOUBLE the area found by looking up the percentile or area
under one side.
Let’s look at an example:
Suppose we have a test and it is a proportion. Our alternate hypothesis is
p  0.25 and when we run the sample statistics we come up with a z-score of
2.34.
We look in the chart to find that 2.34 is associated with an area of 0.0096. Noting
that this is a two-tailed test, we DOUBLE the area to .0192. Note that this is still
quite unlikely to occur by chance. This is an excellent p-value to report.
You need to be very careful to check to see if you have a one-tailed or two-tailed
test before you report a p-value
55
Discussion with problem
Suppose we have a simple random sample of 144 body temperatures with a sample
mean of 98.15 degrees and a sample standard deviation of 0.65 degrees. Does this
provide evidence that perhaps the body temperature of a healthy adult is not 98.6?
Report the p value.
Ho:
Ha:
One tailed or two?
Test statistic – which formula?
P-value and decision?
56
10.4 Tests about Means
Problem 1
A single M&M candy is supposed to weigh 0.9085 grams. The Mars Company
wants to hit that standard with high reliability. “Every candy every time” is their
goal. Over years of testing the standard deviation is known to be 0.03691
A simple random sample of 100 candies is taken by the quality control staff.
The mean weight was 0.91470 and the sample standard deviation was 0.03680.
The line staff feels they are hitting the standard; the quality control folks are not so
sure. Who’s feelings are supported by the data. First compute a p-value, then test
at the 0.05 level of significance.
57
Problem 2
One quick way to see if a data sample is random is to look at the last digit in the
sample. These should have a uniform distribution with a mean of 4.5 and a
standard deviation of 2.87. This is for data that is NOT rounded – that kind of data
is bimodal (0 and 5!).
Picture
Now using data reporting the last digits in lengths for the 73 home runs hit by Barry
Bonds in 2001, we get a sample mean of 1.753 and a standard deviation of 2.650
Does it appear that the lengths were accurately measured?
58
Problem 3
The Flesch-Kincaid Grade Level formula gives the grade reading level K of a
passage of text. For this formula W is the number of words in the passage, S is the
number of sentences, and L is the number of syllables. The formula is
K  0.39
W
L
 11.8  15.59
S
W
First figure out the grade level for the paragraph about from “The” through “is”.
Now, the publisher of Harry Potter series analyzed 36 pages from each book in the
series and calculated the mean K to be 4.075 with a sample standard deviation of
1.28.
The teachers at Devon Williams Middle School won’t use a book unless the reading
level for a typical page can be shown to be 4th grade or higher. Does the evidence
support this claim at the 0.05 level or higher?
59
Problem 4
The NC tobacco company claims their low tar cigarette contains at most 40 grams
of tar. A consumer advocacy group thinks it’s higher and tests 49 randomly
selected cigarettes:
Frequency table
47.3
39.3
40.3
38.3
46.3
43.3
39.2
03
02
04
11
09
07
04
Who’s claim is supported by the evidence?
ACTIVITIES 10 #5
60
THE END
61