Download Review 9.1-9.3

Document related concepts

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Confidence interval wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Sections 9.1 – 9.3
If we want to compute a confidence interval (CI)
for a population mean, which formula would we
use?
If we want to compute a confidence interval (CI)
for a population mean, which formula would we
use?
It depends!
Think!!
If we know the population standard deviation,
σ, we would use
Suppose σ is unknown
What changes would decrease the width of a
confidence interval (CI) for a population mean?
Suppose σ is unknown
Suppose σ is unknown
If we want to compute the sample size needed
for a given confidence level and given margin of
error, how would we proceed?
If we want to compute the sample size needed for a
given confidence level and given margin of error,
how would we proceed?
Think!!!
2 possible cases
If we want to compute the sample size needed
for a given confidence level and given margin of
error, how would we proceed?
Is population standard deviation, σ:
(1) known
(2) or unknown?
What does P-value mean?
P-value is the probability of getting a result as
extreme or more extreme than the result (test
statistic) we got from our sample, given the null
hypothesis is true.
What is α?
α is the level of significance.
α is the level of significance.
How does α relate to P-value?
α is the level of significance.
α is the maximum P-value for which the null
hypothesis will be rejected.
α is the level of significance.
α is the maximum P-value for which the null
hypothesis will be rejected.
Reject null hypothesis because the P-value of
0.## is less than the significance level of α = 0.05
How does α compare for a two-sided test versus
a one-sided test?
How does α compare for a two-sided test versus
a one-sided test?
It’s the same for both.
For example, for a 95% confidence level, α is
0.05 for a two-sided test and a one-sided test.
Remember Errors?
What is:
(a) a Type I error?
(b) a Type II error?
Errors
Type I error is rejecting a true null hypothesis.
(b) a Type II error?
Errors
Type I error is rejecting a true null hypothesis.
Type II error is failing to reject a false null
hypothesis.
Errors
Type I error is rejecting a true null hypothesis.
P(Type I error) = ?
Errors
Type I error is rejecting a true null hypothesis.
P(Type I error) = α, the level of significance
Why do we transform data?
Why do we transform data?
To change skewed data into more normal data.
15/40 Guideline?
15/40 Guideline?
15/40 guideline is a set of rules that helps us
know when it is appropriate to use a t-interval
or t-test for the population mean.
15/40 Guideline
Page 608
If our sample size is 40 or more, do we need to
plot the sample data?
If our sample size is 40 or more, do we need to
plot the sample data?
Yes!! Why?
If our sample size is 40 or more, do we need to
plot the sample data?
Yes!! Why?
Need to check for outliers.
Page 610, E41
Pretend that each data set described is a
random sample and that you want to do a
significance test or construct a confidence
interval for the unknown mean. Use the sample
size and the shape of the distribution to decide
which of these descriptions (I–IV) best fits each
data set
Page 610, E41
I. There are no outliers, and there is no
evidence of skewness. Methods based on
the normal distribution are suitable.
II. The distribution is not symmetric, but the
sample is large enough that it is reasonable
to rely on the robustness of the t-procedure
and construct a confidence interval, without
transforming the data to a new scale
Page 610, E41
III. The shape suggests transforming. With a
larger sample, this might not be necessary, but
for a skewed sample of this size transforming is
worth trying.
IV. It would be a good idea to analyze this data
set twice, once with the outliers and once
without.
Page 610, E41
A. weights, in ounces, of bags of potato chips
Page 610, E41
A. weights, in ounces, of bags of potato chips
n = 15; fairly symmetric with outlier
Page 610, E41
A. weights, in ounces, of bags of potato chips
n = 15; fairly symmetric with outlier
IV. It would be a good idea to analyze this data
set twice, once with the outliers and once
without.
Page 610, E41
B. Per capita gross domestic product (GNP) for
various countries
Page 610, E41
B. Per capita gross domestic product (GNP) for
various countries n = 34, strongly skewed right
Page 610, E41
B. Per capita gross domestic product (GNP) for
various countries n = 34, strongly skewed right
III. The shape suggests transforming. With a
larger sample, this might not be necessary, but
for a skewed sample of this size transforming is
worth trying.
Page 610, E41
C. Batting averages of American League players
Page 610, E41
C. Batting averages of American League players
n > 40,
fairly symmetric, no outliers
Page 610, E41
C. Batting averages of American League players
n > 40,
fairly symmetric, no outliers
I. There are no outliers, and there is no evidence
of skewness. Methods based on the normal
distribution are suitable.
Page 610, E41
D. self-reported grade-point averages of 67
students
Page 610, E41
D. self-reported grade-point averages of 67
students
n = 67, no outliers
Page 610, E41
D. self-reported grade-point averages of 67
students
n = 67, no outliers
II. The distribution is not symmetric, but the sample is large
enough that it is reasonable to rely on the robustness of the
t-procedure and construct a confidence interval, without
transforming the data to a new scale
Page 610, E42
Page 610, E42
A. Mean number of people per room for various
countries
Page 610, E42
A. Mean number of people per room for various
countries
n = 34, strongly skewed
right, outliers
Page 610, E42
A. Mean number of people per room for various
countries
n = 34, strongly skewed
right, outliers
III. The shape suggests transforming. With a larger
sample, this might not be necessary, but for a
skewed sample of this size transforming is worth
trying. Note: outliers may become “part of herd”
Page 610, E42
B. Record low temperatures of national capitals
Page 610, E42
B. Record low temperatures of national capitals
n=7
Page 610, E42
B. Record low temperatures of national capitals
n=7
III. The shape suggests transforming. With a larger
sample, this might not be necessary, but for a
skewed sample of this size transforming is worth
trying.
Page 610, E42
C. Student errors in estimating the midpoint of a
segment
Page 610, E42
C. Student errors in estimating the midpoint of a
segment
n = 15, fairly symmetric,
no outlier
Page 610, E42
C. Student errors in estimating the midpoint of a
segment
n = 15, fairly symmetric,
no outlier
I. There are no outliers, and there is no evidence
of skewness. Methods based on the normal
distribution are suitable.
Page 610, E42
D. Ages of employees
Page 610, E42
D. Ages of employees
n = 50, no outliers
Page 610, E42
D. Ages of employees
n = 50, no outliers
II. The distribution is not symmetric, but the sample
is large enough that it is reasonable to rely on the
robustness of the t-procedure and construct a
confidence interval, without transforming the data
to a new scale
Questions?
Monday, 1 April:
-- Homework Quiz 9.1 – 9.3
-- Fathom Lab 9.3a
Tuesday:
-- Test 9.1 – 9.3
-- both sides of 1 note card
Enjoy your break!!