Download review - Penn State Department of Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Sufficient statistic wikipedia , lookup

Psychometrics wikipedia , lookup

Eigenstate thermalization hypothesis wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Confidence interval wikipedia , lookup

Foundations of statistics wikipedia , lookup

Omnibus test wikipedia , lookup

Statistical hypothesis testing wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
A review of key statistical
concepts
An overview of the review
•
•
•
•
Populations and parameters
Samples and statistics
Confidence intervals
Hypothesis testing
Populations and Parameters
… and Samples and Statistics
Populations and Parameters
• A population is any large collection of
objects or individuals, such as Americans,
students, or trees about which information is
desired.
• A parameter is any summary number, like
an average or percentage, that describes the
entire population.
Parameters
• Examples:
– population mean µ = average temperature
– population proportion p = proportion approving
of president’s job performance
• 99.999999999999….% of the time, we
don’t (...or can’t) know the real value of a
population parameter.
• Best we can do is estimate the parameter!
Samples and Statistics
• A sample is a representative group drawn
from the population.
• A statistic is any summary number, like an
average or percentage, that describes the
sample.
Statistics
• Examples
– sample mean (“x-bar”)
– sample proportion (“p-hat”)
• Because samples are manageable in size, we
can determine the value of statistics.
• We use the known statistic to learn about
the unknown parameter.
Example: Smoking at PSU?
Population of
42,000 PSU students
What proportion
smoke regularly?
Sample of
987 PSU students
43% reported
smoking regularly
Example: Grade inflation?
Population of
5 million college
students
Sample of
100 college students
Is the average
GPA 2.7?
How likely is it that
100 students would
have an average
GPA as large as 2.9
if the population
average was 2.7?
Example: A linear relationship?
Regression Plot
Weight = -2037.00 + 130.817 Gestation
S = 167.327
R-Sq = 77.5 %
R-Sq(adj) = 76.8 %
Birth weight (grams)
3500
Population line
Y   0  1 x
3000
Sample estimate
yˆ  b0  b1 x
2500
34
35
36
37
38
39
40
Gestation (weeks)
41
42
Two ways to learn
about a population parameter
• Confidence intervals estimate parameters.
– We can be 95% confident that the proportion of
Penn State students who have a tattoo is between
5.1% and 15.3%.
• Hypothesis tests test the value of parameters.
– There is enough statistical evidence to conclude
that the mean normal body temperature of adults
is lower than 98.6 degrees F.
Confidence intervals
A review of concepts
The situation
• Want to estimate the actual population mean
.
• But can only get “x-bar,” the sample mean.
• Use “x-bar” to find a range of values,
L<<U, that we can be really confident
contains .
• The range of values is called a “confidence
interval.”
Confidence intervals
for proportions in newspapers
• “Sample estimate”: 69% of 1,027 U.S. adults
think using a hand-held cell phone while driving a
car should be illegal.
• The “margin of error” is 3%.
• The “confidence interval” is 69% ± 3%.
• We can be really confident that between 66% and
72% of all U.S. adults think using a hand-held cell
phone while driving a car should be illegal.
Source: ABC News Poll, May 16-20, 2001
General form of
most confidence intervals
•
•
•
•
Sample estimate ± margin of error
Lower limit L = estimate - margin of error
Upper limit U = estimate + margin of error
Then, we’re confident that the value of the
population parameter is somewhere
between L and U.
(1-α)100% t-interval
for population mean 
Formula in words:
Sample mean ± (t-multiplier × standard error)
Formula in notation:
xt







 s
n

 , n  1
2







Determining the t-multiplier
0.4
density
0.3
0.2
1
0.1

2

2
0.0
-4
-3
-2
-1
0
t(14)
1
2
3
4
Typical confidence coefficients
Conf. coefficient
1   
Conf. level
1   100%

2
0.90
90%
0.05
0.95
95%
0.025
0.99
99%
0.005
t-interval for mean in Minitab
One-Sample T: FVC
Variable
FVC
N
9
Mean
3.5556
StDev SE Mean
95.0% CI
0.1667 0.0556 (3.4274,3.6837)
We can be 95% confident that the mean forced vital
capacity of all female college students is between 3.43
and 3.68 liters.
Length of confidence interval
• Want confidence interval to be as narrow as
possible.
• Length = Upper Limit - Lower Limit
How length of CI is affected?
xt
•
•
•
•






 s
n


 , n  1
2







As sample mean increases…
As the standard deviation decreases…
As we decrease the confidence level…
As we increase sample size …
Hypothesis testing
A review of concepts
General idea of
hypothesis testing
• Make an initial assumption.
• Collect evidence (data).
• Based on the available evidence (data),
decide whether to reject or not reject the
initial assumption.
Example: Normal body temperature
Population of
many, many adults
Sample of
130 adults
Is average adult
body temperature
98.6 degrees? Or
is it lower?
Average body
temperature of 130
sampled adults is
98.25 degrees.
Making the decision
• It is either likely or unlikely that we would
collect the evidence we did given the initial
assumption.
• If it is likely, then we “do not reject” our
initial assumption. There is not enough
evidence to do otherwise.
Making the decision (cont’d)
• If it is unlikely, then:
– either our initial assumption is correct and we
experienced a very unusual event
– or our initial assumption is incorrect
• In statistics, if it is unlikely, we “reject” our
initial assumption.
Again, idea of hypothesis testing:
criminal trial analogy
• First, state 2 hypotheses, the null hypothesis
(“H0”) and the alternative hypothesis (“HA”)
– H0: Defendant is not guilty (innocent).
– HA: Defendant is guilty.
Criminal trial analogy
(continued)
• Then, collect evidence, such as finger
prints, blood spots, hair samples, carpet
fibers, shoe prints, ransom notes,
handwriting samples, etc.
• In statistics, the data are the evidence.
Criminal trial analogy
(continued)
• Then, make initial assumption.
– Our criminal justice system is based on
“defendant is innocent until proven guilty.”
– So, assume defendant is innocent.
• In statistics, we always assume the null
hypothesis is true.
Criminal trial analogy
(continued)
• Then, make a decision based on the
available evidence.
– If there is sufficient evidence (“beyond a
reasonable doubt”), reject the null hypothesis.
(Behave as if defendant is guilty.)
– If there is insufficient evidence, do not reject
the null hypothesis. (Behave as if defendant is
innocent.)
Very important point
• If we reject the null hypothesis, we do not prove
the alternative hypothesis is true.
• If we do not reject the null hypothesis, we do not
prove the null hypothesis is true.
• We merely state there is enough evidence to
behave one way or the other.
• Always true in statistics! Whatever the decision,
there is always a chance we made an error.
Errors in criminal trials
Truth
Jury
Decision
Not guilty
Guilty
Not guilty
Guilty
OK
ERROR
ERROR
OK
Errors in hypothesis testing
Truth
Decision
Null
hypothesis
Do not
reject null
OK
Reject null
TYPE I
ERROR
Alternative
hypothesis
TYPE II
ERROR
OK
Definitions: Types of errors
• Type I error: The null hypothesis is
rejected when it is true.
• Type II error: The null hypothesis is not
rejected when it is false.
• There is always a chance of making one of
these errors. But, a good scientific study
will minimize the chance of doing so!
Making the decision
• “It is either likely or unlikely that we would
collect the evidence we did given the initial
assumption.”
• Two ways to determine likely or unlikely:
– Critical value approach (many textbooks)
– P-value approach (science, journals, software)
Possible hypotheses about mean µ
Type
Null
Alternative
Right-tailed
H0 :   3
HA :   3
Left-tailed
H0 :   3
HA :   3
Two-tailed
H0 :   3
HA :   3
Critical value approach
• Using sample data and assuming null hypothesis is
true, calculate the value of the test statistic.
• Set the significance level, α, the probability of
making a Type I error to be small (0.05 or 0.01).
• Compare the value of the test statistic to the
known distribution of the test statistic.
• If the test statistic is more extreme than expected,
allowing for an α chance of error, reject the null
hypothesis. Otherwise, don’t reject the null.
Right-tailed critical value
0.4
density
0.3
0.2
0.95
0.1
0.05
0.0
-4
-3
-2
-1
0
t(14)
1
2
3
4
1.7613
Reject null hypothesis if test statistic is greater than 1.7613.
Left-tailed critical value
0.4
density
0.3
0.2
0.95
0.1
0.05
0.0
-4
-3
-2
-1
-1.7613
0
1
2
3
4
t(14)
Reject null hypothesis if test statistic is less than -1.7613.
Two-tailed critical value
0.4
density
0.3
0.95
0.2
0.1
0.025
0.025
0.0
-4
-3
-2
-2.1448
-1
0
t(14)
1
2
3
4
2.1448
Reject null hypothesis if test statistic is less than -2.1448 or
greater than 2.1448.
P-value approach
• Using sample data and assuming null hypothesis is
true, calculate the value of the test statistic.
• Using known distribution of the test statistic,
calculate the P-value = “If the null hypothesis is
true, what is the probability that we’d observe a
more extreme test statistic than we did?”
• Set the significance level, α, the probability of
making a Type I error to be small (0.05 or 0.01).
• If the probability is small, i.e., smaller than α,
reject the null hypothesis. Otherwise, don’t reject
the null.
Right-tailed P-value
0.4
density
0.3
0.2
0.9873
0.1
0.0127
0.0
-4
-3
-2
-1
0
t(14)
1
2
3
4
t* = 2.5
If it’s unlikely to observe such a large test statistic, i.e., if the Pvalue (0.0127) is smaller than α, reject the null hypothesis.
Left-tailed P-value
0.4
density
0.3
0.2
0.9873
0.1
0.0127
0.0
-4
-3
-2
t* = -2.5
-1
0
1
2
3
4
t(14)
If it’s unlikely to observe such a small test statistic, i.e., if the Pvalue (0.0127) is smaller than α, reject the null hypothesis.
Two-tailed P-value
0.4
density
0.3
0.9746
0.2
0.1
0.0127
0.0127
0.0
-4
-3
-2
t* = -2.5
-1
0
t(14)
1
2
3
4
t* = 2.5
If it’s unlikely to observe such an extreme test statistic, i.e., if the
P-value (0.0254) is smaller than α, reject the null hypothesis.
Example: Right-tailed test
Brinell hardness measurement of
ductile iron subcritically annealed:
170 167 174 179 179
156 163 156 187 156
183 179 174 179 170
156 187 179 183 174
187 167 159 170 179
H 0 :   170
H A :   170
One-Sample T: Brinell
Test of mu = 170 vs mu > 170
Variable
Brinell
N
25
Mean
172.52
StDev
10.31
SE Mean
2.06
T
1.22
P
0.117
Example: Right-tailed critical value
0.4
density
0.3
0.95
0.2
0.1
0.05
0.0
-4
-3
-2
-1
0
t(24)
1
2
1.7109
3
4
Example: Right-tailed P-value
0.4
density
0.3
0.883
0.2
0.117
0.1
0.0
-4
-3
-2
-1
0
1
2
t(24) t* = 1.22
3
4
Example: Left-tailed test
Height of sunflower seedlings.
11.5
15.2
16.5
15.1
12.9
9.3
11.8
19.0
13.5
17.1
11.1
12.2
15.7
12.8
14.4
13.3
15.0
10.3
16.1
12.4
16.7
12.4
13.3
14.1
19.2
10.9
8.5
15.8
10.5
13.5
13.0
14.3
13.5
H 0 :   15.7
H A :   15.7
Test of mu = 15.7 vs mu < 15.7
Variable
Sunflower
N
33
Mean
13.664
StDev
2.544
SE Mean
0.443
T
-4.60
P
0.000
Example: Left-tailed critical value
0.4
density
0.3
0.95
0.2
0.05
0.1
0.0
-4
-3
-2
-1
-1.6939
0
t(32)
1
2
3
4
Example: Left-tailed P-value
0.4
density
0.3
0.2
> 0.999
0.1
< 0.001
0.0
-5
0
-4.60
t(32)
5
Example: Two-tailed test
H 0 :   7.5
Thickness of spearmint gum.
7.65
7.55
7.60
7.40
7.65
7.40
7.70
7.50
7.55
7.50
H A :   7.5
Test of mu = 7.5 vs mu not = 7.5
Variable N
Gum
10
Mean
7.5500
StDev
0.1027
SE Mean
0.0325
T
1.54
P
0.158
Example: Two-tailed critical value
0.4
density
0.3
0.95
0.2
0.1
0.025
0.025
0.0
-4
-3
-2
-2.2622
-1
0
t(9)
1
2
3
2.2622
4
Example: Two-tailed P-value
0.4
density
0.3
0.2
0.842
0.1
0.079
0.079
0.0
-4
-3
-2
-1.54
-1
0
t(9)
1
1.54
2
3
4