Download 9.3 Tests for a Single Mean - LISA (Virginia Tech`s Laboratory for

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Psychometrics wikipedia , lookup

Omnibus test wikipedia , lookup

Analysis of variance wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
T-tests and ANOVA using JMP
Kristopher Patton
April 7, 2015
*http://gipedu.org/virginia-polytechnicinstitute-state-university-virginia-tech/
Laboratory for Interdisciplinary
Statistical Analysis
LISA helps VT researchers benefit
from the use of Statistics
Designing Experiments • Analyzing Data • Interpreting Results
Grant Proposals • Using Software (R, SAS, JMP, Minitab...)
Collaboration
Walk-In Consulting
From our website request a meeting for personalized
statistical advice
OSB 103: Mon. – Fri. from 1:00 to 3:00
GLC Room A: Tues., Thurs., Fri. from 10:00 to 12:00
Hutcheson 403-J: Wed. from 10:00 to 12:00
Great advice right now:
Meet with LISA before collecting your data
Short Courses
Designed to help graduate students
apply statistics in their research
All services are FREE for VT researchers. We assist with research—not class projects or homework.
www.lisa.stat.vt.edu
3
Hypothesis Test
A hypothesis test is a detailed protocol for
decision-making concerning a population by
examining a sample from that population.
4
Hypothesis Tests vs. Criminal Trials
Burden of Proof—Obligation to shift the
conclusion using evidence
Hypothesis Test
Assume the initial
hypothesis is true until
the data suggests
otherwise
Trial
Innocent until proven
guilty
5
Steps in a Hypothesis Test
1.
Test
2.
Assumptions
3.
Hypotheses
4.
Mechanics
5.
Conclusion
6
One Sample t-Test
• Used to test whether the population mean is different from
a specified value.
7
Medical Example
• In a glaucoma study, the following intraocular pressure
(mm Hg) values were recorded from a sample of 21
elderly subjects. Based on this data, can we conclude that
the mean intraocular pressure of the population from
which the sample was drawn differs from 14 mm Hg?*
Intraocular Pressure
14.5
12.9
14
16.1
12
17.5
14.1
12.9
17.9
12
16.4
24.2
12.2
14.4
17
10
18.5
20.8
16.2
14.9
19.6
𝑦 = 15.6238 𝑠 = 3.383
*Wayne, D. Biostatistics: A Foundation for Analysis in the Health Sciences.
5th ed. New York: John Wiley & Sons, 1991.
Assumptions
• The data are randomly sampled from the
population.
• The data are approximately normally
distributed.
• Our data are representative of the variable of interest,
which is also referred to as the response variable.
Hypotheses
• The “null hypothesis” is a statement describing a claim
about a population constant.
- The null hypothesis is denoted as 𝑯𝟎 .
• The “alternative hypothesis” is a statement describing the
researcher’s suspicions about the claim. Also called
“research hypothesis”.
- The alternative hypothesis is denoted as 𝑯𝒂 .
Medical Example hypotheses:
𝐻0 : 𝜇 = 14 𝑣𝑠 𝐻𝑎 : 𝜇 ≠ 14
Hypotheses
• For hypothesis testing there are three versions for testing
that are determined by the context of the research
question.
• Left Tailed Hypothesis Test (less than)
• Right Tailed Hypothesis Test (greater than)
• Two Tailed or Two Sided Hypothesis Test (not equal to)
Mechanics
• Rejection Rule: Reject the null hypothesis (𝑯𝟎 ) if
the p-value ≤ 𝜶
• Test Statistic: Compute the test statistic, which is
a standardization of the sample mean, and is
needed for the p-value computation.
• P-value: The chance of observing your sample
results or more extreme results assuming that the
null hypothesis is true. If this chance is “small” then
you may decide the claim in the null hypothesis is
false.
12
Test Statistic for Medical Example
•In many cases, including Example 1, the population standard
deviation 𝝈 is unknown because it is a parameter from the
population that must be estimated.
•The best estimate for 𝝈 is 𝒔.
• Our standardized value becomes
𝒕𝒐𝒃𝒔
𝝁𝟎 : hypothesized mean
𝒚: sample mean
𝑠: sample standard deviation
𝒏: sample size
𝑡𝒐𝒃𝒔 : observed t test statistic
Test statistic for a one
sample t-test
𝒚 − 𝝁𝟎
= 𝒔 ~𝒕𝒏−𝟏
𝒏
This t observed (𝑡0𝑏𝑠 ) test statistic follows a
t distribution with 𝒏 − 𝟏 degrees of
freedom.
Test Statistic for Medical Example
• In the example it was given that 𝒚 = 𝟏𝟓. 𝟔𝟐𝟑𝟖 and 𝒔 = 𝟑. 𝟑𝟖𝟑.
𝒕𝒐𝒃𝒔
𝒚 − 𝝁𝟎 𝟏𝟓. 𝟔𝟐𝟑𝟖 − 𝟏𝟒
=
=
= 𝟐. 𝟐𝟎
𝒔/ 𝒏
𝟑. 𝟑𝟖𝟑/ 𝟐𝟏
P-value
• The p-value is determined based on the sign of the
alternative hypothesis.
1. 𝑯𝒂 : 𝝁 ≠ 𝝁𝟎 . If this is the case, then the p-value is
the area in both tails of the t distribution.
0.4
Density
0.3
0.2
0.1
1/2 p-value
0.0
1/2 p-value
-t_obs
0
t_obs
P-value
• The p-value is determined based on the sign of the
alternative hypothesis.
2. 𝑯𝒂 : 𝝁 < 𝝁𝟎 . If this is the case, then the p-value is
the area to the left of the observed test statistic.
0.4
p-value
Density
0.3
0.2
0.1
0.0
0
t_obs
P-value
• The p-value is determined based on the sign of the
alternative hypothesis.
3. 𝑯𝒂 : 𝝁 > 𝝁𝟎 . If this is the case, then the p-value is
the area to the right of the observed test statistic.
0.4
Density
0.3
0.2
0.1
p-value
0.0
0
t_obs
Medical Example
• 𝑯𝟎 : 𝝁 = 𝟏𝟒 𝒗𝒔. 𝑯𝒂 : 𝝁 ≠ 𝟏𝟒
• P−𝐯𝐚𝐥𝐮𝐞 = 𝟎. 𝟎𝟏𝟗𝟖𝟔 + 𝟎. 𝟎𝟏𝟗𝟖𝟔 = 𝟎. 𝟎𝟑𝟗𝟕𝟐
0.4
Density
0.3
0.2
0.1
0.01986
0.0
0.01986
-2.2
0
t
2.2
Conclusion
• Conclusions should always include:
• Decision: reject or fail to reject
(not accept 𝐻0 ).
• Context: what your decision means in context of the
problem.
• Medical Example: With a p-value=0.0398, which is less
than 0.05, we reject 𝐻0 . There is sufficient sample
evidence to conclude that the true mean intraocular
pressure differs from 14 mm Hg.
19
Summary of One Sample t-test
2-Tailed Test
Right-Tailed
Left Tailed
Null hypothesis
𝐻0 : 𝜇 = 𝜇0
𝐻0 : 𝜇 ≤ 𝜇0
𝐻0 : 𝜇 ≥ 𝜇0
Alternative
hypothesis
𝐻𝑎 : 𝜇 ≠ 𝜇0
𝐻𝑎 : 𝜇 > 𝜇0
𝐻𝑎 : 𝜇 < 𝜇0
• Test Statistic:
• 𝒕𝒐𝒃𝒔 =
𝒚−𝝁𝟎
𝒔
𝒏
• Degrees of Freedom: 𝒏 − 𝟏
• Assumption: The population from which the sample is
drawn is normal or approximately normal.
20
Importing Data into JMP
*http://nuke.progettiesistemi.com/Simpl
eBusiness/tabid/97/Default.aspx
21
Egyptian Skulls Data Set
• Four measurements of male Egyptian skulls from 5
different time periods. Thirty skulls are measured from
each time period.
• Variables
• MB: Maximal Breadth of Skull
• BH: Basibregmatic Height of Skull
• BL: Basialveolar Length of Skull
• NH: Nasal Height of Skull
• Year: Approximate Year of Skull Formation
• (negative = B.C., positive = A.D.)
*Thomson, A. and Randall-Maciver, R. (1905) Ancient Races of the
Thebaid, Oxford: Oxford University Press.
*http://members.ozemail.com.au/~rdun
lop/CoplandMain/MathsLG/CollandEnt
DataLG.htm
22
Hypothesis Test for a Single Mean in JMP
• JMP Demonstration
• Open data set.
• AnalyzeDistribution
• Complete the dialog box as shown
and select OK.
• Select the red arrow next to
“Pressure” and select Test Mean.
• Complete Dialog box as shown
and select OK.
• Select the red arrow next to
“Pressure” and select Confidence
Interval->0.95.
23
Two Sample T-Test
• The major goal is to determine whether a difference exists
between two populations.
• Examples:
• Compare blood pressure for male and females.
• Compare the proportion of smokers and nonsmokers
with lung cancer.
• Compare weight before and after treatment.
• Is the mean cholesterol of people taking drug A lower than the
mean cholesterol of people taking drug B?
24
Hypotheses for 2 Samples
• The population means of the two groups are not equal.
H 0: μ 1 = μ 2
H a: μ 1 ≠ μ 2
The population mean of group 1 is greater than the
population mean of group 2.
H 0: μ 1 = μ 2
H a: μ 1 > μ 2
The population mean of group 1 is less than the population
mean of group 2.
H 0: μ 1 = μ 2
H a: μ 1 < μ 2
25
Two Sample Assumptions
• The two samples are random and independent.
• The populations from which the samples are drawn are
approximately normal.
• The populations have the same standard deviation.
26
Test Statistic for TWO Samples
𝑦𝟏 − 𝑦𝟐
𝒕𝒐𝒃𝒔 =
𝒔𝒑
𝒔𝒑 =
𝟏
𝟏
+
𝒏𝟏 𝒏𝟐
𝒏𝟏 − 𝟏 𝒔𝟐𝟏 + 𝒏𝟐 − 𝟏 𝒔𝟐𝟐
𝒏𝟏 + 𝒏𝟐 − 𝟐
• Upon calculation of the test-statistic, we can then
calculate the p-value and draw our conclusion.
27
Summary: Two Sample t-Test
2-Tailed Test
Right-Tailed
Left Tailed
Null
𝐻0 : 𝜇1 − 𝜇2 = 0
𝐻0 : 𝜇1 − 𝜇2 ≤ 0
𝐻0 : 𝜇1 − 𝜇2 ≥ 0
Alternative
𝐻𝑎 : 𝜇1 − 𝜇2 ≠ 0
𝐻𝑎 : 𝜇1 − 𝜇2 > 0
𝐻𝑎 : 𝜇1 − 𝜇2 < 0
• Test Statistic:
𝒕𝒐𝒃𝒔 =
𝒔𝒑 =
𝑦𝟏 − 𝑦𝟐
𝟏
𝟏
𝒔𝒑 𝒏 + 𝒏
𝟏
𝟐
Degrees of Freedom
n1 + n2 − 2
𝒏𝟏 − 𝟏 𝒔𝟐𝟏 + 𝒏𝟐 − 𝟏 𝒔𝟐𝟐
𝒏𝟏 + 𝒏𝟐 − 𝟐
Assumption: The populations from which
both samples are drawn are normal or
approximately normal.
28
VA Lung Cancer Data Set
• Veteran's Administration lung cancer trial.
• Variables
• stime: Survival of follow-up time in days.
• status: Dead or Censored.
• treat: Treatment type of either Standard or Test.
• age: Patient’s age in years.
• Karn: Karnofsky score of patient's performance on a scale of
0 (dead) to 100 (perfectly normal).
• diag.time: Time since diagnosis in months at entry to the trial.
• cell: One of four cell types.
• prior: Did the patient receive prior therapy?
*Kalbfleisch, J.D. and Prentice R.L. (1980) The Statistical Analysis of
Failure Time Data. Wiley.
*http://lungcancernewst
oday.com/2015/03/05/f
da-grants-licensingapplication-to-opdivofor-the-treatmentadvanced-squamousnsclc/
29
JMP
• JMP Demonstration:
Analyze  Fit Y By X
Y, Response: Karnofsky Score (Karn)
X, Factor: Treatment (treat)
Select: Means/ANOVA/Pooled t
30
Paired t-Test
• The objective of paired comparisons is to minimize
sources of variation that are not of interest in the study by
pairing observations with similar characteristics.
• Example:
A researcher would like to determine if background noise
causes people to take longer to complete math problems.
The researcher gives 20 subjects two math tests one with
complete silence and one with background noise and
records the time each subject takes to complete each test.
31
Hypotheses for Paired t-Test
• The population mean difference is not equal to zero.
H0: μdifference = 0
Ha: μdifference ≠ 0
• The population mean difference is greater than zero.
H0: μdifference = 0
Ha: μdifference > 0
• The population mean difference is less than a zero.
H0: μdifference = 0
Ha: μdifference < 0
32
Assumptions for Paired t-Test
• The sample is random.
• The data is matched pairs.
• The differences have a normal distribution.
33
Test Statistic for Paired t-Test
•
Test Statistic:
𝒕𝒐𝒃𝒔
𝒚𝒅
= 𝒔
𝒅
𝒏
Where 𝑦𝑑 bar is the mean of the differences and sd
is the standard deviations of the differences.
•
Upon calculation of the test-statistic, we can then
calculate the p-value and draw our conclusion.
34
Summary of Paired t-Test
2-Tailed
Right Tailed
Left Tailed
Null
𝐻0 : 𝜇𝑑 = 0
𝐻0 : 𝜇𝑑 ≤ 0
𝐻0 : 𝜇𝑑 ≥ 0
Alternative
𝐻𝑎 : 𝜇𝑑 ≠ 0
𝐻𝑎 : 𝜇𝑑 > 0
𝐻𝑎 : 𝜇𝑑 < 0
• Test Statistic:
𝒕𝒐𝒃𝒔
𝒚𝒅
= 𝒔
𝒅
𝒏
• Degrees of Freedom: 𝒏 − 𝟏
Assumption: The population of differences
is normal or approximately normal.
35
Egyptian Skulls Data Set
• Four measurements of male Egyptian skulls from 5
different time periods. Thirty skulls are measured from
each time period.
• Variables
• MB: Maximal Breadth of Skull
• BH: Basibregmatic Height of Skull
• BL: Basialveolar Length of Skull
• NH: Nasal Height of Skull
• Year: Approximate Year of Skull Formation
• (negative = B.C., positive = A.D.)
*Thomson, A. and Randall-Maciver, R. (1905) Ancient Races of the
Thebaid, Oxford: Oxford University Press.
*http://members.ozemail.com.au/~rdun
lop/CoplandMain/MathsLG/CollandEnt
DataLG.htm
36
Paired T-Test Example
• JMP Analysis:
• Create a new column of Diff = MB – BH
• Analyze  Distribution
• Y, Columns: Diff
• Test Mean
• Specify Hypothesized Mean: 0
37
One-Way ANOVA
• ANOVA is used to determine whether three or more
populations have different distributions.
A
B
C
Medical Treatment
38
ANOVA Strategy
• The first step is to use the ANOVA F test to determine
there are any significant differences among the population
means.
• If the ANOVA F test shows that the population means are
not all the same, then follow up tests can be performed to
see which pairs of population means differ.
39
One-Way ANOVA Model
yij  i   ij
Where
yij is the response of the jth trial on the ith factor level
i is the mean of the ith group
 ij ~ N (0,  2 )
i  1,, r
j  1, , ni
In other words, for each group the observed value
is the group mean plus some random variation.
40
One-Way ANOVA Hypothesis
• Test whether there is a difference in the population
means.
H 0 : 1   2     r
H a : The i are not all equal.
41
ANOVA Assumptions
• The samples are random and independent of
each other.
• The populations are normally distributed.
• The populations all have the same standard
deviations.
• The ANOVA F test is robust to the assumptions of
normality and equal standard deviations.
42
Step 3: ANOVA F Test
A
B
C
A
B
C
Medical Treatment
Compare the variation within the samples to the
variation between the samples.
43
ANOVA Test Statistic
F
Variation between Groups MSG

Variation within Groups
MSE
Variation within groups small
compared with variation
between groups
→ Large F
Variation within groups large
compared with variation
between groups → Small F
44
MSG
•
The mean square for groups, MSG, measures the
variability of the sample averages.
•
SSG stands for sums of squares groups.
•
r = “# of groups”
SSG
r -1
n1 ( y1  y ) 2  n 2 ( y2  y ) 2    n r ( y1  y ) 2

r -1
MSG 
45
MSE
•
Mean square error, MSE, measures the variability
within the groups.
SSE stands for sums of squares error.
•
n = “total # of observations”
•
SSE
n-r
(n 1 - 1)s12  (n 2 - 1)s 22    (n r - 1)s 2r

n-r
Where
MSE 
ni
si 
(y
j 1
ij
 yi  )
ni  1
46
ANOVA in JMP
• JMP demonstration
• Analyze  Fit Y By X
• Y, Response: MB
• X, Factor: Year (change to nominal)
Normal Quantile Plot  Plot Actual by Quantile
Means/ANOVA
47
Follow-Up Test
• If the F-test results in a significant p-value, we
can then use Tukey’s HSD Test to determine
which pairs of groups are significant!
48
Tukey Tests
• Tukey’s test simultaneously tests
H 0 : i  i '
H a : i  i '
for all pairs of factor levels.
• JMP demonstration:
• Oneway ANOVA Compare Means  All Pairs, Tukey HSD
49
Two-Way ANOVA
• We are interested in the effect of two categorical factors
on the response.
• We are interested in whether either of the two factors
have an effect on the response and whether there is an
interaction effect.
• An interaction effect means that the effect on the response of one
factor depends on the level of the other factor.
50
Interaction
Interaction
No Interaction
Low
High
Dosage
Drug A
Drug B
Improvement
Improvement
Drug A
Drug B
Low
High
Dosage
51
Two-Way ANOVA Model
yijk     i   j  ( ) ij   ijk
Where
yijk is the response of the kth trial on the ith factor A level and the jth factor B level
 is the overall mean
 i is the main effect of the ith level of factor A
 j is the main effect of the jth level of factor B
( ) ij is the interactio n effect of the ith level of factor A and the jth level of factor B
 ijk ~ N (0,  2 )
i  1,  , a
j  1,  , b
k  1,..., nij
52
VA Lung Cancer Data Set
• Veteran's Administration lung cancer trial.
• Variables
• stime: Survival of follow-up time in days.
• status: Dead or Censored.
• treat: Treatment type of either Standard or Test.
• age: Patient’s age in years.
• Karn: Karnofsky score of patient's performance on a scale of
0 (dead) to 100 (perfectly normal).
• diag.time: Time since diagnosis in months at entry to the trial.
• cell: One of four cell types.
• prior: Did the patient receive prior therapy?
*Kalbfleisch, J.D. and Prentice R.L. (1980) The Statistical Analysis of
Failure Time Data. Wiley.
*http://lungcancernewst
oday.com/2015/03/05/f
da-grants-licensingapplication-to-opdivofor-the-treatmentadvanced-squamousnsclc/
53
Two-Way ANOVA in JMP
• JMP demonstration
• Analyze  Fit Model
• Y: Karn
• Highlight treat and status and click Macros  Factorial to Degree
• Run Model
54
Acknowledgements
• Tonya Pruitt, LISA Administrative Specialist, VT
Department of Statistics
• Dr. Chris Franck, Assistant Research Professor, VT
Department of Statistics
• Dr. Anne Ryan Driscoll, Assistant Research Professor, VT
Department of Statistics