Download Hypothesis testing

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
The future is a vain hope, the past is a distracting
thought. Uphold our loving kindness at this
instant, and be committed to our duties and
responsibilities right now.
1
Applied Statistics Using SAS
and SPSS
Topic: Hypothesis Testing
By Prof Kelly Fan, Cal State Univ, East Bay
2
Hypothesis Testing
 A statistical hypothesis is an assertion or
conjecture concerning one or more populations.
 Agenda:
1. Types of tests
2. Types of errors
3. P-value
4. Summary of tests
5. Assumption checking
3
4
5
Types of Tests
6
Types of Tests
7
Types of Tests
8
Types of Errors
H0 true
we accept H0
we reject H0
Good!
(Correct!)
Type I
Error, or
“ Error”
H0 false
Type II
Error, or
“ Error”
Good!
(Correct)
9

= Probability of Type I error =
P(rej. H0|H0 true)

= Probability of Type II error
= P(acc. H0|H0 false)
We often preset , called significance level.
The value of  depends on the specifics of the
H1 (and most often in the real world, we don’t
know these specifics).
10
EXAMPLE: H0 :  < 100
H1 :  >100
Suppose the Critical Value = 141:
X

=100
C=14
1
11
 = P (X < 141/ = 150)
= .3594
What is ?
These are
values
corresp.to a
value of 25
for the Std.
Dev. of X
 = 150
141  = 150
 = 160
 = P (X < 141/ = 160)
= .2236
141
 = 160
 = 170
 = P (X < 141/ = 170)
= .1230
141
 = P (X < 141/ = 180)
 = 170
 = 180
= .0594
 = P (X < 141|H0 false)
141
 = 180
12
Note: Had  been preset at .025
(instead of .05), C would have
been 149 (and  would be
larger); had  been preset at
.10, C would have been 132 and
 would be smaller.
 and  “trade off”.
13
P Value
Definition: the probability that we reject Ho
when Ho is true based on the observed data
Idea: the largest “risk” we pay to reject H0
Alternate name: the observed type I error rate /
the observed significance level
 When will we reject Ho ?
 What is the formula to calculate the largest risk?
14
Steps of Hypothesis Tests
1.
2.
3.
4.
5.
Set up Ho and Ha properly
Preset  level (the significant level)
Select an appropriate test
Calculate its p-value
Reject Ho if p-value < or = the significant
level; otherwise fail to reject Ho
15
Set Up Hypothesis Properly
Conjecture: The fraction of defective
product in a certain process is at most
10%.
Which error is more seriously? Incorrectly
claim this conjecture is true? false?
 The “=“ sign must be in Ho
16
One Population
17
Two Populations
18
Assumption Checking
1. Tests/graphs for normality
2. Tests for equal variances
19
Example: Mortar Strength
The tension bond strength of cement mortar is
an important characteristic of the product. An
engineer is interested in comparing the strength
of a modified formulation in which polymer latex
emulsions have been added during mixing to the
strength of the unmodified mortar. The
experimenter has collected 10 observations on
strength for the modified formulation and
another 10 observations for the unmodified
formulation.
20
Example: Mortar Strength
Modified
16.85
16.40
17.21
16.35
16.52
17.04
16.96
17.15
16.59
16.57
Unmodified
17.50
17.63
18.25
18.00
17.86
17.75
18.22
17.90
17.96
18.15
21
SAS/SPSS Data Input
SPSS: One variable one column in the
work sheet
SAS: One variable one name
22
Normality Tests
SAS: PROC UNIVARIATE DATA=** NORMAL PLOT;
Tests for Normality
Test
--Statistic---
-----p Value------
Shapiro-Wilk
Kolmogorov-Smirnov
Cramer-von Mises
Anderson-Darling
W 0.918255
D 0.134926
W-Sq 0.081542
A-Sq 0.537514
Pr < W
0.0917
Pr > D >0.1500
Pr > W-Sq 0.1936
Pr > A-Sq 0.1503
23
Normality Tests
SPSS: Analyze >> Descriptive Statistics >> Explore
>> Plots , Normality plots with tests
Tests of Normality
a
strength
Kolmogorov-Smirnov
Statistic
df
Sig.
.135
20
.200*
Shapiro-Wilk
Statistic
df
.918
20
Sig.
.092
*. This is a lower bound of the true significance.
a. Lilliefors Significance Correction
24
Two-sample t Tests and Equal-variance Tests
SAS: PROC TTEST DATA=** ;
25
Two-sample t Tests and Equal-variance Tests
SPSS: equal-variance tests: Homework for ST3900 students
SPSS: two-sample t tests as below
26
Research Question
A researcher claims that a new series of
math courses for elementary school is
more effective than the current one. Two
(1st grade) classes of students are
selected to perform an experiment to verify
this claim. How would you conduct the
experiment to avoid confounding variables
as much as possible?
27
Paired Samples
If the same set of sources are used to
obtain data representing two populations,
the two samples are called paired. The
data might be paired:
As a result of the data from certain “before” and
“after” studies
From matching two subjects to form “matched
pairs”
28
Tests for Paired Samples
Calculate the pair differences
Proceed as in one sample case
Notes:
SAS: all variables must be included in data
SPSS: create/calculate all variables we need
29
INFERENCES ABOUT MEAN WHEN
“BEYOND THE SCOPE”
When population is nonnormal and n is small, how to do
inferences about :
1). Non-parametric tests
2). (Optional) Use Bootstrap methods to simulate the
sampling distribution of t test statistic and then the
simulated distribution to find an (approximate) C.I. and pvalue
30
Non-parametric Tests
Independent samples: Wilcoxon Rank
Sum Test (also called Manny-Whitney test)
 Assumption: two distributions of the same
shape
Paired samples/One sample: Wilcoxon
Signed-Rank Test
 Assumption: a symmetric distribution (of the
differences for paired samples)
31
Introduction to Bootstrap Methods
How to simulate the sampling distribution of a given
statistic, say t, based on a given sample of size n:
1. Pretend the original sample is the entire population
2. Select a random sample of size n from the original
sample (now the population) with replacement ; this is
called a bootstrap sample
3. Calculate the t value of the bootstrap sample, t*
4. Repeat steps 2, 3 many times, 1000 or more, say B
times. Use the obtained t* values to obtain an
approximation to the sampling distribution
32
Review: Confidence Interval
33
34
One Population
35
Two Populations
36