Download Lecture # 3 Null and Alternative Hypotheses Steps in Conducting a

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Psychometrics wikipedia , lookup

Taylor's law wikipedia , lookup

History of statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Analysis of variance wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Null and Alternative
Hypotheses
Lecture # 3
Significance Testing
Is there a significant difference between a
measured and a standard amount (that can not
be accounted for by random error alone)?
aka Hypothesis testing- H0 (null hypothesis)
(no difference) Decision- Accept or Reject
The lower the probability that the observed difference
occurs by chance, the less likely it is that the null
hypothesis is true.
H0 -> Null Hypotheses
Ha -> Alternative Hypotheses
Hypotheses always pertain to population
parameters or characteristics rather than to
sample characteristics. It is the population,
not the sample, that we want to make an
inference about from limited data.
JRA 01/13
JRA 01/13
Steps in Conducting a
Hypothesis Test
Steps in Conducting a
Hypothesis Test (Cont’d)
  Step
 
1. Set up H0 and Ha
  Step 2. Identify the nature of the
sampling distribution curve and specify
the appropriate test statistic
  Step 3. Determine whether the
hypothesis test is one-tailed or twotailed
JRA 01/13
 
 
 
Step 4. Taking into account the specified significance
level, determine the critical value (two critical values
for a two-tailed test) for the test statistic from the
appropriate statistical table
Step 5. State the decision rule for rejecting H0
Step 6. Compute the value for the test statistic from
the sample data
Step 7. Using the decision rule specified in step 5,
either reject H0 or reject Ha
JRA 01/13
Decision based on
significance testing
Decision based on significance
testing
The null hypothesis is rejected if the probability
of such a difference occurring by chance is
less than 1 in 20 (5% or 0.05).
  We
In such a case, the difference is said to be
significant at the 0.05 (or 5%) level.
Using this level, there is 1 in 20 chance that we
will reject the null hypothesis when it is in
fact, true.
can use 0.01 or 0.001 (1% or 0.1%)
  However, if the null hypothesis is
retained, it has not been proved that it is
true, only that it has not been
demonstrated to be false.
  We use the t-test in significance testing.
  If |calculated t| is greater than the critical
value, we reject the null hypothesis.
JRA 01/13
JRA 01/13
α, 1-α, β and 1-β
α, 1-α, β and 1-β
The significance level (α) of a statistical
hypothesis test is a fixed probability of wrongly
rejecting the null hypothesis H0, if it is in fact true.
It is the probability of a Type I error.
The confidence level is 1-α.
Usually, the significance level is chosen to be 0.05
(or 5%)
JRA 01/13
  type
II error occurs when H0 is not
rejected and when it is, in fact, false.
  A type II error is frequently due to sample
sizes being too small.
  The probability of a type II error is
symbolized by β.
  The power of the test is 1-β, which is the
probability of avoiding a Type II error.
β
JRA 01/13
Significance Testing
α, 1-α, β and 1-β
  The
α is user-defined.
  β is difficult to constrain because its
value depends on the unknown value
of the population parameter
  An inverse relationship exists between α
and β.
  Reduce the probability of α and β
increases
Is there a significant difference between a
measured (X) and a standard amount (µ) that
can not be accounted for by random error alone?
a.k.a: Hypothesis testing- H0 (null hypothesis)
(no difference) Decision- Accept or Reject
t = (X - m ) n /s
Where X is the sample mean, s = Standard Deviation
and n = sample size
If |t| (i.e. the calculated value of t without regard to sign) exceeds a
certain critical value, then the null hypothesis is rejected.
JRA 01/13
Values of Student’s t
JRA 01/13
Comparing an experimental
mean to a known value
JRA 01/13
JRA 01/13
Significance Tests
“gives us tools to accept conclusions that have a high probability of
Truth table
being correct and to reject conclusions that do not”
  At
P=0.05, there is a 5% risk that a null
hypothesis will be rejected even
though it is true
(Type I error)
  It
is also possible to retain a null
hypothesis even when it is false
(Type II error)
JRA 01/13
Summary of Errors Involved in
Hypothesis Testing
Inference
Based on
Sample Data
H0 is True
H0 is False
Real State of Affairs
H0 is True
Correct decision
Confidence level
= 1- α
H0 is False
Type II error
P (Type II error) = β
Correct decision
Type I error
Significance level Power = 1-β
=α*
*Term α represents the maximum probability of
committing a Type I error
JRA 01/13
JRA 01/13
To calculate the probability of a
Type II error, we postulate H1
where H1 is an alternative hypothesis
Consider the example that a product
contains 3% of Phosphorus by weight.
Four (4) measurements are taken, the
mean and SD are calculated and a
significance test is conducted at P=0.05
It is suspected that the [P] has increased.
H0: µ = 3.0% (one tailed t-test, “increase”)
JRA 01/13
Sampling Distribution if H0 is true
Sampling Distribution if H1 is true
n=4
n=4
Probability of a type I error is 0.05
If the sample mean lies above the critical value Xc,
the null hypothesis is rejected.
Probability of a type II error, H0 is retained even if H1 is
true and the sample mean lies below the critical value Xc.
JRA 01/13
Increase of sample size to reduce
both Type I and Type II errors.
JRA 01/13
We can assign a confidence
level to our measurements…..
If we can accept a 5% error level, we can say that
these values are reported with a 95% confidence limit.
n=9
SE = s / n
The probability that a false hypothesis is rejected is
called the POWER of the test. (1- prob of a Type II error)
JRA 01/13
For a small number of measurements, we must
consult a t value table
•  choose confidence level %
•  determine number of degrees of freedom (n-1)
•  plug t value into the following equation:
where t is 4.303 for 2 d.f.
(n=3) and s was 0.28% Cl= 66.69% ± 0.70%
or 65.99% Cl- - 67.39% Cl-
95%CL = X ±
ts
n
JRA 01/13
Comparison of the means from 2
“samples”
and X2
  Comparison of one analytical technique
to another (new method to a standard
method)
  Null hypothesis is that there is no
difference (both methods give the same
results)
X 1 - X2 = 0
Comparison of two means with a
t-test
  X1
where
spooled =
s1 2 (n1 - 1) + s 2 2 (n 2 - 1)
n1 + n 2 - 2
t has n1+ n2 - 2 degrees of freedom
Assumes that the samples are drawn from
populations with equal standard deviations
JRA 01/13
Comparison of two means with a
t-test (no assumptions about equal variance)
JRA 01/13
Comparison of means from two
sets of data
Set 1
2.31017
2.30986
2.31010
2.31001
2.31024
2.31010
2.31028
Mean = 2.31010
n=7
s = 0.00014
13 degrees of freedom (n+n-2)
Set 2
2.30143
2.29890
2.29816
2.30182
2.29869
2.29940
2.29849
2.29889
Mean = 2.29947
n=8
s = 0.00137
H0 = no significant difference between means
JRA 01/13
JRA 01/13
Comparison of 2 Means with t-test
spooled =
t
=
Values of Student’s t
0.00014 2 (7 - 1) + 0.00137 2 (8 - 1)
= 0.00102
7+8-2
2.31010 - 2.29947
0.00102
7(8)
= 20.2
7+8
For 13 degrees of freedom, tcritical is 2.228-2.131 @ 95% CL
The calculated t value is >, therefore, reject the H0 and
the difference is significant.
JRA 01/13
JRA 01/13
Exercise # 1–Refractive Index Data Analysis
(Eleven different fragments were measured for K1 & Q2 and Q1. K1 and Q2 samples
Exercise # 1–Refractive Index Data Analysis
were removed from the same source of fragments. Q1 samples were removed from a
1.5195
different source of glass.)
1.5190
Sample
K1&Q2
K1&Q2
K1&Q2
K1&Q2
K1&Q2
K1&Q2
K1&Q2
K1&Q2
K1&Q2
K1&Q2
K1&Q2
RI
1.51880
1.51881
1.51886
1.51881
1.51888
1.51870
1.51874
1.51881
1.51872
1.51881
1.51880
Mean
1.51879
SD
0.00005
Sample
Q1
Q1
Q1
Q1
Q1
Q1
Q1
Q1
Q1
Q1
Q1
RI
1.51828
1.51844
1.51842
1.51838
1.51841
1.51848
1.51834
1.51842
1.51841
1.51838
1.51834
Mean
SD
I
R
1.5185
CASE$
1.5180
12 10 8 6 4 2 0 2 4 6 8 10 12
Count
Count
1.5195
Q1
K1
1.5190
1.51839
0.00006
I
R
1.5185
CASE$
1.5180
7 6 5 4 3 2 1 0 1 2 3 4 5 6 7
Count
Count
JRA 01/13
Q2
K1
JRA 01/13
t-Test: Two-Sample Assuming Unequal Variances
Mean
Variance
Observations
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
Exercise # 2 – Elemental Analysis
Q1
K1
1.51901 1.51911
1.01E-08 6.73E-09
5
5
0
8
-1.792570632
0.055401831
1.85954832
0.110803662
2.306005626
K1 and Q1 analysis summary:
EXTERNAL CALIBRATION FOR ICP-MS GLASS SAMPLE: K1
(concentrations in ppm)
there is NOT a significant difference between the 2 values
Sample
Ti
Mn
Ga
Rb
Sr
Zr
Ba
La
Ce
Sm
Hf
Pb
K1 and Q1
K1 and Q1
65.87
67.22
19.63
20.06
0.515
0.540
1.656
1.669
32.23
31.91
36.24
37.64
17.49
18.03
2.782
2.806
4.512
4.628
0.363
0.394
0.936
0.931
0.889
1.006
K1 and Q1
67.21
19.53
0.485
1.710
32.50
36.99
17.59
2.722
4.510
0.383
0.930
0.979
66.77
0.78
19.74
0.28
0.513
0.027
1.678
0.028
32.21
0.30
36.96
0.70
17.70
0.29
2.770
0.043
4.550
0.067
0.380
0.016
0.932
0.003
0.958
0.062
1.2
1.4
5.3
1.7
0.9
1.9
1.6
1.6
1.5
4.2
0.4
6.4
t-Test: Two-Sample Assuming Unequal Variances
Mean
Variance
Observations
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
Q2
K1
1.51835 1.51911
4.665E-08 6.73E-09
5
5
0
5
-7.394163923
0.000355858
2.015049176
0.000711716
Average
SD
%SD
2.570577635
there is a significant difference between the 2 values
JRA 01/13
JRA 01/13
Exercise # 2 – Elemental Analysis
Exercise # 2 – Elemental Analysis
K1 and Q1 analysis summary:
EXTERNAL CALIBRATION FOR ICP-MS GLASS SAMPLE: K1 / Q1
(concentrations in ppm)
Q2 analysis summary:
Sample
EXTERNAL CALIBRATION FOR ICP-MS GLASS SAMPLE: Q2
(concentrations in ppm)
Sample
Ti
Mn
Ga
Rb
Sr
Zr
Ba
La
Ce
Sm
Hf
Pb
Q2
Q2
122.6
137.1
2.888
6.339
2.054
2.131
0.264
0.276
24.10
23.79
81.14
79.74
28.47
28.38
19.25
18.53
183.7
184.0
0.630
0.719
1.931
1.805
11.50
9.932
Q2
126.5
3.760
2.095
0.257
24.25
79.83
29.96
20.67
196.4
0.641
1.965
11.94
128.7
7.5
4.329
1.794
2.093
0.039
0.266
0.010
24.05
0.23
80.24
0.78
28.94
0.89
19.48
1.09
188.0
7.2
0.664
0.048
1.900
0.085
11.13
1.06
5.8
42
1.9
3.7
1.0
1.0
3.1
5.6
3.8
7.3
4.5
9.5
%SD
JRA 01/13
Mn
Ga
Rb
Sr
Zr
Ba
La
Ce
Sm
Hf
Pb
66.77 19.74 0.513 1.678 32.21 36.96 17.70 2.770 4.550 0.380 0.932 0.958
SD
%SD
0.78
1.2
0.28
1.4
0.027 0.028
5.3
1.7
0.30
0.9
0.70
1.9
0.29
1.6
0.043 0.067 0.016 0.003 0.062
1.6
1.5
4.2
0.4
6.4
Q2 analysis summary:
EXTERNAL CALIBRATION FOR ICP-MS GLASS SAMPLE: Q2
(concentrations in ppm)
Sample
average
SD
Ti
K1/Q1 Average
Q2 average
SD
%SD
Ti
Mn
Ga
Rb
Sr
Zr
Ba
La
Ce
Sm
Hf
Pb
128.7
4.329
2.093
0.266
24.05
80.24
28.94
19.48
188.0
0.664
1.900
11.13
7.5
5.8
1.794
42
0.039
1.9
0.010
3.7
0.23
1.0
0.78
1.0
0.89
3.1
1.09
5.6
7.2
3.8
0.048
7.3
0.085
4.5
1.06
9.5
JRA 01/13
Exercise # 2 – Elemental Analysis
Values of Student’s t
Analysis summary for comparison of samples using a two-sample t-test:
Sample
comparison
Ti
K1/Q1 with Q2 14.2
Mn
Ga
Rb
Sr
Zr
Ba
La
Ce
Sm
Hf
Pb
14.7
57.8
81.4
37.7
71.3
20.8
26.6
44
9.6
19.8
16.6
The critical t value for all entries is 2.8, if the calculated t is larger than this critical t,
the samples are distinguishable and marked in red.
If any of the elements are considered significantly different (absent an
explanation), then the glass samples are considered different.
JRA 01/13
Significance tests
JRA 01/13
Paired t-test
  Comparison
of an experimental mean
with a known value
  Comparison of two experimental means
  “Paired” comparisons
  One-sided or two sided tests
  Comparison of standard deviations
  Determination of outliers
  Analysis of Variance (ANOVA)
  Comparison of several means
JRA 01/13
  Comparing
results from two different
methods
  Need to separate difference due to
methods (d, if any exist) from differences
due to chance
  Paired t-test
t = d√n/sd
Where d is mean of differences
sd is standard deviation of differences
Degrees of freedom is n-1
JRA 01/13
Paired t-test
One sided and two sided tests
So far we’ve tested for differences in means
in either direction (2 sided)
  There are occasions when only one side of
the test is affected (increase in a rate of
reaction). 1 sided
  The critical value of t is halved
  (p= 0.10 is used instead of p= 0.05 in the
table)
  Prior knowledge is needed to decide on 1 or 2
sided tests
 
t = 0.159√9/0.570
t= 0.88 and critical t9= 2.26 (P=0.05)
We DO NOT reject the H0, no difference
JRA 01/13
F test for the comparison of s
Comparing the random errors of two sets of
data
  Is method A more precise than method B?
  Do methods A and B differ in precision?
 
JRA 01/13
Rejection of Measurements….
In the case where you suspect something went wrong,
you use the Q-test. (Minimum of 3 measurements)
(Dixon’s Test)
Q-test = |suspect value - nearest value| / total range
Calculate Qexp.
If Qexp ³ Qcritical (from Q table), then reject.
d
F = S21/S22
Where 1 and 2 are allocated so that F is greater
than or equal to 1
Use 2 degree of freedom values from F table
JRA 01/13
X1 X2 X3 X4 X5
X6
Qexp = d/w
w
JRA 01/13
Test for outliers
Significance Testing - Part 2
JRA 01/13
Sources of variation
 
 
Analysis of Variance (ANOVA)
Due to random error in measurement
 
Causes a different result each time a
measurement is repeated.
Controlled or fixed-effect factor
1. 
2. 
3. 
Analysis of Variance (ANOVA) - used to separate
and estimate the different causes of variation
in a data set.
ie.
1)  comparing the mean concentration of metals in
solution stored under different conditions.
2)  comparing the concentration of the metals in
solution analyzed by different methods.
3)  Comparing results obtained by different
analysts.
JRA 01/13
the storage conditions
methods of analysis
people conducting the analysis
JRA 01/13
  Used
to separate any variation which is
caused by changing the controlled
factor from the variation due to random
error.
(test whether altering the controlled factor
leads to a significant difference between
the mean values obtained during the
analysis)
JRA 01/13
Fluorescene from solutions
stored under different conditions
Example
 
 
 
1. 
2. 
Measurement of concentration of NaCl
in a large barrel
Samples are removed from several
different locations in the barrel
Several replicate analyses are
performed on each of these samples
Random error in concentration
Variation in concentration from samples
from different parts of the barrel
JRA 01/13
Fluorescene from solutions
stored under different conditions
Is the difference between the sample means too great
To be explained by the random error?
JRA 01/13
JRA 01/13
Steps in ANOVA
Null hypothesis - all the samples are drawn from
a population with mean u and variance
  Estimate the within-sample variation (note:
within sample variation does not depend on the
means of the sample)
  Determine the within-sample estimate of the
variance
  Determine the between-sample variation (F-test)
 
JRA 01/13
JRA 01/13