Download One-sample tests

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
BIOINF 2118
Statistical Testing Part 2
Page 1 of 4
TOPICS
Variance known
variance unknown
z- test
t-test
One-sample tests
(or paired samples)
two-sample tests
z- test
same variance
t-test
Testing
H0:”variances are equal”
different variances
Welsh test
F-test
One-sample tests
One-sample tests are tests of LOCATION.
“Is this sample from a distribution that is “centered” at zero?”
“Center” = mean?
Or “center” = median? Here, it’s “mean”.
A test can be parametric or nonparametric. Today, it’s parametric.
Parametric tests are derived from some assumption, together with a criterion for the “best” test.
(A) Example: the ONE-SAMPLE z-test.
(A.1): PLANNING A STUDY
Suppose
with
known, and
H0:
unknown. The “null hypothesis” is
= 0.
If the alternative hypothesis is
HA:
>0
then “best test” against this “one-sided hypothesis” is the upper tail:
RR = “rejection region” =
.
This has a Type I error of
.
.
Notice that
X
is sufficient. You don’t need the entire dataset.
X
is a “sufficient statistic”.
BIOINF 2118
Statistical Testing Part 2
Page 2 of 4
For a specific alternative,
HA:
=
> 0,
The probability of rejecting is
where
is the Type II error.
Determining the sample size needed
Suppose you’ve picked the acceptable Type I and II errors, α, β and the alternative
can get the sample size needed. From the end of the last equation,
So n has to quadruple if you double σ, double
, or halve
Specific example of one-sample test:
See R code handout, “normal and t test one-sample.Rmd”.
.
. Then you
BIOINF 2118
Statistical Testing Part 2
Page 3 of 4
(A.2) Data analysis, z-test
Suppose you observe
X
= 6.0. This is bigger than the critical value. So “Reject H0”.
The P-value is the probability of observing a result as extreme or more extreme, given H0.
P-value =
See R code handout.
(B) Example: the ONE-SAMPLE t-test.
If the variance is NOT known, you can estimate it.
But the uncertainly needs to be taken into consideration.
Remember that a t variate is a ratio between a standard normal Z and the square-root of a chisquare
divided by the degrees of freedom
. ( If the Z and
independent.) The “divided by
” ensures
that the denominator converges to the true standard deviation and therefore the t converges to the
normal.
.
In our case,
Notice that the
, and
.
cancels out:
.
This will be the test statistic. See R code handout for detail.
(B.1): PLANNING A STUDY
RR = “rejection region” =
.
This has a Type I error of
; see discussion of the normal test above.
BIOINF 2118
Statistical Testing Part 2
Page 4 of 4
.
The power is a little more complicated. It involves a “non-central t distribution”, which is not just a tdistribution shifted over. The “noncentrality parameter” is
. So the power will
depend on the alternative mean, but also what is now an UNknown variance.
(B.2) Data analysis
See handout. Note the difference in P-values if you can’t assume that you know the variance.
Two-sample tests
H0 is that the two samples have the same mean.
When the samples are MATCHED, just subtract the matched pairs to get a single sample. Proceed
as above. Remember, the variance of the difference is the SUM of the variances. So for the normal
test, where you see sigma, replace it with sigma times the square root of 2.
When UNmatched, the samples may have different sizes, say m and n.
C. Example: the TWO-SAMPLE z-test.
If the variance is known, then
will be standard normal under the null hypothesis.
D. Example: the TWO-SAMPLE t-test.
If the variance is UNknown, but the SAME, then you POOL the estimates on the two samples
where
,
.
E. Example: the TWO-SAMPLE t-test, variances not known to be the same.
We’ll visit this in “N13-testing, part 3 - F test.doc” and “F_distribution.R”.