Download Power of the test

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Pre- and post-test probability wikipedia , lookup

Effect size wikipedia , lookup

Transcript
Introduction to sample size and
power calculations
Afshin Ostovar
Bushehr University of Medical Sciences
Questions
1.
How many samples do we need?
2.
How much chance do we have to reject the null
hypothesis when the alternative is in fact true?
(what’s the probability of detecting a real effect?)
3.
Can we quantify how much power we have for
given sample sizes?
Why don’t we take the sample size as
large as possible?
1. Economy:
o
o
Why should we include more than we have to?
Every trial costs!
2. Ethics:
o
We should never punish more test persons or animals than necessary
3. Statistics:
o
o
We can proof almost every effect, if we only have sufficiently large sample size
Stress field: statistical significance vs. clinical relevance
What do we want to do?
We have to find the correct sample size to
detect the desired effect.
Not too small - Not too large
What do we need on the way?

How does a test work?

What means ”Power of a test”?

What determines the sample size?

How do we handle this in practical tasks?
A short introduction to hypotheses testing
Strategy:
1.
Formulate a hypothesis
Expected heights equal
Null hypothesis
H0
vs.
vs.
vs.
Expected heights different
Alternative
H1
2.
Find an appropriate test statistics
3.
Compute the observed test statistics
4.
Reject the null hypothesis H0 if Test statistic is too large.
But what does this mean: ”too large”?
Possible results of a single test
Test Decision
Reality
Accept
Reject
H0 True
Right
Type I Error
H0 False
Type II Error
Right

Wrong decisions:
o
o

Rejection even if H0 is true (type I error)
No rejection even if H0 is false (type II error)
What do we want?
o
o
Reduction of the wrong decisions.
⇒ Minimal probability for both types of errors.
Two opaque jars, each holding 100 beads, some blue and some white
Principles of Sample Size
1.
2.
3.
4.
5.
Variability
Effect size
Significance level
Power of the test
Type of the test
Variability
o
e.g. standard deviation of both samples for a t-test
o
taken some experience former or pilot studies
Effect Size
o
What effect (mean difference, risk, coefficient size) is
clinically relevant?
o
Which effect must be detectable to make the study
meaningful?
o
This is a choice by the researcher (not statistically
determined!)
Significance level
o
usually set to = 0.05
o
Adjustments must be taken into account (e.g.
multiple testing)
Power of the test
o
o
often used 1 − β = 0.8
This is a choice by the researcher (not statistically
determined!)
Type of the Test
o
Different test for the same problem often
have different power
What to put in a grant application
1.
Bare essentials:

Proportions



P1
The effect size (from which P2 and the
standard error can be calculated)
Means



Population mean for the controls
The effect size
Standard deviation for either the cases
or controls
What to put in a grant application

Some points

The variable you choose for the calculation should be the
most important one to the interpretation of the study.

Sometimes it is wise to do calculations for a number of
outcome variables to see if they all give about the same
sample size or if there are some variables which are more
likely to yield precise results than others.
Some points, cont’d

Altman Nomogram
Sample Size Calculation software







SPSS
STATA
PASS
Gpower
PS
Epiinfo
…
What this workshop covers

Means
1.
2.
3.
4.
5.

Proportions
1.
2.
3.

Estimation
Comparing with a constant
Comparing two independent groups
Comparing ≥ two independent groups
Comparing before and after situations
Estimation
Comparing with a constant
Comparing two independent groups
Correlation & Regression
What this workshop does not cover

Complex formulas

Special cases:








Survival analysis
Survey sampling methods
Crossover trials
Non-inferiority trials
Missing data
Subgroup analysis
Non-parametric tests
…