Download Power and Sample Size

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Power and Sample Size
Boulder 2004
Benjamin Neale
Shaun Purcell
To Be Accomplished
•
Introduce Concept of Power via
Correlation Coefficient (ρ) Example
Identify Relevant Factors Contributing to
Power
Practical:
•
•
•
•
Power Analysis for Univariate Twin Model
How to use Mx for Power
Simple example
Investigate the linear relationship (r)
between two random variables X and Y:
r=0 vs. r0 (correlation coefficient).
• draw a sample, measure X,Y
• calculate the measure of association r
(Pearson product moment corr. coeff.)
• test whether r  0.
How to Test r  0
•
•
•
•
•
assumed the data are normally
distributed
defined a null-hypothesis (r = 0)
chosen a level (usually .05)
utilized the (null) distribution of the
test statistic associated with r=0
t=r  [(N-2)/(1-r2)]
How to Test r  0
•
•
•
Sample N=40
r=.303, t=1.867, df=38, p=.06 α=.05
As p > α, we fail to reject r = 0
have we drawn the correct conclusion?
a= type I error rate
probability of deciding r  0
(while in truth r=0)
a is often chosen to equal
.05...why?
DOGMA
N=40, r=0, nrep=1000 – central t(38),
a=0.05 (critical value 2.04)
800
600
400
NREP=1000
4.6% sign.
200
0
-5
-4
-3
-2
-1
0
1
2
3
4
5
4
5
0.4
0.3
central t(38)
0.2
2.5%
2.5%
0.1
0
-5
-4
-3
-2
-1
0
1
2
3
Observed non-null distribution (r=.2)
and null distribution
100
80
60
abs(t)>2.04 in
23%
rho=.20
N=40
Nrep=1000
40
20
0
-5
-4
-3
-2
-1
0
1
2
3
4
5
1
2
3
4
5
0.5
0.4
null distribution t(38)
0.3
0.2
0.1
0
-5
-4
-3
-2
-1
0
In 23% of tests of r=0, |t|>2.024
(a=0.05), and thus draw the correct
conclusion that of rejecting r = 0.
The probability of rejecting the nullhypothesis (r=0) correctly is 1-b, or
the power, when a true effect exists
Hypothesis Testing
• Correlation Coefficient hypotheses:
– ho (null hypothesis) is ρ=0
– ha (alternative hypothesis) is ρ ≠ 0
• Two-sided test, where ρ > 0 or ρ < 0 are one-sided
• Null hypothesis usually assumes no effect
• Alternative hypothesis is the idea being
tested
Summary of Possible Results
accept H-0
reject H-0
H-0 true
1-a
a
H-0 false
b
1-b
a=type 1 error rate
b=type 2 error rate
1-b=statistical power
Rejection of H0
Non-rejection of H0
H0 true
Type I error
at rate a
Nonsignificant result
(1- a)
HA true
Significant result
(1-b)
Type II error
at rate b
Power
• The probability of rejection of a
false null-hypothesis depends
on:
–the significance criterion (a)
–the sample size (N)
–the effect size (NCP)
“The probability of detecting a given effect size
in a population from a sample of size N,
using significance criterion a”
Standard Case
Sampling
P(T) distribution if
H0 were true alpha 0.05
Sampling
distribution if HA
were true
POWER = 1 - b
b
a
Effect Size (NCP)
T
Impact of Less Cons. alpha
Sampling
P(T) distribution if
H0 were true alpha 0.1
Sampling
distribution if HA
were true
POWER = 1 - b 
b
a
T
Impact of More Cons. alpha
Sampling
P(T) distribution if
H0 were true alpha 0.01
Sampling
distribution if HA
were true
POWER = 1 - b
b
a
T
Increased Sample Size
Sampling
P(T) distribution if
H0 were true alpha 0.05
Sampling
distribution if HA
were true
POWER = 1 - b
b
a
T
Increase in Effect Size
Sampling
P(T) distribution if
H0 were true alpha 0.05
Sampling
distribution if HA
were true
POWER = 1 - b
b
a
Effect Size (NCP)↑
T
Effects on Power Recap
• Larger Effect Size
• Larger Sample Size
• Alpha Level shifts <Beware the False
Positive!!!>
• Type of Data:
– Binary, Ordinal, Continuous
When To Do Power Calcs?
•
•
•
•
Generally study planning stages of study
Occasionally with negative result
No need if significance is achieved
Computed to determine chances of
success
Power Calculations Empirical
•
•
•
•
•
Attempt to Grasp the NCP from Null
Simulate Data under theorized model
Calculate Statistics and Perform Test
Given α, how many tests p < α
Power = (#hits)/(#tests)
Practical: Empirical Power 1
• We will Simulate Data under a model
online
• We will run an ACE model, and test for C
• We will then submit our data and Shaun
will collate it for us
• While he’s collating, we’ll talk about
theoretical power calculations
Practical: Empirical Power 2
• First get F:\ben\2004\ace.mx and put it
into your directory
• We will paste our simulated data into this
script, so open it now in preparation, and
note both places where we must paste in
the data
• Note that you will have to fit the ACE
model and then fit the AE submodel
Practical: Empirical Power 3
• Simulation Conditions
– 30% A2
20% C2
50% E2
– Input:
– A 0.5477 C of 0.4472 E of 0.7071
– 350 MZ 350 DZ
– Simulate and Space Delimited at
– http://statgen.iop.kcl.ac.uk/workshop/unisim.html or
click here in slide show mode
– Click submit after filling in the fields and you will get a
page of data
Practical: Empirical Power 4
• With the data page, use control-a to select the
data, control-c to copy, and in Mx control-v to
paste in both the MZ and DZ groups.
• Run the ace.mx script with the data pasted in
and modify it to run the AE model.
• Report the A, C, and E estimates of the first
model, and the A and E estimates of the second
model as well as both the
-2log-likelihoods on the webpage
http://statgen.iop.kcl.ac.uk/workshop/ or click
here in slide show mode
Practical: Empirical Power 5
• Once all of you have submitted your
results we will take a look at the theoretical
power calculation, using Mx.
• Once we have finished with the theory
Shaun will show us the empirical
distribution that we generated today
Theoretical Power Calculations
• Based on Stats, rather than Simulations
• Can be calculated by hand sometimes, but
Mx does it for us
• Note that sample size and alpha-level are
the only things we can change, but can
assume different effect sizes
• Mx gives us the relative power levels at
the alpha specified for different sample
sizes
Theoretical Power Calculations
• We will use the power.mx script to look at
the sample size necessary for different
power levels
• In Mx, power calculations can be
computed in 2 ways:
– Using Covariance Matrices (We Do This One)
– Requiring an initial dataset to generate a
likelihood so that we can use a chi-square test
Power.mx 1
! Simulate the data
!
30% additive genetic
!
20% common environment
!
50% nonshared environment
#NGroups 3
G1: model parameters
Calculation
Begin Matrices;
X lower 1 1 fixed
Y lower 1 1 fixed
Z lower 1 1 fixed
End Matrices;
Matrix X 0.5477
Matrix Y 0.4472
Matrix Z 0.7071
Begin Algebra;
A = X*X' ;
C = Y*Y' ;
E = Z*Z' ;
End Algebra;
End
Power.mx 2
G2: MZ twin pairs
Calculation
Matrices = Group 1
Covariances A+C+E|
A+C
Options MX%E=mzsim.cov
End
A+C _
|
G3: DZ twin pairs
Calculation
Matrices = Group 1
H Full 1 1
Covariances A+C+E|
H@A+C _
H@A+C |
A+C+E /
Matrix H 0.5
Options MX%E=dzsim.cov
End
A+C+E /
Power.mx 3
! Second part of script
! Fit the wrong model to the simulated data
! to calculate power
#NGroups 3
G1 : model parameters
Calculation
Begin Matrices;
X lower 1 1 free
Y lower 1 1 fixed
Z lower 1 1 free
End Matrices;
Begin Algebra;
A = X*X' ;
C = Y*Y' ;
E = Z*Z' ;
End Algebra;
End
Power.mx 4
G2 : MZ twins
Data NInput_vars=2 NObservations=350
CMatrix Full File=mzsim.cov
Matrices= Group 1
Covariances A+C+E
|
A+C _
A+C
|
Option RSiduals
End
G3 : DZ twins
Data NInput_vars=2 NObservations=350
CMatrix Full File=dzsim.cov
Matrices= Group 1
H Full 1 1
Covariances A+C+E
|
H@A+C _
H@A+C
|
Matix H 0.5
Option RSiduals
! Power for alpha = 0.05 and 1 df
Option Power= 0.05,1
End
A+C+E /
A+C+E /