Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Power and Sample Size Boulder 2004 Benjamin Neale Shaun Purcell To Be Accomplished • Introduce Concept of Power via Correlation Coefficient (ρ) Example Identify Relevant Factors Contributing to Power Practical: • • • • Power Analysis for Univariate Twin Model How to use Mx for Power Simple example Investigate the linear relationship (r) between two random variables X and Y: r=0 vs. r0 (correlation coefficient). • draw a sample, measure X,Y • calculate the measure of association r (Pearson product moment corr. coeff.) • test whether r 0. How to Test r 0 • • • • • assumed the data are normally distributed defined a null-hypothesis (r = 0) chosen a level (usually .05) utilized the (null) distribution of the test statistic associated with r=0 t=r [(N-2)/(1-r2)] How to Test r 0 • • • Sample N=40 r=.303, t=1.867, df=38, p=.06 α=.05 As p > α, we fail to reject r = 0 have we drawn the correct conclusion? a= type I error rate probability of deciding r 0 (while in truth r=0) a is often chosen to equal .05...why? DOGMA N=40, r=0, nrep=1000 – central t(38), a=0.05 (critical value 2.04) 800 600 400 NREP=1000 4.6% sign. 200 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 4 5 0.4 0.3 central t(38) 0.2 2.5% 2.5% 0.1 0 -5 -4 -3 -2 -1 0 1 2 3 Observed non-null distribution (r=.2) and null distribution 100 80 60 abs(t)>2.04 in 23% rho=.20 N=40 Nrep=1000 40 20 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 1 2 3 4 5 0.5 0.4 null distribution t(38) 0.3 0.2 0.1 0 -5 -4 -3 -2 -1 0 In 23% of tests of r=0, |t|>2.024 (a=0.05), and thus draw the correct conclusion that of rejecting r = 0. The probability of rejecting the nullhypothesis (r=0) correctly is 1-b, or the power, when a true effect exists Hypothesis Testing • Correlation Coefficient hypotheses: – ho (null hypothesis) is ρ=0 – ha (alternative hypothesis) is ρ ≠ 0 • Two-sided test, where ρ > 0 or ρ < 0 are one-sided • Null hypothesis usually assumes no effect • Alternative hypothesis is the idea being tested Summary of Possible Results accept H-0 reject H-0 H-0 true 1-a a H-0 false b 1-b a=type 1 error rate b=type 2 error rate 1-b=statistical power Rejection of H0 Non-rejection of H0 H0 true Type I error at rate a Nonsignificant result (1- a) HA true Significant result (1-b) Type II error at rate b Power • The probability of rejection of a false null-hypothesis depends on: –the significance criterion (a) –the sample size (N) –the effect size (NCP) “The probability of detecting a given effect size in a population from a sample of size N, using significance criterion a” Standard Case Sampling P(T) distribution if H0 were true alpha 0.05 Sampling distribution if HA were true POWER = 1 - b b a Effect Size (NCP) T Impact of Less Cons. alpha Sampling P(T) distribution if H0 were true alpha 0.1 Sampling distribution if HA were true POWER = 1 - b b a T Impact of More Cons. alpha Sampling P(T) distribution if H0 were true alpha 0.01 Sampling distribution if HA were true POWER = 1 - b b a T Increased Sample Size Sampling P(T) distribution if H0 were true alpha 0.05 Sampling distribution if HA were true POWER = 1 - b b a T Increase in Effect Size Sampling P(T) distribution if H0 were true alpha 0.05 Sampling distribution if HA were true POWER = 1 - b b a Effect Size (NCP)↑ T Effects on Power Recap • Larger Effect Size • Larger Sample Size • Alpha Level shifts <Beware the False Positive!!!> • Type of Data: – Binary, Ordinal, Continuous When To Do Power Calcs? • • • • Generally study planning stages of study Occasionally with negative result No need if significance is achieved Computed to determine chances of success Power Calculations Empirical • • • • • Attempt to Grasp the NCP from Null Simulate Data under theorized model Calculate Statistics and Perform Test Given α, how many tests p < α Power = (#hits)/(#tests) Practical: Empirical Power 1 • We will Simulate Data under a model online • We will run an ACE model, and test for C • We will then submit our data and Shaun will collate it for us • While he’s collating, we’ll talk about theoretical power calculations Practical: Empirical Power 2 • First get F:\ben\2004\ace.mx and put it into your directory • We will paste our simulated data into this script, so open it now in preparation, and note both places where we must paste in the data • Note that you will have to fit the ACE model and then fit the AE submodel Practical: Empirical Power 3 • Simulation Conditions – 30% A2 20% C2 50% E2 – Input: – A 0.5477 C of 0.4472 E of 0.7071 – 350 MZ 350 DZ – Simulate and Space Delimited at – http://statgen.iop.kcl.ac.uk/workshop/unisim.html or click here in slide show mode – Click submit after filling in the fields and you will get a page of data Practical: Empirical Power 4 • With the data page, use control-a to select the data, control-c to copy, and in Mx control-v to paste in both the MZ and DZ groups. • Run the ace.mx script with the data pasted in and modify it to run the AE model. • Report the A, C, and E estimates of the first model, and the A and E estimates of the second model as well as both the -2log-likelihoods on the webpage http://statgen.iop.kcl.ac.uk/workshop/ or click here in slide show mode Practical: Empirical Power 5 • Once all of you have submitted your results we will take a look at the theoretical power calculation, using Mx. • Once we have finished with the theory Shaun will show us the empirical distribution that we generated today Theoretical Power Calculations • Based on Stats, rather than Simulations • Can be calculated by hand sometimes, but Mx does it for us • Note that sample size and alpha-level are the only things we can change, but can assume different effect sizes • Mx gives us the relative power levels at the alpha specified for different sample sizes Theoretical Power Calculations • We will use the power.mx script to look at the sample size necessary for different power levels • In Mx, power calculations can be computed in 2 ways: – Using Covariance Matrices (We Do This One) – Requiring an initial dataset to generate a likelihood so that we can use a chi-square test Power.mx 1 ! Simulate the data ! 30% additive genetic ! 20% common environment ! 50% nonshared environment #NGroups 3 G1: model parameters Calculation Begin Matrices; X lower 1 1 fixed Y lower 1 1 fixed Z lower 1 1 fixed End Matrices; Matrix X 0.5477 Matrix Y 0.4472 Matrix Z 0.7071 Begin Algebra; A = X*X' ; C = Y*Y' ; E = Z*Z' ; End Algebra; End Power.mx 2 G2: MZ twin pairs Calculation Matrices = Group 1 Covariances A+C+E| A+C Options MX%E=mzsim.cov End A+C _ | G3: DZ twin pairs Calculation Matrices = Group 1 H Full 1 1 Covariances A+C+E| H@A+C _ H@A+C | A+C+E / Matrix H 0.5 Options MX%E=dzsim.cov End A+C+E / Power.mx 3 ! Second part of script ! Fit the wrong model to the simulated data ! to calculate power #NGroups 3 G1 : model parameters Calculation Begin Matrices; X lower 1 1 free Y lower 1 1 fixed Z lower 1 1 free End Matrices; Begin Algebra; A = X*X' ; C = Y*Y' ; E = Z*Z' ; End Algebra; End Power.mx 4 G2 : MZ twins Data NInput_vars=2 NObservations=350 CMatrix Full File=mzsim.cov Matrices= Group 1 Covariances A+C+E | A+C _ A+C | Option RSiduals End G3 : DZ twins Data NInput_vars=2 NObservations=350 CMatrix Full File=dzsim.cov Matrices= Group 1 H Full 1 1 Covariances A+C+E | H@A+C _ H@A+C | Matix H 0.5 Option RSiduals ! Power for alpha = 0.05 and 1 df Option Power= 0.05,1 End A+C+E / A+C+E /