* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Missing Data Pt. 2
Survey
Document related concepts
Transcript
Guide to Handling Missing Information • Contacting researchers • Algebraic recalculations, conversions and approximations • Imputation method (substituting missing data) Imputation Method - When recalculations not possible -e.g. no standard deviation for a study - Use available data from other studies or other meta-analysis Imputation Method a. Within study imputation b. Multiple imputations Within-study _ imputation Method 1. (Means) ~ SDj _ Xj ~SD = X Ʃ k SD j j ______ i _ i Ʃik Xi = Standard deviation (SD) for missing data from study j =Mean from study with missing SD Ʃik SDi =Summation of all known SD from different studies (Ʃik Xi) =Summation of means from different studies other than j _ Assumptions - ~SD = X Ʃ k SD j j ______ i _ i Ʃik Xi •Assumes SD to mean ratio is at the same scale for all studies - Experimental scales can differ tremendously between different taxonomic groups or experimental designs Method 2. (sample size) ~s α+β(n ) j= j -Regression techniques - Reports sample size but missing information to calculate pooled SD (required for Hedge’s d). α = Intercept β = slope of the linear regression of n vs s nj = observed sample size of the study with missing data Assumptions ~s α+β(n ) j= j • Assumes n (observed sample size of the study with missing data) is a good predictor s. Method 3. No. of studies ~s = Ʃ k s √n j i j i _____ K √nj K= number of studies with complete information on s and n (sample size of individual study) Method 4. Follman et al. (1992) Furukawa et al. (2006) ~s = √Ʃ k [(n -1)Ϭ2 ] j __________ i i i √Ʃik (ni-1) Ϭ2= variance n= sample size of individual study Assumptions • Some degree of homogeneity among the _ observed SD and X across studies • Assume information is missing at random and not due to reporting biases (non-random) -Imputations retain their original units. -Large variations among estimates will bias imputations. Multiple imputations • Use random sampling approach • Average repeated sampling for missing data Overall imputed synthesis Advantage of multiple imputations • Variability is explicitly modeled therefore do no treat imputed value as true observation • e.g. ~sj=α+β(nj) Does not account for error associated with α or β. Methods: Multiple imputations • Various methods: use maximum likelihood or Bayesian models. • Requires specialized software • e.g. Hot Deck- To calculate pooled s but several SD values missing - Random sample of s drawn with replacement possible s - Process repeated with replacement from possible s - Repeat till we get “m” number of complete data sets Methods: Hot deck _ calculate effect size= δ _ Calculate variance = Ϭ2 (δl) . . _ δ = Ʃlm =___ 1 δl m for each(m) data set Pooled effect size . Variance= _ _ m = Ϭ2(δ ) + (1+1) Ʃ m= (δ – δ)2 Ϭ2(δ)= Ʃ_________ _ _________ l 1 l l 1 l m m m-1 Rubin and Schenker (1991) If 30% data missing->m= 3 If 50% data missing->m= 5 Non-parametric analyses and bootstrapping • Alternative to Hedge’s d • Using weighting scheme • Does not require SD • E.g log response ratio _ T= treatment lnR= ln X T ___ _ C= control XC If sample size available but no SD Ϭ2=(lnR)= n___ Inverse of a simplified estimate T nC nT+nC of variance Effects of Imputation • No standardized method for imputation-> bias Rubin and Schenker (1991) e.g. • Appropriateness of imputed data can be evaluated using a sensitivity analysis • Benefits despite potential bias - Improved variance estimate (i.e. smaller CI) over exclusion - May potentially improve representation of null studies