* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Sample size for microarray experiments
Survey
Document related concepts
Transcript
R users: ssize package for sample size Judith Boer June 6, 2006 Introduction ssize package • Simple R tool for sample size determination based on a pilot study • Developed by Gregory Warnes (Pfizer) • Available from BioC • Input is list of standard deviations from representative control arrays ssize package (BioC, Gregory Warnes, Pfizer R&D) • Estimate standard deviation for each gene based on representative control arrays • Set parameters for • minimum effect size (fold change) 2 • maximum family-wise type I error rate 0.05 • desired power 0.80 • Calculate per-test error rate based on Bonferroni correction • Compute sample size for each gene separately according to standard formula for the two-sample t-test with pooled variance • Summarize the necessary sample size across all genes using a cumulative plot Assumption and output of ssize • Microarray data has been normalized and transformed so that the data for each gene is sufficiently close to a normal distribution that a standard 2-sample pooled-variance t-test will reliably detect differentially expressed genes • however, alternative test method can be implemented • Output ssize package is three plots: • power plot • sample size plot • fold change plot Performance of ssize in simulation study • Manuscript submitted to Biometrics: Warnes & Liu • Dependence of the genes: little or no effect • Proportion of genes with true differential expression: no effect • Unequal variance between control and test groups: has effect • solution: use unequal variance t-test sample size formula • Multiple comparison method: use of Bonferroni correction considerably underestimates power, hence overestimates sample size • use of FDR planned by Warnes Demonstration on our data • Demonstrated on Agilent two-color data from a platform comparison study by Peter-Bram ‘t Hoen and the LGTC • 10 arrays with direct comparisons of 5 WT and 5 transgenic mice, using dye swap replicates • Loess normalization in limma • Exported the MA list object • Extracted the normalized log ratios for the KO vs WT with same dye orientation (first 5 arrays) • Calculated the standard deviation of the log ratios (exp.sd) • Used exp.sd for the ssize package Histograms of the standard deviations Powerplot: power to detect 2-fold change Samplesize plot: sample size to detect 2-fold change Fold change plot: fold change to achieve 80% power Conclusions ssize package • Not useful for absolute estimation of sample size due to Bonferroni correction • Their own example comparing Bonferroni and FDR showed reduction of sample size needed for 90% power from 8 to below 3 arrays (curve very steep) • Useful for relative sample size estimation • compare different microarray platforms • compare different biological sources (organism, tissue, treatment, in vitro) • Simple tool that anyone can use in R