* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Two-way ANOVA - GeneSifter.Net
Epigenetics in learning and memory wikipedia , lookup
Saethre–Chotzen syndrome wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Metagenomics wikipedia , lookup
Neuronal ceroid lipofuscinosis wikipedia , lookup
Long non-coding RNA wikipedia , lookup
History of genetic engineering wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Ridge (biology) wikipedia , lookup
Genomic imprinting wikipedia , lookup
Pathogenomics wikipedia , lookup
Gene therapy wikipedia , lookup
Gene desert wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Gene nomenclature wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
The Selfish Gene wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Genome evolution wikipedia , lookup
Genome (book) wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Public health genomics wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Microevolution wikipedia , lookup
Gene expression programming wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Using 2-way ANOVA to dissect the immune response to hookworm infection in mouse lung Eric Olson [email protected] Using 2-way ANOVA to dissect the immune response to hookworm infection in mouse lung General microarry data analysis workflow From raw data to biological significance Comparison statistics Two-way ANOVA GeneSifter Overview The Gene Expression Omnibus (GEO) Microarray analysis of gene expression following hookworm infection Data overview Dissection of the immune response using 2-way ANOVA The Microarray Data Analysis Process Experimental Design Number of groups, factors, replicates Data management Data, sample annotation, gene annotation, databases Differential Expression Comparison statistics, Correction for multiple testing, Clustering Biological significance Individual genes, Biological themes Platform Selection One-color, two-color, platform comparisons System access Ease of you, accessibility Making data public and using public data MIAME, Journals, GEO, meta-analysis The Microarray Data Analysis Process Experimental Design Number of groups, factors, replicates Data management Data, sample annotation, gene annotation, databases Differential Expression Comparison statistics, Correction for multiple testing, Clustering Biological significance Individual genes, Biological themes Platform Selection One-color, two-color, platform comparisons System access Ease of you, accessibility Making data public and using public data MIAME, Journals, GEO, meta-analysis Experiment Design •Type of experiment – – – Two groups • Normal vs. cancer • Control vs. treated Three or more groups, single factor • Time series • Dose response • Multiple treatment Four or more groups, multiple factors • Time series with control and treated cells The type of experiment and number of groups and factors will determine the statistical methods needed to detect differential expression •Replicates – – The more the better, but at least 3 Biological better than technical Rigorous statistical inferences cannot be made with a sample size of one. The more replicates, the stronger the inference. Pavlidis P, Li Q, Noble WS. The effect of replication on gene expression microarray experiments. Bioinformatics. 2003 Sep 1;19(13):1620-7. Experimental Design and Other Issues in Microarray Studies - Kathleen Kerr http://ra.microslu.washington.edu/learning/documents/KerrNAS.pdf Differential Expression The fundamental goal of microarray experiments is to identify genes that are differentially expressed in the conditions being studied. Comparison statistics can be used to help identify differentially expressed genes and cluster analysis can be used to identify patterns of gene expression and to segregate a subset of genes based on these patterns. •Statistical Significance – Fold change Fold change does not address the reproducibility of the observed difference and cannot be used to determine the statistical significance. – Comparison statistics • 2 group – t-test, Welch’s t-test, Wilcoxon Rank Sum, • 3 or more groups, single factor – One-way ANOVA, Kruskal-Wallis • 4 or more groups, multiple factors – Two-way ANOVA Comparison tests require replicates and use the variability within the replicates to assign a confidence level as to whether the gene is differentially expressed. Supporting material Draghici S. (2002) Statistical intelligence: effective analysis of high-density microarray data. Drug Discov Today, 7(11 Suppl).: S55-63. t-test for comparison of two groups Calculate t statistic t= difference between groups difference within groups = Mean grp 1 – Mean grp 2 ((s12/n1) + (s22/n2))1/2 s = variance n = size of sample Determine confidence level for t (probability that t could occur by chance) df = n1 + n2 - 2 The larger the difference between the groups and the lower the variance the bigger t will be and the lower p will be Differential Expression 2 groups, 4 replicates each Mean, standard deviation, fold change and p-value calculated 8 18 7 16 Mean Signal 6 5 4 3 12 10 8 6 2 Mean Signal 14 4 1 2 0 0 Exp Con Gene 1 Fold Change = 5.3 p = 0.19 Exp Con Gene 2 Fold Change = 5.3 p = 0.03 Fold change vs. p value Analysis of Variance (ANOVA) •Like t-test, identifies genes with large differences between groups and small differences within groups •For use with 3 or more groups •One-way and two-way •One-way examines effects of one factor on gene expression •Two-way can examine effects of two factors on gene expression as well as the interaction of the two factors Pavlidis P. Using ANOVA for gene selection from microarray studies of the nervous system. Methods. 2003 Dec;31(4):282-9. Glantz S. Primer of Biostatistics. 5th Edition. McGraw-Hill. Glantz S, Slinker B. Primer of Regression and Analysis of Variance. McGraw-Hill. Two-way ANOVA Example Triple treatment in Huntington’s Disease model (R6/2 mice, GSE857, Affymetrix U74Av2) Disease effect 3 3 Interaction Disease and treatment effect (no Interaction) R6/2 + R6/2 Treatment effect R6/2 - 3 WT + 3 WT - WT Gene expression pattern Disease Treatment + Two-way ANOVA compared to t-test Triple treatment in Huntington’s Disease model (R6/2 mice, GSE857, Affymetrix U74Av2) Disease Treatment + Disease Differences WT 3 3 R6/2 3 3 t-test 274 Two-way 791 Pavlidis P, Noble WS. Analysis of strain and regional variation in gene expression in mouse brain. Genome Biol. 2001;2(10):RESEARCH0042. Analysis Workflow Examples 2 groups 5 groups, single factor 12 groups, two factors (apoE -/- aorta vs. wt aorta) (Drosophila Innate Immune Response Time Series) (Immune response to hookworms in mouse lung) t-test One-way ANOVA Two-way ANOVA BH (FDR) BH (FDR) BH (FDR) Up regulated Down regulated Clustering Clustering Gene Lists Gene Lists Gene Lists Individual genes of interest Biological themes (Pathways, molecular functions, etc.) Using 2-way ANOVA to dissect the immune response to hookworm infection in mouse lung General microarry data analysis workflow From raw data to biological significance Comparison statistics Two-way ANOVA GeneSifter Overview The Gene Expression Omnibus (GEO) Microarray analysis of gene expression following hookworm infection Data overview Dissection of the immune response using 2-way ANOVA GeneSifter – Microarray Data Analysis Accessibility Web-based Secure Data management Data Annotation (MIAME) Multiple upload tools CodeLink Affymetrix Illumina Agilent Custom Differential Expression - Powerful, accessible tools for determining Statistical Significance R based statistics Bioconductor Comparison Tests t-test, Welch’s t-test, Wilcoxon Rank sum test, one-way ANOVA, two-way ANOVA Correction for Multiple Testing Bonferroni, Holm, Westfall and Young maxT, Benjamini and Hochberg Unsupervised Clustering PAM, CLARA, Hierarchical clustering Silhouettes GeneSifter – Microarray Data Analysis Integrated tools for determining Biological Significance One Click Gene Summary™ Ontology Report Pathway Report Search by ontology terms Search by KEGG terms or Chromosome The GeneSifter Data Center • Free resource Training Research Publishing • 6 areas Cardiovascular Cancer Endocrinology Neuroscience Immunology Oral Biology • Access to : Data Analysis summary Tutorials WebEx The GeneSifter Data Center www.genesifter.net/dc The Gene Expression Omnibus (GEO) Gene expression data repository (mostly microarrays) Over 3000 data sets All array platforms represented Searchable by Platform Species Experiment annotation Downloadable data Using the Gene Expression Omnibus (http://www.microarraysuccess.org/newsletter) Using 2-way ANOVA to dissect the immune response to hookworm infection in mouse lung General microarry data analysis workflow From raw data to biological significance Comparison statistics Two-way ANOVA GeneSifter Overview The Gene Expression Omnibus (GEO) Microarray analysis of gene expression following hookworm infection Data overview Dissection of the immune response using 2-way ANOVA Project Analysis : Two-way ANOVA Scott lab, Johns Hopkins University (Bloomberg School of Public Health ) Affymetrix Mouse 430 2.0 Wild type and SCID mice Control and 5 time points after infection CEL files available (loaded and MAS5 processed in GeneSifter) Alex Loukas, and Paul Prociv. Immune Responses in Hookworm Infections. Clinical Microbiology Reviews, October 2001, p. 689-703, Vol. 14, No. 4 Analysis of Variance (ANOVA) •Like t-test, identifies genes with large differences between groups and small differences within groups •For use with 3 or more groups •One-way and two-way •One-way examines effects of one factor on gene expression •Two-way can examine effects of two factors on gene expression as well as the interaction of the two factors Pavlidis P. Using ANOVA for gene selection from microarray studies of the nervous system. Methods. 2003 Dec;31(4):282-9. Glantz S. Primer of Biostatistics. 5th Edition. McGraw-Hill. Glantz S, Slinker B. Primer of Regression and Analysis of Variance. McGraw-Hill. Project Analysis : Two-way ANOVA Factor One: Strain (2 levels, SCID, WT) Factor Two: Time after infection (6 levels, con, 2,3,4,8,12 dpi) Gene expression pattern Strain: Time: WT SCID Strain Effect Time Effect Interaction Project Analysis : Two-way ANOVA Project Analysis : Two-way ANOVA Identify Factors Indicate number of levels for each Identify levels for each factor Project Analysis : Two-way ANOVA Assign levels for each factor to cells Include fold-change cutoff if desired Select effect to filter on first (you can switch later) Two-way ANOVA : Strain Effects Biological Significance Gene Annotation Sources • UniGene - organizes GenBank sequences into a non-redundant set of gene-oriented clusters. Gene titles are assigned to the clusters and these titles are commonly used by researchers to refer to that particular gene. • LocusLink (Entrez Gene) - provides a single query interface to curated sequence and descriptive information, including function, about genes. • Gene Ontologies – The Gene Ontology™ Consortium provides controlled vocabularies for the description of the molecular function, biological process and cellular component of gene products, that can be used by databases such as Entrez Gene. • KEGG - Kyoto Encyclopedia of Genes and Genomes provides information about both regulatory and metabolic pathways for genes. • Reference Sequences- The NCBI Reference Sequence project (RefSeq) provides reference sequences for both the mRNA and protein products of included genes. GeneSifter maintains its own copies of these databases and updates them automatically. One-Click Gene Summary Two-way ANOVA : Strain Effects Ontology Report Ontology Report : z-score R = total number of genes meeting selection criteria N = total number of genes measured r = number of genes meeting selection criteria with the specified GO term n = total number of genes measured with the specific GO term Reference: Scott W Doniger, Nathan Salomonis, Kam D Dahlquist, Karen Vranizan, Steven C Lawlor and Bruce R Conklin; MAPPFinder: usig Gene Ontology and GenMAPP to create a global gene-expression profile from microarray data, Genome Biology 2003, 4:R7 Z-score Report KEGG Report Two-way ANOVA : Strain Effects Strain effects - Visualization Visualization of 517 genes (strain effect p < 0.001) Strain effects - Partitioning Segregation of expression patterns using k-medoids clustering Strain effects - Partitioning Silhouette widths are used to find “best” number of clusters k 2 4 6 mean sil. width 0.71 0.41 0.25 Dudoit S, Fridlyand J. A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biol. 2002 Jun 25;3(7):RESEARCH0036. Epub 2002 Jun 25. Strain : Cluster 1 Strain : Cluster 2 Two-way ANOVA : Time Effects Two-way ANOVA : Time Effects Time : Cluster 1 Time : Cluster 2 Two-way ANOVA : Interaction Two-way ANOVA : Interaction Interaction : Cluster 3 Interaction : Cluster 2 Two-way ANOVA : Summary Immune response to hookworms in mouse lung 12 groups (3 biological replicates) 2 factors (Strain and Time) Two-way ANOVA ~39,000 genes Interaction 56 genes Pattern selection – Hierachical clustering, PAM (Interaction) Z-scores Biological process Transcription (4) Circadian Rhythm (3) Strain 517 genes Time 1054 genes Biological process Immune response (8) Chitin catabolism (4) Strain effects, time effects and interaction GeneSifter Workflow Examples 2 groups 5 groups, single factor 12 groups, two factors (apoE -/- aorta vs. wt aorta) (Drosophila Innate Immune Response Time Series) (Immune response to hookworms in mouse lung) t-test One-way ANOVA Two-way ANOVA BH (FDR) BH (FDR) BH (FDR) Up regulated Down regulated Clustering Clustering Gene Lists Gene Lists Gene Lists Individual genes of interest Biological themes (Pathways, molecular functions, etc.) Resources Monthly Webinar Series 8/10/06 - Microarray analysis of gene expression in Huntington's Disease peripheral blood - a platform comparison Archived - Using 2-way ANOVA to dissect gene expression following myocardial infarction in mice Archived - Using 2-way ANOVA to dissect the immune response to hookworm infection in mouse lung Archived - The microarray data analysis process - from raw data to biological significance Archived - Microarray analysis of gene expression in androgen-independent prostate cancer Archived - Microarray analysis of gene expression in male germ cell tumors Thank You www.genesifter.net Trial account, tutorials, sample data and Data Center Eric Olson [email protected]