* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download LIMMA
Oncogenomics wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
History of genetic engineering wikipedia , lookup
Essential gene wikipedia , lookup
Pathogenomics wikipedia , lookup
Public health genomics wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Quantitative trait locus wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Genome evolution wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Microevolution wikipedia , lookup
Genome (book) wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Minimal genome wikipedia , lookup
Genomic imprinting wikipedia , lookup
Ridge (biology) wikipedia , lookup
Gene expression programming wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Designer baby wikipedia , lookup
LIMMA Linear Models for Microarray Data Difficulties with microarray data • Variability of the expression values differs between genes • Non-identical and dependent distribution between genes • Multiple testing of tens of thousands of genes Correct for multiple comparisons • Multiple testing - Family-wise error rate - False Discovery Rate etc. • Parallel nature of the inference allows for compensating possibilities • Borrowing information from the ensemble of genes to assist in inference from individual genes Empirical Bayes • Frequentist methods, a hypothesis is typically rejected or not rejected without directly assigning a probability • Bayesian methods, specifies some prior probability, which is then updated in the light of new data. • For Bayesian techniques, the prior distribution is assigned independent of the data and fixed before any data is observed. Empirical Bayes • Superficially similar to Bayesian methods in that a prior distribution is assigned. • However, prior distribution is estimated from the data • Therefore Empirical Bayes is a frequentist technique LIMMA • Empiricial Bayes techniques have previously been applied to microarray data • Analysis specific to experiment and very difficult to implement • LIMMA - Simple model with simple expression of posterior odds • Allows linear modelling to be applied to microarray data Estrogen Data • 2x2 factorial experiment on MCF7 breast cancer cells using Affymetrix HGU95av2 arrays • Factors : Estrogen (Presence/Absence) Length of exposure (10hr/48hr) • The idea of the study is to identify genes that respond to estrogen treatment Read in the Data • Load in the estrogen data • Normalise the data • Define the targets (factors) for the linear model Design Matrix 1 low10-1.cel absent 10 2 low10-2.cel absent 10 3 high10-1.cel present 10 4 high10-2.cel present 10 5 low48-1.cel absent 48 6 low48-2.cel absent 48 7 high48-1.cel present 48 8 high48-2.cel present 48 • Eight arrays • Four pairs of replicates • Four parameters in the linear model Contrast Matrix 1 low10-1.cel absent 10 2 low10-2.cel absent 10 3 high10-1.cel present 10 4 high10-2.cel present 10 5 low48-1.cel absent 48 6 low48-2.cel absent 48 7 high48-1.cel present 48 8 high48-2.cel present 48 Estrogen effect at 10 hours Estrogen effect at 48 hours Time effect without estrogen Differential Expression • Extract linear model fit for contrasts • Obtain list of differentially expressed genes for contrasts • Look for overlap among differentially expressed genes Linear Model Fit • logFC - Estimate of the log2-fold-change corresponding to the effect or contrast • AveExpr - Average log2-expression for the probe over all arrays/channels • t - moderated t-statistic • P.Value - Raw p-value • adj.P.Value -Adjusted p-value • B - log odds that the gene is differentially expressed Annotating Data • Probe arrays can be annotated with external data • Multiple sources of gene annotations Gene Set Enrichment • All biochemical pathways are determined by sets of genes • Gene sets are determined by prior biological knowledge relating to co-expression, function, location or known biochemical pathways. • If a pathway is in any way related to a biological trait then the co-functioning genes should display a higher degree of enrichment compared to the rest of the transcriptome. • Gene Set Enrichment (GSE) is a computational technique which determines whether a priori defined set of genes show statistically significant overlap Estrogen receptor (ER) gene set • If estrogen is present, ER genes will bind the estrogen and become activated • Gain ability to regulate gene expression and result in differential expression between the cells with and without estrogen • Should lead to up regulation of ER genes