* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download m10-expression
Transposable element wikipedia , lookup
Bisulfite sequencing wikipedia , lookup
Public health genomics wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Genomic imprinting wikipedia , lookup
Short interspersed nuclear elements (SINEs) wikipedia , lookup
Human genome wikipedia , lookup
Oncogenomics wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Minimal genome wikipedia , lookup
History of RNA biology wikipedia , lookup
Cancer epigenetics wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Genomic library wikipedia , lookup
Non-coding DNA wikipedia , lookup
Non-coding RNA wikipedia , lookup
Genome (book) wikipedia , lookup
Pathogenomics wikipedia , lookup
Epigenomics wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
RNA silencing wikipedia , lookup
History of genetic engineering wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Genome editing wikipedia , lookup
Primary transcript wikipedia , lookup
Microevolution wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Helitron (biology) wikipedia , lookup
Designer baby wikipedia , lookup
Genome evolution wikipedia , lookup
Gene expression programming wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Metagenomics wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Mir-92 microRNA precursor family wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
BST281: Genomic Data Manipulation, Spring 2017 Monday 10: Gene expression assays Gene expression or transcriptional activity provides a global snapshot of molecular dynamics. Proteins/metabolites hard to measure, but RNA provides a more uniform intermediate. Transcriptional measurements provide the ability to: Associate genes with biological processes / environmental conditions / stimuli / chemistry / regulation / etc. Diagnostic / prognostic biomarker for human (or other) sample outcomes. Microarrays were originally developed for sequencing. Array one or more known sequences per gene in preassigned locations. Extract RNA from condition(s) of interest, reverse transcribe with fluorescent label. Quantify the amount of transcript using fluorescence (i.e. binding) at each location. Can be either one or two channel depending on control strategy (internal vs. external) and quantification. QC issues are educational for essentially all 'omics assays, can be at probe, gene (probeset), or experiment level. Spots must be located correctly, local background removed, spatial defects handled, batches normalized... Identify probes, genes, or experiments that should just be tossed. Then normalize what's left, and optionally re-impute remaining missing values. Normalization of expression data often uses MA plots: Y axis = M = ratio (red / green for two channel, current / median for single) X axis = A = average (red + green, or current + median) You want something that's vertically centered and straight (no systematic or intensity-dependent bias). Median centering, quantile normalization, and LOWESS smoothing adjust plots. Median is global, quantile/LOWESS include local elements - ensure similar distributions among samples. (GC)RMA extends these to a hierarchical model including per-probe, -gene, -experiment, and -nucleotide. RNA-seq is a mashup of DNA sequencing methods + QC with single channel microarray analysis. Transcript levels measured by read counts rather than fluorescence intensity. Tricky part during mapping is dealing with gaps to cross exon junctions; also resolve ambiguous isoforms. De novo assembly methods also exist, which extend DNA assembly techniques to allow multiple isoforms. Single cell RNA-seq (scRNA-seq) replaces "experiments" with single cells, prone to noise / zeros / missing values. Differences are mostly in data generation to deal with extremely low input biomass. Allows analysis of heterogeneity / cell types / etc. but requires dealing with sparse, noisy, low count data. Data live in GEO, ArrayExpress, Gene Expression Atlas, GDC Data Portal, BioGPS, and others. Textbooks Expression databases: Pevsner, Chapter 2 p27-28 RNA biology: Pevsner, Chapter 10 p433-470 Expression QC: Pevsner, Chapter 11 p479-498 Literature Widespread aneuploidy revealed by DNA microarray expression profiling. Hughes, Nature Genetics 2000 A survey of best practices for RNA-seq data analysis. Conesa, Genome Biology 2016 Design and computational analysis of single-cell RNA-sequencing experiments. Bacher, Genome Biology 2016