Download m10-expression

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Transposable element wikipedia , lookup

Bisulfite sequencing wikipedia , lookup

Public health genomics wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Genomic imprinting wikipedia , lookup

Short interspersed nuclear elements (SINEs) wikipedia , lookup

Human genome wikipedia , lookup

Oncogenomics wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Minimal genome wikipedia , lookup

History of RNA biology wikipedia , lookup

Cancer epigenetics wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Genomic library wikipedia , lookup

Non-coding DNA wikipedia , lookup

Non-coding RNA wikipedia , lookup

Genome (book) wikipedia , lookup

NEDD9 wikipedia , lookup

Pathogenomics wikipedia , lookup

Epigenomics wikipedia , lookup

Gene wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

RNA silencing wikipedia , lookup

History of genetic engineering wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Genome editing wikipedia , lookup

Primary transcript wikipedia , lookup

Microevolution wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Helitron (biology) wikipedia , lookup

Designer baby wikipedia , lookup

Genome evolution wikipedia , lookup

Gene expression programming wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Metagenomics wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Genomics wikipedia , lookup

Mir-92 microRNA precursor family wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Gene expression profiling wikipedia , lookup

RNA-Seq wikipedia , lookup

Transcript
BST281: Genomic Data Manipulation, Spring 2017
Monday 10: Gene expression assays
Gene expression or transcriptional activity provides a global snapshot of molecular dynamics.
Proteins/metabolites hard to measure, but RNA provides a more uniform intermediate.
Transcriptional measurements provide the ability to:
Associate genes with biological processes / environmental conditions / stimuli / chemistry / regulation / etc.
Diagnostic / prognostic biomarker for human (or other) sample outcomes.
Microarrays were originally developed for sequencing.
Array one or more known sequences per gene in preassigned locations.
Extract RNA from condition(s) of interest, reverse transcribe with fluorescent label.
Quantify the amount of transcript using fluorescence (i.e. binding) at each location.
Can be either one or two channel depending on control strategy (internal vs. external) and quantification.
QC issues are educational for essentially all 'omics assays, can be at probe, gene (probeset), or experiment level.
Spots must be located correctly, local background removed, spatial defects handled, batches normalized...
Identify probes, genes, or experiments that should just be tossed.
Then normalize what's left, and optionally re-impute remaining missing values.
Normalization of expression data often uses MA plots:
Y axis = M = ratio (red / green for two channel, current / median for single)
X axis = A = average (red + green, or current + median)
You want something that's vertically centered and straight (no systematic or intensity-dependent bias).
Median centering, quantile normalization, and LOWESS smoothing adjust plots.
Median is global, quantile/LOWESS include local elements - ensure similar distributions among samples.
(GC)RMA extends these to a hierarchical model including per-probe, -gene, -experiment, and -nucleotide.
RNA-seq is a mashup of DNA sequencing methods + QC with single channel microarray analysis.
Transcript levels measured by read counts rather than fluorescence intensity.
Tricky part during mapping is dealing with gaps to cross exon junctions; also resolve ambiguous isoforms.
De novo assembly methods also exist, which extend DNA assembly techniques to allow multiple isoforms.
Single cell RNA-seq (scRNA-seq) replaces "experiments" with single cells, prone to noise / zeros / missing values.
Differences are mostly in data generation to deal with extremely low input biomass.
Allows analysis of heterogeneity / cell types / etc. but requires dealing with sparse, noisy, low count data.
Data live in GEO, ArrayExpress, Gene Expression Atlas, GDC Data Portal, BioGPS, and others.
Textbooks
Expression databases:
Pevsner, Chapter 2 p27-28
RNA biology:
Pevsner, Chapter 10 p433-470
Expression QC:
Pevsner, Chapter 11 p479-498
Literature
Widespread aneuploidy revealed by DNA microarray expression profiling. Hughes, Nature Genetics 2000
A survey of best practices for RNA-seq data analysis. Conesa, Genome Biology 2016
Design and computational analysis of single-cell RNA-sequencing experiments. Bacher, Genome Biology 2016