Download Genomic analysis of gene expression Basics of

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Human genome wikipedia , lookup

RNA interference wikipedia , lookup

Genetic engineering wikipedia , lookup

Bisulfite sequencing wikipedia , lookup

Genomic imprinting wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Ridge (biology) wikipedia , lookup

Cell-free fetal DNA wikipedia , lookup

Oncogenomics wikipedia , lookup

Comparative genomic hybridization wikipedia , lookup

Epitranscriptome wikipedia , lookup

Point mutation wikipedia , lookup

History of RNA biology wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Gene expression programming wikipedia , lookup

Minimal genome wikipedia , lookup

Genome (book) wikipedia , lookup

Epigenomics wikipedia , lookup

RNA silencing wikipedia , lookup

Non-coding RNA wikipedia , lookup

Pathogenomics wikipedia , lookup

Nutriepigenomics wikipedia , lookup

NEDD9 wikipedia , lookup

Metagenomics wikipedia , lookup

Non-coding DNA wikipedia , lookup

Genome evolution wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Helitron (biology) wikipedia , lookup

Primary transcript wikipedia , lookup

History of genetic engineering wikipedia , lookup

Gene wikipedia , lookup

Public health genomics wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Microevolution wikipedia , lookup

Genomics wikipedia , lookup

Designer baby wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Gene expression profiling wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

RNA-Seq wikipedia , lookup

Transcript
Genes can be regulated at many levels
• transcription
• post transcription (RNA stability)
the “transcriptome”
• post transcription (translational control)
• post translation (not considered gene regulation)
usually, when we speak of gene regulation, we are referring to
transcriptional regulation
DNA
RNA
TRANSCRIPTION
PROTEIN
TRANSLATION
Genomic analysis of gene expression
• Methods capable of giving a “snapshot” of RNA
expression of all genes
• Can be used as diagnostic profile
– Example: cancer diagnosis
• Can show how RNA levels change during
development, after exposure to stimulus, during
cell cycle, etc.
• Provides large amounts of data
• Can help us start to understand how whole
systems function
DNA microarrays
Microarrays have become incredibly popular since their
inception in 1995 (Schena et al. (1995) Science 270:467-70)
Again, there are now many variations. We’ll take a quick
look at the two basic types: Affymetrix (high density
oligonucleotide) and glass slide (cDNA, long oligo, etc).
Both are conceptually similar, with differences in
manufacture and details of design and analysis.
Benfey and Protopapas, "Genomics" © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper
Saddle River, New Jersey 07458
Comparisons of microarrays
Basics of microarrays
DNA spotting I
• DNA attached to solid
support
• DNA spotting usually
uses multiple pins
• DNA in microtiter
plate
• DNA usually PCR
amplified
• Oligonucleotides can
also be spotted
– Glass, plastic, or nylon
• RNA is labeled
– Usually indirectly
• Bound DNA is the
probe
– Labeled RNA is the
“target”
Benfey and Protopapas, "Genomics" © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper
Saddle River, New Jersey 07458
Commercial DNA spotter
Benfey and Protopapas, "Genomics" © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper
Saddle River, New Jersey 07458
Benfey and Protopapas, "Genomics" © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper
Saddle River, New Jersey 07458
cDNA microarrays—labeling
and hybridization
Movie of microarray spotting
cell type A
extract
mRNA
make
labeled
cDNA
hybridize to
microarray
cell type B
more in “A”
more in “B”
Benfey and Protopapas, "Genomics" © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper
Saddle River, New Jersey 07458
Benfey and Protopapas, "Genomics" © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper
Saddle River, New Jersey 07458
equal in A & B
1
cDNA microarrays: key points
•hybridize two samples/chip (i.e., direct comparison of
samples)
• non-standardized production can affect reproducibility
(i.e., depends a lot on who made them)
• longer sequences can have cross-hybridization with
other genes
• don’t necessarily need to know all genes in genome:
can use unsequenced ESTs, for instance
How microarrays are made:
Affymetrix GeneChips
• Oligonucleotides synthesized on silicon chip
– One base at a time
• Uses process of photolithography
– Developed for printing computer circuits
Affymetrix GeneChips
• Oligonucleotides
– Usually 20–25 bases in length
– 10–20 different oligonucleotides for each gene
• Oligonucleotides for each gene selected by
computer program to be the following:
– Unique in genome
– Nonoverlapping
• moderate cost (often only $75-$100/array)
• Composition based on design rules
• Empirically derived
Benfey and Protopapas, "Genomics" © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper
Saddle River, New Jersey 07458
Affymetrix: A review
Affymetrix: key points
• can hybridize only one sample/chip (i.e., no direct
comparisons of 2 samples)
(12-20/gene)
Comparison of microarray
hybridization
• standardized production tends to give good
reproducibility
• Spotted microarrays
• limited amount of probe sequence can be problematic
(miss alternative splices, bias toward one end of
transcript, dependent on good genome annotation), but
can also be helpful in limiting cross- hybridization
• Affymetrix GeneChips
• high cost (>$300/array, and remember only 1
sample/array)
probe pair
Benfey and Protopapas, "Genomics" © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper
Saddle River, New Jersey 07458
– Competitive hybridization
• Two labeled cDNAs hybridized to same slide
– One labeled RNA population per chip
– Comparison made between hybridization
intensities of same oligonucleotides on different
chips
Mismatch probe cells
Benfey and Protopapas, "Genomics" © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper
Saddle River, New Jersey 07458
Determination of Lymphoma subtypes (Alizadeh et al. (2000) Nature 403:503)
Uses of microarrays
• Gene discovery
- tissue profiles
- time course data
- altered genetic backgrounds
Other types and uses of microarrays
• CGH (comparative genomic hybridization)
look at cytogenetic abnormalities
controls
X
diploid
haploid
diploid
for developmental biology research, this is usually the
best use of microarrays
• Comparing tissues/genotypes
there are still some inherent difficulties here
• Classification
there’s a lot of promise in medicine (especially cancer
research) for this
2
Other types and uses of microarrays: ChIP-chip
ChIP-chip II
Other types and uses of microarrays
•Protein-binding microarrays (PBM)
determine transcription factor binding sites
•protein arrays, tissue arrays
(not the kind of arrays we’re discussing)
• Sensitive means of
measuring RNA
abundance
• Not genomewide: used to
verify microarray results
• TaqMan method uses
fluorescently tagged
primers
• Fluorescent tag released
by Taq polymerase
Benfey and Protopapas, "Genomics" © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper
Saddle River, New Jersey 07458
Experimental Design for Microarrays
•technical vs biological replicates
Microarray Experiments
Real-time PCR readout
• The readout of a realtime PCR reaction is a
set of curves
• The curves indicate
the PCR cycle at
which fluorescence is
detected
• Each cycle is twice the
amount of the
previous cycle
Benfey and Protopapas, "Genomics" © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper
Saddle River, New Jersey 07458
MIAME (Minimal Information About a
Microarray Experiment)
•amplification of RNA
• EXPERIMENT DESIGN
type, factors, number of arrays, reference sample, qc, database
accession (ArrayExpress, GEO)
•dye swaps
• SAMPLES USED, PREPARATION AND LABELING
•reference samples
•closed loop designs
• HYBRIDIZATION PROCEDURES AND PARAMETERS
• MEASUREMENT DATA AND SPECIFICATIONS
quantitations, hardware & software used for scanning and analysis,
raw measurements, data selection and transformation procedures, final
expression data
• ARRAY DESIGN
platform type, features and locations, manufacturing protocols or
commercial p/n
experimental design
statistical processing and analysis
condition 1
condition 2
condition 3
genes
conditions
Real-time PCR
Data Analysis—Don’t try this at home
Microarray analysis is a complex and rapidly evolving
field. Issues include normalization within and among
arrays, limited replication of experiments, and massive
multiple testing (20,000 genes vs 20,000 genes). Each
array platform has its own quirks and requirements.
Although a lot of software packages will do your
analysis for you, working with a true statistician is
highly recommended.
But it’s also important to have a grasp of the basics!
3
Analysis of microarray data
Even “Best Route” has a high false-negative
rate: ~25%
Validation of data
There’s no way that all of your microarray data can be
validated.
• Microarrays can measure the expression of
thousands of genes simultaneously
• Vast amounts of data require computers
• Types of analysis
It’s strongly recommended that any key findings be
verified by independent means.
Northern blots and quantitative RT-PCR are the typical
ways of doing this; real-time, quantitative RT-PCR is
generally the method of choice.
– Gene-by-gene
• Method: Statistical techniques
– Categorizing groups of genes
• Method: Clustering algorithms
– Deducing patterns of gene regulation
• Method: Under development
Benfey and Protopapas, "Genomics" © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper
Saddle River, New Jersey 07458
Data Analysis: Clustering III
Hierarchical clustering
At the beginning, each gene is a cluster. In each subsequent
step, the two closest clusters are merged until only one cluster
remains. There are a few different ways of doing this.
conditions
genes
• simple and widely used method
• in large clusters, can lose true representation of expression pattern
• cannot go back—early errors become fixed
Data Analysis: Clustering VI
Gene Ontology (www.geneontology.org)
The GO collaborators are developing three structured ontologies that describe
gene products in a species-independent manner.
Molecular function: Molecular function describes activities, such as catalytic or
binding activities, at the molecular level. Examples of broad functional terms are
catalytic activity, transporter activity, or binding; examples of narrower functional
terms are adenylate cyclase activity or Toll receptor binding.
Biological process: A biological process is accomplished by one or more ordered
assemblies of molecular functions. Examples of broad biological process terms are
cell growth and maintenance or signal transduction. Examples of more specific
terms are pyrimidine metabolism or alpha-glucoside transport. It can be difficult
to distinguish between a biological process and a molecular function, but the general
rule is that a process must have more than one distinct steps.
Cellular component: A cellular component is just that, a component of a cell but
with the proviso that it is part of some larger object, which may be an anatomical
structure (e.g. rough endoplasmic reticulum or nucleus) or a gene product group
(e.g. ribosome, proteasome or a protein dimer).
Gene Ontology —example
4