Download Biological Modelling Gene Expression Data

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gene desert wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Secreted frizzled-related protein 1 wikipedia , lookup

Genome evolution wikipedia , lookup

Comparative genomic hybridization wikipedia , lookup

Epitranscriptome wikipedia , lookup

Molecular cloning wikipedia , lookup

Molecular evolution wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

Transcriptional regulation wikipedia , lookup

Non-coding DNA wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Point mutation wikipedia , lookup

List of types of proteins wikipedia , lookup

Promoter (genetics) wikipedia , lookup

Gene wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Endogenous retrovirus wikipedia , lookup

Gene regulatory network wikipedia , lookup

Gene expression wikipedia , lookup

Gene expression profiling wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Community fingerprinting wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

RNA-Seq wikipedia , lookup

Transcript
cDNA Microarrays
Neil Lawrence
Schedule
• Today: Introduction and Background
18th April
25th April
2nd May
9th May
16th May
23rd May
Introduction and Background
cDNA Mircoarrays
No Lecture
Affymetrix GeneChips
Guest Lecturer – Dr Pen Rashbass
Analysis methods
Review - Gene Expression
• The amount of mRNA produced from a
particular gene.
• Different genes express due to
– Cell differentiation
– Environment changes
DNA Microarrays
• New technology for measuring gene
expression.
• Two main types.
– cDNA microarrays (Synteni/Stanford).
– GeneChip® (Affymetrix)
Affymetrix vs cDNA
• Technology differs in:
• How DNA sequences are laid down
– spotting vs photolithography
• Length of DNA sequences that are laid
down
– Full sequences vs partial sequences.
cDNA Chips
• Today we focus on cDNA arrays.
• Advantages:
– Don’t need sequence of gene.
– Provide DNA library.
– Can make your own chips.
• Disadvantages
– Larger arrays.
– Requires image processing.
cDNA Microarrays
• Lay down a full sequence of the gene
– Typically 1000s of base pairs long.
• An arrayer lays down the spots
– Spot of genes DNA placed on a glass slide
or nylon membrane.
Array Spotting
Different
DNA in solution
solutions
How is the DNA obtained?
• DNA Libraries
Biological Sample
• The total mRNA from two different samples is
extracted.
• These two samples may be from a wild-type
and a mutant organism.
• This mRNA is reverse transcribed to produce
cDNA (complementary DNA).
• Achieved using an enzyme - reverse
transcriptase.
Sample Prepartion
• These cDNA pools should represent all the transcribed
genes within each sample.
• Number of copies of each cDNA should be proportional
to the relative of mRNA in each sample
• Reverse transcription occurs in the presence of
fluorescently tagged nucleotides.
• Each cDNA pool is labelled with different fluorescent
dyes (or fluorochromes) usually Cy3 (visualised as
green) and Cy5 (visualised as red).
• Incorporation of these dyes can be different.
Sample Hybridisation
• Labelling should occur so that the amount of
each fluorochrome attached to a given cDNA
will be proportional to the relative abundance
of the gene in the sample.
• Labelled cDNA are mixed together and
hybridzed onto the slide.
• Set conditions so that cDNA in the mixed pool
binds to its corresponding spot on the array,
but will not bond to the slide itself or to other
spots.
Scanning the Slide
• Lasers with different wavelengths are used to excite
each dye (Cy3 and Cy5).
• These lasers emit the appropriate wavelength and use a
filter system to stop ‘bleed through’ between the two
channels.
• The amount of dye present should be proportional to
the gene expression level of the cell.
• The overall intensity of signal from each dye should be
proportional to the gene expression level within each
sample.
Processed Microarray
Slide With Grid Layout
often
each grid
is from a
different
print-tip
Interpretation of Signals
• As a result:
– Predominantly green signal - gene is more abundant
in the Cy3 labelled sample.
– Predominantly red signal - gene is more abundant in
the Cy5 labelled sample.
– Yellow signal - gene is equally abundant in both
samples.
• So fluorescent intensity colour for each spot
provides information on abundance and
relative expression levels.
Image Processing - ScanAlyze
dust spot
Image Processing
• In ML Group we try and obtain
uncertainty estimates.
• We do this using Bayesian statistics.
• Bayesian demo
Results Loaded into ScanAlyze
Biological Noise
• Biological noise – Expression levels depend on
– Intrinsic Intracellualar factors (The Stage of the Cell Cycle).
– Extrinsic factors (Signals from other cells).
• Expression level depends on complex interaction of activators
and inhibitors.
• Even in tissue culture dishes (with uniform cells) variation of
RNA levels within each cell can occur.
• The Overall expression level is taken from RNA collected from
a pool of cells.
• It will be a combination of all the transcripts of each cell, our
understanding of these processes is incomplete, thus the
fluctuations often appear to be random and are thus called
biological noise.
Experimental Noise
•
Problems with initial spotting on the arrays e.g. not all spots present or the
amount spotted is not the same across all spots.
•
Efficiency of dye incorporation into cDNA.
•
Efficiency of hybridization
–
–
–
–
Differences in the dye properites (how the dye effects hybridisation with the cDNA
strand).
How effectively the cDNA samples bind to the elements (spots).
How effectively the background level is reduced by washing
Whether there are any tidemarks left after washing and drying.
•
Dirt/dust on the slide
•
Differences arising from the scanning process.
•
Image Processing Noise If the image is incorrectly processed, i.e. spot
locations are not properly specified, the intensities that are extracted will not
truly reflect the expression levels. This variation can be random or systematic.
Normalisation
• During the array preparations technical variations can
occur.
―Dye properties.
―Differences in dye incorporation.
―Differences in scanning.
• Remove these variations.
―Balance the fluorescent intensities of the dyes.
―Allows comparison of expression levels across
experiments (arrays).
Global Normalisation
• Global Normalisation methods assume
the two dyes are related by a constant
factor
• Taking logs
Local Normalisations
• The dye factor is dependent on:
―Spot intensity (A=RG).
―Location on the array.
• Local normalisation methods:
– Intensity dependent.
– Print-tip group.
Intensity dependent
• Visualise the effect: M-A plot
• Correction of the intensity
dependant variations:
print-tip effect
Picture from: http://www.stat.berkeley.edu/users/terry/zarray/Html/index.html
Print-tip Group
• Different experiments may use different
printing set-up:
– Layout of the tips in the print-head of the arrayer.
– Differences on the length or opening of the tips.
– Deformation.
• Print-tip normalisation is simply:
(print-tip + A) – dependent Normalisation
Conclusions
• We’ve reviewed cDNA chips.
• Key issues:
– Hybridization process.
– Noise Mechanisms.
– Normalisation.