Download Microarray analysis

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Microarray data analysis
Annamaria Carissimo
[email protected]
Outline




Microarray analysis Pipeline
Practicals:
Array Express
Gene Ontology with David Tool
Gene Set Enrichment Analysis (GSEA)
What is a DNA microarray?

A grid of DNA spots on a substrate
(chip) used to detect complementary
sequences
Monitoring the expression of several thousand
genes at the same time
Hybridization on a chip
Probe Array
Hybridized Array
Detect
Labeled cDNA/RNA
Fluorescent
Stain (for the data
Acquisition)
Intensity -> how much hybridization occurred for each probe
Zoom in.....
Zoom in..
Zoom in...
How does it look like?
Data flow
Chip
scanning
Image
Processing
Intensity files
.CEL (Affymetrix)
.txt (Illumina-Agilent)
DATA ANALYSIS USING
OUR PIPELINE
Microarray analysis pipeline
http://microarrayanalysis.tigem.it/index_i.html
Platform supported




3’ Expression array
Mouse-> MOE430A, Mouse430_2, MG_U74Av2
Human-> HG-U133A, HG-U133A_2, HG-U133_Plus_2
Whole Transcript Expression and Exon array
Mouse-> Mouse Gene 1.0 ST, Mouse Exon 1.0 ST
Human-> Human Gene 1.0 ST, Human Exon 1.0 ST
Agilent GE 4x44
Human and Mouse -> two color and one color
Illumina Bead Chip
Human and Mouse -> WG-6, Ref-8 and HT-12
Affymetrix 3’ microarray





A chip consists of a number of probesets.
Probesets are intended to measure expression for a
specific mRNA
Each probeset is complementary to a target sequence
which is derived from one or more mRNA sequences
Probesets consist of 25mer probe pairs selected from
the target sequence: one Perfect Match (PM) and one
Mismatch (MM) for each chosen target position.
Each chip has a corresponding Chip Description File
(CDF) which (among other things) describes probe
locations and probeset groupings on the chip.
Target sequences and Probes
Example:
 1415771_at:



Description: Mus musculus nucleolin mRNA, complete cds
LocusLink: AF318184.1 (NT sequence is 2412 bp long)
Target Sequence is 129 bp long
11 probe pairs tiling the target sequence
gagaagtcaaccatccaaaactctgtttgtcaaaggtctgtctgaggataccactgaagagaccttaaaagaatcatttgagggctctgttcgtgcaagaatagtcactgatcgggaaactggttctt
gagaagtcaaccatccaaaactctgtttgtcaaaggtctgtctgaggataccactgaagagaccttaaaagaatcatttgagggctctgttcgtgcaagaatagtcactgatcgggaaactggttctt
gagaagtcaaccatccaaaactctgtttgtcaaaggtctgtctgaggataccactgaagagaccttaaaagaatcatttgagggctctgttcgtgcaagaatagtcactgatcgggaaactggttctt
gagaagtcaaccatccaaaactctgtttgtcaaaggtctgtctgaggataccactgaagagaccttaaaagaatcatttgagggctctgttcgtgcaagaatagtcactgatcgggaaactggttctt
gagaagtcaaccatccaaaactctgtttgtcaaaggtctgtctgaggataccactgaagagaccttaaaagaatcatttgagggctctgttcgtgcaagaatagtcactgatcgggaaactggttctt
gagaagtcaaccatccaaaactctgtttgtcaaaggtctgtctgaggataccactgaagagaccttaaaagaatcatttgagggctctgttcgtgcaagaatagtcactgatcgggaaactggttctt
gagaagtcaaccatccaaaactctgtttgtcaaaggtctgtctgaggataccactgaagagaccttaaaagaatcatttgagggctctgttcgtgcaagaatagtcactgatcgggaaactggttctt
gagaagtcaaccatccaaaactctgtttgtcaaaggtctgtctgaggataccactgaagagaccttaaaagaatcatttgagggctctgttcgtgcaagaatagtcactgatcgggaaactggttctt
gagaagtcaaccatccaaaactctgtttgtcaaaggtctgtctgaggataccactgaagagaccttaaaagaatcatttgagggctctgttcgtgcaagaatagtcactgatcgggaaactggttctt
gagaagtcaaccatccaaaactctgtttgtcaaaggtctgtctgaggataccactgaagagaccttaaaagaatcatttgagggctctgttcgtgcaagaatagtcactgatcgggaaactggttctt
gagaagtcaaccatccaaaactctgtttgtcaaaggtctgtctgaggataccactgaagagaccttaaaagaatcatttgagggctctgttcgtgcaagaatagtcactgatcgggaaactggttctt
Affymetrix probeset
Perfect match
Probe
pair
ctgtctgaggataccactgaagaga
ctgtctgaggattccactgaagaga
Mismatch
probe pairs values
summarization
ONE probeset value
Background correction and Normalization
Compare different samples on different microarray chips
Example
Control
Tratment
Sample1- sample2 - sample3
replicates
Sample1 - sample2 - sample3
replicates
Normalize all together
Differentially expression
We want to compare two biologically different conditions
through the identification of differentially expressed genes
Example
Control
Tratment
Sample1- sample2 - sample3
replicates
Sample1 - sample2 - sample3
replicates
T-test for each gene
Processing Microarray data
(from .CEL files to gene expression)



Background correction
Normalization
Expression summary
Microarray Analysis Suite (MAS5)
(Affimetrix proprietary method )
Robust Multy-array Average (RMA)
(Irizarry (2003))
 Identifying significant expressed genes in treatment versus control
Bayesian t-test (Cyber-T tool) – Multiple testing correction-> False discovery
rate (FDR)
Paired or unpaired design?
 Output is a text file (Excel) with the resulting analysis.
Microarray Pipeline - step 1
upload your .CEL files

On Mac:
Microarray Pipeline - step 1
upload your .CEL files

On Windows:
Related documents