Download Microarrays - Computational Bioscience Program

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Pharmacogenomics wikipedia , lookup

Ridge (biology) wikipedia , lookup

Oncogenomics wikipedia , lookup

X-inactivation wikipedia , lookup

Transposable element wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Epigenetics in learning and memory wikipedia , lookup

Epistasis wikipedia , lookup

Genomic imprinting wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Point mutation wikipedia , lookup

Pathogenomics wikipedia , lookup

NEDD9 wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Public health genomics wikipedia , lookup

History of genetic engineering wikipedia , lookup

Copy-number variation wikipedia , lookup

Genetic engineering wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

Gene wikipedia , lookup

Neuronal ceroid lipofuscinosis wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Genome evolution wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Saethre–Chotzen syndrome wikipedia , lookup

Genome (book) wikipedia , lookup

Gene therapy wikipedia , lookup

RNA-Seq wikipedia , lookup

The Selfish Gene wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Helitron (biology) wikipedia , lookup

Gene desert wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Gene expression programming wikipedia , lookup

Gene nomenclature wikipedia , lookup

Gene expression profiling wikipedia , lookup

Microevolution wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Designer baby wikipedia , lookup

Transcript
Microarrays
Tzu Lip Phang, Ph.D.
Lawrence Hunter, Ph.D.
Associate Professor of Bioinformatics
Director, Computational Bioscience Program
Division of Pulmonary Sciences and Critical Care Medicine
University of Colorado School of Medicine
University of Colorado School of Medicine
[email protected]
[email protected]
http://compbio.uchsc.edu/Hunter
Data Science
AKA BIG DATA
The Devils is in the
Details
Workshop
The Central Dogma
Genome
Transcriptome
Microarrys in the Literature
7000
Number of papers
6000
5000
4000
3000
2000
1000
0
Year
Microarray: Primer
Basic Statistical Analysis
Basic Statistical Analysis
Power Analysis
• How many biological replication?
• My experience; at least 3, preferably 5, even 7
• Bioconductor: SSPA
Basic Statistical Analysis
QC
• Including image analysis, normalization,
•
and data transformation
Data normalization:
– Remove systematic errors introduced in
labeling, hybridization and scanning
procedures
– Correct these errors while preserve
biological variability / information
Why normalization?
To normalize or not to …
Basic Statistical Analysis
Statistical Testing
• Hypothesis Testing: Is the
means of two groups different
from each other
– Fold Change
– Student-T Test
Student-T Test
What is Multiple Comparison
Testing??!
Genes
Gene 1
Gene 2
Gene 3
Gene 4
Gene 5
Gene 6
Gene 7
Gene 8
Gene 9
Gene 10
Alpha level = 0.05
P-values
0.0001
0.0002
0.008
0.009
0.005
0.09
0.05
0.09
0.2
0.3
<=
<=
<=
<=
<=
<=
<=
<=
<=
<=
Critical level
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
0.05
Ho
1
1
1
1
1
0
0
0
0
0
When large number of tests
…
Genes
Gene 1
Gene 2
Gene 3
Gene 4
Gene 5
Gene 6
…
…
Gene 999
Gene 1000
P-values
0.0001
0.0002
0.008
0.009
0.005
0.09
…
…
0.2
0.3
Alpha level = 0.05
50 wrong genes …
<=
<=
<=
<=
<=
<=
…
…
<=
<=
Critical level
0.05
0.05
0.05
0.05
0.05
0.05
…
…
0.05
0.05
Ho
1
1
1
1
1
0
…
…
0
0
Correction … Bonferroni
Genes
Gene 1
Gene 2
Gene 3
Gene 4
Gene 5
Gene 6
…
…
Gene 999
Gene 1000
P-values
0.0001
0.0002
0.008
0.009
0.005
0.09
…
…
0.2
0.3
<=
<=
<=
<=
<=
<=
…
…
<=
<=
Alpha level = 0.05 / 1000 = 0.00005
Critical level
0.00005
0.00005
0.00005
0.00005
0.00005
0.00005
0.00005
0.00005
0.00005
0.00005
Ho
0
0
0
0
0
0
…
…
0
0
Strike the balance …
Most Conservative
Bonferroni
Most Lenient
False Discovery Rate
No correction
The False Discovery Rate (FDR) of a set of predictions is the expected percent
of false predictions in the set of predictions.
Example:
If the algorithm returns 100 genes with false discovery rate of 0.3, then we
should expect 70 of them to be correct
Put them together
Basic Statistical Analysis
Biological Interpretation