* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Analysis of Microarray Data Using R
Survey
Document related concepts
Gene desert wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Molecular evolution wikipedia , lookup
Secreted frizzled-related protein 1 wikipedia , lookup
Gene expression wikipedia , lookup
X-inactivation wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Genome evolution wikipedia , lookup
Community fingerprinting wikipedia , lookup
Gene regulatory network wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Genomic imprinting wikipedia , lookup
Ridge (biology) wikipedia , lookup
Transcript
Microarray Data Analysis Using R Studies in Tissue Databases Mark Reimers, NCI Outline The GNF tissue database Exploratory analysis - clustering Positional co-regulation Insight via co-regulation Apoptotic configuration of tissues Probe level analysis The GNF Expression Atlas Su et al ( PNAS 2004) hybridized 150 samples from 61 tissues to Affymetrix U133A and custom arrays Variation in gene expression (as proportion of transcriptome) 95% show at least one 2-fold change among 61 tissues 37% show more than 2-fold differences between lowest 10% and highest 10% Clustering samples All biological replicates are nearest neighbors Dendrogram reflects discrepancy between healthy and cancerous Co-regulation of Nearby Genes Some groups of genes next to one another on chromosome show high correlation across tissues Significance of Co-regulation How often would such correlations happen ‘by chance’ - eg. by selecting genes at random? Three random measures would have correlation greater than 0.6 with p < 10-20! However 3 genes selected at random from atlas have probability ~ 10-3 of having all corrs > 0.6 156 regions of high correlation determined In 30,000 positions, we should see 30 Many are paralogs Perhaps 50% false discovery rate among the rest Prediction of Function Zhang, et al (J. Biol, 2004, 3:21) hybridized 55 mouse tissues to spotted oligo arrays Hypothesis: genes with similar tissue expression patterns share similar function Able to recover prediction of GO biological process for known genes with better than 50% accuracy for many categories Extended prediction to 1,092 uncharacterized transcripts Investigation of Poorly Characterized Gene - Top1MT 10-fold variation in expression (odd for a ‘housekeeping gene’) >50 genes with expression highly correlated ( .75) with Top1MT across tissue database Large proportion are splicing factors Top1MT has an odd splice junction in intron 1, and may depend critically on abundant splicing factors Apoptosis Patterns Majority of epithelial tissues show common pattern (indisposed to apoptosis) Blood cells show variety of patterns Exploration of Probe Sets Examine correlation of probe sets across 150 samples All but one probe verified to match latest Unigene build for gene Probes organized by position in 3’ end Red: 1; White: < 0 Quality of Arrays Regional bias images