Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Computational Laboratory: aCGH Data Analysis Feb. 4, 2011 Per Chia-Chin Wu Today’s Topics • Review aCGH and its data analysis • Homework of aCGH data analysis using tools in Genboree and ruby Chromosomal Aberrations REF: Albertson et al Array CGH Label Patient DNA with Cy3 Label Control DNA with Cy5 Hybridize DNA to genomic clone microarray Analyze Cy3/Cy5 fluorescence ratio of patient to control (log of Cy3/Y5) Workflow of aCGH Analysis Finished chips (scanner) Raw image data (experiment info ) (image processing software) Probe level raw intensity data Background adjustment, Normalization, transformation Raw copy number (CN) data [log ratio of tumor/normal intensities] Segmentation and boundary determination Estimation of CN Characterizing individual genomic profiles Normalization • Background Adjustment/Correction Reduces unevenness of a single chip Eliminates non-specific hybridization signal Before adjustment After adjustment Corrected Intensity (S’) = Observed Intensity (S) – Background Intensity (B) Normalization • Normalization Reduces technical variation between chips Before After S – Mean of S S’ = STD of S S’ ~ N(0,1 ) • Log Transformation S : Probe raw intensity; S’ : Log transformation, S’ = log2(S) CN = S’tumor - S’normal = log2(Stumor/Snormal) before Log transformation S after Log transformation Log(S) Segmentation/Smoothing CN Clone/Chromosome Segmentation/Smoothing CN Clone/Chromosome Segmentation/Smoothing • Goal:To partition the clones into sets with the same copy number and to characterize the genomic segments. Noise reduction Detection of Loss, Normal, Gain, Amplification Breakpoint analysis • Biological model: genomic rearrangements lead to gains or losses of sizable contiguous parts of the genome. Recurrent (over tumors) aberrations may indicate an oncogene or a tumor suppressor gene Segmentation Methods • • • • AWS - Adaptive Weights Smoothing CBS - Circular Binary Segmentation HMM - Hidden Markov Model partitioning Many more All existing methods amount to unsupervised, locationspecific partitioning and operating on individual chromosomes. Workflow of aCGH Data Analysis Finished chips (scanner) Raw image data (experiment info ) (image processing software) Probe level raw intensity data Background adjustment, Normalization, transformation Raw copy number (CN) data [log ratio of tumor/normal intensities] Segmentation and boundary determination Estimation of CN Characterizing individual genomic profiles Homework: Analyze TCGA Data The Cancer Genome Atlas Project (TCGA) • Goal: find genomic alterations that cause cancer (mutations, CNA, methylation, …) • Pilot project 1. brain (glioblastoma multiforme): 186 pairs of tumor and normal samples 2. lung (squamous) 3. ovarian (serous cystadenocarcinoma ) Flowchart of Data Analysis Raw copy number (CN) data [log ratio of tumor/normal intensities] Segmenttion and boundary determination Estimation of CN Characterizing individual genomic profiles Annotation Identify Recurrent Genes Ruby: Mapping Probes Ruby: Mapping Probes Ruby: Mapping Probes LFF format Upload Data Data Analysis: Segmentation Data Analysis: Combine Tracks Data Analysis: Annotation Selector Data Analysis: Mapping Genes Data Analysis: Recurrent Genes Overview of Data Analysis Raw copy number (CN) data [log ratio of tumor/normal intensities] Data Preprocessing (Ruby) and uploading data to Genboree Segmentation (Segmentation Tool) Characterizing individual genomic profiles Combing data Annotation (Annotation Selector; Attribute Lifter) Identify Recurrent Genes (Ruby) You Need To Submit 1. ruby script from step 1 that creates your lff file 2. ruby script from step 5 that parses your table 3. two-column final output from step 5