* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Microarray Data Visualization analysis
Survey
Document related concepts
Non-coding RNA wikipedia , lookup
RNA silencing wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Promoter (genetics) wikipedia , lookup
List of types of proteins wikipedia , lookup
Genome evolution wikipedia , lookup
Molecular evolution wikipedia , lookup
Gene regulatory network wikipedia , lookup
Gene expression wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Community fingerprinting wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Transcript
Microarray Cancer Data Visualization Analysis in Relation to Pharmacogenomics By Ngozi Nwana 1 Microarray Data Acquisition What is Microarray: Microarray data (scanned image data of expressed genes) are obtained from microscope slides that contain an ordered series of samples (DNA, RNA, Protein, Tissue). The type of microarray depends on the material placed on it, for example DNA, DNA Microarray, RNA, RNA Microarray etc. The most commonly used microarray is the DNA microarray. DNA Microarrays are ordered sets of gene-specific probes fixed to a solid support to which fluorescently labeled samples (with reverse transcriptase – enabling RNA to bind to spots of cDNA) are hybridized for use in massively parallel gene expression studies. 2 Background Definition of Keywords Genetics has been the primary discovery engine for modern biomedical science Genetics is the study of heredity and how traits are passed on through generations Genomics is the study of genes and their functions Every human cell (with some rare exceptions) contains 46 (organized as 23 pairs) linear chromosomes (pieces of DNA). The chromosomes contain genetic information, which is organized into thousands of different ‘genes’ A gene is a stretch of DNA, which codes for a particular protein, whether it is a structural protein (a protein that makes up part of a structure of the cell, for example the cell wall) or an enzyme. 3 Microarray Technology and Pharmacogenomics Microarray technology has enabled many advances in gene study (genomics science). It provides a method of collecting thousands of individual qualitative (such as gene category ) and/or quantitative (such as RNA level for an entire experiment), measurements/attributes simultaneously in a single sample. The oncology field has been especially active and to an extent successful in using microarrays to differentiate between cancer cell types and to obtain molecular signatures of the state of activity of diseased cells of patient samples. This approach of studying cancer provides a better understanding of the underlying mechanism for tumorigenesis, more accurate diagnosis, more comprehensive prognosis, and more effective therapeutic interventions 4 Microarray Data and Pharmacogenomics cont’d Pharmacogenomics - studies the way a person responds to a drug, by studying the inherited variations in genes that dictate drug response including negative, positive or no response) General Practice: Current drug therapy is empirically prescribed to fit the needs of the “average” patient. Effect: Empirical prescription leads to undue toxicity in cured patients and delays alternative active therapies while causing unnecessary toxicity in resistant ones. Goal: To obtain new and widely applicable validated predictors of the likelihood of optimal drug therapy response that will enable individually tailored prescriptions. 5 Visualization of Microarray Visualization of microarray: Enable the simultaneous visualization of multiple expressed gene data attributes Provides visualized summaries of gene expression data Provides genome researchers with meaningful details (gene cluster summary, map position within the genome, gene /protein sequences for effective disease recognition Visualization attributes: Quantitative attributes - RNA level & p-Value & Size of expressed genes Qualitative: Color Size and color are two attributes that can be used to display quantitative differences in data using most visualization tools Visualization methods that enable the ability to simultaneously visualize multiple data attributes including the analysis of qualitative information about either gene families or biological function and quantitative information such as RNA level and p-value simultaneously are very important. 6 Source Microarray Visualization data BRAC 1 & BRAC 2 (Onset) Microarray real-time data Control data from healthy cells Cells from patients undergoing treatment and have undertaken neoadjuvant chemotherapy (treatment of locally advanced and inoperable breast cancer- given before surgery) - aims at reducing tumor size and increasing rates of breast conserving treatment 7 GenePix Sample Data & Format Rank NAME Log(base2) of R/G Normalized Ratio (Mean) Ch2 Normalized Net (Mean) Ch1 Net (Mean) Regression Correlation Spot Flag 1 IMAGE:199180 223 1 -7.986 0.799 0 2 IMAGE:810625 119 1 -7.08 0.635 0 3 IMAGE:52228 119 1 -7.08 0.611 0 4 IMAGE:141726 631 10 -6.027 0.705 0 5 IMAGE:74537 17093 330 -5.696 0.787 0 6 PEROU:5D10 451 12 -5.195 0.946 0 7 IMAGE:436741 11601 474 -4.613 0.686 0 8 IMAGE:682522 711 30 -4.571 0.842 0 9 IMAGE:782730 2852 126 -4.503 0.776 0 10 IMAGE:41648 5618 269 -4.384 0.662 0 11 IMAGE:46620 47 3 -4.155 0.782 0 12 IMAGE:51865 4595 352 -3.707 0.615 0 13 IMAGE:587847 4140 318 -3.701 0.71 0 14 IMAGE:109440 10454 821 -3.671 0.679 0 15 IMAGE:199367 21288 1749 -3.605 0.96 0 16 IMAGE:247281 177 15 -3.565 0.674 0 17 IMAGE:276688 110 10 -3.507 0.621 0 18 IMAGE:80186 7123 636 -3.486 0.913 0 19 IMAGE:214572 3764 339 -3.475 0.791 0 20 IMAGE:810911 1497 136 -3.457 0.622 0 8 MATLAB Gene Spatial image Representations The command gprread reads the data from the file into a structure. pd.ColumnNames enabled the read to the Structure name/Fieldname fields with the following resulting spatial images of microarray data. Figuremaimage (pd,'F635 Median') Notice the very high background levels down the right side of the array. Areas of high color intensity signifies high level gene expression. 9 Visualization results cont’d Visualization results scanned at 532F for breast cancer cells The "F532 Median" field corresponds to the foreground of the green (Cy3) channel. Figure maimage(pd,'F532 Median') 10 Visualization results for the untreated Control sample scanned at 532F 11 Clustering Commands The xlsread function can be used to read in the data from the XLS file and load the data into MATLAB [numericData, textData] = xlsread(‘cancerdata.xls); This reads the data in the spreadasheet in two variable, numericData (stores numeric values) and textData for text values giValues = numericData (:,2: end); drugMechanism = textData(2: end,1); To perform the clustering, the command below is used: clustergram(giValues, ‘rowlabels’, drug, ‘columnlabels’, tumorTypes); 12 Cluster figure 13 Visualization results A Subsection example of Unsupervised hierarchical clustering 14 Microarray Data Visualization Results Cross section of Hierarchical Clustering of expressed genes 15 Conclusion Significant differences in gene expression in cancer specimens before and after treatment were observed Differences in the microarray spatial images between the control and diseased cancer genes were observed. Further confirmation of whether the drug used is providing effective therapy is an oncologist’s call. 16 Q &A Thank You 17