Download Microarray Data Visualization analysis

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Non-coding RNA wikipedia , lookup

RNA silencing wikipedia , lookup

Transcriptional regulation wikipedia , lookup

Promoter (genetics) wikipedia , lookup

List of types of proteins wikipedia , lookup

Genome evolution wikipedia , lookup

Molecular evolution wikipedia , lookup

Gene wikipedia , lookup

Gene regulatory network wikipedia , lookup

Gene expression wikipedia , lookup

Endogenous retrovirus wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Silencer (genetics) wikipedia , lookup

Community fingerprinting wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Gene expression profiling wikipedia , lookup

RNA-Seq wikipedia , lookup

Transcript
Microarray Cancer Data
Visualization Analysis in Relation
to Pharmacogenomics
By Ngozi Nwana
1
Microarray Data Acquisition
What is Microarray:
 Microarray data (scanned image data of expressed genes) are
obtained from microscope slides that contain an ordered series
of samples (DNA, RNA, Protein, Tissue).
 The type of microarray depends on the material placed on it, for
example DNA, DNA Microarray, RNA, RNA Microarray etc. The
most commonly used microarray is the DNA microarray.
 DNA Microarrays are ordered sets of gene-specific probes fixed
to a solid support to which fluorescently labeled samples (with
reverse transcriptase – enabling RNA to bind to spots of cDNA)
are hybridized for use in massively parallel gene expression
studies.
2
Background Definition of Keywords
 Genetics has been the primary discovery engine for modern
biomedical science
 Genetics is the study of heredity and how traits are passed on
through generations
 Genomics is the study of genes and their functions
 Every human cell (with some rare exceptions) contains 46
(organized as 23 pairs) linear chromosomes (pieces of DNA).
 The chromosomes contain genetic information, which is
organized into thousands of different ‘genes’
 A gene is a stretch of DNA, which codes for a particular protein,
whether it is a structural protein (a protein that makes up part of
a structure of the cell, for example the cell wall) or an enzyme.
3
Microarray Technology and
Pharmacogenomics
 Microarray technology has enabled many advances in gene
study (genomics science).
 It provides a method of collecting thousands of individual
qualitative (such as gene category ) and/or quantitative (such as
RNA level for an entire experiment), measurements/attributes
simultaneously in a single sample.
 The oncology field has been especially active and to an extent
successful in using microarrays to differentiate between cancer
cell types and to obtain molecular signatures of the state of
activity of diseased cells of patient samples.
 This approach of studying cancer provides a better
understanding of the underlying mechanism for tumorigenesis,
more accurate diagnosis, more comprehensive prognosis, and
more effective therapeutic interventions
4
Microarray Data and Pharmacogenomics
cont’d
 Pharmacogenomics
-
studies the way a person responds to a drug, by studying
the inherited variations in genes that dictate drug
response including negative, positive or no response)
 General Practice:
Current drug therapy is empirically prescribed to fit the
needs of the “average” patient.
 Effect:
Empirical prescription leads to undue toxicity in cured
patients and delays alternative active therapies while
causing unnecessary toxicity in resistant ones.
 Goal:
To obtain new and widely applicable validated predictors
of the likelihood of optimal drug therapy response that
will enable individually tailored prescriptions.
5
Visualization of Microarray
 Visualization of microarray:
Enable the simultaneous visualization of multiple expressed gene
data attributes
Provides visualized summaries of gene expression data
Provides genome researchers with meaningful details (gene cluster
summary, map position within the genome, gene /protein sequences for
effective disease recognition
 Visualization attributes:
Quantitative attributes
- RNA level & p-Value & Size of expressed genes
Qualitative:
Color
 Size and color are two attributes that can be used to display quantitative
differences in data using most visualization tools
 Visualization methods that enable the ability to simultaneously visualize
multiple data attributes including the analysis of qualitative information about
either gene families or biological function and quantitative information such as
RNA level and p-value simultaneously are very important.
6
Source Microarray Visualization data
 BRAC 1 & BRAC 2 (Onset) Microarray real-time data
 Control data from healthy cells
 Cells from patients undergoing treatment and have undertaken
neoadjuvant chemotherapy (treatment of locally advanced and
inoperable breast cancer- given before surgery)
-
aims at reducing tumor size and increasing rates of breast
conserving treatment
7
GenePix Sample Data & Format
Rank
NAME
Log(base2) of
R/G
Normalized
Ratio (Mean)
Ch2
Normalized
Net (Mean)
Ch1 Net
(Mean)
Regression
Correlation
Spot Flag
1
IMAGE:199180
223
1
-7.986
0.799
0
2
IMAGE:810625
119
1
-7.08
0.635
0
3
IMAGE:52228
119
1
-7.08
0.611
0
4
IMAGE:141726
631
10
-6.027
0.705
0
5
IMAGE:74537
17093
330
-5.696
0.787
0
6
PEROU:5D10
451
12
-5.195
0.946
0
7
IMAGE:436741
11601
474
-4.613
0.686
0
8
IMAGE:682522
711
30
-4.571
0.842
0
9
IMAGE:782730
2852
126
-4.503
0.776
0
10
IMAGE:41648
5618
269
-4.384
0.662
0
11
IMAGE:46620
47
3
-4.155
0.782
0
12
IMAGE:51865
4595
352
-3.707
0.615
0
13
IMAGE:587847
4140
318
-3.701
0.71
0
14
IMAGE:109440
10454
821
-3.671
0.679
0
15
IMAGE:199367
21288
1749
-3.605
0.96
0
16
IMAGE:247281
177
15
-3.565
0.674
0
17
IMAGE:276688
110
10
-3.507
0.621
0
18
IMAGE:80186
7123
636
-3.486
0.913
0
19
IMAGE:214572
3764
339
-3.475
0.791
0
20
IMAGE:810911
1497
136
-3.457
0.622
0
8
MATLAB Gene Spatial image Representations
 The command gprread reads the
data from the file into a
structure.

pd.ColumnNames enabled the
read to the Structure
name/Fieldname fields with the
following resulting spatial
images of microarray data.
Figuremaimage (pd,'F635 Median')
 Notice the very high background
levels down the right side of the
array. Areas of high color
intensity signifies high level
gene expression.
9
Visualization results cont’d
Visualization results scanned at
532F for breast cancer cells
The "F532 Median" field
corresponds to the foreground of
the green (Cy3) channel.
Figure maimage(pd,'F532 Median')
10
Visualization results for the untreated Control
sample scanned at 532F
11
Clustering Commands
 The xlsread function can be used to read in the data from the
XLS file and load the data into MATLAB
[numericData, textData] = xlsread(‘cancerdata.xls);
This reads the data in the spreadasheet in two variable,
numericData (stores numeric values) and textData for text
values
giValues = numericData (:,2: end);
drugMechanism = textData(2: end,1);
 To perform the clustering, the command below is used:
clustergram(giValues, ‘rowlabels’, drug, ‘columnlabels’,
tumorTypes);
12
Cluster figure
13
Visualization results
A Subsection example of Unsupervised hierarchical clustering
14
Microarray Data Visualization Results
Cross section of Hierarchical Clustering of expressed genes
15
Conclusion
 Significant differences in gene expression in cancer specimens
before and after treatment were observed
 Differences in the microarray spatial images between the
control and diseased cancer genes were observed.
 Further confirmation of whether the drug used is providing
effective therapy is an oncologist’s call.
16
Q &A
Thank You
17