Download Microarray Data Visualization as a Tool for Medical

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Microarray Data Visualization as a Tool for Medical Research and Diagnosis
Multimedia Learning Module
Mark Siegal
Department of Biology
[email protected], x8-7908
Microarrays are a powerful new “genomic” technology, allowing scientists to measure
and compare the activity of thousands of genes simultaneously. Microarrays are
increasingly being used for medical research and diagnosis. For example, they have
been used to identify particular genes that have altered activity in cancerous human
tissues. Scientists have also used microarrays to identify new sub-types of cancers, by
observing genes that are more or less active in different tumors. The hope is that this
kind of “genomic diagnosis” will enable individualized treatments with more successful
outcomes than one-size-fits-all or trial-and-error approaches.
The target audience for this learning module is undergraduate non-science majors. The
learning module will be used by students enrolled in Genomes and Diversity (NYU MAP
course V55.0314). It will also be more broadly disseminated for use by other students
and teachers at NYU and at other schools. The goal of the learning module is for
students to understand how scientists organize and display the enormous amount of
data produced by microarray experiments, so that diagnostic differences between
biological samples can be discovered. The specific objectives are:
1) to understand how vast amounts of quantitative data can be visualized graphically
to aid in comprehension;
2) to understand how grouping data by similarity (“clustering”) improves visualization
and comprehension, and how clustering can be used in diagnosis; and
3) to understand how the procedure for clustering works.
The students will need the following background knowledge:
1) knowledge of the central dogma of molecular biology (DNA makes RNA makes
protein);
2) understanding that different cell types can be distinguished by what proteins they
are producing (and therefore, by extension, what RNAs they are producing);
3) knowledge of how a microarray experiment is performed (isolation of mRNA,
fluorescent labeling of cDNA, hybridization, laser scanning) [note that this could
be a subject for an additional animation or simulation]; and
4) intuitive understanding of the mathematical concept of correlation.
The learning module will need to teach the following concepts:
1) how and why a microarray experiment is performed (brief review);
2) how quantitative data from a microarray experiment are organized into a 2dimensional (genes X conditions) matrix;
3) how quantitative data in the matrix can be represented by different colors;
4) how patterns in the data become evident when groups of genes with similar
expression across conditions are clustered together (or groups of conditions with
similar expression across genes are clustered together);
5) how such patterns can be used to make inferences about differences between the
original samples;
6) how such patterns can be used to classify new samples (i.e., to diagnose);
7) how the procedure for clustering works (place together the two most similar rows
[or columns], then the next two, etc); and
8) how to look at a clustergram of real microarray data and make inferences.
The learning module should be a combination of animation and simulation. The module
should include abstract representations of a microarray experiment, but these
representations should be introduced with actual images to provide a clear connection
between physical objects and their representations. Although students in Genomes and
Diversity will have lectures and readings that supplement this learning module, the
module itself should be self-contained, with embedded instruction. The module should
be interactive, with questions that allow self-assessment by the student and assessment
by the instructor. Navigation should be flexible, allowing students to set the pace of their
own learning.
After using the module, students should be able to:
1) explain how microarray data can be useful in medical research and diagnosis;
2) explain how data organization and coding are crucial steps in interpreting results
from large experiments;
3) perform the clustering procedure on a simplified data set;
4) use clustering of a simplified data set to find patterns in the data and to classify
(diagnose) new samples; and
5) make inferences based on a clustergram of real microarray data.
Microarrays measure the activity of every gene (thousands of data points)
patient 1
patient 2
patient 3
Clustering organizes data so that patterns can be seen
1
2
3
4
5
6
1
7
6
2
5
4
3
7
gene 1
gene 4
gene 6
gene 7
gene 5
gene 2
gene 8
gene 3
gene 1
gene 2
gene 3
gene 4
gene 5
gene 6
gene 7
gene 8
This has real-world significance, for example in diagnosing previously
unrecognized sub-types of cancer with different rates of successful treatment
GC B-like
Activated B-like