Download file1

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Alizadeh et. al. (2000)
Stephen Ayers
“Clustering is finding a natural grouping in
a set of data, so that samples within a
cluster will be more similar to each other
than they are to samples in other clusters.”
Finding groups of correlated genes
“signature groups”
Genes without well established
Extract features of groups
Hierarchical Clustering
• Tiers of points from a bottom layer of 1
point in each of n clusters to top level of n
points, all in one cluster
• Usually represented in dendrogram
• Top-down
• Start with all samples and
successively split into separate
• Bottom-up approach
• Less computationally intensive
• Start with n singletons and
successively merge clusters
– Place all values in separate clusters
– Merge most similar clusters into higher
– Repeat until all clusters have been merged
Average-Linkage Method
Available << >>
1. Compute similarity matrix
2. Scan matrix to find most highest similarity
Uses form of the correlation coefficient
3. A node is created between these values
4. Values are replaced by node
Diffuse Large B-cell Lymphoma
• Most common subtype of non-Hodgkin’s
• 25,000 cases/year
• 40% of patients respond well
• Possible undetected heterogeneity
• Found 2 classes using clustering (Eisen
1998): Germinal Center B-like and
Activated B-like
17,856 cDNA clones total
12,069 germinal center B-cell library
2,338 lymphomic cancer genes
3,186 genes important to lymphocyte or
cancer biology
• ¼ of genes = duplicates
Expression Analysis
• DLBCL, Follicular Lymphoma, Chronic
Lympohcytic Leukemia
• Lymphocyte subpopulations with a range of
• -normal human tonsils, lymph nodes
• -lymphoma, leukemia cell lines
Figure 1
Figure 2:
Figure 3:
Figure 4:
Figure 5
• More categories likely
• Changes in treatment
• Possible drug targets