Download 1 D DISCRETE WAVELET TRANSFORM FOR CLASSIFICATION OF Adarsh Jose

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Minimal genome wikipedia , lookup

Primary transcript wikipedia , lookup

Public health genomics wikipedia , lookup

United Kingdom National DNA Database wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

Biology and consumer behaviour wikipedia , lookup

Point mutation wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Ridge (biology) wikipedia , lookup

Transposable element wikipedia , lookup

Epigenetics of neurodegenerative diseases wikipedia , lookup

Genomics wikipedia , lookup

Gene nomenclature wikipedia , lookup

Genetic engineering wikipedia , lookup

Genomic imprinting wikipedia , lookup

Gene desert wikipedia , lookup

Gene therapy wikipedia , lookup

Long non-coding RNA wikipedia , lookup

Epigenomics wikipedia , lookup

NEDD9 wikipedia , lookup

Polycomb Group Proteins and Cancer wikipedia , lookup

Non-coding DNA wikipedia , lookup

Epigenetics in learning and memory wikipedia , lookup

Metagenomics wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Genome (book) wikipedia , lookup

Cancer epigenetics wikipedia , lookup

Epigenetics of human development wikipedia , lookup

Genome editing wikipedia , lookup

Gene wikipedia , lookup

Genome evolution wikipedia , lookup

Oncogenomics wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Epigenetics of diabetes Type 2 wikipedia , lookup

History of genetic engineering wikipedia , lookup

Mir-92 microRNA precursor family wikipedia , lookup

Helitron (biology) wikipedia , lookup

Site-specific recombinase technology wikipedia , lookup

Designer baby wikipedia , lookup

Nutriepigenomics wikipedia , lookup

Microevolution wikipedia , lookup

Gene expression programming wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Gene expression profiling wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

RNA-Seq wikipedia , lookup

Transcript
1 D DISCRETE WAVELET TRANSFORM FOR CLASSIFICATION OF
CANCER SAMPLES IN DNA MICROARAY DATA
Adarsh
1
Jose ,
Dale
1
Mugler ,
PhD, Zhong-Hui
2,3
Duan ,
PhD.
1. Department of Biomedical Engineering, The University of Akron
2. Department of Computer Science, The
University of Akron
3. Integrated Biosciences Program, The University of Akron
Abstract
The most important problem in applying Supervised
Learning methods for classifying cancer samples
using the gene expression profiles, is the limited
availability of the samples. So selecting the relevant
features is imperative
for optimizing the
classification algorithms. A feature(gene) selection
method using 1D Discrete Wavelet Transforms is
proposed for addressing ‘two class’ problems in
DNA microarray data.
Gene Expression: The process by which encoded
information from
DNA is converted into actual
structures in cells. The subset of ‘expressed genes’ and
their ‘expression levels’ form a characteristic of the state
of the cell.
DNA microarrays: Allows measurement of expression
levels of thousands of genes simultaneously. Entire
genome can be probed at a single point of time. It is
based on base pair attraction between complementary
pairs in the DNA and RNA strands. The microarray
technology quantifies the notion of gene expression.
Classification problem of microarray
data
Training sets
with class labels
Feature Selection
Training
Classifier
Validation using
Testing Set
Problem : The number of features(genes) is very large
compared to the number of samples.
Solution: To reduce the feature size by ‘selecting’ or
‘extracting’ the ‘most relevant’ features.
What is the wavelet transform ?
Datasets
•  Leukemia dataset - 48 ALL & 25 AML Samples
•  B-Cell Lymphoma dataset – 58 DLBLC & 10 FCC
Results & Observations
•  The algorithm was tested for classification accuracy on
the oligonucleotide datasets by using KNN Classifier
and 3 different validation methods for different
variable sizes .
•  ‘Haar’ and ‘Bior1.5’ wavelets gave accuracy of up to 97%.
•  The average classification error is less than 11% in
both the oligonucleotide datasets studied.
•  ‘Shuffling’ the samples within each class ‘DOES NOT’
have any effect on the accuracy.
Procedure
1 D Discrete Wavelet Transform
•  Break down the signal into different frequency bands.
•  Implemented by sending the signal through a series of
high
pass and low pass half band filters.
Conclusions
Wavelet Decomposition
Examples
•  1-D Discrete Wavelets can capture patterns in Gene –
Expression data which makes it a potential tool for
feature selection.
•  A complete Error Estimation study has to be carried out
with microarray data obtained from different
platforms.
References
1. T.R Golub et al. Molecular Classification of cancer: Class
Discovery and Class Prediction by Gene Expression
Monitoring, www.sciencemag.org, SCIENCE, VOL 286 (1999)
•  The samples are grouped into the 2 classes.
•  1-D Discrete Wavelet Transform to Level 3 of gene was
taken.
•  Gene expression profile reconstructed using Level 3
approx. only.
•  Score = abs(mean(class1) – mean(class2))
•  Genes were ranked by their scores .
2. Shipp MA, Ross KN, Tamayo P, Weng AP, Kutok JL, Aguiar
RC, Gaasenbeek M, Angelo M, Reich M, Pinkus GS, Ray TS,
Koval MA, Last KW, Norton A, Lister TA, Mesirov J, Neuberg
DS, Lander ES, Aster JC, Golub TR.
Diffuse large B-cell lymphoma outcome prediction by geneexpression profiling and supervised machine learning
Nat Med 2002 Jan;8(1):68-74.
3.Matlab manual – Matlab Wavelet toolbox, Matlab
Bioinformatics toolbox. Mathworks