Download Learning Methodologies for Detection and Classification of Mutagens

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

List of types of proteins wikipedia , lookup

Photosynthetic reaction centre wikipedia , lookup

Multi-state modeling of biomolecules wikipedia , lookup

Drug design wikipedia , lookup

Biochemistry wikipedia , lookup

Transcript
274
Chapter 14
Learning Methodologies
for Detection and
Classification of Mutagens
Huma Lodhi
Imperial College London, UK
AbsTRACT
Predicting mutagenicity is a complex and challenging problem in chemoinformatics. Ames test is a
biological method to assess mutagenicity of molecules. The dynamic growth in the repositories of molecules establishes a need to develop and apply effective and efficient computational techniques to solving chemoinformatics problems such as identification and classification of mutagens. Machine learning
methods provide effective solutions to chemoinformatics problems. This chapter presents an overview
of the learning techniques that have been developed and applied to the problem of identification and
classification of mutagens.
INTRODUCTION
Mutagenicity is an unfavorable characteristic
of drugs that can cause adverse effects. In chemoinformatics, it is crucial to develop and design
effective and efficient computational tools to
identify toxic and mutagenic molecules. Accurate
prediction of mutagenicity will not only accelerate the process of finding quality lead molecules
but will also decrease the potential drug attrition.
During recent years considerable efforts have been
devoted to developing, analyzing and applying
DOI: 10.4018/978-1-61520-911-8.ch014
statistical and relational learning techniques to
identify undesirable biological effects such as
mutagenicity.
Mutagens produce mutations to DNA and may/
may not cause cancers. However the use of drugs
that are characterized by mutagenicity but not
carcinogenicity is not recommended (Debnath,
Compadre, Debnath, Schusterman, & Hansch,
1991). The Ames test (Ames, Lee, & Durston,
1973) is viewed a biological means to identify
mutagenic molecules. In this test, a bacterium,
generally Salmonella typhimurium, is used to
categorize mutagens and non-mutagens. The novel
molecules are exposed to the bacterium that lacks
Copyright © 2011, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Learning Methodologies for Detection and Classification of Mutagens
the ability to produce amino acid, histidine. The
growth of the bacterial culture demonstrates the
mutations in DNA, hence the molecule is classified mutagen. Figure 1 shows a mutagenic molecule. Machine learning methods and techniques
provides an accurate, useful and efficient means
to classify mutagens. In this chapter we present
an overview of a number of techniques that have
been developed and applied to the problem of
predicting mutagenicity. The review, presented in
the chapter, is not exhaustive and recent research
and seminal work has been outlined.
bACKGROUND
In machine learning the problem of recognition
and identification of mutagens is generally solved
by viewing it as a classification problems. Methods ranging from Inductive Logic Programming
Figure 1. An example of mutagenic molecule
(ILP) techniques to kernel based methods (KMs)
have been developed and applied to mutagenicity
classification. Mutagenesis dataset presented by
Debnath et al. (1991) is a benchmark dataset on
which the efficacy of learning methods has been
evaluated. We, therefore, present an overview of
the techniques that have been applied to the dataset.
Mutagenesis dataset comprises 230 molecules trialled for mutagenicity on Salmonella
typhimurium. Debnath et al. (1991) showed that
a subset of 188 molecules are learnable using
linear regression. This subset was later termed the
“regression friendly” dataset (hereafter referred
to as mutagenesis dataset). The remaining 42
molecules are named the “regression unfriendly”
subset. Of the 188 molecules 125 have positive log
mutagenicity whereas 63 molecules have zero or
negative log mutagenicity. Debnath et al. identified two chemical features, C, and two structural
(indicator) variables, I, to predicting mutagenicity.
The chemical features are lowest unoccupied molecule orbital (LUMO) and water/octanol partition
coefficient (LOGP). The two indicator variables
are number of fused rings (fused rings count),
IN1, and examples of acenthrylenes, IN2. These are
structural binary variables where IN1 is assigned
value “1” if a molecule has 3 or more fused rigs
and IN1 is set to “0” for all the molecules that have
less than 3 fused rings. Similarly the value of IN2
is set to 1 for 5 examples of acenthrylenes and
alternatively 0. On the basis of linear regression
based quantitative structure activity relation analysis, Debanth et al. suggested that mutagenicity of
molecules that are aromatic nitro compounds is
characterized by hydrophobicity, nitro groups in
conjunction with electron attracting elements and
3 or more fused rings.
Srinivasan, Muggleton, King, and Sternberg
(1996) introduced more features for the mutagenesis dataset by exploiting atom bond connectivities
and using first order logic. The key information is
given in the form of atom and bond, AB, description. Furthermore, atom and bond description is
used to define functional groups, FG, including
275
13 more pages are available in the full version of this document, which may
be purchased using the "Add to Cart" button on the product's webpage:
www.igi-global.com/chapter/learning-methodologies-detection-classificationmutagens/45475?camid=4v1
This title is available in InfoSci-Medical, InfoSci-Books, Communications,
Social Science, and Healthcare. Recommend this product to your librarian:
www.igi-global.com/e-resources/library-recommendation/?id=18
Related Content
Using Chemical Structural Indicators for Periodic Classification of Local Anaesthetics
Francisco Torrens and Gloria Castellano (2013). Methodologies and Applications for Chemoinformatics and
Chemical Engineering (pp. 117-137).
www.igi-global.com/chapter/using-chemical-structural-indicators-periodic/77073?camid=4v1a
Advanced PLS Techniques in Chemometrics and Their Applications to Molecular Design
Kiyoshi Hasegawa and Kimito Funatsu (2011). Chemoinformatics and Advanced Machine Learning
Perspectives: Complex Computational Methods and Collaborative Techniques (pp. 145-168).
www.igi-global.com/chapter/advanced-pls-techniques-chemometrics-their/45469?camid=4v1a
Modeling Ecotoxicity as Applied to some Selected Aromatic Compounds: A Conceptual DFT
Based Quantitative-Structure-Toxicity-Relationship (QSTR) Analysis
Santanab Giri, Arindam Chakraborty, Ashutosh Kumar Gupta, Debesh Ranjan Roy, Ramadoss Vijayaraj,
Ramakrishnan Parthasarathi, Venkatesan Subramanian and Pratim Chattaraj (2012). Advanced Methods
and Applications in Chemoinformatics: Research Progress and New Applications (pp. 1-24).
www.igi-global.com/chapter/modeling-ecotoxicity-applied-some-selected/56448?camid=4v1a
On Extended Topochemical Atom (ETA) Indices for QSPR Studies
Kunal Roy and Rudra Narayan Das (2012). Advanced Methods and Applications in Chemoinformatics:
Research Progress and New Applications (pp. 380-411).
www.igi-global.com/chapter/extended-topochemical-atom-eta-indices/56464?camid=4v1a