Download In the data set I send you there are the compounds with some

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Aromaticity wikipedia , lookup

Phenols wikipedia , lookup

Homoaromaticity wikipedia , lookup

VX (nerve agent) wikipedia , lookup

Transcript
The IMAGETOX 1 data set.
This data set (excel file) contains toxicity data of chemical compounds. The target organism is a
fish, fathead minnow (Pimephales promelas); for chemical compounds it contains chemical
descriptors.
The toxicity data is from the US Environmental Protection Agency (1), while chemical descriptors
have been calculated by Mario Negri Institute.
The structure of the data set is the following:
the first columns (in blue) identify the compounds
 ID: the number as in EPA list. The original data set of EPA is bigger. For instance it contains
compounds without toxicity values. I have cut these compounds, but left the original ID number
in the EPA list.
 NAME: the chemical name
 CAS: the CAS number
The following five columns (in red) are from EPA
 SMILES: the smiles structure (we corrected some wrong structures)
 CODE: the code given by EPA for the different compounds; it refers to different chemical
classes, according to a classification defined by EPA, and reported in a separate table (Table 1),
attached. This may be useful if you want to focalise your attention to certain classes
 MW: the molecular weight according to the EPA list (salt or hydrate). Since it is not correct to
model for toxicity water or counter ions, descriptors have been calculated for the free
compounds. As a consequence, the molecular weight has been calculated on this new basis, and
given in the part of descriptors (black columns); but here there is the original EPA value.
 LogP2: the log P reported by EPA. In some cases it is measured, in others calculated (it means
that it is not a ‘pure’ parameter). This can be used as a descriptor. However, this descriptor has
been calculated also by us and reported in the descriptor section (black columns).
 MoA. Code for Mode of Action; see Table 2 for definition
The following four columns in green show toxicity values:
 The dose is expressed in mg/l
 Or in mmol/l (dividing mg with MW as in the black column)
 Or expressing the dose in mmol/l as –log
 Or expressing the toxicity as class, according to the EC classification
All the other columns in black are the descriptors of the chemicals.
Table 1.
1.0
1.1
2.0
2.1
3.0
3.1
3.3
4.0
Definition of chemical classification codes presented in the CODE field of the
FHM_MOA.DBF file.
Alkanes
Alkenes
Saturated Hydrocarbons
Unsaturated Hydrocarbons
Basic Ethers
Diphenyl Ethers
Cyclic Ethers
Basic Alcohols
4.1
4.2
4.3
5.0
6.0
6.1
6.2
7.0
8.0
8.1
8.2
8.3
9.0
10.0
10.1
10.2
10.3
10.4
10.5
11.1
12.0
12.1
12.2
12.3
13.0
13.1
14.0
14.1
15.0
15.2
15.3
15.4
15.5
15.6
16.0
17.0
18.0
19.0
20.0
21.0
22.0
23.0
23.1
24.0
Alkene Alcohols
Alkyne Alcohols
Diols
Aldehydes
Basic Ketones
beta-Diketones
Cyclic Ketones
Carboxylic Acids
Basic Esters
Phthalates
Amides
Acrylates
Nitriles
Primary, aliphatic amines
Secondary, aliphatic amines
Tertiary, aliphatic amines
Primary, aromatic amines
Secondary, aromatic amines
Tertiary, aromatic amines
Azine compounds
Thiols
Sulfides
Disulfides
Sulfo compounds
Benzenes
Chlorinated Benzenes
Phenols
Chlorinated Phenols
Piperazines
Pyrimidines
Pyridines
Triazines
5-Membered ring aliphatics
5-Membered ring aromatics
Multiple hetero-atom compounds
Heterocyclic sulfur compounds
Anilides and Ureas
Phosphorous compounds
Quaternary ammonium compounds
Carbamates
Other pesticides
Barbitals
DEAS-complex structures
Organometallics
Table 2.
Definition of mode of action codes used in MOA_NUM and MOA fields from
the FHM_MOA.DBF.
MOA_NUM
MOA
Definition
1.1
NARCOSIS_I_1
Narcosis I MOA with an A level of confidence
1.2
NARCOSIS_I_2
Narcosis I MOA with a B level of confidence
1.3
NARCOSIS_I_3
Narcosis I MOA with a C level of confidence
1.4
NARCOSIS_I_4
Narcosis I MOA with a D level of confidence
2.1
NARCOSIS_II_1
Narcosis II MOA with an A level of confidence
2.2
NARCOSIS_II_2
Narcosis II MOA with a B level of confidence
2.3
NARCOSIS_II_3
Narcosis II MOA with a C level of confidence
2.4
NARCOSIS_II_4
Narcosis II MOA with a D level of confidence
3.1
UNCOUPLER_1
Uncoupler of oxidative phosphorylation MOA with an
A level of confidence
3.2
UNCOUPLER_2
Uncoupler of oxidative phosphorylation MOA with a
B level of confidence
3.3
UNCOUPLER_3
Uncoupler of oxidative phosphorylation MOA with a
C level of confidence
5.1
ACHE_1
Acetylcholinesterase Inhibition MOA with an A level
of confidence
6.1
BLOCK_1
Respiratory blocker/inhibitor MOA with an A level of
confidence
6.3
BLOCK_3
Respiratory blocker/inhibitor MOA with a C level of
confidence
7.1
REACTIVE_1
Electrophile/proelectrophile reactivity MOA with an
A level of confidence
7.2
REACTIVE_2
Electrophile/proelectrophile reactivity MOA with a B
level of confidence
7.3
REACTIVE_3
Electrophile/proelectrophile reactivity MOA with a C
level of confidence
7.4
REACTIVE_4
Electrophile/proelectrophile reactivity MOA with a D
level of confidence
8.1
NEUROTOX_1
Central nervous system seizure/stimulant MOA with
an A level of confidence
10.1
NARCO_ESTER_1
Narcosis III MOA with an A level of confidence
10.2
NARCO_ESTER_2
Narcosis III MOA with a B level of confidence
MOA_NUM
MOA
Definition
10.3
NARCO_ESTER_3
Narcosis III MOA with a C level of confidence
10.4
NARCO_ESTER_4
Narcosis III MOA with a D level of confidence
12.1
NARCO_I&II_1
Identified as both Narcosis I & II MOA with an A
level of confidence
12.2
NARCO_I&II_2
Identified as both Narcosis I & II MOA with a B level
of confidence
12.3
NARCO_I&II_3
Identified as both Narcosis I & II MOA with a C level
of confidence
(1) Russom C.L., Bradbury S.P., Broderius S.J., Hammermeister D.E., Drummond R.A. 1997.
Predicting modes of toxic action from chemical structure: acute toxicity in the fathead minnow
(pimephales promelas). Environ Toxicol Chem 16: 948-967