Download 11051_2011_238_MOESM1_ESM

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Supplementary material
Improved model for fullerene C60 solubility in organic solvents based on quantum-chemical
and topological descriptors.
Tetyana Petrova1, Bakhtiyor F Rasulev1,*, Andrey A Toropov3, Danuta Leszczynska2 and Jerzy Leszczynski1
1
Interdisciplinary Center for Nanotoxicity, Department of Chemistry and Biochemistry, Jackson State University, 1400 J. R. Lynch Street,
P. O. Box 17910, Jackson, MS 39217, USA
2
Department of Civil and Environmental Engineering, Jackson State University, 1400 J. R. Lynch Street, P. O. Box 17910, Jackson, MS
39217, USA
3
Istituto di Ricerche Farmacologiche Mario Negri, 20156, Via La Masa 19, Milano, Italy
*) Corresponding author: Tel: 601-979-4114, fax: 601-979-7823
E-mail address: [email protected]
Table S1.
List of descriptors involved in the models 1-5.
N
Solvent name
1
2
3
pentane
hexane
octane
4
5
6
isooctane
decane
dodecane
7
8
9
10
11
12
13
14
15
cis-decahydronaphthalene
transdecahydronaphthalene
cyclopentyl bromide
cyclohexyl chloride
cyclohexyl bromide
cyclohexyl iodide
CAS
No.
X1sol
J3D
109-66-0
2.414
2.914
3.914
3.770
5.467
5.753
6.151
6.582
493-01-6
4.914
5.914
4.966
6.404
6.577
3.173
493-02-7
4.966
3.154
137-43-9
3.471
3.683
3.971
4.260
4.959
3.207
3.541
3.524
3.508
3.452
3.000
3.394
3.805
3.242
3.904
4.199
2.121
3.000
2.828
3.464
2.500
2.475
2.121
2.475
3.797
2.243
1.783
2.126
1.849
2.417
2.183
3.537
3.485
3.029
Training set
-0.308
0.420
-0.302
0.414
-0.293
0.406
-0.294
0.404
-0.288
0.400
-0.284
0.397
-0.274
0.388
-0.272
0.384
-0.264
0.250
-0.281
0.290
-0.262
0.248
-0.246
0.210
-0.264
0.229
-0.235
0.261
-0.286
0.403
-0.282
0.395
-0.316
0.280
-0.337
0.229
-0.289
0.225
-0.292
0.190
-0.258
0.216
-0.297
0.244
-0.271
0.254
-0.253
0.215
-0.326
0.261
3.382
3.215
4.403
4.324
4.395
4.356
4.319
4.244
4.226
4.175
4.079
3.976
4.072
-0.300
-0.316
-0.289
-0.251
-0.286
-0.266
-0.250
-0.296
-0.279
-0.272
-0.239
-0.264
-0.288
5.247
-0.287
110-54-3
111-65-9
2663564-3
124-18-5
112-40-3
542-18-7
108-85-0
626-62-0
5401-627
110-83-8
16
17
18
19
20
21
22
23
24
1,2-dibromocyclohexane
cyclohexene
methylcyclohexane
trans-1,2dimethylcyclohexane
dichloromethane
carbon tetrachloride
dibromomethane
bromoform
iodomethane
bromochloromethane
bromoethane
iodoethane
25
26
27
28
29
30
31
32
33
34
35
36
37
1,1,2,2-tetrachloroethane
1,2-dichloroethane
1,1,1-trichloroethane
1-chloropropane
1-iodopropane
2-chloropropane
2-bromopropane
2-iodopropane
1,2-dichloropropane
1,3-dichloropropane
1,2-dibromopropane
1,3-diiodopropane
1,2,3-tribromopropane
79-34-5
38
1,2,3-trichloropropane
96-18-4
2.621
2.750
2.268
2.975
2.021
2.309
2.598
2.912
3.121
3.555
4.536
4.800
3.804
513-36-0
2.624
39
1-chloro-2-methylpropane
108-87-2
6876-239
75-09-2
56-23-5
74-95-3
75-25-2
74-88-4
74-97-5
74-96-4
75-03-6
107-06-2
71-55-6
540-54-5
107-08-4
75-29-6
75-26-3
75-30-9
78-87-5
142-28-9
78-75-1
627-31-6
96-11-7
HOMO,
eV
HOMOLUMO gap
0.283
0.248
0.297
0.215
0.294
0.250
0.211
0.280
0.280
0.234
0.197
0.212
0.257
0.297
TI2
FDI
H052
X3
AMW
nHAcc
2.094
2.488
3.284
2.999
0.956
0.970
0.993
0.971
0
0
0
0
0.707
0.957
1.457
1.385
4.250
4.310
4.390
4.390
0
0
0
0
4.086
4.891
1.047
1.000
1.000
0.972
0
0
0
1.957
2.457
3.466
4.450
4.480
4.940
0
0
0
1.047
0.977
0
3.466
4.940
0
0.956
0.975
0.975
0.975
0.854
1.000
0.986
0.993
1.000
1.000
4
4
4
4
4
1.644
1.894
1.894
1.894
2.54
9.940
6.590
9.060
11.67
13.44
0
0
0
0
0
0.667
0.975
0.854
0.979
0.951
0.947
0
0
0
1.5
1.894
2.54
5.140
4.680
4.680
0
0
0
1.333
0.800
1.333
1.000
1.000
1.333
1.333
1.333
1.521
1.000
1.000
1.000
1.000
1.000
1.000
0.983
1.000
1.000
0
0
0
0
0
0
3
3
0
0
0
0
0
0
0
0
0
1.333
16.99
30.76
34.77
50.54
28.39
25.88
13.62
19.50
20.98
0
0
0
0
0
0
0
0
0
1.707
0.800
1.707
1.707
1.000
1.000
1.000
1.542
2.094
1.542
2.094
1.745
1.745
1.000
1.000
0.964
0.990
0.968
0.982
0.996
1.000
1.000
1.000
1.000
1.000
1.000
0
0
2
2
6
6
6
3
0
3
0
0
0
0.5
0
0.5
0.5
0
0
0
0.816
0.707
0.816
0.707
1.394
1.394
12.37
16.67
7.140
15.45
7.140
11.180
15.45
10.27
10.27
18.35
26.900
25.53
13.40
0
0
0
0
0
0
0
0
0
0
0
0
0
1.542
0.961
1
0.816
6.610
0
40
1-iodo-2-methylpropane
513-38-2
3.331
2.977
5.178
5.214
-0.250
-0.267
3.328
3.797
2.624
2.301
2.060
4.376
-0.263
-0.283
-0.268
526-73-8
3.000
3.805
3.788
4.215
2.527
3.063
2.940
3.319
-0.253
-0.234
-0.233
-0.231
95-63-6
4.198
3.192
-0.224
108-67-8
4.182
3.148
-0.230
527-53-7
4.609
3.465
-0.221
119-64-2
694-80-4
4.966
4.432
4.305
4.932
4.605
2.816
3.683
3.971
4.382
4.942
4.671
2.546
3.277
3.318
3.244
3.689
2.501
2.458
2.443
2.391
2.362
2.377
-0.233
-0.240
-0.240
-0.239
-0.240
-0.260
-0.257
-0.251
-0.264
-0.259
-0.259
108-37-2
4.654
2.378
-0.262
507-19-7
41
42
43
2-bromo-2-methylpropane
1,2-dibromoethylene
tetrachloroethylene
540-49-8
127-18-4
513-37-1
44
45
46
47
1-chloro-2-methylpropene
benzene
1,2-dimethylbenzene
1,3-dimethylbenzene
48
1,2,3-trimethylbenzene
49
1,2,4-trimethylbenzene
50
63
1,3,5-trimethylbenzene
1,2,3,5tetramethylbenzene
tetralin
n-propylbenzene
iso-propylbenzene
n-butylbenzene
tert-butylbenzene
fluorobenzene
chlorobenzene
bromobenzene
1,2-dichlorobenzene
1,3-dibromobenzene
1-bromo-2-chlorobenzene
1-bromo-3-chlorobenzene
64
65
66
67
68
69
70
71
72
73
1,2,4-trichlorobenzene
styrene
nitrobenzene
benzonitrile
anisole
benzaldehyde
phenyl isocyanate
3-nitrotoluene
thiophenol
benzyl bromide
74
75
trichlorotoluene
1-methylnaphthalene
51
52
53
54
55
56
57
58
59
60
61
62
76
77
78
79
80
81
dimethylnaphthalene
1-phenylnaphthalene
ethanol
1-butanol
1-pentanol
acetone
82
N,N-dimethylformamide
71-43-2
95-47-6
108-38-3
103-65-1
98-82-8
104-51-8
98-06-6
462-06-6
108-90-7
108-86-1
95-50-1
108-36-1
0.215
0.254
0.207
0.226
0.233
0.244
0.234
0.231
0.232
0.226
0.230
0.228
0.233
0.234
0.235
0.234
0.235
0.233
0.230
0.225
0.225
0.219
1.542
1.542
0.980
0.970
1
1
0.816
0.816
13.14
9.790
0
0
1.707
1.521
1.542
1.000
1.000
0.976
0
0
0
0.5
1.333
0.816
30.97
27.64
7.550
0
0
0
0.667
0.854
1.069
0.950
1.000
0.991
1.000
0.987
0
0
0
0
1.5
2.54
2.199
3.114
6.510
5.900
5.900
5.720
0
0
0
0
1.060
0.998
0
2.86
5.720
0
0.950
1.000
0
2.414
5.720
0
0.962
0.993
0
3.343
5.590
0
1.047
2.036
1.659
2.593
1.726
0.975
0.975
0.975
0.854
1.069
0.854
1.000
0.965
0.981
0.994
0.973
1.000
1.000
1.000
1.000
1.000
1.000
0
0
0
0
0
0
0
0
0
0
0
3.466
2.422
2.593
2.691
2.83
1.894
1.894
1.894
2.54
2.199
2.54
6.010
5.720
5.720
5.590
5.590
8.010
9.380
13.08
12.25
19.66
15.95
0
0
0
0
0
1
0
0
0
0
0
1.069
1.000
0
2.199
15.95
0
1.060
1.000
0
2.86
15.12
0
1.481
1.659
1.481
1.481
1.481
2.036
1.686
0.975
1.481
0.962
1.000
1.000
1.000
0.988
1.000
1.000
1.000
1.000
1.000
1.000
0
0
0
0
0
0
0
0
0
0
2.302
2.593
2.302
2.302
2.302
2.422
2.92
1.894
2.302
3.343
6.510
8.790
7.930
6.760
7.580
8.510
8.070
8.480
11.40
13.03
0
3
1
1
1
2
3
0
0
0
1.075
1.164
1.000
1.000
0
0
3.933
4.534
6.770
6.510
0
0
2.180
1.333
2.094
2.488
1.000
1.542
1.000
0.911
0.953
0.967
0.935
0.930
0
3
2
2
6
0
5.886
0
0.707
0.957
0
0.816
7.300
5.120
4.940
4.900
5.810
6.090
0
1
1
1
1
2
0.221
0.221
120-82-1
5.064
2.329
-0.270
100-42-5
3.932
4.305
3.932
3.932
3.932
4.432
4.698
3.683
4.639
5.475
2.704
2.614
2.502
2.712
2.613
2.473
2.831
2.484
2.690
2.591
-0.228
-0.291
-0.275
-0.223
-0.263
-0.249
-0.278
-0.240
-0.255
-0.270
5.377
5.788
2.311
2.439
-0.215
-0.210
7.949
1.414
2.414
2.914
1.732
2.270
1.973
3.932
5.117
5.488
3.997
4.553
-0.214
-0.266
-0.266
-0.266
-0.253
-0.246
98-95-3
100-47-0
100-66-3
100-52-7
103-71-9
99-08-1
108-98-5
100-39-0
3058333-6
90-12-0
2880488-8
605-02-7
64-17-5
71-36-3
71-41-0
67-64-1
68-12-2
0.215
0.186
0.172
0.212
0.213
0.183
0.207
0.162
0.215
0.200
0.220
0.172
0.170
0.165
0.339
0.336
0.337
0.224
0.255
83
84
85
86
87
88
89
90
91
tetrahydrothiophene
thiophene
2-methylthiophene
N-methyl-2-pyrrolidone
pyridine
quinoline
aniline
N-methylaniline
N,N-dimethylaniline
92
1,5,9-cyclododecatriene
110-01-0
110-02-1
554-14-3
872-50-4
110-86-1
91-22-5
62-53-3
100-61-8
121-69-7
4904-614
3.000
3.000
3.348
3.305
3.000
4.966
3.394
3.932
4.305
6.000
2.887
2.129
2.329
3.108
2.413
2.079
2.651
2.840
3.157
4.049
-0.213
-0.245
-0.231
-0.238
-0.251
-0.240
-0.196
-0.189
-0.184
-0.232
0.245
0.219
0.213
0.260
0.214
0.177
0.196
0.191
0.188
0.247
0.579
0.579
0.956
0.918
0.667
1.047
0.975
1.481
1.659
1.244
1.000
1.000
1.000
0.993
1.000
1.000
1.000
0.989
0.976
0.948
4
0
3
2
0
0
0
0
0
0
1.25
1.25
1.644
2.29
1.5
3.466
1.894
2.302
2.593
3
6.780
9.350
8.180
6.200
7.190
7.600
6.650
6.300
6.060
5.410
0
0
0
2
1
1
1
1
1
0
Test set
1
2
3
4
5
6
7
8
9
tetradecane
cyclohexane
1-methyl-1-cyclohexene
629-59-4
cis-1,2dimethylcyclohexane
ethylcyclohexane
6.914
3.000
3.394
6.699
3.605
3.399
-0.281
-0.290
-0.224
0.394
0.425
0.258
5.698
0.667
0.975
1.000
0.963
0.982
0
0
0
2.957
1.500
1.894
4.510
4.680
5.060
0
0
0
2207-014
3.805
4.154
-0.282
0.395
0.854
0.953
0
2.540
4.680
0
1678-917
67-66-3
3.932
3.936
-0.283
0.400
1.481
0.966
0
2.302
4.680
0
78-77-3
2.598
3.328
2.621
3.828
2.977
1.993
3.292
4.366
4.155
5.219
-0.326
-0.276
-0.269
-0.257
-0.268
0.253
0.237
0.254
0.236
0.255
1.000
1.707
1.707
2.094
1.542
1.000
1.000
0.976
1.000
0.972
0
0
2
0
1
0.000
0.500
0.500
0.707
0.816
23.87
23.48
11.18
18.35
9.790
0
0
0
0
0
2-chloro-2-methylpropane
507-20-0
2.250
5.237
-0.282
0.285
0.800
0.966
9
0.000
6.610
0
2-iodo-2-methylpropane
558-17-8
2.750
5.175
-0.246
0.204
0.800
0.987
9
0.000
13.14
0
trichloroethylene
toluene
1,4-dimethylbenzene
1,2,3,4tetramethylbenzene
ethylbenzene
sec-butylbenzene
iodobenzene
1,3-dichlorobenzene
1,2-dibromobenzene
2-nitrotoluene
benzyl chloride
1-chloronaphthalene
1-bromo-2methylnapthalene
1-propanol
1-hexanol
1-octanol
acrylonitrile
2-methoxyethyl ether
79-01-6
3.201
3.394
3.788
4.626
2.213
2.746
2.895
3.530
-0.280
-0.241
-0.230
-0.222
0.233
0.234
0.225
0.227
1.542
0.975
1.140
0.984
1.000
1.000
1.000
0.985
0
0
0
0
0.816
1.894
2.305
3.702
21.90
6.140
5.900
5.590
0
0
0
0
3.932
4.843
4.260
4.365
4.959
4.715
4.285
5.666
6.365
2.991
3.514
2.425
2.392
2.388
2.927
2.721
2.137
2.208
-0.241
-0.240
-0.243
-0.267
-0.246
-0.268
-0.253
-0.220
-0.215
0.235
0.234
0.206
0.225
0.221
0.183
0.225
0.172
0.170
1.481
1.964
0.975
1.069
0.854
1.563
1.481
1.075
1.164
0.990
0.979
1.000
1.000
1.000
1.000
1.000
1.000
1.000
0
0
0
0
0
0
0
0
0
2.302
3.099
1.894
2.199
2.540
3.034
2.302
3.933
4.534
5.900
5.590
17.00
12.25
19.66
8.070
8.440
9.030
10.53
0
0
0
0
0
3
0
0
0
1.914
3.414
4.414
1.914
4.414
4.733
5.867
6.244
2.828
5.224
-0.262
-0.262
-0.261
-0.290
-0.251
0.338
0.338
0.338
0.233
0.329
1.707
2.885
3.685
1.707
3.685
0.919
0.965
0.987
0.944
0.967
2
2
2
0
0
0.500
1.207
1.707
0.500
1.707
5.010
4.870
4.820
7.580
5.830
1
1
1
1
3
chloroform
1,2-dibromoethane
1-bromopropane
1,3-dibromopropane
1-bromo-2-methylpropane
110-82-7
591-49-1
106-93-4
106-94-5
109-64-8
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
108-88-3
106-42-3
488-23-3
100-41-4
135-98-8
591-50-4
541-73-1
583-53-9
88-72-2
100-44-7
90-13-1
2586-621
71-23-8
111-27-3
111-87-5
107-13-1
111-96-6
Plot of correlation coefficients for all models
1
Correlation coefficient, r2
0.9
0.8
0.7
0.6
0.5
0
1
2
3
4
5
Number of variables in the model
Figure S1. Comparative plot of correlation coefficient values for each model.
■- values for training set; ▲- values for test set.
Definitions for the descriptors described in the models selected.
1. X1sol - is topological descriptor, which represents solvation connectivity index (chi-1) that encodes the
solvation property of the compound (Todeschini and Consonni 2003). This molecular descriptor is defined in order
to model solvation entropy and dispersion interactions in solution. The descriptor relates the characteristic
dimension of the molecule to the atomic parameters (quantum number, bond indices and etc). The bidimensional
descriptor X1sol was proposed in 1991 by group of Zefirov and Palyulin (Antipin 1991) in order to treat the
enthalpies of non-specific solvation.
The descriptor is defined by the following equations. If the characteristic dimensions of the molecules by atomic
parameters are taken into account, they are defined as:
( L)
 
 
 (  )

k
1
X sol   
2
s
m
q
m 1
k
k 1
a 1
a
k
n
1/ 2
a 1
a
k





where La is the principal quantum number (2 for C, N, O atoms, 3 for Si, S, Cl and etc.) of the ath atom in the kth
subgraph; δa is the corresponding vertex degree; k is the total number of mth order subgraphs and n is the number
of vertices in the subgraph. The normalization factor 1/(2m+1) is defined in such a way that the indices Xm and
Xmsol for compounds, which contain only second-row atoms, coincide. The 1st order solvation connectivity index
is
X 1sol 
1  ( L  L )
4  (   )
i
j
b
1/ 2
i
j
b




where b runs over all the B bonds; Li and Lj are the principal quantum numbers of the two vertices related to the
considered bond. The positive coefficient of X1sol indicates that an increase in the descriptor value results in an
increase in solubility of C60 in the considered solvent.
2. J3D – the descriptor represents 3D-Balaban index, the geometrical descriptor (Todeschini and Consonni 2003).
The Balaban index describes the distance connectivity of the molecule, which is the average distance sum
connectivity. This index is derived from the geometry distance matrix (hence, a 3D descriptor).
The geometry matrix G is a square symmetric matrix where the ijth entry is the Euclidean distance between the ith
and the jth atoms. The geometric distance degree
is the ith row sum in the geometry matrix G for each i, that is,
Now J3D can be defined as follows:
where,
and
are the geometric distance degrees of two adjacent atoms i and j connected by the bond b, and the sum
runs over all the bonds b in the molecule, B is the total number of bonds in the molecule, and C is the cyclomatic
number.
3. HOMO, LUMO and HOMO-LUMO gap – these descriptors represent quantum chemical descriptors, energies
of Highest Occupied Molecular Orbital (HOMO), Lowest Unoccupied Molecular Orbital (LUMO) and band gap
between them. These orbitals are called the frontier orbitals, and determine the way the molecule interacts with
other species. The HOMO is the orbital that could act as an electron donor, since it is the outermost (highest
energy) orbital containing electrons. The LUMO is the orbital that could act as the electron acceptor, since it is the
innermost (lowest energy) orbital that has room to accept electrons. The HOMO descriptor describes the
nucleophilic properties of solvent and LUMO descriptor describes the electrophilic properties of solvent. HOMOLUMO gap reflects the reactivity of the compound, thus, the less value of the descriptor corresponds to the more
reactive compound. These descriptors can be calculated by various quantum-chemical methods.
4. TI2 – The TI2 descriptor is topological descriptor, second Mohar index TI2. The Mohar index is derived from
Laplacian matrix (Todeschini and Consonni 2003; Mohar 1989), a distance matrix.
The descriptors, TI1 and TI2 are defined on the ground of Laplacian spectrum:
where λ is adjacency matrix, as a measure of molecular branching; graph of N and Q dimension.
5. FDI – the FDI is geometrical descriptor representing a folding degree index. The FDI descriptor is defined as
the largest eigenvalue obtained by the diagonalization of the distance/distance matrix, and then normalized and
divided by the number of atoms (Todeschini and Consonni 2003; Randic et al. 1994; Randic and Krilov 1999). The
values of the descriptor are in range 0<FDI=<1. This descriptor converges to one for linear molecules (of infinite
length) and decreases in accord with the folding of the molecule. The FDI descriptor can be used as indicator of the
degree of departure of a molecule from a strict linearity.
6. H-052 – descriptor is among atom-centered fragments, describing H (hydrogen) attached to C(sp3) with 1X
(heteroatom) attached to the next C (Todeschini and Consonni 2003).
7. nHAcc – descriptor represents a number of acceptor atoms for H bonds (N, O, F and etc). This descriptor is
among the functional group descriptors.
8. X3 – descriptor represents connectivity index chi-3, this is a topological descriptor (Todeschini and Consonni
2003). This descriptor is among Kier-Hall Connectivity Indices that are calculated from the Hydrogen-depleted
molecular graph (Kier & Hall, 1986).
1. Connectivity index Chi-0 through Chi-5
2. Average Connectivity index Chi-0 through Chi-5
3. Valence Connectivity index Chi-0 through Chi-5
4. Average Valence Connectivity index Chi-0 through Chi-5
Connectivity indices Chi-0 through Chi-5 are defined as follows.
: Connectivity index Chi-0 is defined as:
where, n is the number of nodes in the Hydrogen-depleted graph, δi is the vertex degree of the ith atom defined as
the number of non-Hydrogen neighbours in the molecular graph.
The Average Connectivity index Chi-0 is:
: Connectivity index Chi-1 is defined as:
where, b is the number of bonds, the sum runs through all bonds in the Hydrogen-depleted molecule, and for each
bond δi δj is the product of the vertex degrees of the end atoms i and j.
The Average Connectivity index Chi-1 is
Higher Indices: Connectivity indices Chi-m for 2 ≤ m ≤ 5 is defined as:
where, (II δi)k is the product of the vertex degrees of the atoms that form a connected subgraph with m edges, and
K is the total number of such distinct connected sub graphs (the H-depleted molecular graph) each having m edges.
For any m, 0 ≤ m ≤ 5, if we replace the vertex degree δi by the valence vertex degree
for each atom i in the
Connectivity index Chi-m, then we get Valence Connectivity Indices Chi-m (Kier & Hall, 1981; Kier & Hall,
1983). That is,
where,
is the product of the valence vertex degrees of the atoms that form a connected subgraph with m
edges, and K is the total number of such distinct connected subgraphs (the H-depleted molecular graph) each
having m edges.
The Valence Connectivity Indices account for the presence of heteroatoms and double and triple bonds.
The Average Valence Connectivity index Chi-1 is defined similarly:
9. AMW - a constitutional descriptor, describes an average molecular weight.
References
Kier LB, Hall LH (1981) J. Pharm. Sci., 70:583.
Kier LB, Hall LH (1983) General definition of valence delta-values for molecular connectivity. J. Pharm. Sci.
72:1170–1173
Kier LB, Hall LH (1986) Molecular Connectivity in Structure-Activity Analysis, J. Wiley & Sons, New York
Related documents