Download Kazan-3 eng12_09_201136

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Data assimilation wikipedia , lookup

Transcript
УДК 544.354.081.7:004.021
Chemoinformatics methods in the thermodynamics of equilibrium. The dissociation of acetic acid
© Bondarev Sergey Nikolaevich1, Zaitseva Inna Sergeevna2,
Bondarev Nikolay Vasilievich1+*
1
V. N. Karazin Kharkiv National University.
Svobody Sq. 4, UA-61077 Kharkov, Ukraine. Е-mail: [email protected]
2
Kharkov National Academy of Municipal Economy
Marshal Bazhanov Str. 17, UA-61002 Kharkiv, Ukraine. Е-mail: [email protected]
_______________________________________________
*Supervisor; +Corresponding author
Keywords: chemoinformatics, artificial neural network, multiple regression, dataset, the dissociation
constant, Gibbs energy, solvation.
Abstract
Methods of Chemical Informatics (Chemoinformatics) - multiple linear regression and neural
network modeling - were used to analyze the dependence of Gibbs energy of the dissociation (the dissociation constant) of acetic acid on the properties of water and organic solvents. The significant factors were identified which affect the acid dissociation equilibrium. The neural network model (threelayer perceptron) was constructed and the perspective of application of neural networks in prediction
of dissociation constants (strength) of acetic acid in aqueous organic solvents was shown.
Introduction
In the physical chemistry of solutions, as well as in chemistry in general, a huge amount
of experimental data has accumulated, the in-depth analysis of which is impossible without
use of modern Informatics - "the science of a fundamentally new human-machine technology
of expanded reproduction of a qualitatively new knowledge" [1]. Consequently, at the junction of chemistry and informatics, the chemoinformatics has appeared and quickly drawn into
an independent discipline [2-5], methods of which begun to penetrate actively into all branches of chemistry. The terminology of discipline is in flux: chemoinformatics, cheminformatics
(chemіinformatics), chemical informatics [3]. A narrow, but very common understanding of
chemoinformatics is an application of informatics in bioorganic chemistry for drug development [2]. Later this definition has been expanded. In particular, according to the definition
given by G. Paris (Paris, 2000), chemoinformatics is a scientific discipline that covers the design, creation, organization, management, search, analysis, distribution, visualization and use
of chemical information [3], the subject of which includes the methods of storage, retrieval
and processing of chemical information. The development of chemoinformatics has contributed greatly to the presence of justified methodology and its software, which allows the chemist
to carry out the prediction of a variety of properties of chemical compounds and processes on
the basis of experimental data [4-7]. In this case, the nonlinear modeling techniques come to
the foreground (in particular, neural network technology of prediction [8]) as promising
methods in the analysis of datasets and prediction of properties of complicated systems.
In contrast to statistical models, which are based on that fact, that first the assumption
about the nature of relations between the analyzed variables is made, and then the verifying of
the agreement of the proposed data with the model is carried out, is in the neural network
modeling (neural networks) no assumption made about the true form of these relationships.
The novelty of this work comes to demonstrate the consistency of traditional approaches (regression and correlation, solvation and thermodynamic approach) with modern neural
network methods of analysis and prediction of thermodynamic properties chemical equilibrium. Moreover, solvation and thermodynamic method [9] makes it possible to reveal contribu-
1
tions of the effects of environment (solvation reagent) into changes in thermodynamic parameters of chemical equilibrium. The feature of regression and correlation approach is the principle of linearity of free energies (FEL), which allows to identify the causes (electrostatic and
chemical interactions) of solvent influence on the thermodynamics of chemical reactions [10].
The scope of neural networks largely coincides with a range of problems to be solved by traditional statistical methods. However, in comparison with linear methods of statistics (linear
regression, autoregression, linear discriminant), neural networks allow effectively to build
nonlinear dependencies.
Artificial neural networks (ANN) are widely used in organic, analytical, physical and biological branches of chemistry for processing the experimental data to construct predictive models
[11-13].
Specifics of neural network approach to analyzing and prediction of data
The advantage of ANN before classical methods of statistical analysis consists in possibility of approximating of any arbitrarily complex nonlinear dependencies of a random and
unknown before form from experimental data [14]. An another essential feature of neural
networks is that the relationship between input and output data is in the process of network
training [15]. We are going to discuss briefly the basic concepts of the theory of neural networks applied to multilayer perceptron (MP) [8].
The notion of the neuron. ANN consists of a number of "artificial neurons" [16]. Neuron has several input channels for information - the dendrites, and the channel of information
output - an axon. The axon of a neuron is connected to dendrites of other neurons through
synapses.
Fig. 1 represents a graphical model of a neuron, which indicates that the j-th neuron
receives signals x (i) from other neurons through multiple input channels, each of them is
multiplied by w (j, i) - the weight of synaptic connection of the output of neuron i with the
input of neuron j, the positive values of which correspond to excitatory synapses, the negative
values - to inhibitory synapses, and if w (j, i) = 0 then the connection between neurons j and i
is missing. Further, the addition of the transformed signals (the block adder SUM) is made, and the
threshold of excitation (activation) of b (j) j-th neuron is added.
Fig. 1. The scheme of an artificial neuron.
The current state of the neuron (the induced local field of neuron j) is described by the
relation:
N
u j   w( j , i ) x(i )  b( j )
(1)
i 1
where х(i) - the input signals, i = 1, 2, …, N. The index of j refers to the number of neurons in the
network to be considered, index of i indicates the number of synaptic connections.
2
The signal which is received by a neuron is converted into an output signal
y j  f (u j )
(2)
by means of nonlinear activation function or transfer function f (.).
The input layer of neurons is used to enter the input variables, the output layer is used to
display the results. Furthermore, the network may contain many intermediate (hidden) neurons, which perform internal functions. The sequence of layers of neurons (input, hidden and
output neurons) and their connections is called the network architecture.
Each neuron of a hidden or output layer of the perceptron can perform two types of calculations [8]: 1) the calculation of the output signal of the neuron, which is made as a continuous nonlinear function of the input signal and the weights of synaptic connections of the
neuron; 2) the calculation of the gradient of the surface of error of weights of synaptic connections of the neuron, which is necessary for reverse transfer through the network.
The neuron activation function f (.) is a nonlinear function, which simulates the process
of excitation transfer. The most popular form of the function is a sigmoid one. The presence
of the nonlinear function is a necessary condition, since otherwise the reflection of "inputoutput" of the network can be reduced to an ordinary single-layer perceptron [17].
The most important property of a multilayer perceptron is the ability to learn the approximation of any arbitrarily complex nonlinear dependencies between input and output data
[18-20]. The training algorithms allow finding of synaptic weights and thresholds of activation of a neuron, which minimize the prediction error of the network [21].
The initial configuration (architecture) of the network is chosen randomly, and the training process is terminated either when passed through a certain number of epochs (the epoch in
the iterative learning process of the neural network corresponds to one pass across the training
set, followed by checking for the control set), or when the error reaches a certain level of
smallness or ceases to decrease.
Calculation Part
The purpose of current work is the application of a linear multifactor and neural network approach to analyzing the relationship between Gibbs energies of dissociation of acetic
acid and properties of aqueous organic solvents for the construction of predictive relationships
and neural network models.
Analyzed data: the dependent variables - aquamolalic [22] standard Gibbs energies
o
(ΔG d,HAc) of dissociation of acetic acid [23-25]; independent variables - physical and chemical properties of aqueous organic solvents (water – methanol, water – ethanol, water – propan-2-ol) - dielectric constant, Dimroth-Reinhardt electron-acceptor parameters, Kamlet-Taft
electron-donating parameters and density of cohesion energy [26-29].
1. A multiple linear analysis was made to establish the relationship between Gibbs energy of dissociation of acetic acid (ΔGod,HAc) and properties of aqueous-organic solvents normalized parameters of Dimroth - Reihardt (ETN) and Camlet - Taft (BKT), dielectric constant (εN) and density of cohesion energy (δ2N) based on a linear first order model:
ΔGod,HAc = b0 + b1(1/εN) + b2ETN + b3BKT + b4δ2N
(3)
For analysis, the data on Gibbs energies of acetic acid dissociation in water-methanol,
water-ethanol, water-propan-2-ol at 298.15 K with the content of non-aqueous component up
to 0.7 of mole fraction were used (the resolvation area [30], after which, the composition of
the solvation shells of ions and molecules of the acid is changed and the nonlinear dependence
of ΔGod,HAc on the solvent properties appears).
Significant descriptors and the presence of multicollinearity were detected using the
correlation matrix (Table 1) and step by step inclusion of parameters into the regression equation. The level of significance was chosen as the criterion for inclusion of the descriptor into
3
the regression equation (p < 0.05). Coefficients of the equation of multiple regression were
evaluated by standard method of least squares (LSM).
Statistically significant factors, that determine the dependence of Gibbs energies of acetic acid dissociation from solvent properties, are the dielectric constant and the density of cohesion energy (Table 1)
Table 1. Total coefficients of the pair correlation
n=22
1/N
ETN
BKT
2N
rGod(HAc)
1/N
1.0000
-0.8744
0.8778
-0.6597
0.9335
ETN
-0.8744
1.0000
-0.9774
0.5789
-0.8438
BKT
0.8778
-0.9774
1.0000
-0.7082
0.9008
2N
-0.6597
0.5789
-0.7082
1.0000
-0.8635
rGod(HAc)
0.9335
-0.8438
0.9008
-0.8635
1.0000
The correlation coefficient between independent variables 1/εN and δ2N is -0.6597, i.e.,
these descriptors are weakly correlated, so a further regression analysis was performed taking
into account only those descriptors.
Table 2 shows the results of analysis for the regression equation:
God(HAc) = (37.3  2,8) + (3.62  0.51)·N-1 – (13.3  2.7)·2N
(4)
To test the statistical quality of the estimated regression equation, a standard procedure
was used as described in detail in [31, 32]: checking the adequacy of the regression equation
to the experimental data according to the Fisher's criterion; checking the statistical significance of coefficients in regression equation according to the Student's criterion; checking the
properties of the data, the validity of which was assumed in estimating the two-parameter
equation: the expectation of a random deviation equals zero for all observations, the constancy of the dispersion of deviations, the absence of autocorrelation of residuals (Durbin-Watson
test), the absence of multicollinearity (the matrix of the total correlation coefficients), the errors have normal distribution (a detailed analysis of the residues), the absence of emissions
(Mahalonobis distance and Cook's index).
As follows from the values of multiple correlation coefficient and determination (Table
2), the constructed model with two factors 1/εN and δ2N adequately describes the experimental
data rGod(HAc).
99% (R2 = 0.98) of the dispersion of the dependent variable rGod(HAc) is explained by
the influence of independent descriptors. The Fisher's criterion F has a high value
F(2,19) = 454.02 against Fkr (2,19) = 3.52, which confirms the statistical significance of the
linear regression model.
Table 2. The results of regression for the dependent variable rGod(HAc)
1/N
2N
R= 0.99 R2= 0.98 F(2,19)=454.02 The standard estimation error: 0.64
Beta
Standard error
B
Standard error
t(19)
coefficients
37.3
1.4
27.2
0.64
0.04
3.6
0.2
14.7
-0.44
0.04
-13.3
1.3
-10.0
p-level
0.00
0.00
0.00
The application of the total pair correlation coefficients in multiple regression (Table 1)
for the studying the relationship of two variables can lead to incorrect conclusions. Therefore,
the partial correlation coefficients were analyzed, which reflect the degree of linear relation-
4
ship between two variables that was calculated after eliminating the influence of other factors.
The partial correlation coefficients are given in Table 3.
Table 3. The partial correlation coefficients and tolerance
Independent variable rGod(HAc)
1/N
2N
Beta
coefficients
Partial correlation
Half-partial
correlation
Tolerance
Rsquare
t(19)
p-level
0.64
-0.44
0.96
-0.92
0.48
-0.33
0.56
0.56
0.44
0.44
14.7
-10.0
0.00
0.00
As we can see from Tables 1 and 3, by stepwise exclusion of variables 1/N and 2N increases the correlation coefficient between rGod(HAc) and independent variables: the full
and partial correlation coefficients are respectively 0.93 and 0.96, -0.86 and -0.92. This means
that both predictors significantly mask the true relationship of studied variables. The tolerance
for the predictors 1/N and 2N is 0.56 (Table 3), which indicates the absence of multicollinearity (the redundancy of predictors).
Figure 2 shows the normal probability plot of residues (Fig. 2 (a)) and threedimensional plot of the Gibbs energies of acetic acid dissociation rGod(HAc) from the properties of aqueous organic solvents -1N and 2N (Fig. 2 (b)), indicating the adequacy of the selected linear model: the residues correspond to the normal distribution, since they are located
near the line of normal distribution (a) almost all points are located on the plane (b).
By use of equation (2) for evaluation of the Gibbs energy of acetic acid dissociation in
water-dioxane and water-dimethylsulfoxide solvents, the satisfactory predictive ability was
found for equation (2) for aqueous organic solvents with low content of dioxane or dimethylsulfoxide (Table 4).
Fig. 2. The normal probability plot of residues (a) and the dependence of Gibbs energy of acetic acid
dissociation rGod(HAc) from the properties of water-organic solvents -1N and 2N (b).
Table 4. Comparison of the observed values of Gibbs energy of acetic acid dissociation with those
5
predicted according to the equation (4)
Mole
Solvent properties
ΔGod,HAc
fraction S
observed
1/εN
ETN
BKT
δ2N
Water-dioxane
0
1
1
0.19
1
27.15
0.1
1.645
0.77
0.36
0.917
32.56
0.2
2.625
0.69
0.45
0.834
38.06
0.3
3.848
0.62
0.48
0.751
44.02
0.4
5.326
0.60
0.50
0.668
50.83
0.5
9.229
0.54
0.51
0.584
58.88
Water-dimethylsulfoxide
0
1
1
0.19
1
27.15
0.1
1.025
0.85
0.35
0.931
29.51
0.2
1.047
0.75
0.45
0.862
33.03
ΔGod,HAc
predicted
Residues
27.62
31.06
35.71
41.24
47.70
62.94
-0.47
1.50
2.35
2.78
3.13
-4.06
27.62
28.63
29.63
-0.47
0.88
3.40
2. Neural network estimation. The data for the neural network: properties of aqueous
organic solvents (input variables), Gibbs energy of acetic acid dissociation (output variables).
Using the illustrative possibilities of interactive computer graphics [1], (Fig. 3) the surfaces were constructed for relationships of the Gibbs energy of dissociation of acetic acid
from the properties of water and organic of solvents: water-methanol, water- ethanol and water-propan-2-ol with step 0.1 of molar fraction of the non-aqueous component.
Fig. 3. Surfaces ΔGod,HAc = f(1/εN, ETN), ΔGod,HAc = f(1/εN, δ2N), ΔGod,HAc = f(1/εN, BKT),
ΔGod,HAc = f(δ2N, BKT), ΔGod,HAc = f(BKT, ETN), ΔGod,HAc = f(δ2N, ETN).
6
The complicated form of surfaces (Fig. 3) evidences in favor of the neural network simulation, because of the fact of nonlinearity of the problem. Certainly, the problem could be
solved by statistical methods of nonlinear analysis. However, as mentioned earlier, a complicating factor is the necessity to formulate the hypothesis about the explicit study of dependence, which, as we can see from Fig. 3, is not obvious.
Since the input and the output vectors of the net are known, the training algorithms can
be used for learning with teacher [8, 21]. In this case, due to the ability to generalize, the network can obtain new results when the input data are submitted (the properties of aqueous organic solvents), which were not used by training the network.
By the procedure of decreasing the dimension it was found that all the properties of
aqueous organic solvents significantly affect the strength of acetic acid (the Gibbs energy of
dissociation).
The choice of the structure (of architecture) of the neural network (how many intermediate layers and neurons in them should be used) is the most difficult problem.
There are various methods known for selecting the optimal network structure [8, 21].
However, in most cases their applicability to a particular problem depends strongly on the
quality and quantity of input data.
Previously the 1000 networks were analyzed (linear, radial basis function (the number
of hidden elements min = 1, max = 8), three-layer perceptron (number of hidden elements min
= 1, max = 10) and the perspective type of network and the version of architecture were selected - it was a three-layer perceptron with five hidden neurons 4:4-5-1:1 MP (Table 5, Fig.
4).
In our regression problem the linear neural network model is characterized by very low
productivity (Table 5) – the standard deviation for the control sample is equal to 0.424, for the
model of radial basis function (RBF) this value is 0.222.
The network with the architecture MP 4:4-8-1:1 has slightly better statistical characteristics (the control performance is equal to 0.107) compared with MP 4:4-5-1:1 (control performance is equal to 0.114), but in the case of small datasets preference is given to less complex network architectures.
Training /
Elements *
Test error
Control error
Training error
Test performance
Control performance
Training performance
Architecture,
Pearson's correlation coefficient
Table 5. The statistical characteristics of different types of neural networks
1
Linear 2:2-1:1,
0.417
0.424
0.465
0.139
0.082
0.198
PI
0.8923
2 RBF 4:4-7-1:1
0.120
0.253
0.222
0.013
0.010
0.029
KM, KN, PI
0.9873
3 МP 4:4-5-1:1,
0.090
0.114
0.119
0.030
0.014
0.048
BP 100,
0.9947
CG 20, CG 31b
4 МP 4:4-8-1:1,
0.056
0.107
0.117
0.018
0.013
0.049
BP 100,
0.9965
CG 20, CG 52b
*
Note. The algorithms (codes) used for the network optimization: the code of PI - pseudoinverse (linear optimization by the method of least squares), the code KM - K-means (positioning of centers); the
code KN - K-nearest neighbors (setting of deviations); the code BP - Back Propagation; the code CG conjugate gradient method; b - the code stop (the network with the least error in the control selection)
[33]. The CG31b code shows that for optimization of the network the conjugate gradient method was
used and that the network was found at the 31st epoch at the lowest error on validation set.
7
Fig. 4. The architecture of a three-layer perceptron with direct transmission of the signal for the prediction of dissociation constants (Gibbs energy of dissociation) of acetic acid according to properties
of aqueous organic solvents.
It should be noted that the constructed in this way networks are not necessarily the best
ones of all possible, since by using of non-linear possibilities of neuromodelling in order to
minimize errors there is no assurance that it is impossible to achieve smaller errors [21]. In
particular, it refers to multilayer perceptrons.
Therefore, the further correction of the network was carried out using the algorithm of
quick propagation (100 epochs in the first stage) and the Levenberg-Marquardt algorithm (500
epochs in the second stage), which is considered one of the best algorithms for nonlinear optimization [33]. Algorithm of quick propagation is a heuristic modification of the backpropagation algorithm, where for the acceleration of the convergence the simple quadratic
model of the error surface computed for each weight of synaptic connection is used.
The Levenberg-Marquardt algorithm is used only for the relatively small networks with
a single output that most closely matches the conditions of our problem.
We also analyzed the perceptrons MP 4:4-3-1:1 and MP 4:4-4-1:1 that contain three and
four neurons respectively in the hidden layer (Table 6).
Despite the relatively high ability for data approximation, these networks have less ability to generalization (prediction).
Test performance
Training error
Control error
Test error
0.054
0.073
0.074
0.018
0.024
0.017
MP 4:4-4-1:1
0.093
0.092
0.160
0.028
0.020
0.036
Training/
Elements
Control performance
MP 4:4-3-1:1
Architecture
Training performance
Table 6. The results of training the networks MP 4:4-3-1:1 and MP 4:4-4-1:1
BP 100, CG 20,
CG 55b
BP 100, CG 20,
CG 29b
For training the networks, the entire set of observations was divided into three samples
(by default, the random division of observations among the samples was made) to avoid the
overtraining the network and to guarantee the quality of generalization (prediction). The first
of them (the training sample - 50% of observations) was used for training the network and the
second (the control sample - 25% of observations) - for the cross-validation of training algorithm during its operation, and the third (the test sample - 25% of observations) - for the final
independent testing.
Training is carried out at the speed 0.01.
As an activation function on intermediate layers by neural network modeling, the hy-
8
perbolic tangent function (tanh) was used, the sigmoid nonlinearity of which is defined as follows: [8]:
exp(au )  exp(au )
, где a > 0.
(5)
tanh(u ) 
exp(au )  exp(au )
In the first epoch, the quick propagation algorithm adjusts the weights of synaptic connections in accordance with the generalized delta rule [34], as in the method of backpropagation [8]:
w j ,i (n)   j oi  w j ,i (n  1)
(6)
where n - the number of training example; η - the speed of training used by passing from one
step of the process to another (was chosen equal to 0.01); δj - local gradient of error; α - usually a positive value, called the constant of the moment or inertia coefficient (was chosen equal
to 0.3); oi - the output value of the i-th neuron.
On subsequent epochs, the algorithm uses the assumption of quadraticity of error surface for more rapid progress to the minimum point. Changes in weights are calculated by the
quick propagation formula [33]:
y j ( n)
w j ,i (n) 
w j ,i (n  1)
(7)
y j (n  1)  y j (n)
where yj(n) - the output value given by the network which corresponds to the n-th training example.
The correction of weights by the Levenberg-Marquardt method was performed according to the formula [33]:
w j ,i (n)    Z T Z   I  Z T ε
1
(8)
where ε - the error vector on all observations; Z - Jacobian matrix that contains the first partial
derivatives of neural network errors to the variables of weights and offsets of weights of synaptic connections; λ - parameter of the algorithm defined in a linear (scalar) optimization
along the chosen direction. The first term in the Levenberg-Marquardt formula corresponds to
the linear model, and the second term - to the gradient descent. The controlling parameter I
defines the relative significance of these two contributions.
Discussion of the results of multifactor analysis and neuromodelling
Physico-chemical interpretation of the parameters of multiple linear regression. In the
resulting linear two-factor model (see Eq. 2), which characterizes the dependence of the Gibbs
energy of dissociation of acid God(HAc) on the reciprocal of the dielectric constant N-1 and
on the density of cohesive energy 2N, the parameter b1 > 0, and b2 < 0. Therefore, with decreasing of dielectric constant and decreasing of density of cohesion energy by replacing the
water with water-organic solvents, the acid strength decreases. This means that: a) decreasing
of the dielectric constant of the solvent is accompanied by decreasing of the Gibbs energy of
ion solvation, resulting in a shift of the equilibrium towards undissociated acid molecules; b)
by decreasing of the density of cohesive energy of aqueous organic solvents, the undissociated form of acetic acid is more stabilized (Fig. 5).
9
Fig. 5. The influence of density of cohesion energy and dielectric properties of mixed solvents watermethanol, water-ethanol and water-propan-2-ol (w-s) on reducing the strength of acetic acid
(God(HAc) = Gdo,w-s(HAc) - Gdo,w(HAc)).
Comparison of the results of a linear multifactor regression analysis (Fig. 5) and the results of solvational and thermodynamic analysis (Fig. 6 as an example [35]) allow us to speak
about their adequacy. Decreasing of the strength of acetic acid in water-methanol solvents in
comparison with water is caused by increasing of solvation of nondissociated acid molecules
and decreasing of solvation of acetate ions (Fig. 6).
Fig. 6. The influence of the Gibbs energy of ion resolvation (trGo(H+), trGo(Ac-) and mole-
10
cules trGo(HAc) at Gibbs energy change of acetic acid dissociation God(HAc) by replacing the water with water-methanol solvents.
The results of the optimization of multilayer perceptron training MP 4:4-5-1:1 (with 4
input neurons, 5 hidden neurons and 1 output neurons, Fig. 4) by quick propagation algorithm
and Levenberg-Marquardt algorithm are given in Table. 7. This network has better statistical
indicators than those given in Table. 6. Thus, the control performance of the network grew
from 0.886 (1 - 0.114) to 0.960 (1 - 0.040), the error on validation sample decreased from
0.013 to 0.011, Pearson's correlation coefficient increased from 0.9947 to 0.9984.
Architecture,
Pearson's correlation coefficient
Training performance
Control performance
Test performance
Training error
Control Error
Test error
Training / Elements
Table 7. The training results of perceptron MP 4:4-5-1:1
MP 4:4-5-1:1
0.9984
0.059
0.040
0.084
0.017
0.011
0.017
QР100, LМ235b
Now we describe the necessary technical details associated with training of neural network on data which were used in this work.
Training performance; control performance; test performance – the quotient of standard deviation of the prediction error and the standard deviation of initial data on relevant samples.
Training error; control error, test error - network error on samples used during training.
Error for a particular network configuration (a measure of the effectiveness of training)
is determined by running through a network of all existing observations and by comparison of
the real output values (yi) with the desired (targeted) values (di). The error signal ej(n) of output neuron j at n-th iteration, corresponding to the n-th training example, is calculated as follows:
ej(n) = dj(n) – yj(n)
(9)
In sample training mode by the method of back-propagation the average square error of
the network is determined by the equation [8]:
1 N
(10)
Еav (n) 
e2j (n) ,

2 N n1 jC
where the set C includes all the neurons of output layer of network, N - the total number of
images (examples) in the training set. The inner summation on j is performed over all neurons
of output layer of network, whereas the outer summation is performed over all examples of
this epoch. In sample mode, the correction of weight wj,i is performed only after passing of all
set of examples through a network. At the consecutive training mode by the method of backpropagation the correction of weights is carried out after submission of each example.
It should be noted that the error on validation sample does not exceed the error on an
independent (testing) sample, i. e. the necessary condition of optimization is satisfied [21].
As follows from Table 6, the results of the neural network on three sets are almost identical,
which indicates the acceptable quality of a neural network.
Training. On the first step the algorithm of quick propagation was used (100 epochs), on
the second step - a method of Levenberg-Marquardt. The network is selected at the 235th
11
epoch at the minimal error on the control sample (the stop code b).
The criteria for quality of training of neural network model are the statistical characteristics listed in the Table 8.
Table 8. The statistical parameters of neural network training for modeling the dependence of the
strength of acetic acid on the properties of water-organic solvent *
The average of data
Standard deviation of
data
Average of error
Standard deviation
error
Average of absolute
error
Quotient of standard
deviations
Correlation
Training sample God,HAc
38.29
8.69
Control sample God,HAc
40.86
7.94
Test sample
God,HAc
33.30
4.72
General sample God,HAc
37.71
8.18
0.009
0.51
0.053
0.32
-0.301
0.40
-0.056
0.47
0.36
0.27
0.44
0.36
0.059
0.040
0.084
0.057
0.9983
0.9992
0.9987
0.9984
*
Note. The average of data - average value of initial Gibbs energies of dissociation of acetic acid
ΔGod,HAc. In our case they are in the range 33 - 40, which indicates the correct partition of the set of
data on training, control and test sample. Standard deviation - the standard deviation of initial data
ΔGod,HAc. The average of error - average value of prediction error (the error is the difference between
the original and calculated ΔGod,HAc). Standard deviation of error - the standard deviation of the prediction error ΔGod,HAc. The average of absolute error is the average absolute error of prediction (absolute error is the difference (in absolute value) between the initial and calculated ΔGod,HAc. The ratio of
standard deviations, or performance, is the ratio of standard deviation of the prediction error to the
standard deviation of the initial data (standard deviation error / standard deviation of the data). Correlation - Pearson's correlation coefficient.
An illustration of the performance of the neural network is also a plot of observed values of the Gibbs energy of dissociation of acetic acid (output variable) God,HAc (God,HAcObserved) from the predicted values (God,HAc -Predicted) (Fig. 7) and tables 9 and 10.
Fig. 7. The dependence of values of the Gibbs energy of dissociation of acetic acid
God,(HAc), predicted by the neural network, Predicted from observed God,(HAc), Observed in aqueous-organic solvents.
The plotted points are located quite close to the line at an angle of 45 degrees to the
12
axes, indicating the effective work of the neural network.
Table 9. Illustration of the performance of neural networks on the example of prediction of Gibbs
energy (kJ / mol) of dissociation of acetic acid according to properties of water-propan-2-ole solvents
Mole
fraction
of propan-2ole
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Solvent properties
1/ε
N
1.000
1.287
1.650
2.087
2.568
3.032
3.411
3.668
3.829
3.973
4.117
ETN
BKT
1.00
0.77
0.70
0.67
0.66
0.64
0.62
0.60
0.58
0.56
0.54
0.19
0.49
0.60
0.65
0.69
0.72
0.75
0.79
0.83
0.89
0.88
δ
2
ΔGod,HAc
observed
ΔGod,HAc
predicted
Residues
27.15
29.62
32.40
35.02
37.04
38.49
40.10
42.86
47.30
52.23
53.02
27.36
29.60
32.54
34.80
36.59
38.21
39.96
43.02
46.88
52.59
52.93
0.21
-0.02
0.14
-0.22
-0.45
-0.28
-0.14
0.16
-0.42
0.36
-0.09
N
1.000
0.943
0.889
0.839
0.787
0.730
0.664
0.586
0.492
0.379
0.242
The data of the Table 9 reflect the quality of the neural network, which training included the Gibbs energy of dissociation of acetic acid in water-propan-2-ole solvents, while
the results of table 10 are the prediction of acid strength made by a model which is already
trained according to properties of water-dimethylsulfoxide solvents.
The consistency of statistical and neural network analysis of the thermodynamics of
chemical equilibrium be concluded on the basis of the results of table. 4, 9 and 10.
The use of neural networks seems to be promising for the analysis of the relationship
of thermodynamics of chemical equilibrium and physical and chemical properties of aqueous
organic solvents (without restrictions in the composition of mixed solvent) in order to predict
the equilibrium constants (the strength of a weak electrolyte, or of stability of complexes) in
solvents, especially when these data (the equilibrium constants) are absent, or their definition
is associated with experimental difficulties (for example, in pure dimethylsulfoxide or dioxane).
Table 10. The illustration of the quality of neural network on the example of prediction of Gibbs energy of dissociation of acetic acid according to properties of water-dimethylsulfoxide solvents
Mole
fraction
of
DMSO
0
0.1
0.2
0.3
1/eN
1.000
1.025
1.047
1.091
Solvent properties
ETN
BKT
1.00
0.85
0.75
0.69
0.19
0.35
0.45
0.50
δ2N
ΔGod,HAc
observed
ΔGod,HAc
predicted
Residues
1.000
0.931
0.862
0.793
27.15
29.51
33.03
38.83
27.36
30.40
32.56
35.13
0.21
0.89
-0.47
-3.70
It should be noted that the prediction of dissociation constants in waterdimethylsulfoxide solvents with a high content of dimethylsulfoxide and in water-dioxane
solvents failed to produce the satisfactory results using the neural network which was built in
the current research. This is quite explainable, since the network was trained only on the wa-
13
ter-alcohol solvents. One way of solving this problem may be the completion of the set of data with the dissociation constants and properties of mixed solvents containing both, protolytic
and aprotic organic component, for the network training on more representative samples and
the expansion of its predictive capabilities.
Conclusion
The combination of thermodynamic and traditional statistical methods of chemical informatics allows to reveal not only the contributions of the primary effects of medium (the
Gibbs energy of reagent resolvation) into the change in the strength of electrolyte, but also
provides the information on the causes of the shift of chemical equilibrium by influence of the
solvent. The two-parameter dependence of the constants (of the Gibbs energy) dissociation of
acetic acid on the dielectric constant and density of cohesion energy of aqueous organic solvents
(with alcohol content up to 0.7 of the mole fraction of methanol, ethanol or prorpan-2-ol) was
established, indicating that a) reducing the dielectric constant of the solvent, accompanied by a
decrease in the Gibbs energy of solvation of ions, moves the the dissociation equilibrium towards undissociated acid molecules; b) with a decrease in the density of cohesive energy,
aqueous organic solvents stabilize more the undissociated form of acetic acid. These conclusions are consistent with the solvation-thermodynamic analysis – a decrease in strength of
acetic acid in water-methanol solvents, in comparison with water, is caused by increasing
solvation of undissociated acid molecules and decreasing solvation of acetate ions.
A neural network analysis of dependence of the Gibbs energy of dissociation of acetic
acid on the physical and chemical properties of aqueous organic solvents (water-methanol,
water-ethanol, water-propan-2-ol) was made. The neural network model (three-layer perceptron) was constructed and the dissociation constants of acetic acid in water-dimethylsulfoxide
solvents containing organic component with up to 0.3 of mole fraction were predicted. The
perspectivity of application of neural networks was shown to analysis of the relationship of
thermodynamics of chemical equilibria and physical and chemical properties of aqueous organic solvents in order to predict the equilibrium constants (the strength of a weak electrolyte,
or stability of complexes) in solvents for which these data are not available.
References
[1] Zenkin A.A. Cognitive Computer Graphics. М.: Nauka. 1991. 192 p. [in Russian].
[2] Brown F.K. Chapter 35. Chemoinformatics: What is it and How does it Impact Drug
Discovery // Annual Reports in Medicinal Chemistry. 1998. Vol. 33. P. 375-384.
[3] Leach Andrew R., Gillet Valerie J. An Introduction to Chemoinformatics. Springer.
2007. 256 p.
[4] Bunin B.A., Siesel A., Morales G.A., Bajorath J. Chemoinformatics: Theory, Practice, &
Products. Springer. 2007. 295 p.
[5] Baskin I.I., Varnek A. Chapter 1. Fragment Descriptors in SAR/QSAR/QSPR Studies,
Molecular Similarity Analysis and in Virtual Screening. // In: Chemoinformatics Approaches to Virtual Screening / Varnek A., Tropsha A., Ed. RCS Publishing. 2008.
P.1-43.
[6] Konyaev D.S. Methods of data analysis, and chemical informatics in the study of complexation in solution and on the surface of chemically modified silica’s. Candidate's
Dissertation in Chemistry. Kharkov: 1999. P. 139-141. [in Russian].
[7] Kireeva N.V. Prediction of the stability constants of complexes of lanthanide and alkaline earth metals with organic ligands and the melting points of ionic liquids methods of
chemical informatics //Abstract of Candidate's Dissertation in Chemistry.
M.: A.N. Frumkin IPhChE RAS and Strasbourg University (France). 2010. 25 p. [in
14
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
Russian].
Haykin S. Neural networks: a complete course. 2th Ed. M.: Publishing House "Williams". 2006. 1104 p. [in Russian].
Bondarev N.V. Solvation-thermodynamic effects of water-methanol solvent in the coordination of the cations Na+, K+, NH4+ and Ag+ with 18-crown-6. Gibbs energy of complexation and resolvation reagents. Zh. Obshch. Khim. 2006. V.76. № 1. P. 13-18.
Smirnova E.V, Tsyba Yu.V., Bondarev N.V., Zaitseva I.S. Regression analysis of the
impact properties of water-dimethylsulfoxide solvents on the strength of acetic acid
and benzoic acid // Materiały V Międzynarodowej naukowe-praktycznej konferencji
"Naukowa przestrzeń Europy - 2009". Przemyśl, Polska: Nauka i studia. 2009. Vol. 7
(Chemia i chemiczne technologie).– S. 6-8.
Baskin I.I., Palyulin V.A., Zefirov N.S. Application of artificial neural networks in
chemical and biochemical studies // Vestn. Mosk. Univ., Ser. 2: Khim. 1999. Vol. 40.
№5. P. 323-326.
Galbershtam N.M., Baskin I.I., Palyulin V.A., Zefirov N.S. Neural networks as a
method of action-dependency structure - property of organic compounds // Uspekhi
khimii. 2003. Vol. 72. № 7. P. 706-727.
Krasnianchyn Ya.N., Panteleimonov A.V., Kholin Yu.V. Parameterization of artificial
neural networks in solving tasks of qualitative chemical analysis. // Vestn. Kharkov.
Nat. Univ. 2010. № 895. Khim. Issue 18 (41). P. 39-45.
Baskin I.I. Modeling properties of chemical compounds using artificial neural networks and fragment descriptors // Abstract. dis .... Doctor. Phys.-Math. Science. M.:
М.V. Lomonosov Moscow State University. 2010. 49 p.
Ezhov A.A., Shumsky S.A. Neurocomputing and its application in economics and business. M.: Moscow Engineering Physics Institute. 1998. 224 p. [in Russian].
Uossermen F. Neural Computing. Theory and Practice. M.: Mir. 1992. 118 p. [in Russian].
Rosenblatt F. Principles of neurodynamics: perceptrons and the theory of brain mechanisms. M.: Mir. 1965. 480 p. [in Russian].
Galushkin A.I. The theory of neural networks. V.1. Neurocomputers and their application. M.: IPRZHR. 2000. 416 p. [in Russian].
Neuroinformatics / A.N. Gorban, V.L. Dunin-Barkovsky, A.N. Kirdin et al. Novosibirsk: Nauka. Siberian Department Russian Academia of Science. 1998. 296 p. [in Russian].
Aksenov S.V., Novoseltsev V.B. Organization and use of neural networks (methods and
technologies) / Ed. V.B. Novoseltsev. Tomsk: NTL. 2006. 128 p. [in Russian].
Borovikov V.P. Neural networks. Statistica Neural Networks. Methodology and technology contemporal analysis of the data. 2th Ed. M.: Hot Line - Telecom. 2008. 392 p.
[in Russian].
Tsurko E.N., Bondarev N.V. Mathematical modeling of solvent parameters' relevant
contribution to the alpha-amino acid (valine, alpha-alanine) solvation in H2O–MeOH,
H2O–EtOH and H2O– PrOH-2 // J. Mol. Liquids. 2007. № 131–132. P.151-157.
Nikolsky B.P. Reference Book of Chemist. Vol. 3. 2th Ed. L.: "Khimiya".1965. 1008 p.
Lebed V.I., Bondarev N.V. The dissociation constants and thermodynamic characteristics of dissociation of acetic acid and benzoic acid in mixtures of water - methanol, water - dioxane // Zh. Fiz. Khim. 1982. Vol. 56. № 1. P. 30–33.
Lebed V.I., Bondarev N.V., Paulenova A. The dissociation constants and thermodynamic characteristics of dissociation and solvation of acetic acid in mixtures of water propanol-2 // Zh. Fiz. Khim. 1987. Vol. 61. № 6. P. 1487-1491.
Zaitseva I.S., Eltsov S.V., Kabakov E.N., Bondarev N.V. Correlation analysis of the in-
15
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
fluence of medium effects on the energetics of complexation of sodium and potassium cations with the ether 18-crown-6 in aqueous-organic solvents // Zh. Obshch. Khim.
2003. Vol. 73. Vol.7. P. 1079-1084.
Afanas'ev V.N., Efremova L.S., Volkova T.V. Physicochemical properties of binary
solvents. Aqueous systems. Ivanovo: IChS RAS. 1988. 412 p. [in Russian].
Kalidas C., Hefter G., Marcus Y. Gibbs energies of transfer of cations from water to
mixed aqueous organic solvents // Chem. Rev. 2000. Vol. 100. № 3. P. 819-852.
Hefter G., Marcus Y., Waghorne W.E. Enthalpies of Transfer of Electrolytes and Ions
between Water and Mixed Aqueous Solvents // Chem. Rev. 2002. Vol. 102. №8.
P.2773-2836.
Gordon D. Organic chemistry of electrolyte solutions. M.: Mir. 1979. 712 p. [in
Russian].
Larina O.V., Bondarev N.V., Kern A.P. Medium effects and complex formation of sodium, potassium, ammonium, and silver (I) salts with 18-crown-6 ether in water-propan2-ol solvents // Vestn. Kharkov. Nat. Univ. 2007. № 770. Khim. Issue 15 (38). P. 301312.
Tsyba Yu.V., Bondarev N.V., Zaitseva I.S. Statistical analysis of experimental data
and thermodynamics of carboxylic acids dissociation // Materiály IV Mezinárodní
vědecko-praktická konference "Věda: teorie a praxe - 2008". Praha, Czech Republic:
Publishing House "Education and Science" s.r.o. 2008. Dil 10 (Chemie a chemická
technologie). S. 12-16.
StatSoft, Inc. Electronic textbook on statistics. Moscow, StatSoft. 2001. WEB:
http://www.statsoft.ru/home/textbook/default.htm.
Widrow B., Hoff M.E. Adaptive switching circuits. IRE WESTCON Conferencion
Record. New York. 1960. P. 96-104.
Tsyba Yu.V., Bondarev N.V., Zaitseva I.S. Regression-correlation and thermodynamic
analysis of the impact properties of the solvents water-methanol, water-ethanol, waterpropan-2-ol on the strength of acetic acid // Материали за V Международна научна
практична конференция "Динамиката на съвременната наука - 2009". София,
България: "Бял ГРАД-БГ" ООД. 2009. Том 12. (Химия и химически технологии).
С. 35-37.
16