Download Average of n independent experiments

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Inductive probability wikipedia , lookup

Foundations of statistics wikipedia , lookup

Law of large numbers wikipedia , lookup

Time series wikipedia , lookup

History of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
Formulae for Statistical Treatment of Experimental Data
1.
n independent experiments
Average of n independent experiments
x=
∑x
i
N
∑x
2
−
Variance
s2 =
Standard deviation of the mean
σm = s / N
Uncertainty interval
x±t
(∑ x )2
N
N −1
=
1
( xi − x ) 2
∑
N −1
s
= x ± tσ m
n
where t is student’s factor for a given probability and N − 1 degrees of freedom.
Numerical example
Seven independent experiments led to the values given below for the O−H bond dissociation
enthalpy in phenol, X = DH o (PhO − H) . Calculate the average value and the associated
uncertainty interval.
N
1
2
3
4
5
6
7
X / kJ mol-1
366.3
372.1
375.0
374.3
369.3
372.4
369.8
X−Xmean
-5.01
0.79
3.69
2.99
-2.01
1.09
-1.51
Sum of X
(X−Xmean)2
25.14
0.62
13.58
8.91
4.06
1.18
2.29
2599.2
Xmean
371.31
2
Sum of (X−Xmean)
Sum2/ 7
Variance
Sigma m squared
Sigma m
Uncertainty (2×Sigma m)
55.79
965120.09
9.30
1.33
1.15
2.3
1
The result is 371.2±2.3 kJ mol-1. The factor 2, which multiplies the standard deviation of the
mean, is the so-called “thermochemical convention”. An alternative procedure, as pointed out
above, is using student’s t-factor for 6 degrees of freedom and for a given probability.
The variance can be easily calculated with the Descriptive Statistics package (Tools, Data
Analysis) of Microsoft Excel. Note that this package does not provide the standard deviation
of the mean directly. The value for the student’s factor may be calculated by using the
statistical function “TINV” with arguments 0.05 (1−0.95 probability) and the corresponding
number of degrees of freedom.
2.
Error propagation for a given function
Error propagation for Y = f ( x1 , x2 , x3 )
Error in Y ( σ i are the σ m for each xi )
Examples:
2
y = mx + b
σ Y = m σ x ; σ Y2 = m 2σ x2
y = a/ x
σY =
σY =
y = ln x
3.
⎛ ∂f ⎞
σ = ∑ ⎜ ⎟ σ i2
∂xi ⎠
i =1 ⎝
n
2
Y
aσ x
x
2
σx
x
2
; σ Y2 = a 4 σ x2
x
;σ =
2
Y
σ x2
x
2
Error propagation for a given function
Average of several values with different uncertainties ( σ i ):
∑x σ
x=
∑1 σ
i
2
i
i
2
i
1
σ
2
=∑
i
1
σ i2
i
Numerical example
The example is identical to the one given above. The only (important!) difference is that now
each value was itself obtained from several independent experiments (even through different
2
techniques), so that the uncertainty intervals affecting each value are provided. Each value
and associated uncertainty (standard deviation of the mean) were calculated as described in
the example above. What is the mean value and the uncertainty in this case?
N
1
2
3
4
5
6
7
σ
X/σ2
5.72
5.81
44.59
23.39
23.08
23.28
5.78
X / kJ mol-1
366.3
372.1
375.0
374.3
369.3
372.4
369.8
8
8
2.9
4
4
4
8
Sum of X/σ2
Sum of 1/σ2
131.7
0.353
Xmean
Variance
Sigma
372.7
2.83
1.68
1/σ2
0.01563
0.01563
0.11891
0.06250
0.06250
0.06250
0.01563
The result is 372.7±1.7 kJ mol-1.
4.
Linear regression
y = mx + b
∑ (x
n
m=
i =1
i
)(
− x yi − y
∑ (x
n
i
i =1
−x
)
)
2
b = y − mx
Regression coefficient:
∑ [(x
n
r=
i
i =1
∑ (x
n
i =1
i
)(
− x yi − y
−x
) ∑ (y
2
n
i =1
i
)]
−y
)
2
3
Variance of the slope:
∑(
s =
2
m
) (
)
(
)(
)
⎡n
⎤
xi − x ∑ yi − y − ⎢∑ xi − x yi − y ⎥
i =1
⎣ i =1
⎦
2
n
2
(n − 2)⎡⎢∑ xi − x ⎤⎥
⎣ i =1
⎦
n
i =1
n
2
2
(
2
or
)
r −2 − 1
n−2
sm
=
m
Variance of the intercept:
⎛ n
⎞
sb2 = s m2 ⎜ ∑ xi2 ⎟ / n
⎝ i =1 ⎠
Uncertainty of y 0 = mx0 + b , for a given x0 :
∑ (x
n
s y20 =
i =1
i
−x
) ∑ (y − y ) − ∑ [(x − x )(y − y )]
2
n
n
2
i =1
i
2
i
i =1
i
(n − 2)∑ (xi − x )
n
2
i =1
⎡
⎤
2
⎢1
x −x ⎥
⎢ + n 0
⎥
2⎥
⎢n
xi − x
∑
⎢⎣
⎥⎦
i =1
(
)
(
)
Uncertainty of x0 = ( y0 − b) / m , for a given y 0 :
∑ (x
n
s x20 =
i =1
i
−x
) ∑ (y − y ) − ∑ [(x − x )(y − y )]
2
n
i =1
n
2
i
2
i
i =1
i
(n − 2)m 2 ∑ (xi − x )
n
2
i =1
⎡
2
⎢1 1
y0 − y
⎢ + +
n
⎢ n l m2
xi − x
∑
⎢⎣
i =1
(
(
)
⎤
⎥
⎥
2⎥
⎥⎦
)
Standard deviation of the fit:
σ = ∑ (mxi + b − yi ) / (n − 2)
i
F=
σ
where σ y ≈ s y =
σ y2
2
∑∑ ( y
i
− yi )
2
i, j
j
∑n
i
−1
;
yi =
1
ni
ni
∑y
i, j
j =1
i
Better fits yield small F-values. The F value calculated for a given fit must be smaller than F
tabulated for N − 2 degrees of freedom (where N is the number of data points) and a given
probability.
4
Uncertainy intervals:
y = (m ± σ m × t ) x + b ± σ b × t
where σ m = sm and σ b = sb are the standard deviations of the slope and the intercept,
respectively, and t is student’s factor for N − 2 degrees of freedom (where N is the number of
data points) and a given probability.
Numerical example
According to the Benson scheme, the standard enthalpies of formation of any n-alkane in the
gas phase, can be calculated using the group terms [C−(H)3C] and [C−(H)2(C)2]:
∆ f H o [CH3 (CH2 ) n CH3 , g ] = 2[C - (H)3 (C)] + n[C - (H)2 (C)2 ]
In other words the enthalpies of formation of CH3(CH2)nCH3 vary linearly with n. This
correlation can be used to derive the above Benson terms. This example illustrates how.
n
0
1
2
3
4
5
6
7
8
9
10
∆fHo(CH3(CH2)nCH3,g) / kJ mol-1
-83.8
-104.7
-125.7
-146.9
-166.9
-187.6
-208.5
-228.2
-249.5
-270.8
-289.4
error
0.3
0.5
0.6
0.8
0.8
1.3
1.3
0.6
1.3
2.5
2.1
The regression analysis using the above formulae yields:
∆ f H o [CH 3 (CH 2 ) n CH 3 , g ] = 2 × (−84.3636 ± 0.3483 × t ) + (−20.61818 ± 0.058869 × t ) × n
where t is student’s factor for 9 degrees of freedom and for a given probability (e.g. t = 2.262
for 95% probability).
The calculation of σ m = sm and σ b = sb can be easily done with any spreadsheet. With
Microsoft Excel (Tools, Data Analysis, Regression). The value of σ m appears under
“Standard Error” and “X Variable”; the value of σ b appears under “Standard Error” and
5
“Intercept” (the value for the student’s factor may be calculated by using the statistical
function “TINV”, as described above). It is, however, fairly easy to make mistakes in the
selection of data. The above example can be used to test the package.
6