Download Basic Data Handling in Nuclear and Reactor Applications

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data analysis wikipedia , lookup

Experimental uncertainty analysis wikipedia , lookup

Transcript
Basic
Data Handling
in
Nuclear and Reactor
Applications
2
1.8
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
Calle Persson
Department of Reactor Physics
Royal Institute of Technology
[email protected]
www.neutron.kth.se
2006-12-19
1.8
2
Table of Contents
1
The Poisson Distribution.................................................................................................... 3
2
The Propagation of Error Formula ..................................................................................... 3
3
Linear Fitting...................................................................................................................... 4
4
The χ2-test........................................................................................................................... 5
5
Mean Values....................................................................................................................... 8
6
Number of Accurate Figures .............................................................................................. 8
7
Preparation Tasks ............................................................................................................... 9
8
References .......................................................................................................................... 9
Appendix 1: Solutions to the Preparation Tasks ...................................................................... 10
Basic Data Handling in Nuclear and Reactor Applications
Calle Persson, Department of Reactor Physics, KTH
2(10)
Basic Data Handling in Nuclear and Reactor Applications
When analyzing experimental data, it is of essential weight to treat the collected data correctly. It may
happen that different results are obtained from the same raw data depending on how the information is
handled and which methods that have been applied. Such situations can be avoided by following some
general statistical rules for data handling. It is recommended to carefully read this document before
starting any data analysis. It is not the aim of this document to give a complete derivation of all statistical
relations and distributions; the document will rather serve as a short handbook for an engineer or
experimentalist.
1
The Poisson Distribution
An event that happens with a constant probability in time is said to be Poisson distributed. The
radioactive decay is a typical example of a process obeying Poisson distribution. Each decay is completely
random in time and, therefore, also the subsequent detection of the related particle or gamma.
Consequently, the collected data from a measurement performed with radioactive samples should be
treated as Poisson distributed.
The standard deviation, σi, of a measurement performed on a process obeying Poisson distribution is
given by
σ i = yi ,
(1)
where yi is the number of events detected during the measurement i. The relative error, erel, will
consequently decrease with the square-root of the number of counts:
erel =
2
σi
yi
=
yi
yi
=
1
.
yi
(2)
The Propagation of Error Formula
Assume that a measurement has been performed with the results yi and σi. Now the standard deviation
σ(f(yi)) is to be calculated, where f(yi) is a quantity that is a function of the measured values. This can be
done by using the Propagation of Error Formula:
2
⎛ ∂f
⎞
σ ( f ( yi )) = ∑ ⎜ σ ( yi ) ⎟ .
i ⎝ ∂yi
⎠
(3)
Example 1: The prompt neutron decay constant, α, has been measured in a subcritical reactor. In
subcritical systems, this is an important quantity which tells how subcritical the core is through the relation
ρ = αΛ + β eff ,
(4)
where ρ is the reactivity, Λ is the neutron mean generation time and βeff is the effective delayed neutron
fraction. Experimentally, α was determined to -420±15 s-1 in an arbitrary subcritical core, and by
simulations, the neutron mean generation time and the effective delayed neutron fraction was estimated to
100±10 µs and 750±30 pcm respectively. By employing the Propagation of Error Formula, Eq. (3), the
standard deviation of the reactivity can be calculated as
Basic Data Handling in Nuclear and Reactor Applications
Calle Persson, Department of Reactor Physics, KTH
3(10)
2
2
2
⎞
⎛ ∂ρ
⎞ ⎛ ∂ρ
⎞ ⎛ ∂ρ
σ ( ρ ) = ⎜ σ (α ) ⎟ + ⎜ σ (Λ) ⎟ + ⎜
σ ( β eff ) ⎟ =
⎟
⎝ ∂α
⎠ ⎝ ∂Λ
⎠ ⎝⎜ ∂β eff
⎠
=
(5)
2
2
( Λσ (α ) ) + (ασ (Λ) ) + σ 2 ( β eff ) = 447 pcm.
The final result is therefore ρ=-3450±450 pcm. This corresponds to an effective multiplication factor, keff,
of
keff =
1
= 0.9667
1− ρ
(6)
with the standard deviation
2
∂k
⎛ ∂k
⎞
σ (ρ )
σ (keff ) = ⎜ eff σ ( ρ ) ⎟ = eff σ ( ρ ) =
= 0.0042 .
2
∂ρ
(1 − ρ )
⎝ ∂ρ
⎠
(7)
The final result should then be given as keff=0.967±0.004, taking into account the number of relevant
digits with respect to the error.
3
Linear Fitting
Let us assume that a set of data points yi has been measured for different conditions xi (for instance space
or time) and that there exists a linear dependence between these data according to
y = ax + b .
(8)
The task is to find a and b and their standard deviations σ(a) and σ(b) in this linear relation. By assuming
all measurements having the same error this can be solved by minimizing the relation
S LS = ∑ ( yi − axi − b )
2
(9)
i
with respect to a and b. This is generally known as the Least Square Method. However, since we are dealing
with Poisson distributed data, we know that the standard deviation of each measurement is
σ i ( yi ) = yi
(10)
and that yi and consequently σi(yi) will be different for each measurement. Therefore, measured data with
higher accuracy should be given higher weight in the linear fitting procedure, according to the following
relation:
S =∑
i
( yi − axi − b )
σ i2
2
.
(11)
Minimizing S with respect to a and b gives a system of two equations that can be solved:
( y − axi − b ) xi = 0
∂S
= −2∑ i
σ i2
∂a
i
( y − axi − b ) = 0.
∂S
= −2∑ i
σ i2
∂b
i
Basic Data Handling in Nuclear and Reactor Applications
Calle Persson, Department of Reactor Physics, KTH
(12)
4(10)
By introducing the notation
A=∑
i
C=∑
i
E=∑
i
xi
B=∑
σ i2
i
yi
D=∑
σ i2
i
1
σ i2
xi2
σ i2
(13)
xi yi
σ i2
the result can be written as
a=
EB − CA
BD − A2
(14)
b=
DC − EA
.
BD − A2
(15)
The standard deviations of these parameters are here given without further explanation as
σ (a) =
B
BD − A2
(16)
σ (b) =
D
.
BD − A2
(17)
and
4
The χ2-test
When performing fitting of functions to experimental data, it is important to know how good the assumed
model describes the outcome of the experiment. It might happen that the proposed function do not
describe the reality well and another model must be used. If the previously defined quantity S is divided by
the degree of freedom, ν, of the problem, the reduced χ2 (“chi-squared” or “chi-two”) is obtained:
χ2 S
= .
ν ν
(18)
ν = N −m
(19)
The degree of freedom is given by
where N is the number of measurements and m is the number of parameters of the model. In the case of a
linear fit, m=2. If the reduced χ2 is close to one, the quality of the fit is good. A more rigorous way to
evaluate whether the quality of the fit is acceptable or not is to calculate the probability
P( χ 2 ≥ S ) .
If this probability is larger than 5% the fit is considered to be acceptable. This is done by integrating the
χ2-distribution
P (u )du =
(u / 2)ν / 2 −1 e − u / 2
du
2Γ(ν / 2)
(20)
according to
Basic Data Handling in Nuclear and Reactor Applications
Calle Persson, Department of Reactor Physics, KTH
5(10)
(u / 2)ν / 2 −1 e− u / 2
du .
2Γ(ν / 2)
0
S
P( χ 2 ≥ S ) = 1 − ∫
(21)
In Eq. (20) and Eq.(21) u=χ2 (just to avoid confusion with the exponent) and
∞
Γ( z ) = ∫ t z −1e − t dt
(22)
0
is the Gamma function1. The χ2-test and the P(χ2≥S)-test can serve as an excellent tool when judging the
quality of a function fit. However, a visual control by plotting the data together with the fitted function is
essential for the final judgment.
Example 2: The data in Table 1 have been obtained experimentally and a linear dependence between x
and y is assumed. Eq. (14) to Eq. (18) give the values a=7.0±0.3, b=10.7±1.2 with χ2/ν≈0.6 and
P(χ2≥S)≈68%. The reduced χ2 is somewhat low, but P(χ2≥S) is well above 5% which tells us that the fit is
good.
Table 1. Example of experimental data.
1
2
x
18
26
y
1
4
σ(y)
3
30
2
4
38
3
5
43
2
6
53
2
7
61
2
y=ax+b
65
60
55
50
y
45
40
35
30
25
20
15
0
1
2
3
4
5
6
7
8
x
Figure 1. Linear fit to experimental data.
Often, the linear fit cannot be performed directly on the measured values. For instance, if the
measured quantity is of exponential nature, the problem must be transformed to a linear problem.
Consider the decay of a radioactive source:
N = ε N 0 e − λt .
(23)
where N is the number of detected decays, ε is the efficiency of the detector, N0 is the decay rate at time
zero and λ is the decay constant. This relation can be transformed to a linear problem by taking the natural
logarithm on both sides:
ln ( N ) = ln ( ε N 0 ) − λt .
(24)
This is equivalent with the previously used linear model with the transformations
1
Numerical problems may appear when calculating this integral for large values of z.
Basic Data Handling in Nuclear and Reactor Applications
Calle Persson, Department of Reactor Physics, KTH
6(10)
y = ln( N )
x=t
a = −λ
.
(25)
b = ln ( ε N 0 )
Important is to remember that the standard deviation, σi(Ni), must be transformed as well. Using the
Propagation of Error Formula, Eq. (3), the transformation of the standard deviation is given by
⎛ ∂ ln( N )
σ ( ln( N ) ) = ⎜
⎝ ∂N
⎞
⎠
2
1
1
σ (N ) =
N
N
σ (N ) ⎟ =
1
.
N
N =
(26)
Example 3: Consider a measurement of a radioactive decay. The data in Table 2 have been collected.
What is the half-life of the sample?
Table 2. Example of radioactive decay data.
10
20
30
t [s]
2
1000
670
500
N [cps]
40
400
50
320
60
220
70
165
After performing the linearization (Eq. (24)) and transformation (Eq. (25)), the parameters a and b can
be calculated using Eq. (14) to Eq. (17). Note that Eq. (26) must be used when calculating the
components of Eqs. (13). The results are a=-0.029±0.001 and b=7.15±0.03 and hence the decay rate is
given by λ=-a=0.029±0.001 s-1. The reduced χ2 is found to be 1.7 which is somewhat high. On the other
hand, P(χ2≥S)≈13% is above the limit of 5% and the fit is acceptable. The half-life is found from the
relation
t1/ 2 =
ln 2
λ
= 23.7 s
(27)
and by using the Propagation of Error Formula, Eq. (3), the standard deviation is
2
ln 2
⎛ ∂t
⎞
σ (t1/ 2 ) = ⎜ 1/ 2 σ (λ ) ⎟ = 2 σ (λ ) = 0.8 s .
λ
⎝ ∂λ
⎠
(28)
y=ax+b
Exponential
7
1100
6.8
1000
6.6
900
6.4
800
6.2
ln(y)
N [cps]
700
600
6
5.8
500
400
5.6
300
5.4
200
5.2
100
5
0
10
20
30
40
50
60
70
t [s]
80
0
10
20
30
40
50
60
70
80
x
Figure 2. Experimental data and function fitting in linear scale and logarithmic scale.
2
Counts per second (s-1).
Basic Data Handling in Nuclear and Reactor Applications
Calle Persson, Department of Reactor Physics, KTH
7(10)
5
Mean Values
Let yi, i=1,2,3…N, be the results of N measurements, all of them made under the same conditions and
with the same error. The arithmetic mean value, y , is then given by
N
1
N
y=
∑y
.
i
i =1
(29)
If the values yi are Gaussian distributed, for instance when performing successive measurements on
distances etc, the best estimate of the error of the arithmetic mean value is the standard error
σ
σ ( y) =
,
N
(30)
where
N
σ=
∑( y
i
i =1
− y)
2
(31)
N −1
is the ordinary definition of the standard deviation. On the other hand, if the data is Poisson distributed,
the best estimate of the error of the arithmetic mean value is
y
.
N
σ ( y) =
(32)
If the measurements were performed under different conditions, with different standard deviation, σi,
for each measurement, the weighted mean value, μ̂ , should be used.
N
μˆ =
∑ y /σ
i
i =1
N
∑1/ σ
i =1
2
i
(33)
2
i
Such situation might occur for instance if the same observable is measured using different techniques with
different accuracy. The standard deviation of the weighted mean value is given by
σ ( μˆ ) =
1
N
∑1/ σ
i =1
.
(34)
2
i
Note that if σi is the same for all measurements, Eq. (33) transforms into Eq. (29).
Eqs. (33) and (34) are valid only for Gaussian distributed data. However, the mean value of Poisson
distributed data is Gaussian distributed.
6
Number of Accurate Figures
A common mistake is to include all figures from the last calculation step in the final presentation of the
result and, thereby, give false information concerning the accuracy of the measurement and its result. If
the calculation has been done without error estimation of the included parameters, the number of figures
in the final result may not be more than given in the input data. For instance:
175/16.2 = 10.8024… ≈ 10.8
Basic Data Handling in Nuclear and Reactor Applications
Calle Persson, Department of Reactor Physics, KTH
8(10)
If the calculation has been performed on quantities with known errors, the result may not have more
accurate figures than described by the error. Moreover, the error itself should be left with one accurate
figure, maximum two if further analysis of the result is expected. Examples:
(175±1)/(16.2±0.1) = 10.8024…±0.0908… ≈ 10.80±0.09
(175±30)/(16.2±0.1) = 10.8024…±1.8530… ≈ 11±2
7
Preparation Tasks
1. Create a function, in for instance Matlab, that takes x, y and σ as input parameters and delivers a,
σ(a), b, σ(b), χ2/ν and P(χ2≥S). Try to reproduce the calculations in the examples 2 and 3 above.
Hint: In Matlab, a function is defined by writing
function [a, a_std, b, b_std, red_chisq, P] = lsq(x,y,std)
The function is then recalled by writing
[a, a_std, b, b_std, red_chisq, P] = lsq(x,y,std)
from another m-file or from the command line. Note: the filename of the function file must be
the same as the function name (in this case lsq.m).
2. In reactor applications, cosine functions are frequently used to describe flux distributions etc.
How do the standard deviation, σi(y), transform when transforming the cosine function
y = Amax cos ( B ( x − x0 ) )
(35)
to a linear relation? Assume that y is Poisson distributed and that the maximum amplitude, Amax,
is known. Hint:
∂
1
arccos( x) = −
.
∂x
1 − x2
(36)
3. Consider a fission chamber that is measuring the neutron flux in a reactor. Signals are collected
during 10 s and the total number of events during this time is displayed. In order to achieve
sufficient low statistical error, three identical measurements are performed according to Table 3.
What is the number of counts per second and the corresponding statistical error?
Table 3. Experimental data.
Measurement number
y [counts per 10 s]
8
1
257
2
281
3
272
References
The information in this document has been collected mainly from:
1. W.R. Leo, Techniques for Nuclear and Particle Physics Experiments, Springer-Verlag, 1994.
2. G. Blom, Sannolikhetsteori och statistikteori med tillämpningar, Studentlitteratur, 1989.
Further reading:
3. W.T Eadie, Statistical methods in experimental physics, Amsterdam, 1971.
4. D.L Smith et al., Probability, Statistics and Data Uncertainties in Nuclear Science and Technology,
OCDE/OECD, American Nuclear Society, LaGrange Park, Illinois, USA, 1991.
Basic Data Handling in Nuclear and Reactor Applications
Calle Persson, Department of Reactor Physics, KTH
9(10)
Appendix 1: Solutions to the Preparation Tasks
1.
function [a, a_std, b, b_std, red_chisq, P] = lsq(x,y,std)
A=sum(x./std.^2);
B=sum(1./std.^2);
C=sum(y./std.^2);
D=sum(x.^2./std.^2);
E=sum(x.*y./std.^2);
a=(E*B-C*A)/(B*D-A^2);
b=(D*C-E*A)/(B*D-A^2);
a_std=sqrt(B/(B*D-A^2));
b_std=sqrt(D/(B*D-A^2));
S=sum((y-a*x-b).^2./std.^2);
N=length(x);
nu=N-2;
red_chisq=S/(nu);
F=@(u)((u./2).^((nu./2)-1).*exp(-u./2))./(2.*gamma(nu/2));
P=1-quad(F,0,S);
2.
Transformation:
⎛ y
arccos ⎜
⎝ Amax
⎞
⎟ = Bx − Bx0
⎠
(37)
The error is transformed acoording to
⎛
⎛ y
σ ⎜⎜ arccos ⎜
⎝ Amax
⎝
=
3.
⎞⎞
⎟ ⎟⎟ =
⎠⎠
⎛ y
∂ ⎛
⎜⎜ arccos ⎜
∂y ⎝
⎝ Amax
2
⎛ ∂ ⎛
⎞
⎛ y ⎞⎞
σ ( y) ⎟ =
⎜ ⎜⎜ arccos ⎜
⎟
⎟
⎟
⎜ ∂y
⎟
⎝ Amax ⎠ ⎠
⎝ ⎝
⎠
⎞⎞
⎟ ⎟⎟ σ ( y ) =
⎠⎠
1
1
2
⎛ y ⎞ Amax
1− ⎜
⎟
⎝ Amax ⎠
y
.
(38)
Eq.(29) gives the mean value of the three measurements, y =270. Eq.(32) then gives the
associated error, σ ( y ) ≈9. The number of counts per second is consequently 27±0.9. Note that
the error must be calculated before the values are transformed to counts per second.
Basic Data Handling in Nuclear and Reactor Applications
Calle Persson, Department of Reactor Physics, KTH
10(10)