Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Basic Data Handling in Nuclear and Reactor Applications 2 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 Calle Persson Department of Reactor Physics Royal Institute of Technology [email protected] www.neutron.kth.se 2006-12-19 1.8 2 Table of Contents 1 The Poisson Distribution.................................................................................................... 3 2 The Propagation of Error Formula ..................................................................................... 3 3 Linear Fitting...................................................................................................................... 4 4 The χ2-test........................................................................................................................... 5 5 Mean Values....................................................................................................................... 8 6 Number of Accurate Figures .............................................................................................. 8 7 Preparation Tasks ............................................................................................................... 9 8 References .......................................................................................................................... 9 Appendix 1: Solutions to the Preparation Tasks ...................................................................... 10 Basic Data Handling in Nuclear and Reactor Applications Calle Persson, Department of Reactor Physics, KTH 2(10) Basic Data Handling in Nuclear and Reactor Applications When analyzing experimental data, it is of essential weight to treat the collected data correctly. It may happen that different results are obtained from the same raw data depending on how the information is handled and which methods that have been applied. Such situations can be avoided by following some general statistical rules for data handling. It is recommended to carefully read this document before starting any data analysis. It is not the aim of this document to give a complete derivation of all statistical relations and distributions; the document will rather serve as a short handbook for an engineer or experimentalist. 1 The Poisson Distribution An event that happens with a constant probability in time is said to be Poisson distributed. The radioactive decay is a typical example of a process obeying Poisson distribution. Each decay is completely random in time and, therefore, also the subsequent detection of the related particle or gamma. Consequently, the collected data from a measurement performed with radioactive samples should be treated as Poisson distributed. The standard deviation, σi, of a measurement performed on a process obeying Poisson distribution is given by σ i = yi , (1) where yi is the number of events detected during the measurement i. The relative error, erel, will consequently decrease with the square-root of the number of counts: erel = 2 σi yi = yi yi = 1 . yi (2) The Propagation of Error Formula Assume that a measurement has been performed with the results yi and σi. Now the standard deviation σ(f(yi)) is to be calculated, where f(yi) is a quantity that is a function of the measured values. This can be done by using the Propagation of Error Formula: 2 ⎛ ∂f ⎞ σ ( f ( yi )) = ∑ ⎜ σ ( yi ) ⎟ . i ⎝ ∂yi ⎠ (3) Example 1: The prompt neutron decay constant, α, has been measured in a subcritical reactor. In subcritical systems, this is an important quantity which tells how subcritical the core is through the relation ρ = αΛ + β eff , (4) where ρ is the reactivity, Λ is the neutron mean generation time and βeff is the effective delayed neutron fraction. Experimentally, α was determined to -420±15 s-1 in an arbitrary subcritical core, and by simulations, the neutron mean generation time and the effective delayed neutron fraction was estimated to 100±10 µs and 750±30 pcm respectively. By employing the Propagation of Error Formula, Eq. (3), the standard deviation of the reactivity can be calculated as Basic Data Handling in Nuclear and Reactor Applications Calle Persson, Department of Reactor Physics, KTH 3(10) 2 2 2 ⎞ ⎛ ∂ρ ⎞ ⎛ ∂ρ ⎞ ⎛ ∂ρ σ ( ρ ) = ⎜ σ (α ) ⎟ + ⎜ σ (Λ) ⎟ + ⎜ σ ( β eff ) ⎟ = ⎟ ⎝ ∂α ⎠ ⎝ ∂Λ ⎠ ⎝⎜ ∂β eff ⎠ = (5) 2 2 ( Λσ (α ) ) + (ασ (Λ) ) + σ 2 ( β eff ) = 447 pcm. The final result is therefore ρ=-3450±450 pcm. This corresponds to an effective multiplication factor, keff, of keff = 1 = 0.9667 1− ρ (6) with the standard deviation 2 ∂k ⎛ ∂k ⎞ σ (ρ ) σ (keff ) = ⎜ eff σ ( ρ ) ⎟ = eff σ ( ρ ) = = 0.0042 . 2 ∂ρ (1 − ρ ) ⎝ ∂ρ ⎠ (7) The final result should then be given as keff=0.967±0.004, taking into account the number of relevant digits with respect to the error. 3 Linear Fitting Let us assume that a set of data points yi has been measured for different conditions xi (for instance space or time) and that there exists a linear dependence between these data according to y = ax + b . (8) The task is to find a and b and their standard deviations σ(a) and σ(b) in this linear relation. By assuming all measurements having the same error this can be solved by minimizing the relation S LS = ∑ ( yi − axi − b ) 2 (9) i with respect to a and b. This is generally known as the Least Square Method. However, since we are dealing with Poisson distributed data, we know that the standard deviation of each measurement is σ i ( yi ) = yi (10) and that yi and consequently σi(yi) will be different for each measurement. Therefore, measured data with higher accuracy should be given higher weight in the linear fitting procedure, according to the following relation: S =∑ i ( yi − axi − b ) σ i2 2 . (11) Minimizing S with respect to a and b gives a system of two equations that can be solved: ( y − axi − b ) xi = 0 ∂S = −2∑ i σ i2 ∂a i ( y − axi − b ) = 0. ∂S = −2∑ i σ i2 ∂b i Basic Data Handling in Nuclear and Reactor Applications Calle Persson, Department of Reactor Physics, KTH (12) 4(10) By introducing the notation A=∑ i C=∑ i E=∑ i xi B=∑ σ i2 i yi D=∑ σ i2 i 1 σ i2 xi2 σ i2 (13) xi yi σ i2 the result can be written as a= EB − CA BD − A2 (14) b= DC − EA . BD − A2 (15) The standard deviations of these parameters are here given without further explanation as σ (a) = B BD − A2 (16) σ (b) = D . BD − A2 (17) and 4 The χ2-test When performing fitting of functions to experimental data, it is important to know how good the assumed model describes the outcome of the experiment. It might happen that the proposed function do not describe the reality well and another model must be used. If the previously defined quantity S is divided by the degree of freedom, ν, of the problem, the reduced χ2 (“chi-squared” or “chi-two”) is obtained: χ2 S = . ν ν (18) ν = N −m (19) The degree of freedom is given by where N is the number of measurements and m is the number of parameters of the model. In the case of a linear fit, m=2. If the reduced χ2 is close to one, the quality of the fit is good. A more rigorous way to evaluate whether the quality of the fit is acceptable or not is to calculate the probability P( χ 2 ≥ S ) . If this probability is larger than 5% the fit is considered to be acceptable. This is done by integrating the χ2-distribution P (u )du = (u / 2)ν / 2 −1 e − u / 2 du 2Γ(ν / 2) (20) according to Basic Data Handling in Nuclear and Reactor Applications Calle Persson, Department of Reactor Physics, KTH 5(10) (u / 2)ν / 2 −1 e− u / 2 du . 2Γ(ν / 2) 0 S P( χ 2 ≥ S ) = 1 − ∫ (21) In Eq. (20) and Eq.(21) u=χ2 (just to avoid confusion with the exponent) and ∞ Γ( z ) = ∫ t z −1e − t dt (22) 0 is the Gamma function1. The χ2-test and the P(χ2≥S)-test can serve as an excellent tool when judging the quality of a function fit. However, a visual control by plotting the data together with the fitted function is essential for the final judgment. Example 2: The data in Table 1 have been obtained experimentally and a linear dependence between x and y is assumed. Eq. (14) to Eq. (18) give the values a=7.0±0.3, b=10.7±1.2 with χ2/ν≈0.6 and P(χ2≥S)≈68%. The reduced χ2 is somewhat low, but P(χ2≥S) is well above 5% which tells us that the fit is good. Table 1. Example of experimental data. 1 2 x 18 26 y 1 4 σ(y) 3 30 2 4 38 3 5 43 2 6 53 2 7 61 2 y=ax+b 65 60 55 50 y 45 40 35 30 25 20 15 0 1 2 3 4 5 6 7 8 x Figure 1. Linear fit to experimental data. Often, the linear fit cannot be performed directly on the measured values. For instance, if the measured quantity is of exponential nature, the problem must be transformed to a linear problem. Consider the decay of a radioactive source: N = ε N 0 e − λt . (23) where N is the number of detected decays, ε is the efficiency of the detector, N0 is the decay rate at time zero and λ is the decay constant. This relation can be transformed to a linear problem by taking the natural logarithm on both sides: ln ( N ) = ln ( ε N 0 ) − λt . (24) This is equivalent with the previously used linear model with the transformations 1 Numerical problems may appear when calculating this integral for large values of z. Basic Data Handling in Nuclear and Reactor Applications Calle Persson, Department of Reactor Physics, KTH 6(10) y = ln( N ) x=t a = −λ . (25) b = ln ( ε N 0 ) Important is to remember that the standard deviation, σi(Ni), must be transformed as well. Using the Propagation of Error Formula, Eq. (3), the transformation of the standard deviation is given by ⎛ ∂ ln( N ) σ ( ln( N ) ) = ⎜ ⎝ ∂N ⎞ ⎠ 2 1 1 σ (N ) = N N σ (N ) ⎟ = 1 . N N = (26) Example 3: Consider a measurement of a radioactive decay. The data in Table 2 have been collected. What is the half-life of the sample? Table 2. Example of radioactive decay data. 10 20 30 t [s] 2 1000 670 500 N [cps] 40 400 50 320 60 220 70 165 After performing the linearization (Eq. (24)) and transformation (Eq. (25)), the parameters a and b can be calculated using Eq. (14) to Eq. (17). Note that Eq. (26) must be used when calculating the components of Eqs. (13). The results are a=-0.029±0.001 and b=7.15±0.03 and hence the decay rate is given by λ=-a=0.029±0.001 s-1. The reduced χ2 is found to be 1.7 which is somewhat high. On the other hand, P(χ2≥S)≈13% is above the limit of 5% and the fit is acceptable. The half-life is found from the relation t1/ 2 = ln 2 λ = 23.7 s (27) and by using the Propagation of Error Formula, Eq. (3), the standard deviation is 2 ln 2 ⎛ ∂t ⎞ σ (t1/ 2 ) = ⎜ 1/ 2 σ (λ ) ⎟ = 2 σ (λ ) = 0.8 s . λ ⎝ ∂λ ⎠ (28) y=ax+b Exponential 7 1100 6.8 1000 6.6 900 6.4 800 6.2 ln(y) N [cps] 700 600 6 5.8 500 400 5.6 300 5.4 200 5.2 100 5 0 10 20 30 40 50 60 70 t [s] 80 0 10 20 30 40 50 60 70 80 x Figure 2. Experimental data and function fitting in linear scale and logarithmic scale. 2 Counts per second (s-1). Basic Data Handling in Nuclear and Reactor Applications Calle Persson, Department of Reactor Physics, KTH 7(10) 5 Mean Values Let yi, i=1,2,3…N, be the results of N measurements, all of them made under the same conditions and with the same error. The arithmetic mean value, y , is then given by N 1 N y= ∑y . i i =1 (29) If the values yi are Gaussian distributed, for instance when performing successive measurements on distances etc, the best estimate of the error of the arithmetic mean value is the standard error σ σ ( y) = , N (30) where N σ= ∑( y i i =1 − y) 2 (31) N −1 is the ordinary definition of the standard deviation. On the other hand, if the data is Poisson distributed, the best estimate of the error of the arithmetic mean value is y . N σ ( y) = (32) If the measurements were performed under different conditions, with different standard deviation, σi, for each measurement, the weighted mean value, μ̂ , should be used. N μˆ = ∑ y /σ i i =1 N ∑1/ σ i =1 2 i (33) 2 i Such situation might occur for instance if the same observable is measured using different techniques with different accuracy. The standard deviation of the weighted mean value is given by σ ( μˆ ) = 1 N ∑1/ σ i =1 . (34) 2 i Note that if σi is the same for all measurements, Eq. (33) transforms into Eq. (29). Eqs. (33) and (34) are valid only for Gaussian distributed data. However, the mean value of Poisson distributed data is Gaussian distributed. 6 Number of Accurate Figures A common mistake is to include all figures from the last calculation step in the final presentation of the result and, thereby, give false information concerning the accuracy of the measurement and its result. If the calculation has been done without error estimation of the included parameters, the number of figures in the final result may not be more than given in the input data. For instance: 175/16.2 = 10.8024… ≈ 10.8 Basic Data Handling in Nuclear and Reactor Applications Calle Persson, Department of Reactor Physics, KTH 8(10) If the calculation has been performed on quantities with known errors, the result may not have more accurate figures than described by the error. Moreover, the error itself should be left with one accurate figure, maximum two if further analysis of the result is expected. Examples: (175±1)/(16.2±0.1) = 10.8024…±0.0908… ≈ 10.80±0.09 (175±30)/(16.2±0.1) = 10.8024…±1.8530… ≈ 11±2 7 Preparation Tasks 1. Create a function, in for instance Matlab, that takes x, y and σ as input parameters and delivers a, σ(a), b, σ(b), χ2/ν and P(χ2≥S). Try to reproduce the calculations in the examples 2 and 3 above. Hint: In Matlab, a function is defined by writing function [a, a_std, b, b_std, red_chisq, P] = lsq(x,y,std) The function is then recalled by writing [a, a_std, b, b_std, red_chisq, P] = lsq(x,y,std) from another m-file or from the command line. Note: the filename of the function file must be the same as the function name (in this case lsq.m). 2. In reactor applications, cosine functions are frequently used to describe flux distributions etc. How do the standard deviation, σi(y), transform when transforming the cosine function y = Amax cos ( B ( x − x0 ) ) (35) to a linear relation? Assume that y is Poisson distributed and that the maximum amplitude, Amax, is known. Hint: ∂ 1 arccos( x) = − . ∂x 1 − x2 (36) 3. Consider a fission chamber that is measuring the neutron flux in a reactor. Signals are collected during 10 s and the total number of events during this time is displayed. In order to achieve sufficient low statistical error, three identical measurements are performed according to Table 3. What is the number of counts per second and the corresponding statistical error? Table 3. Experimental data. Measurement number y [counts per 10 s] 8 1 257 2 281 3 272 References The information in this document has been collected mainly from: 1. W.R. Leo, Techniques for Nuclear and Particle Physics Experiments, Springer-Verlag, 1994. 2. G. Blom, Sannolikhetsteori och statistikteori med tillämpningar, Studentlitteratur, 1989. Further reading: 3. W.T Eadie, Statistical methods in experimental physics, Amsterdam, 1971. 4. D.L Smith et al., Probability, Statistics and Data Uncertainties in Nuclear Science and Technology, OCDE/OECD, American Nuclear Society, LaGrange Park, Illinois, USA, 1991. Basic Data Handling in Nuclear and Reactor Applications Calle Persson, Department of Reactor Physics, KTH 9(10) Appendix 1: Solutions to the Preparation Tasks 1. function [a, a_std, b, b_std, red_chisq, P] = lsq(x,y,std) A=sum(x./std.^2); B=sum(1./std.^2); C=sum(y./std.^2); D=sum(x.^2./std.^2); E=sum(x.*y./std.^2); a=(E*B-C*A)/(B*D-A^2); b=(D*C-E*A)/(B*D-A^2); a_std=sqrt(B/(B*D-A^2)); b_std=sqrt(D/(B*D-A^2)); S=sum((y-a*x-b).^2./std.^2); N=length(x); nu=N-2; red_chisq=S/(nu); F=@(u)((u./2).^((nu./2)-1).*exp(-u./2))./(2.*gamma(nu/2)); P=1-quad(F,0,S); 2. Transformation: ⎛ y arccos ⎜ ⎝ Amax ⎞ ⎟ = Bx − Bx0 ⎠ (37) The error is transformed acoording to ⎛ ⎛ y σ ⎜⎜ arccos ⎜ ⎝ Amax ⎝ = 3. ⎞⎞ ⎟ ⎟⎟ = ⎠⎠ ⎛ y ∂ ⎛ ⎜⎜ arccos ⎜ ∂y ⎝ ⎝ Amax 2 ⎛ ∂ ⎛ ⎞ ⎛ y ⎞⎞ σ ( y) ⎟ = ⎜ ⎜⎜ arccos ⎜ ⎟ ⎟ ⎟ ⎜ ∂y ⎟ ⎝ Amax ⎠ ⎠ ⎝ ⎝ ⎠ ⎞⎞ ⎟ ⎟⎟ σ ( y ) = ⎠⎠ 1 1 2 ⎛ y ⎞ Amax 1− ⎜ ⎟ ⎝ Amax ⎠ y . (38) Eq.(29) gives the mean value of the three measurements, y =270. Eq.(32) then gives the associated error, σ ( y ) ≈9. The number of counts per second is consequently 27±0.9. Note that the error must be calculated before the values are transformed to counts per second. Basic Data Handling in Nuclear and Reactor Applications Calle Persson, Department of Reactor Physics, KTH 10(10)