Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistical uncertainty in calculation and measurement of radiation 1 It is important to understand statistical uncertainty for radiation measurement or radiation transport calculation by Monte Carlo method (hereafter ’MC method’). This note is for a 3 hour lecture on statistical uncertainty and related topics. In the 1st hour, mean and variance is briefly mentioned and variance of sum and variance of mean are derived. Then, we certify that statistical uncertainty derived in radiation transport calculation by MC method and definition of variance of mean agree. We derive equation for statistical uncertainty used in radiation measurement from definition of variance of mean. In 2nd hour, we mention on binomial, Poisson, and Gaussian distribution. In 3rd hour, we mention on primary equation of least square method. 1 Uncertainty of mean 1. What is a meaning and calculation method of “uncertainty of mean”? √ 2. In radiation measurement, we often evaluate statistical uncertainty of count N as N . For example, in the section of statics in isotope tetyo [1], total count and its uncertainty are expressed as √ (1) N± N How was it derived? 3. What is a variance which does not depend on number of data? In this chapter, we investigate these questions while using ’uncertainty of sum’ and ’uncertainty of mean’ as key words. Variance and standard deviation is introduced in section 1.1. √ Then, uncertainty of sum and uncertainty of mean is mentioned in section 1.2 and 1.3. Then, ’N ± N ’ is derived from uncertainty of sum in section 1.3, and ’statistical deviation which does not depend on number of data’ is derived in section 1.4. The equation of statistical uncertainty in MC calculation (For example, Eq.(28) is easily understood after these sections. 1.1 Variance and standard deviation Let’s assume that some quantity x is measured or calculated repeatedly by n times. Let’s write value of i-th x as xi . The mean of x is n 1∑ xi (2) x̄ = n i=1 The difference of individual xi and mean of x corresponds to a fluctuation of individual xi . Then average of square of this difference may be useful quantity to estimate the fluctuation of group of xi . This quantity is called as variance and is calculated as, s2 = n 1∑ (xi − x̄)2 n i=1 (3) Also, square root of variance, is called as standard deviation and we write it as s. The standard deviation is calculated as, v u n u1 ∑ s=t (xi − x̄)2 n (4) i=1 As the dimension of x and that of the standard deviation are the same, standard deviation s is more conveniently utilized than s2 to express the fluctuation of x. 1 2014-01-20 English version 1 1.2 Uncertainty of sum Now, we think of sum of xi whose data number is n and write it as y. n ∑ y = xi i=1 = x1 + x2 + · · · + xN = nx̄ (5) Then, let’s think of the uncertainty of y. Thinking of ”propagation of error”, the variance of y is written using deviation of xi , i.e. ∆xi , ( ) ( ) ∂y 2 ∂y 2 ∆x21 + ∆x22 + · · · + ∂x1 ∂x2 = 1 · s2 + 1 · s2 + 1 · · · · s2 ( ∆y 2 = ∂y ∂xn )2 ∆x2n = ns2 Here, the variance of y, s2y is written as ∆y 2 , because it suits for equation of partial differentiation. The standard deviation of y, sy is obtained as the square root of variance of y. √ s2y sy = √ ns2 √ = ns = (6) If xi is consist of data number n, the uncertainty of sum xi , sy , is product of uncertainty √ of xi and n. 1.3 Uncertainty of mean Let’s think of mean of xi . The data number of xi is assumed to be n again. x̄ = = n 1∑ xi n i=1 1 {x1 + x2 + · · · + xN } n Thinking of propagation of error, the uncertainty of mean is calculated as, ( ) ( ) ( ∂ x̄ 2 ∂ x̄ ∂ x̄ 2 ∆x21 + ∆x22 + · · · + ∂x1 ∂x2 ∂xn ( )2 ( )2 ( )2 1 1 1 = · s2 + · s2 + · · · + · s2 n n n {( ) } 1 2 2 = n ·s n 1 2 = s n ∆x̄2 = )2 ∆x2n Here, the variance of x̄, s2x̄ , is written as ∆x̄2 , because it suits for equation of partial derivation. The standard derivation of x̄, sx̄ , is calculated as the square root of variance of x̄. √ √ sx̄ = s2x̄ = 2 1 2 s = n √ 1 s n (7) When data number of xi is n, the uncertainty of average of xi , sx̄ , is equal to the √ product of uncertainty of xi and 1/ n. 2 . Eq.(7) is an important equation to evaluate uncertainty of mean when mean is obtained after measurement is repeated. Many textbook of statics mention this equation. 3 . 1.4 Derivation of N ± √ N 10 10 8 8 6 6 Number Number Variance of x Let’s think that time interval is so short that count xi in each time interval is either 0 or 1 in radiation measurement. Or, let ’s think that the result of MC calculation is either 0 or 1 for each incident particle. An example of distribution xi for the case that number of time interval n equal 10 is shown in Fig. 1 and Fig. 2. Special binomial distribution whose value is either 0 or 1 is called as ”unit binomial distribution” or Bernoulli distribution. Let’s write probability of hit as a(A 4 2 0 4 2 0 0 1 x 0 i 1 x i Figure 1: Distribution of xi . Number of time interval n=10, Number of interval with count=4. Figure 2: Distribution of xi . Number of time interval n=10, Number of interval with count=1. probability that the response is 1). Average of xi , x̄, is, x̄ = = n 1∑ xi n i=1 1 (na · 1 + n(1 − a) · 0) = a. n xi = 1 and xi =0 happens for na and n(1 − a) times, respectively. Utilizing this, variance, s2x , is, s2 = n 1∑ (xi − x̄)2 n i=1 = n 1∑ (xi − a)2 n i=1 = ] 1[ na(1 − a)2 + n(1 − a)(0 − a)2 n 2 Until this point, an equation to evaluate uncertainty is obtained. This equation is equivalent to an equation which is used in EGS5 user code to obtain statistical uncertainty. By this thing, the uncertainty calculated in EGS user code is based on the definition of uncertainty of the mean. While EGS5 code is mentioned, this lecture should be useful to understand other MC code, ex, EGSnrc. ucnaicgv.f is a sample user’s code of EGS5. The explanation sentence on statistical uncertainty in ucnaicgv.f, and contents of calculation is shown in Appendix A.1 and A.2, respectively. 3 Miyakawa [2] states that ”If sample is taken from a population whose mean and variance are µ and σ 2 , respectively, the expectation values of mean and variance of the sample of size n are µ and σ 2 /n, respectively”. Hoel [3] states that ” √ Sample mean x̄ follows a normal distribution of mean µ and standard deviation σ x = σ/ (n). This standard deviation is called as standard error of mean x . ” Kuga [4] shows √ the relation of experimental standard deviation s and standard uncertainty ∆x̄ of experimental value x̄ as ∆x̄ = s/ n. 3 1 na(1 − a) [(1 − a) + a] n = a(1 − a) = (8) For example, in Fig. 1 and Fig. 2, a = 0.4 and a = 0.1, then s2 = 0.24 and s2 = 0.09, respectively. s2 depends only on a and does not depend on n. Standard deviation s is, √ a(1 − a) s= (9) The relation of s and a is shown in Fig. 3; s is on half circle whose diameter is between points of a = 0, s=0 and a = 1, s=0. Uncertainty of sum As shown in Eq. (6), the uncertainty of sum of xi , y, is a product of standard √ deviation of x and n. Thinking Eq. (6) and Eq. (8) together, √ sy = ns √ √ = n a(1 − a) √ = √ = na(1 − a) y(1 − a) Sum and its uncertainty are, y± √ y(1 − a) (10) By the derivation of this equation, variance of Bernoulli distribution is shown as y(1 − a). [2] If we assume that 1 − a ' 1, i.e. probability of hit is small, we obtain, √ y± y (11) This agrees with Eq. (1). Here, we derived Eq. (1). 4 As uncertainty estimation in MC calculation is based on the equation for uncertainty of mean, equation for estimation of uncertainty in MC calculation, Eq (1), and uncertainty of sum of Bernoulli distribution is equivalent. At last, value of uncertainty is evaluated using Eq. (10) and Eq. (11). Let’s assume that number of trial n = 10000 and estimate sy as a function of hit probability a. Then, we obtain Fig. 4. Here, red and black line indicate results in which (1 − a) factor is considered and neglected, respectively. For example, for a = 0.5, count and its uncertainty are 5000 ± 50 and 5000 ± 71, respectively. Uncertainty of mean Dividing Eq. (11) by n, we obtain mean and its uncertainty, √ a± 1.5 1 a(1 − a). n (12) Variance independent of number of data If variance s2 is estimated by using Eq. (3) for several values of n, it is known that the value of variance changes depending on the value of n. The expectation value s2 is proportional to (n − 1)/n. x̄ is a mean of xi . x̄ and xi have correlation. This correlation is a reason of this n dependence. Then, variance which does not depend on n can be obtained by calculating variance σ 2 against expectation value of mean, µ, n 1∑ σ2 = (xi − µ)2 (13) n i=1 By this derivation, variance of mean of Bernoulli distribution is shown as (1 − a). This is one way of derivation of variance of binomial distribution. [2] Derivation of variance from an equation of distribution of binomial distribution is mentioned later. Variance of binomial distribution can be derived from Moment Generating Function, also. 4 4 1 100 Uncertainty of sum (y), sy 0.8 Standard deviation, s sqrt(na) sqrt(na(1-a)) sqrt(a) sqrt(a(1-a)) 0.6 0.4 0.2 0 80 n=10000 5000+-71 60 5000+-50 40 20 0 0.2 0.4 0.6 0.8 Hit probability, a 0 1 0 0.2 0.4 0.6 Hit probability, a 0.8 1 Figure 4: Uncertainty of sum y, sy Figure 3: Standard deviation of x, s While Eq. (13) is an equation of variance, it is impossible to estimate variance using (13) because the value of µ is unknown. Another way of estimation of variance which is independent of n is replacing 1/n by 1/(n − 1) in Eq. (3). n 1 ∑ (xi − x̄)2 (14) σ̂ 2 = n − 1 i=1 Some textbook calls σ̂ as sample variance [2] or unbiased estimation of variance [3]. In this print, they are simply called as variance, they are distinguished by using different character when it is necessary. Also, when n is large enough, the difference of n and n − 1 is not meaningful, then Eq (3) and Eq. (4) may be used to obtain variance and standard deviation. Numerical calculation The relationship of Eq. (3), Eq. (13) and Eq. (14) is investigated by using random number of Excel. By Excel RAND() function, we generate random number which distributes between (0,1) uniformly. A 100 set of 10 random numbers is input. Then we obtain mean of 10 random numbers and calculate s2 by Eq. (3). As 100 of s2 is obtained, the mean of them is calculated. Also, σ 2 , which is based on the fluctuation from expectation value of mean (0.5 in this case), is estimated by Eq. (13). One hundred of σ 2 is obtained and their mean is calculated. Repeat this calculation from n = 10 to n = 1. The expectation value of σ 2 is, ∫ 0 1 ∫ [ y3 y dy = (x − 0.5) dx = 3 −0.5 2 0.5 ]0.5 2 −0.5 1 1 −1 1 = ( − )= . 3 8 8 12 (15) The comparison of values of variance is shown in Fig. 5. The value of s2 which is obtained by Eq. (3) depends on n apparently; the value of s2 becomes small when n becomes small. On the other hand, the value of σ 2 which is based on Eq. (13) is independent of n and its value is close to its expectation 1 n−1 n 2 2 value of 1/12. Also, s2 is close to 12 n . n−1 s , i.e. σ̂ which is calculated by Eq. (14) is independent 2 of n. From these results, s has a problem of n dependence which is most clear when n is small, while the definition of s2 is simple and intuitive. σ 2 and σ̂ 2 have a good nature of n-independence. Certification of data number independence of Eq. (14) re-written as, n 1∑ (xi − µ)2 = n i=1 n 1∑ {(xi − x̄) + (x̄ − µ)}2 n i=1 5 The right-hand-side of Eq. (13) is 0.1 Value of Variance 0.08 0.06 0.04 s = Eq. (3) 0.02 σ = Eq. (13) 1/12 (n-1)/n * (1/12) 2 2 2 s n/(n-1) = Eq. (14) 0 k00928 0 2 4 n 6 8 10 Figure 5: Comparison of value of variance = n n 1∑ 1∑ 2∑ (xi − x̄)2 + (x̄ − µ)2 + (xi − x̄)(x̄ − µ). n i=1 n i n i=1 Here, 3rd term of right-hand-side of the equation is 0 because it is a sum along i of value which is proportional to (xi − x̄). Then, expectation values of both sides are, [ n 1∑ E (xi − µ)2 n i=1 ] [ ] [ ] n 1∑ 1∑ = E (xi − x̄)2 + E (x̄ − µ)2 . n i=1 n i (16) Left-hand-side is variance from expectation value of mean which is calculated by Eq. (13), and its expectation value is σ 2 . The 1st term of right hand-side is a variance from a mean of data (Eq. (3) . The 2nd term of right hand-side is a variance of mean of data, whose expectation value is σ 2 /n. Then, we can summarize this equation as, ”Variance from µ =Variance from x̄ + Variance of x̄ ”. By substituting expectation value to Eq. (16), we can show see agreement of expectation value of Eq. (13) and that of Eq. (14). ] [ σ 2 (n − 1)σ 2 σ2 1∑ (xi − x̄)2 + = E n i n = E [ ∑ (xi − x̄) [ σ 2 ] 2 i 1 ∑ (xi − x̄)2 = E n−1 i [ = E σ̂ 2 ] (End of certification) Equation in ref.[2] was referenced. 6 ] 2 2.1 Binomial, Poisson and Gaussian distribution Binomial distribution Number of count in radiation measurement and MC calculation follows binomial distribution in general. For example, we think of MC calculation of response function of NaI detector and we think of ratio of full energy absorption peak. Let’s think of a distribution of number of count x of full energy absorption peak after calculation was finished for incident particle of n. Let’s write a probability that some incident particle participates in full energy absorption peak as p. As shown in Fig. 6, the probability that x particle from 1st to x-th particle participate is px . As the probability of NO contribution is (1 − p), the probability that (n − x) particle from (x + 1)-th to n-th particle do not contribute is (1 − p)(n−x) . Accordingly the probability that n particle of 1-th to x-th particle contribute and (n − x) particle of (x+1)-th to n-th particle does not contribute is obtained by multiplying these probabilities, px (1 − p)(n−x) In reality, it is very rare that only beginning part contribute sequentially, and the following part does not contribute sequentially again. Because, contribution and no contribution is mixed, at random at a glance. The number of combination to take x from n is calculated as, n Cx = n! . x!(n − x)! Then, the probability that x contribute for number of incident of n is, f (x) = n Cx p x (1 − p)n−x (17) The distribution of this equation is called as binomial distribution. The distribution of Eq. (17) for n = 10 and p = 0.4 is shown in Fig. 7 . 0.3 f Binomial dst. n=10, p=0.4 0.25 0.2 0.15 Figure 6: Concept of MC calculation result for incident particle of n . Contribution and no contribution for full energy absorption peak is shown by p and q, respectively. In this figure, one extra ordinal case that beginning x particle contribute to full energy peak without rest and following (n − x) particle does not contribute sequentially. 0.1 0.05 0 0 2 4 6 x 8 10 Figure 7: Binomial distribution with n=10 and p=0.4 Number of count in radiation measurement follows binomial distribution also. Here, we derive sum, mean, and variance of binomial distribution. 7 Sum We calculate the sum of probability density function f (x) which follows binomial distribution, x=0 f (x). ”Binomial theorem” expressed in next equation is known, ∑n n (a + b) = n ∑ n! an−x bx . x!(n − x)! x=0 We rewrite a and b as p and q, respectively and we assume that p + q = 1, As left hand side = 1 and ∑ right hand side = nx=0 f (x), the equation can be written as, n ∑ f (x) = 1 x=0 Mean We write expectation value of discrete probability variable x by the following equation, E[x] = n ∑ xf (x). x=0 Mean of binomial distribution is written by using E, E[x] = = n ∑ x=0 n ∑ x=0 xn Cx px q 1−x x n! px q n−x . x!(n − x)! A term of x = 0 is omitted as it is 0. Then summation starts from x = 1. E[x] = n ∑ x=1 x n n ∑ ∑ n! n! n! px q n−x = x px q n−x = px q n−x . x!(n − x)! x(x − 1)!(n − x)! (x − 1)!(n − x)! x=1 x=1 By replacing z = x − 1 and j = n − 1, j ∑ ∑ j! j! np E[x] = pz q j−z = np pz q j−z z!(j − z)! z!(j − z)! z=0 z=0 j = np (19) At the end of Eq. (18), the sum is sum of binomial distribution, which is equal to 1. Variance (18) Variance of probability variable x, V [x], is, V (x) = n 1∑ (xi − x̄)2 n x=1 = n n n 1∑ 1∑ 2 ∑ 2 xi + x − x̄ x̄2 n x=1 i n x=1 n x=1 = n 1 1∑ 2 x2i − x̄(nx̄) + (nx̄2 ) n x=1 n n = n 1∑ x2 − x̄2 n x=1 i = E[x2 ] − E[x]2 8 To obtain variance of binomial distribution, E[x2 ] is calculated at first, E[x2 ] = = = n ∑ x=0 n ∑ x=0 n ∑ x2 n Cx px q n−x x2 n n ∑ ∑ n! n!x n! px q n−x = x2 px q n−x = px q n−x x!(n − x)! x(x − 1)!(n − x)! (x − 1)!(n − x)! x=0 x=0 n!x px q n−x (x − 1)!(n − x)! x=1 At the last equal of upper equation, a term with x = 0 is omitted as its value is 0, and summation starts from x = 1. By replacing z = x − 1, j = n − 1, E[x2 ] = np j ∑ j!(z + 1) z!(j − z)! z=0 pz q j−z ∑ j! j!z pz q j−z + np pz q j−z = np z!(j − z)! z!(j − z)! z=0 z=0 j ∑ j Here, first summation of right hand side is equal to jp, which is a mean of binomial distribution, second summation of right hand side is equal to 1, which is a summation of binomial distribution. Utilizing these, the equation is rewritten as, E[x2 ] = np · jp + np = n(n − 1)p2 + np V [x] = E[x2 ] − E[x]2 = n(n − 1)p2 + np − (np)2 = np(1 − p) = npq 2.2 Poisson distribution It is known that binomial distribution converses to Poisson distribution when profanity of hit is low. Probability density function of Poisson distribution is written as, λx x! Here, λ = np. The feature of Poisson distribution is as follows, f (x) = e−λ • Poisson distribution contains only one parameter of mean value. Poisson distribution is simple comparing to binomial distribution which contains two parameters of number of trial and hit probability. • The variable of Poisson distribution is continuous (or real), whereas the variable of binomial distribution is discrete (integer). A comparison of binomial and Poisson distribution is shown in Fig. 8. Binomial distribution of 3 sets of parameter (n=10,p=0.4), (n=20,p=0.2) and (n=40,p=0.1) are displayed. Poisson distribution with λ = 4 is also displayed. Here, the combination of n and p are adjusted to maintain the relationship of np = λ. Binomial distribution converges to Poisson distribution as p decreases. We derive sum, mean, and variance of Poisson distribution. Sum By Maclaurin’s expansion of eλ , we obtain, eλ = 1 + λ + ∞ ∑ 1 2 1 λx λ + λ3 ... = 2! 3! x! x=0 On the other hand, sum of Poisson distribution is written as, ∞ ∑ x=0 e−λ ∞ ∑ λx λx = e−λ = e−λ eλ = 1 x! x! x=0 9 0.3 f n=10, p=0.4 n=20, p=0.2 n=40, p=0.1 Poisson, λ=4 0.25 0.2 0.15 0.1 0.05 0 0 2 4 6 x 8 10 Figure 8: Comparison of binomial and Poisson distribution Mean ∞ ∑ E[x] = xe−λ x=0 ∞ ∑ λx λx = e−λ x! (x − 1)! x=1 Here, a term with x = 0 was omitted as it is 0. ∞ ∑ E[x] = λe−λ x=1 ∞ ∑ λx−1 λx−1 = λe−λ (x − 1)! (x − 1)! x=1 Here, we replace j = x − 1, E[x] = λe−λ ∞ ∑ λj j=0 j! Summation of right hand side is equal to eλ . Then, E[x] = λe−λ eλ = λ Variance ∞ ∑ E[x2 ] = x2 f (x) = x=0 ∞ ∑ x2 e−λ x=0 λx x! We omit a term with x = 0 as it is 0. E[x2 ] = ∞ ∑ x2 e−λ x=1 ∞ ∑ λx λx = xe−λ x! (x − 1)! x=1 Here, we replace z = x − 1, E[x2 ] = ∞ ∑ (z + 1)e−λ z=0 ∞ ∑ = λ z=0 ze z −λ λ z! ∞ ∑ λz+1 λz = (z + 1)λe−λ z! z! z=0 +λ ∞ ∑ z=0 10 e−λ λz z! The first summation of right hand side is equal to mean of Poisson distribution (= λ) and the second summation of right hand side is equal to sum of Poisson distribution (=1). E[x2 ] = λ2 + λ V [x] = E[x2 ] − E[x]2 = λ2 + λ − λ2 = λ 2.3 Gaussian distribution When number of trial is sufficiently large, binomial or Poisson distribution can be approximated as Gaussian distribution. Gaussian distribution is expressed by following equation. { f (x) = 1 (x − µ)2 √ exp − 2σ 2 2πσ } In Fig. 9, comparison of binomial and Poisson distribution with Gaussian distribution is shown. Mean is set as 20 for all distribution. Hit probability of binomial√distribution is√set as p = 0.5. A parameter to control width in Gaussian distribution is set as σ = 10 and σ = 20 for that to approximate binomial and Poisson distribution. 0.16 Gauss,µ=20 σ=10^0.5 Gauss,µ=20 σ=20^0.5 f Binomial,n=40,p=0.5 Poisson λ=20 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 0 5 10 15 20 25 30 35 x 40 Figure 9: Approximation of binomial and Poisson distribution by Gaussian distribution 3 Derivation of basic formula for least square method As shown in Fig. 10, on (x,y)-plane, 3 points exist (x1 ,y1 ), (x2 ,y2 ), and (x3 ,y3 ). The goal to is to draw a line y 0 = a + b(x − x̄) so that the line come close to these 3 points. Several strategies may be possible to realize ”come close”. Here, we draw segment of line from these points to the line in parallel to y axis. We determine coefficient a and b so that sum of square of segment d1 , d2 , and d3 becomes minimum value. d1 ∼d3 is expressed as, d1 = y1 − y10 = y1 − {a + b(x1 − x̄)} 11 d2 = y2 − y20 = y2 − {a + b(x2 − x̄)} d3 = y3 − y30 = y3 − {a + b(x3 − x̄)}. Summation of square of d1 ∼ d3 is, S = d21 + d22 + d23 = {y1 − a − b(x1 − x̄)}2 + {y2 − a − b(x2 − x̄)}2 + {y3 − a − b(x3 − x̄)}2 = {y12 + y23 + y32 } + 3a2 + b2 {(x1 − x̄)2 + (x2 − x̄)2 + (x3 − x̄)2 } − 2a(y1 + y2 + y3 ) −2b{y1 (x1 − x̄) + y2 (x2 − x̄) + y3 (x3 − x̄)} + 2ab{(x1 − x̄) + (x2 − x̄) + (x3 − x̄)}. Here, a term of 2ab is omitted because its { } part is 0. Other term of S can be rewritten as, S = A + 3a2 + b2 B − 2aC − 2bD (20) Here, A = y12 + y22 + y32 B = (x1 − x̄)2 + (x2 − x̄)2 + (x3 − x̄)2 C = (y1 + y2 + y3 ) D = y1 (x1 − x̄) + y2 (x2 − x̄) + y3 (x3 − x̄). Eq. (20) is sum of a quadratic equation of a and that of b. Then, we differentiate S by a or b to obtain coefficient to minimize S. ∂S ∂a = 6a − 2C = 0 C 3 a = ∂S ∂b (21) = 2Bb − 2D = 0 D . B b = (22) Eq. (21) and Eq. (22) are the equation for the case that number of points is 3. Modifying these formula so that they are valid for n points, a = ȳ b = ∑n yi (xi − x̄) ∑i=1 n 2 i=1 (xi − x̄) (23) (24) In derivation of these formulas, [8] was referenced. 4 Practice 1. Generate random number of 0 and 1 using following procedure and calculate mean, standard deviation and uncertainty of mean. (a) Production of distribution i. Assume that response 1 and 0 are obtained for 3 and 7 times for a MC calculation of 10 incidences. ii. For a simplification, let’s assume that 1 appears in beginning 3 times and 0 appears following 7 times. Let’s call this as x. 12 8 y (x3,y3) 7 d3 6 5 y'=a+b(x-x_av) 4 (x1,y1) 3 d2 d1 2 (x2,y2) 1 0 0 1 2 3 4 5 6 7 x 8 Figure 10: Procedure of least square method iii. Input x to Excel or Kaleida Graph. (Input 1 and 0 for A1 to A3 and A4 to A10 cell of Excel) (b) Calculate statistical quantity of x by hand calculation i. Calculate variance and standard deviation of x by Eq. (8) and Eq. (9). ii. Calculate sum of x. Also, calculate uncertainty of sum of x by Eq. (1) and Eq. (10). iii. Calculate mean of x. Also, calculate uncertainty of mean of x by Eq. (12). (c) Investigate statistical nature of x by using function of Excel. Read values of var, stdev, and stdevp function of A1 to A10 and compare them with the result of hand calculation. 5 (d) If possible use function of Kaleida graph to investigate statistical nature of x. Obtain mean, variance, standard deviation and uncertainty of mean by using statistical function of Kaleida graph. 2. Produce another distribution of random number which is consist of 0 and 1. Investigate mean, variance, standard deviation and uncertainty of mean. 求めよ。 (a) Production of distribution i. Divide lower 2 digits of your student number by 100 and assume that the number is a. ii. Generate 10 of random number r (0 < r < 1) . Set x as 1 and 0 for r < a and r > a iii. Input x to Excel and Kaleidagraph. (b) Investigate statistical nature pf x by the same process of problem 1 (b) to (d). 3. Confirm that variance by Eq. (3) 式 depends on n and that by Eq. (13) and Eq. (14) does not depend on n. (a) Numerical calculation. Generate 100 set of random numbers using Excel or other software. Each set should contain n random number, (1 < n < 101). Calculate their variance based on Eq. (3) 式, Eq. (13) and Eq. (14) to investigate n dependence. (b) Certify that Eq. (14) 式 does not depend on n. 5 stdev function returns standard deviation based on non-bias, i.e. (n-1) method. stdevp function returns standard deviation based on bias, i.e. n method. p in stdevp stands for population. 13 4. Calculate following distribution. Excel may be used. (a) Binomial distribution with n = 10 and p = 0.4. Display it as figure and compare it with Fig. 7. (b) 3 kind of binomial distributions and Poisson distribution as shown in Fig. 8. Which binomial distribution is closest to Poisson distribution? (c) Binomial, Poisson, and Gaussian distribution shown in Fig. 9. Does Gaussian distribution approximate binomial and Poisson distribution well? 5. Put 3 points on (x,y)-plane by following procedure. Obtain line which comes close to these points using least square method. (a) Disintegrate your student number (or license number) into several single digits and order them from small one to large one. Assume that these digits corresponds to (x1 , y1 ) ∼ (x3 , y3 ). For example. if the student number is 874913, digits of 8, 7, 4, 9, 1, and 3 are obtained and they are used as (x1 , y1 ) = (1, 3), (x2 , y2 ) = (4, 7), (x3 , y3 ) = (8, 9) . (b) Calculate a and b by using Eq. (21) 式 and Eq. (22) or Eq. (23) and Eq. (24). (c) Plot (x1 , y1 ) ∼ (x3 , y3 ) and draw a line of y 0 = a + b(x − x̄) on the same graph to confirm that line was successfully placed near points by least square method. 6. Derivate next equations. (a) Sum, mean and variance of binomial distribution. Variance can be derivate from hit probability. (b) Sum, mean and variance of Poisson distribution. (c) Equation to calculate coefficient a and b of least square method. References [1] Isotope-tetyo 10th ed. Isotope-kyokai (2001) (in Japanese). [2] K. Miyagawa, “Elementary Statistics 3rd ed.”, Yuhikaku, Tokyo (2007), (in Japanese). [3] P. G. Hoel, “Elementary Statistics, 4th ed.” John Wiley & Sons, Inc. New York (1976). [4] T. Kuga, Parity 25, 54-60 (2010-06), (in Japanese). [5] Nicholas Tsoulfanidis, “Measurement and detection of radiation”, McGraw-Hill Book Company (1983). Section 2.8. [6] G. F. Knoll, “Radiation detection and Measurement 3rd ed.” John Wiley & Sons, Inc. (2000). [7] W. J. Price, “Nuclear Radiation Detection”, McGraw-Hill Book Company, Inc. (1964). [8] http://szksrv.isc.chubu.ac.jp/lms/lms1.html [9] H. Hirayama and Y. Namito, “Lecture Notes of Response calculation of NaI detector (cg Version)”, KEK Internal 2011-5 (Dec 2011). 14 Appendix A. Derivation of statistical uncertainty in Monte Carlo calculation A.1 Calculation of statistical uncertainty ucnaicgv.f ucnaicgv.f is an egs5 sample user code, which is utilized at short course of EGS5. In this user code, photon is incident onto NaI detector and several quantities to characterize detector, such as peak efficiency, total efficiency, distribution of absorption energy, are calculated. In the manual of this user code [9], statistical uncertainty is described as follows. The uncertainty of obtained, x, is estimated using the method used in MCNP in this user code. • Assume that the calculation calls for N “incident” particle histories. • Assume that xi is the result for i-th history. • Calculate the mean value of x : x= N 1 ∑ xi N i=1 (25) • Estimate the variance associated with the distribution of xi : s2 = N N 1 ∑ 1 ∑ (xi − x)2 ' x2 − (x)2 (x2 = x2 ). N − 1 i=1 N i=1 i • Estimate the variance associated with the distribution of x: 1 1 sx2 = s2 ' [x2 − (x)2 ] N N (26) (27) • Report the statistical uncertainty as: sx ' [ 1 2 (x − x2 )]1/2 N (28) A.2 Contents of the code and numerical value Contents of the code and numerical value are investigated to understand equation in the previous subsection. In ucnaicgv.f, following lines are included in shower-call-loop, which is a main part for Monte Carlo calculation, if(depe.ge.ekein*0.999) then pefs=pefs+wtin pef2s=pef2s+wtin end if Using these lines, weight of particle which participates in peak efficiency is summed up. Here, ekein and depe are incident energy and absorbed energy in sensitive volume for single incident particle, respectively. Utilizing these values in if statement, some particle is analyzed if it is scored as contribution to full energy absorption peak or not. pefs is a variable to count particles in full energy absorption peak. After Monte Carlo simulation for all incident particle is finished, results are analyzed in step 9. avpe=pefs/ncount pef2s=pef2s/ncount sigpe=dsqrt((pefs2-avpe*avpe)/ncount) avpe and sigpe are printed as peak efficiency and its uncertainty. The values are ncount=10000, wtin=1, pefs=3728, pef2s=3728. pef2s is used to score square of weight of particle which participates in peak efficiency. As particle weight is 1 in this user code, the value of pefs pef2s are the same. Utilizing these, avpe=0.3728, pefs2=0.3728, sigpe=4.8e-3 are obtained and peak efficiency is primed as 37.28 ± 0.48% . 15 Appendix B B.1 Uncertainty of mean Uncertainty of mean is a value to express uncertainty of mean. Some confusion is seen on this value. • Depending author, period, or field, this value is called with different words. If one search word of Table 1: Search result of Google Word Error for the sample Standard error of mean Error for the sample mean Number of hit 3,520,000 780,000 118,000 ”Error for sample mean” or word which may be used for same meaning by Google, several variety of word is seen as is shown in Table 1. In Wikipedia, ”error of mean” is called as ”standard error” in the page of standard error. In Kaleida-graph, this quantity is called as standard error. At Homepage of statistics bureau, Ministry of Internal Affairs and Communications of Japanese government, it is called as sample error. Hoel [3], an example of textbook for basic statics, call it as ”Standard error of mean x̄”. 6 • While an engineer who treats radiation learns basic statics by textbook for radiation detector. But, many textbook on radiation detector ignore this value [6, 7], and seldom describes it [5]. This is contrast to the fact that other basic things on statistics, such as binomial, Poisson, and Gaussian distribution, mean, variance, standard deviation, are well described in textbook for radiation detector and its name is not changed for long time. B.2 Description of sum, mean, variance of binomial and Poisson distribution in textbook on radiation detector To investigate the effectiveness of ”learning basic concept of basic statistics by textbook on radiation detector”, it is summarized how sum, mean and variance of binomial and Poisson distribution is described on several textbook on radiation detector. Nicholas Tsoulfanidis, ”Measurement and detection of radiation ance of binomial and Poisson distribution are described. Sum, mean, and vari- Glenn F. Knoll, Radiation detection and Measurement By mathematically simplifying Poisson distribution, Gaussian distribution is introduced. It is mentioned that as a nature of Gaussian distribution, expected variance is equal to x̄. But this is valid only if that Gaussian distribution is an approximation of Poisson distribution. (In general Gaussian distribution, variance and mean are independent.) No description is included for derivation of sum, mean, and variance of binomial and Poisson distribution. 6 The divergence situation may be realized as follows. Initially, this value is called as ”Standard error for sample mean x̄”. Either ”sample” or ”mean” or both of them are omitted as custom. Also by calling this value as standard error, this value may be distinguished from standard deviation. 16 William J. Price, Nuclear Radiation Detection Equation for variance of binomial distribution is shown, but that for Poisson distribution is not shown. No description is included for derivation of sum, mean and variance of binomial and Poisson distribution. Sum, mean, and variance of binomial and Poisson distribution are a useful knowledge on basic statistics which is necessary in Monte Carlo calculation and radiation measurement. In textbooks on radiation measurement, the derivation is rare, while only equation is included. Thus it is hard to obtain enough knowledge on basic statistics from these textbooks on radiation measurement. On the other hand, there exist many textbooks of statistics whose level ranges from basic for undergraduate to advanced for specialists. It is recommended to learn basic statistics from these textbooks when it is necessary to obtain systematic knowledge of statistics which is related to Monte Carlo calculation and radiation measurement. 17