Download Statistical uncertainty in calculation and measurement of radiation 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Particle filter wikipedia , lookup

Foundations of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Transcript
Statistical uncertainty in calculation and measurement of radiation
1
It is important to understand statistical uncertainty for radiation measurement or radiation transport calculation by Monte Carlo method (hereafter ’MC method’). This note is for a 3 hour lecture
on statistical uncertainty and related topics. In the 1st hour, mean and variance is briefly mentioned
and variance of sum and variance of mean are derived. Then, we certify that statistical uncertainty
derived in radiation transport calculation by MC method and definition of variance of mean agree. We
derive equation for statistical uncertainty used in radiation measurement from definition of variance
of mean. In 2nd hour, we mention on binomial, Poisson, and Gaussian distribution. In 3rd hour, we
mention on primary equation of least square method.
1
Uncertainty of mean
1. What is a meaning and calculation method of “uncertainty of mean”?
√
2. In radiation measurement, we often evaluate statistical uncertainty of count N as N . For
example, in the section of statics in isotope tetyo [1], total count and its uncertainty are expressed
as
√
(1)
N± N
How was it derived?
3. What is a variance which does not depend on number of data?
In this chapter, we investigate these questions while using ’uncertainty of sum’ and ’uncertainty of
mean’ as key words. Variance and standard deviation is introduced in section 1.1.
√ Then, uncertainty
of sum and uncertainty of mean is mentioned in section 1.2 and 1.3. Then, ’N ± N ’ is derived from
uncertainty of sum in section 1.3, and ’statistical deviation which does not depend on number of data’
is derived in section 1.4. The equation of statistical uncertainty in MC calculation (For example,
Eq.(28) is easily understood after these sections.
1.1
Variance and standard deviation
Let’s assume that some quantity x is measured or calculated repeatedly by n times. Let’s write value
of i-th x as xi . The mean of x is
n
1∑
xi
(2)
x̄ =
n i=1
The difference of individual xi and mean of x corresponds to a fluctuation of individual xi . Then
average of square of this difference may be useful quantity to estimate the fluctuation of group of xi .
This quantity is called as variance and is calculated as,
s2 =
n
1∑
(xi − x̄)2
n i=1
(3)
Also, square root of variance, is called as standard deviation and we write it as s. The standard
deviation is calculated as,
v
u
n
u1 ∑
s=t
(xi − x̄)2
n
(4)
i=1
As the dimension of x and that of the standard deviation are the same, standard deviation s is more
conveniently utilized than s2 to express the fluctuation of x.
1
2014-01-20 English version
1
1.2
Uncertainty of sum
Now, we think of sum of xi whose data number is n and write it as y.
n
∑
y =
xi
i=1
= x1 + x2 + · · · + xN
= nx̄
(5)
Then, let’s think of the uncertainty of y. Thinking of ”propagation of error”, the variance of y is
written using deviation of xi , i.e. ∆xi ,
(
)
(
)
∂y 2
∂y 2
∆x21 +
∆x22 + · · · +
∂x1
∂x2
= 1 · s2 + 1 · s2 + 1 · · · · s2
(
∆y 2 =
∂y
∂xn
)2
∆x2n
= ns2
Here, the variance of y, s2y is written as ∆y 2 , because it suits for equation of partial differentiation.
The standard deviation of y, sy is obtained as the square root of variance of y.
√
s2y
sy =
√
ns2
√
=
ns
=
(6)
If xi is consist of data number n, the uncertainty of sum xi , sy , is product of uncertainty
√
of xi and n.
1.3
Uncertainty of mean
Let’s think of mean of xi . The data number of xi is assumed to be n again.
x̄ =
=
n
1∑
xi
n i=1
1
{x1 + x2 + · · · + xN }
n
Thinking of propagation of error, the uncertainty of mean is calculated as,
(
)
(
)
(
∂ x̄ 2
∂ x̄
∂ x̄ 2
∆x21 +
∆x22 + · · · +
∂x1
∂x2
∂xn
( )2
( )2
( )2
1
1
1
=
· s2 +
· s2 + · · · +
· s2
n
n
n
{( )
}
1 2 2
= n
·s
n
1 2
=
s
n
∆x̄2 =
)2
∆x2n
Here, the variance of x̄, s2x̄ , is written as ∆x̄2 , because it suits for equation of partial derivation. The
standard derivation of x̄, sx̄ , is calculated as the square root of variance of x̄.
√
√
sx̄ =
s2x̄
=
2
1 2
s =
n
√
1
s
n
(7)
When data number of xi is n, the uncertainty of average of xi , sx̄ , is equal to the
√
product of uncertainty of xi and 1/ n. 2 . Eq.(7) is an important equation to evaluate uncertainty
of mean when mean is obtained after measurement is repeated. Many textbook of statics mention
this equation. 3 .
1.4
Derivation of N ±
√
N
10
10
8
8
6
6
Number
Number
Variance of x Let’s think that time interval is so short that count xi in each time interval is either
0 or 1 in radiation measurement. Or, let ’s think that the result of MC calculation is either 0 or 1
for each incident particle. An example of distribution xi for the case that number of time interval n
equal 10 is shown in Fig. 1 and Fig. 2. Special binomial distribution whose value is either 0 or 1 is
called as ”unit binomial distribution” or Bernoulli distribution. Let’s write probability of hit as a(A
4
2
0
4
2
0
0
1
x
0
i
1
x
i
Figure 1: Distribution of xi . Number of time
interval n=10, Number of interval with count=4.
Figure 2: Distribution of xi . Number of time
interval n=10, Number of interval with count=1.
probability that the response is 1). Average of xi , x̄, is,
x̄ =
=
n
1∑
xi
n i=1
1
(na · 1 + n(1 − a) · 0) = a.
n
xi = 1 and xi =0 happens for na and n(1 − a) times, respectively. Utilizing this, variance, s2x , is,
s2 =
n
1∑
(xi − x̄)2
n i=1
=
n
1∑
(xi − a)2
n i=1
=
]
1[
na(1 − a)2 + n(1 − a)(0 − a)2
n
2
Until this point, an equation to evaluate uncertainty is obtained. This equation is equivalent to an equation which
is used in EGS5 user code to obtain statistical uncertainty. By this thing, the uncertainty calculated in EGS user code
is based on the definition of uncertainty of the mean. While EGS5 code is mentioned, this lecture should be useful
to understand other MC code, ex, EGSnrc. ucnaicgv.f is a sample user’s code of EGS5. The explanation sentence on
statistical uncertainty in ucnaicgv.f, and contents of calculation is shown in Appendix A.1 and A.2, respectively.
3
Miyakawa [2] states that ”If sample is taken from a population whose mean and variance are µ and σ 2 , respectively,
the expectation values of mean and variance of the sample of size n are µ and σ 2 /n, respectively”.
Hoel [3] states that ”
√
Sample mean x̄ follows a normal distribution of mean µ and standard deviation σ x = σ/ (n). This standard deviation
is called as standard error of mean x . ” Kuga [4] shows
√ the relation of experimental standard deviation s and standard
uncertainty ∆x̄ of experimental value x̄ as ∆x̄ = s/ n.
3
1
na(1 − a) [(1 − a) + a]
n
= a(1 − a)
=
(8)
For example, in Fig. 1 and Fig. 2, a = 0.4 and a = 0.1, then s2 = 0.24 and s2 = 0.09, respectively. s2
depends only on a and does not depend on n. Standard deviation s is,
√
a(1 − a)
s=
(9)
The relation of s and a is shown in Fig. 3; s is on half circle whose diameter is between points of
a = 0, s=0 and a = 1, s=0.
Uncertainty of sum As shown in Eq. (6), the uncertainty of sum of xi , y, is a product of standard
√
deviation of x and n. Thinking Eq. (6) and Eq. (8) together,
√
sy =
ns
√ √
=
n a(1 − a)
√
=
√
=
na(1 − a)
y(1 − a)
Sum and its uncertainty are,
y±
√
y(1 − a)
(10)
By the derivation of this equation, variance of Bernoulli distribution is shown as y(1 − a). [2] If we
assume that 1 − a ' 1, i.e. probability of hit is small, we obtain,
√
y± y
(11)
This agrees with Eq. (1). Here, we derived Eq. (1). 4 As uncertainty estimation in MC calculation
is based on the equation for uncertainty of mean, equation for estimation of uncertainty in MC
calculation, Eq (1), and uncertainty of sum of Bernoulli distribution is equivalent.
At last, value of uncertainty is evaluated using Eq. (10) and Eq. (11). Let’s assume that number
of trial n = 10000 and estimate sy as a function of hit probability a. Then, we obtain Fig. 4. Here,
red and black line indicate results in which (1 − a) factor is considered and neglected, respectively.
For example, for a = 0.5, count and its uncertainty are 5000 ± 50 and 5000 ± 71, respectively.
Uncertainty of mean
Dividing Eq. (11) by n, we obtain mean and its uncertainty,
√
a±
1.5
1
a(1 − a).
n
(12)
Variance independent of number of data
If variance s2 is estimated by using Eq. (3) for several values of n, it is known that the value of variance
changes depending on the value of n. The expectation value s2 is proportional to (n − 1)/n. x̄ is a
mean of xi . x̄ and xi have correlation. This correlation is a reason of this n dependence.
Then, variance which does not depend on n can be obtained by calculating variance σ 2 against
expectation value of mean, µ,
n
1∑
σ2 =
(xi − µ)2
(13)
n i=1
By this derivation, variance of mean of Bernoulli distribution is shown as (1 − a). This is one way of derivation of
variance of binomial distribution. [2] Derivation of variance from an equation of distribution of binomial distribution is
mentioned later. Variance of binomial distribution can be derived from Moment Generating Function, also.
4
4
1
100
Uncertainty of sum (y), sy
0.8
Standard deviation, s
sqrt(na)
sqrt(na(1-a))
sqrt(a)
sqrt(a(1-a))
0.6
0.4
0.2
0
80
n=10000
5000+-71
60
5000+-50
40
20
0
0.2
0.4
0.6
0.8
Hit probability, a
0
1
0
0.2
0.4
0.6
Hit probability, a
0.8
1
Figure 4: Uncertainty of sum y, sy
Figure 3: Standard deviation of x, s
While Eq. (13) is an equation of variance, it is impossible to estimate variance using (13) because the
value of µ is unknown. Another way of estimation of variance which is independent of n is replacing
1/n by 1/(n − 1) in Eq. (3).
n
1 ∑
(xi − x̄)2
(14)
σ̂ 2 =
n − 1 i=1
Some textbook calls σ̂ as sample variance [2] or unbiased estimation of variance [3]. In this
print, they are simply called as variance, they are distinguished by using different character when it is
necessary. Also, when n is large enough, the difference of n and n − 1 is not meaningful, then Eq (3)
and Eq. (4) may be used to obtain variance and standard deviation.
Numerical calculation The relationship of Eq. (3), Eq. (13) and Eq. (14) is investigated by using
random number of Excel. By Excel RAND() function, we generate random number which distributes
between (0,1) uniformly. A 100 set of 10 random numbers is input. Then we obtain mean of 10 random
numbers and calculate s2 by Eq. (3). As 100 of s2 is obtained, the mean of them is calculated. Also,
σ 2 , which is based on the fluctuation from expectation value of mean (0.5 in this case), is estimated
by Eq. (13). One hundred of σ 2 is obtained and their mean is calculated. Repeat this calculation from
n = 10 to n = 1. The expectation value of σ 2 is,
∫
0
1
∫
[
y3
y dy =
(x − 0.5) dx =
3
−0.5
2
0.5
]0.5
2
−0.5
1 1 −1
1
= ( −
)= .
3 8
8
12
(15)
The comparison of values of variance is shown in Fig. 5. The value of s2 which is obtained by Eq. (3)
depends on n apparently; the value of s2 becomes small when n becomes small. On the other hand,
the value of σ 2 which is based on Eq. (13) is independent of n and its value is close to its expectation
1 n−1
n 2
2
value of 1/12. Also, s2 is close to 12
n . n−1 s , i.e. σ̂ which is calculated by Eq. (14) is independent
2
of n. From these results, s has a problem of n dependence which is most clear when n is small, while
the definition of s2 is simple and intuitive. σ 2 and σ̂ 2 have a good nature of n-independence.
Certification of data number independence of Eq. (14)
re-written as,
n
1∑
(xi − µ)2 =
n i=1
n
1∑
{(xi − x̄) + (x̄ − µ)}2
n i=1
5
The right-hand-side of Eq. (13) is
0.1
Value of Variance
0.08
0.06
0.04
s = Eq. (3)
0.02
σ = Eq. (13)
1/12
(n-1)/n * (1/12)
2
2
2
s n/(n-1) = Eq. (14)
0
k00928
0
2
4
n
6
8
10
Figure 5: Comparison of value of variance
=
n
n
1∑
1∑
2∑
(xi − x̄)2 +
(x̄ − µ)2 +
(xi − x̄)(x̄ − µ).
n i=1
n i
n i=1
Here, 3rd term of right-hand-side of the equation is 0 because it is a sum along i of value which is
proportional to (xi − x̄). Then, expectation values of both sides are,
[
n
1∑
E
(xi − µ)2
n i=1
]
[
]
[
]
n
1∑
1∑
= E
(xi − x̄)2 + E
(x̄ − µ)2 .
n i=1
n i
(16)
Left-hand-side is variance from expectation value of mean which is calculated by Eq. (13), and its
expectation value is σ 2 . The 1st term of right hand-side is a variance from a mean of data (Eq. (3)
. The 2nd term of right hand-side is a variance of mean of data, whose expectation value is σ 2 /n.
Then, we can summarize this equation as, ”Variance from µ =Variance from x̄ + Variance of
x̄ ”. By substituting expectation value to Eq. (16), we can show see agreement of expectation value
of Eq. (13) and that of Eq. (14).
]
[
σ
2
(n − 1)σ
2
σ2
1∑
(xi − x̄)2 +
= E
n i
n
= E
[
∑
(xi − x̄)
[
σ
2
]
2
i
1 ∑
(xi − x̄)2
= E
n−1 i
[
= E σ̂ 2
]
(End of certification) Equation in ref.[2] was referenced.
6
]
2
2.1
Binomial, Poisson and Gaussian distribution
Binomial distribution
Number of count in radiation measurement and MC calculation follows binomial distribution in general. For example, we think of MC calculation of response function of NaI detector and we think of
ratio of full energy absorption peak. Let’s think of a distribution of number of count x of full energy
absorption peak after calculation was finished for incident particle of n. Let’s write a probability that
some incident particle participates in full energy absorption peak as p. As shown in Fig. 6, the probability that x particle from 1st to x-th particle participate is px . As the probability of NO contribution
is (1 − p), the probability that (n − x) particle from (x + 1)-th to n-th particle do not contribute is
(1 − p)(n−x) . Accordingly the probability that n particle of 1-th to x-th particle contribute and (n − x)
particle of (x+1)-th to n-th particle does not contribute is obtained by multiplying these probabilities,
px (1 − p)(n−x)
In reality, it is very rare that only beginning part contribute sequentially, and the following part does
not contribute sequentially again. Because, contribution and no contribution is mixed, at random at
a glance. The number of combination to take x from n is calculated as,
n Cx
=
n!
.
x!(n − x)!
Then, the probability that x contribute for number of incident of n is,
f (x) =
n Cx p
x
(1 − p)n−x
(17)
The distribution of this equation is called as binomial distribution. The distribution of Eq. (17) for
n = 10 and p = 0.4 is shown in Fig. 7 .
0.3
f
Binomial dst.
n=10, p=0.4
0.25
0.2
0.15
Figure 6: Concept of MC calculation result for
incident particle of n . Contribution and no contribution for full energy absorption peak is shown
by p and q, respectively. In this figure, one extra
ordinal case that beginning x particle contribute
to full energy peak without rest and following
(n − x) particle does not contribute sequentially.
0.1
0.05
0
0
2
4
6
x
8
10
Figure 7: Binomial distribution with n=10 and
p=0.4
Number of count in radiation measurement follows binomial distribution also. Here, we derive
sum, mean, and variance of binomial distribution.
7
Sum We calculate the sum of probability density function f (x) which follows binomial distribution,
x=0 f (x). ”Binomial theorem” expressed in next equation is known,
∑n
n
(a + b)
=
n
∑
n!
an−x bx .
x!(n
−
x)!
x=0
We rewrite a and b as p and q, respectively and we assume that p + q = 1, As left hand side = 1 and
∑
right hand side = nx=0 f (x), the equation can be written as,
n
∑
f (x) = 1
x=0
Mean We write expectation value of discrete probability variable x by the following equation,
E[x] =
n
∑
xf (x).
x=0
Mean of binomial distribution is written by using E,
E[x] =
=
n
∑
x=0
n
∑
x=0
xn Cx px q 1−x
x
n!
px q n−x .
x!(n − x)!
A term of x = 0 is omitted as it is 0. Then summation starts from x = 1.
E[x] =
n
∑
x=1
x
n
n
∑
∑
n!
n!
n!
px q n−x =
x
px q n−x =
px q n−x .
x!(n − x)!
x(x
−
1)!(n
−
x)!
(x
−
1)!(n
−
x)!
x=1
x=1
By replacing z = x − 1 and j = n − 1,
j
∑
∑
j!
j!
np
E[x] =
pz q j−z = np
pz q j−z
z!(j
−
z)!
z!(j
−
z)!
z=0
z=0
j
= np
(19)
At the end of Eq. (18), the sum is sum of binomial distribution, which is equal to 1.
Variance
(18)
Variance of probability variable x, V [x], is,
V (x) =
n
1∑
(xi − x̄)2
n x=1
=
n
n
n
1∑
1∑
2 ∑
2
xi +
x − x̄
x̄2
n x=1 i
n x=1
n x=1
=
n
1
1∑
2
x2i − x̄(nx̄) + (nx̄2 )
n x=1
n
n
=
n
1∑
x2 − x̄2
n x=1 i
= E[x2 ] − E[x]2
8
To obtain variance of binomial distribution, E[x2 ] is calculated at first,
E[x2 ] =
=
=
n
∑
x=0
n
∑
x=0
n
∑
x2 n Cx px q n−x
x2
n
n
∑
∑
n!
n!x
n!
px q n−x =
x2
px q n−x =
px q n−x
x!(n − x)!
x(x
−
1)!(n
−
x)!
(x
−
1)!(n
−
x)!
x=0
x=0
n!x
px q n−x
(x
−
1)!(n
−
x)!
x=1
At the last equal of upper equation, a term with x = 0 is omitted as its value is 0, and summation
starts from x = 1. By replacing z = x − 1, j = n − 1,
E[x2 ] = np
j
∑
j!(z + 1)
z!(j − z)!
z=0
pz q j−z
∑
j!
j!z
pz q j−z + np
pz q j−z
= np
z!(j
−
z)!
z!(j
−
z)!
z=0
z=0
j
∑
j
Here, first summation of right hand side is equal to jp, which is a mean of binomial distribution,
second summation of right hand side is equal to 1, which is a summation of binomial distribution.
Utilizing these, the equation is rewritten as,
E[x2 ] = np · jp + np = n(n − 1)p2 + np
V [x] = E[x2 ] − E[x]2 = n(n − 1)p2 + np − (np)2 = np(1 − p) = npq
2.2
Poisson distribution
It is known that binomial distribution converses to Poisson distribution when profanity of hit is low.
Probability density function of Poisson distribution is written as,
λx
x!
Here, λ = np. The feature of Poisson distribution is as follows,
f (x) = e−λ
• Poisson distribution contains only one parameter of mean value. Poisson distribution is simple
comparing to binomial distribution which contains two parameters of number of trial and hit
probability.
• The variable of Poisson distribution is continuous (or real), whereas the variable of binomial
distribution is discrete (integer).
A comparison of binomial and Poisson distribution is shown in Fig. 8. Binomial distribution of 3
sets of parameter (n=10,p=0.4), (n=20,p=0.2) and (n=40,p=0.1) are displayed. Poisson distribution
with λ = 4 is also displayed. Here, the combination of n and p are adjusted to maintain the relationship
of np = λ. Binomial distribution converges to Poisson distribution as p decreases.
We derive sum, mean, and variance of Poisson distribution.
Sum By Maclaurin’s expansion of eλ , we obtain,
eλ = 1 + λ +
∞
∑
1 2
1
λx
λ + λ3 ... =
2!
3!
x!
x=0
On the other hand, sum of Poisson distribution is written as,
∞
∑
x=0
e−λ
∞
∑
λx
λx
= e−λ
= e−λ eλ = 1
x!
x!
x=0
9
0.3
f
n=10, p=0.4
n=20, p=0.2
n=40, p=0.1
Poisson, λ=4
0.25
0.2
0.15
0.1
0.05
0
0
2
4
6
x
8
10
Figure 8: Comparison of binomial and Poisson distribution
Mean
∞
∑
E[x] =
xe−λ
x=0
∞
∑
λx
λx
=
e−λ
x!
(x − 1)!
x=1
Here, a term with x = 0 was omitted as it is 0.
∞
∑
E[x] =
λe−λ
x=1
∞
∑
λx−1
λx−1
= λe−λ
(x − 1)!
(x − 1)!
x=1
Here, we replace j = x − 1,
E[x] = λe−λ
∞
∑
λj
j=0
j!
Summation of right hand side is equal to eλ . Then,
E[x] = λe−λ eλ = λ
Variance
∞
∑
E[x2 ] =
x2 f (x) =
x=0
∞
∑
x2 e−λ
x=0
λx
x!
We omit a term with x = 0 as it is 0.
E[x2 ] =
∞
∑
x2 e−λ
x=1
∞
∑
λx
λx
=
xe−λ
x!
(x − 1)!
x=1
Here, we replace z = x − 1,
E[x2 ] =
∞
∑
(z + 1)e−λ
z=0
∞
∑
= λ
z=0
ze
z
−λ λ
z!
∞
∑
λz+1
λz
=
(z + 1)λe−λ
z!
z!
z=0
+λ
∞
∑
z=0
10
e−λ
λz
z!
The first summation of right hand side is equal to mean of Poisson distribution (= λ) and the second
summation of right hand side is equal to sum of Poisson distribution (=1).
E[x2 ] = λ2 + λ
V [x] = E[x2 ] − E[x]2 = λ2 + λ − λ2 = λ
2.3
Gaussian distribution
When number of trial is sufficiently large, binomial or Poisson distribution can be approximated as
Gaussian distribution. Gaussian distribution is expressed by following equation.
{
f (x) =
1
(x − µ)2
√
exp −
2σ 2
2πσ
}
In Fig. 9, comparison of binomial and Poisson distribution with Gaussian distribution is shown. Mean
is set as 20 for all distribution. Hit probability of binomial√distribution is√set as p = 0.5. A parameter
to control width in Gaussian distribution is set as σ = 10 and σ = 20 for that to approximate
binomial and Poisson distribution.
0.16
Gauss,µ=20 σ=10^0.5
Gauss,µ=20 σ=20^0.5
f
Binomial,n=40,p=0.5
Poisson λ=20
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
0
5
10
15
20
25
30
35
x
40
Figure 9: Approximation of binomial and Poisson distribution by Gaussian distribution
3
Derivation of basic formula for least square method
As shown in Fig. 10, on (x,y)-plane, 3 points exist (x1 ,y1 ), (x2 ,y2 ), and (x3 ,y3 ). The goal to is to
draw a line y 0 = a + b(x − x̄) so that the line come close to these 3 points. Several strategies may be
possible to realize ”come close”. Here, we draw segment of line from these points to the line in parallel
to y axis. We determine coefficient a and b so that sum of square of segment d1 , d2 , and d3 becomes
minimum value. d1 ∼d3 is expressed as,
d1 = y1 − y10 = y1 − {a + b(x1 − x̄)}
11
d2 = y2 − y20 = y2 − {a + b(x2 − x̄)}
d3 = y3 − y30 = y3 − {a + b(x3 − x̄)}.
Summation of square of d1 ∼ d3 is,
S = d21 + d22 + d23
= {y1 − a − b(x1 − x̄)}2 + {y2 − a − b(x2 − x̄)}2 + {y3 − a − b(x3 − x̄)}2
= {y12 + y23 + y32 } + 3a2 + b2 {(x1 − x̄)2 + (x2 − x̄)2 + (x3 − x̄)2 } − 2a(y1 + y2 + y3 )
−2b{y1 (x1 − x̄) + y2 (x2 − x̄) + y3 (x3 − x̄)} + 2ab{(x1 − x̄) + (x2 − x̄) + (x3 − x̄)}.
Here, a term of 2ab is omitted because its { } part is 0. Other term of S can be rewritten as,
S = A + 3a2 + b2 B − 2aC − 2bD
(20)
Here,
A = y12 + y22 + y32
B = (x1 − x̄)2 + (x2 − x̄)2 + (x3 − x̄)2
C = (y1 + y2 + y3 )
D = y1 (x1 − x̄) + y2 (x2 − x̄) + y3 (x3 − x̄).
Eq. (20) is sum of a quadratic equation of a and that of b. Then, we differentiate S by a or b to obtain
coefficient to minimize S.
∂S
∂a
= 6a − 2C = 0
C
3
a =
∂S
∂b
(21)
= 2Bb − 2D = 0
D
.
B
b =
(22)
Eq. (21) and Eq. (22) are the equation for the case that number of points is 3. Modifying these formula
so that they are valid for n points,
a = ȳ
b =
∑n
yi (xi − x̄)
∑i=1
n
2
i=1 (xi
− x̄)
(23)
(24)
In derivation of these formulas, [8] was referenced.
4
Practice
1. Generate random number of 0 and 1 using following procedure and calculate mean, standard
deviation and uncertainty of mean.
(a) Production of distribution
i. Assume that response 1 and 0 are obtained for 3 and 7 times for a MC calculation of
10 incidences.
ii. For a simplification, let’s assume that 1 appears in beginning 3 times and 0 appears
following 7 times. Let’s call this as x.
12
8
y
(x3,y3)
7
d3
6
5
y'=a+b(x-x_av)
4
(x1,y1)
3
d2
d1
2
(x2,y2)
1
0
0
1
2
3
4
5
6
7
x
8
Figure 10: Procedure of least square method
iii. Input x to Excel or Kaleida Graph. (Input 1 and 0 for A1 to A3 and A4 to A10 cell of
Excel)
(b) Calculate statistical quantity of x by hand calculation
i. Calculate variance and standard deviation of x by Eq. (8) and Eq. (9).
ii. Calculate sum of x. Also, calculate uncertainty of sum of x by Eq. (1) and Eq. (10).
iii. Calculate mean of x. Also, calculate uncertainty of mean of x by Eq. (12).
(c) Investigate statistical nature of x by using function of Excel. Read values of var, stdev,
and stdevp function of A1 to A10 and compare them with the result of hand calculation. 5
(d) If possible use function of Kaleida graph to investigate statistical nature of x. Obtain
mean, variance, standard deviation and uncertainty of mean by using statistical function
of Kaleida graph.
2. Produce another distribution of random number which is consist of 0 and 1. Investigate mean,
variance, standard deviation and uncertainty of mean. 求めよ。
(a) Production of distribution
i. Divide lower 2 digits of your student number by 100 and assume that the number is a.
ii. Generate 10 of random number r (0 < r < 1) . Set x as 1 and 0 for r < a and r > a
iii. Input x to Excel and Kaleidagraph.
(b) Investigate statistical nature pf x by the same process of problem 1 (b) to (d).
3. Confirm that variance by Eq. (3) 式 depends on n and that by Eq. (13) and Eq. (14) does not
depend on n.
(a) Numerical calculation. Generate 100 set of random numbers using Excel or other software.
Each set should contain n random number, (1 < n < 101). Calculate their variance based
on Eq. (3) 式, Eq. (13) and Eq. (14) to investigate n dependence.
(b) Certify that Eq. (14) 式 does not depend on n.
5
stdev function returns standard deviation based on non-bias, i.e. (n-1) method. stdevp function returns standard
deviation based on bias, i.e. n method. p in stdevp stands for population.
13
4. Calculate following distribution. Excel may be used.
(a) Binomial distribution with n = 10 and p = 0.4. Display it as figure and compare it with
Fig. 7.
(b) 3 kind of binomial distributions and Poisson distribution as shown in Fig. 8. Which binomial
distribution is closest to Poisson distribution?
(c) Binomial, Poisson, and Gaussian distribution shown in Fig. 9. Does Gaussian distribution
approximate binomial and Poisson distribution well?
5. Put 3 points on (x,y)-plane by following procedure. Obtain line which comes close to these
points using least square method.
(a) Disintegrate your student number (or license number) into several single digits and order
them from small one to large one. Assume that these digits corresponds to (x1 , y1 ) ∼
(x3 , y3 ). For example. if the student number is 874913, digits of 8, 7, 4, 9, 1, and 3 are
obtained and they are used as (x1 , y1 ) = (1, 3), (x2 , y2 ) = (4, 7), (x3 , y3 ) = (8, 9) .
(b) Calculate a and b by using Eq. (21) 式 and Eq. (22) or Eq. (23) and Eq. (24).
(c) Plot (x1 , y1 ) ∼ (x3 , y3 ) and draw a line of y 0 = a + b(x − x̄) on the same graph to confirm
that line was successfully placed near points by least square method.
6. Derivate next equations.
(a) Sum, mean and variance of binomial distribution. Variance can be derivate from hit probability.
(b) Sum, mean and variance of Poisson distribution.
(c) Equation to calculate coefficient a and b of least square method.
References
[1] Isotope-tetyo 10th ed. Isotope-kyokai (2001) (in Japanese).
[2] K. Miyagawa, “Elementary Statistics 3rd ed.”, Yuhikaku, Tokyo (2007), (in Japanese).
[3] P. G. Hoel, “Elementary Statistics, 4th ed.” John Wiley & Sons, Inc. New York (1976).
[4] T. Kuga, Parity 25, 54-60 (2010-06), (in Japanese).
[5] Nicholas Tsoulfanidis, “Measurement and detection of radiation”, McGraw-Hill Book Company
(1983). Section 2.8.
[6] G. F. Knoll, “Radiation detection and Measurement 3rd ed.” John Wiley & Sons, Inc. (2000).
[7] W. J. Price, “Nuclear Radiation Detection”, McGraw-Hill Book Company, Inc. (1964).
[8] http://szksrv.isc.chubu.ac.jp/lms/lms1.html
[9] H. Hirayama and Y. Namito, “Lecture Notes of Response calculation of NaI detector (cg Version)”,
KEK Internal 2011-5 (Dec 2011).
14
Appendix A. Derivation of statistical uncertainty in Monte Carlo
calculation
A.1 Calculation of statistical uncertainty ucnaicgv.f
ucnaicgv.f is an egs5 sample user code, which is utilized at short course of EGS5. In this user code,
photon is incident onto NaI detector and several quantities to characterize detector, such as peak
efficiency, total efficiency, distribution of absorption energy, are calculated. In the manual of this user
code [9], statistical uncertainty is described as follows.
The uncertainty of obtained, x, is estimated using the method used in MCNP in this user code.
• Assume that the calculation calls for N “incident” particle histories.
• Assume that xi is the result for i-th history.
• Calculate the mean value of x :
x=
N
1 ∑
xi
N i=1
(25)
• Estimate the variance associated with the distribution of xi :
s2 =
N
N
1 ∑
1 ∑
(xi − x)2 ' x2 − (x)2 (x2 =
x2 ).
N − 1 i=1
N i=1 i
• Estimate the variance associated with the distribution of x:
1
1
sx2 = s2 ' [x2 − (x)2 ]
N
N
(26)
(27)
• Report the statistical uncertainty as:
sx ' [
1 2
(x − x2 )]1/2
N
(28)
A.2 Contents of the code and numerical value
Contents of the code and numerical value are investigated to understand equation in the previous
subsection. In ucnaicgv.f, following lines are included in shower-call-loop, which is a main part for
Monte Carlo calculation,
if(depe.ge.ekein*0.999) then
pefs=pefs+wtin
pef2s=pef2s+wtin
end if
Using these lines, weight of particle which participates in peak efficiency is summed up. Here, ekein
and depe are incident energy and absorbed energy in sensitive volume for single incident particle,
respectively. Utilizing these values in if statement, some particle is analyzed if it is scored as contribution to full energy absorption peak or not. pefs is a variable to count particles in full energy
absorption peak.
After Monte Carlo simulation for all incident particle is finished, results are analyzed in step 9.
avpe=pefs/ncount
pef2s=pef2s/ncount
sigpe=dsqrt((pefs2-avpe*avpe)/ncount)
avpe and sigpe are printed as peak efficiency and its uncertainty.
The values are ncount=10000, wtin=1, pefs=3728, pef2s=3728. pef2s is used to score square of
weight of particle which participates in peak efficiency. As particle weight is 1 in this user code,
the value of pefs pef2s are the same. Utilizing these, avpe=0.3728, pefs2=0.3728, sigpe=4.8e-3 are
obtained and peak efficiency is primed as 37.28 ± 0.48% .
15
Appendix B
B.1 Uncertainty of mean
Uncertainty of mean is a value to express uncertainty of mean. Some confusion is seen on this value.
• Depending author, period, or field, this value is called with different words. If one search word of
Table 1: Search result of Google
Word
Error for the sample
Standard error of mean
Error for the sample mean
Number of hit
3,520,000
780,000
118,000
”Error for sample mean” or word which may be used for same meaning by Google, several variety
of word is seen as is shown in Table 1. In Wikipedia, ”error of mean” is called as ”standard
error” in the page of standard error. In Kaleida-graph, this quantity is called as standard error.
At Homepage of statistics bureau, Ministry of Internal Affairs and Communications of Japanese
government, it is called as sample error. Hoel [3], an example of textbook for basic statics, call
it as ”Standard error of mean x̄”. 6
• While an engineer who treats radiation learns basic statics by textbook for radiation detector.
But, many textbook on radiation detector ignore this value [6, 7], and seldom describes it [5].
This is contrast to the fact that other basic things on statistics, such as binomial, Poisson, and
Gaussian distribution, mean, variance, standard deviation, are well described in textbook for radiation
detector and its name is not changed for long time.
B.2 Description of sum, mean, variance of binomial and Poisson distribution in
textbook on radiation detector
To investigate the effectiveness of ”learning basic concept of basic statistics by textbook on radiation
detector”, it is summarized how sum, mean and variance of binomial and Poisson distribution is
described on several textbook on radiation detector.
Nicholas Tsoulfanidis, ”Measurement and detection of radiation
ance of binomial and Poisson distribution are described.
Sum, mean, and vari-
Glenn F. Knoll, Radiation detection and Measurement By mathematically simplifying
Poisson distribution, Gaussian distribution is introduced. It is mentioned that as a nature of Gaussian
distribution, expected variance is equal to x̄. But this is valid only if that Gaussian distribution is
an approximation of Poisson distribution. (In general Gaussian distribution, variance and mean are
independent.) No description is included for derivation of sum, mean, and variance of binomial and
Poisson distribution.
6
The divergence situation may be realized as follows. Initially, this value is called as ”Standard error for sample mean
x̄”. Either ”sample” or ”mean” or both of them are omitted as custom. Also by calling this value as standard error, this
value may be distinguished from standard deviation.
16
William J. Price, Nuclear Radiation Detection Equation for variance of binomial distribution is shown, but that for Poisson distribution is not shown. No description is included for derivation
of sum, mean and variance of binomial and Poisson distribution.
Sum, mean, and variance of binomial and Poisson distribution are a useful knowledge on basic
statistics which is necessary in Monte Carlo calculation and radiation measurement. In textbooks on
radiation measurement, the derivation is rare, while only equation is included. Thus it is hard to
obtain enough knowledge on basic statistics from these textbooks on radiation measurement. On the
other hand, there exist many textbooks of statistics whose level ranges from basic for undergraduate
to advanced for specialists. It is recommended to learn basic statistics from these textbooks when it
is necessary to obtain systematic knowledge of statistics which is related to Monte Carlo calculation
and radiation measurement.
17