Download Chapter 4 Fundamental knowledge of statistics for reliability

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Inductive probability wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Probability amplitude wikipedia , lookup

Central limit theorem wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
Chapter 4 Fundamental knowledge of statistics for reliability engineering
4.1 Statistics and probability
As can be found from Chapter 3, some knowledge of statistics and probability is
necessary for relaiability estimation.
Since nearly all of the undergraduate students majoring in mecahnical
engineering scarecely take a course ‘statistics’ or ‘probability’, the minimum
requisite, basic knowledge of statistics and probability will be described through
Chapters 4~6, as easily as possible.
First, basic statistical terms and probability distributions are here described.
Statistics as a scientific discipline may be defined as
“the study that collects, arranges, classifies and summarizes data of the object of
interest to derive meaningful information, or analyses and interprets data or
information obtained in part to comprehend or infer the whole characteristics of the
object.”
The former part of derivation of meaningful information by collection,
arrangement, classification and summarization of data, is called descriptive
statistics, and the latter part of understanding and inference of the whole
characteristics by analysis and interpretation of the data obtained in part, is called
inferential statistics.
When repeating observations of a certain phenomenon or experiments, there
appear many different outcomes and a certain outcome appears more frequently
than other one. The ratio of the frequency of a certain outcome to the total
frequency of all outcomes is called the relative frequency and defining probability
simply with this relative frequency is the relative frequency concept of probability.
Other definitions of probability will be described in Section 4.2.
-1-
As mentioned in the above, when repeating observations or experiments, there
occur many different outcomes without any special trend. This is called
“uncertainty”, “randomness” or “stochasticity.”
The term “stochastic” means “probabilistic” and is relatively well used in
scientific papers.
The dictionary defines “random” as “haphazard”, but in statistics, the term
“random” means “without repeatability”, “of probability” or “probabilistic”, and a
random variable is a variable that varies probabilistically. More detailed definition
of a random variable will be given later.
The term “randomness” normally means lack of pattern or predictability but also
has another meaning of “being of probability.” The phenomena having randomness
can be treated probabilistically.
-2-
4.2 Fundamental terminology of statistics and probability
4.2.1 Population and samples
The terms “population and samples” are already explained in Section 3.2.
Repeating it, sometimes we need to know a certain thing or phenomenon, for
simple example, the size of apples harvested in this year in Korea. For the purpose,
we perform observation or measurement. All of the thing or the whole of the thing
to be observed or measured is called a population. For the above example, all
apples harvested in this year in Korea correspond to a population. Except for
particular cases, it is almost impossible to observe or measure the whole of its
population because a population is usually too large in quantity.
So, usually we
can and do observe or measure only a part of a population. The part of a population
actually observed or measured is called samples. Selecting samples from a
population is called sampling.
4.2.2 Dispersion and distribution
When repeating measurements of a certain object, the measured values or data
are not always a single constant, same value, but different each other. This aspect
is called dispersion and the state of dispersion is distribution.
̅ and sample variance s2
4.2.3 Sample mean 𝒙
As the measured data are not constant but vary, the data have a certain
distribution. If we can express the state of distribution quantitatively, it is very
convenient for comparing distributions each other. The important properties
representing the state of distribution will be i) the location of the center of
distribution and ii) degree of dispersion.
-3-
The mean value is well used as a quantity representing the center of distribution
and for samples, the following arithmetic averge of the data is used and is denoted
̅.
by 𝒙
1
n
x
n
x
(4.2-1)
i
i 1
As a quantity representing the degree of dispersion, the following value called the
sample variance s2 is well used.
s2 
1
n 1
n
 ( x  x)
2
(4.2-2)
i
i 1
or
s2 
1
n
n
 ( x  x)
2
(4.2-3)
i
i 1
It is desirable to use Eq.4.2-2) of unbiased estimator for the sample variance.
Details of unbiased estimator are given in Section 5.5.
As can be found below, the mean is the value which minimizes the variance.
Assume x0 to be a value which minimizes the sample variance. So,
s2 
1
n 1
n
(x  x )
i
2
0
i 1
x0 can be obtained from the condition as
d (s 2 )
0
dx0
From this,

2
n 1
n
(x  x )  0
i
0
i 1
And
x0 
1
n
n
x  x
i
i 1
-4-
4.2.4 Bias
The difference between the mean value and the true one is called bias.
Reference: True value
The true value is essentially unknown. However, the true value in measurements is
defined as the value to be obtained by an exemplar measurement method. The exemplar
method is the method which experts agree to provide data accurate enough to use for the
specified purpose.
4.2.5 Random variable
Although there are many definitions for it, the random variable may be defined as
the variable that takes a numerical value following a probability law.
Random variables are classified into two types according to the set of values that
they can take.
If the random variable takes only certain discrete values, it is called a discrete
random variable.
If the random variable takes any real value within a specified continuous region,
it is called a continuous random variable.
When referring to a random variable generically, we denote it by a capital
(uppercase) letter such as X, and the particular value taken by it by a lowercase
letter such as x. For example, we say that a random variable X takes the values
x1 =0.3, x2 =2.3, ….
-5-
4.2.6-0 Probability mass function for discrete random variables
The probability P(X = xi) that the discrete random variable X takes the value xi is
denoted as
P(X = xi)= p(xi)
A set of probability values p(xi) for all possible values xi is called the probability
mass function.
The probability values must satisfy the following conditions:
i) 0  p( xi )  1
ii)
 p( x )  1
i
i
4.2.6 Probability density function (p.d.f.) for continuous random variables
As the probability density function is already described in detail in Section 3.2,
only the outline of it will be here given.
When the probability P(a≤X≤b) that a random variable X lies between two values
a and b (a≤X≤b) can be expressed by
𝑏
𝑃(𝑎 ≤ 𝑋 ≤ 𝑏) = ∫𝑎 𝑓(𝑥)𝑑𝑥
(4.2-4)
The function y=f(x) is called the probability density function, for which the
abbreviation p.d.f. is often used.
It must satisfy the following conditions:
i)
f(x) ≥ 0, for all x.
∞
ii) ∫−∞ 𝑓(𝑥)𝑑𝑥 = 1
(4.2-5)
If any function satisfies the above conditions, it can be a probability density
function.
The probability P(a≤X≤b) is the area under the probability density function f(x)
between the points a and b as illustrated in Fig. 4.2-1a).
-6-
By definition of the probability of Eq. 4.2-4), the probability that a random
variable X takes any specific value x is 0 as
𝑥
𝑃(𝑥 ≤ 𝑋 ≤ 𝑥) = ∫𝑥 𝑓(𝑥)𝑑𝑥 = 0
However, there may be a case when we need superficially the probability for a
specific value x. For the case, for convenience the probability can be defined as the
probability that the random valuable X lies in the infinitesimal interval between x
and x+dx, as
𝑥+𝑑𝑥
𝑃(𝑥 ≤ 𝑋 ≤ 𝑥 + 𝑑𝑥) = ∫
𝑓(𝑥)𝑑𝑥 = 𝑓(𝑥)𝑑𝑥
𝑥
And it is the area between the points x and x+ dx as illustrated in Fig. 4.2-1b).
y
y
P ( a  x  b)
f ( x)dx
y  f ( x)
a b
y  f ( x)
x
x
x
dx
(a)
(b)
Fig.4.2-1 Probability density function
4.2.7 Cumulative distribution function (c.d.f.)
The cumulative distribution function is defined as the probability that a random
variable X takes a value less than or equal to x, and expressed as
F ( x)  P( X  x) 
 p( x )
i
for the discrete random varialbe
(4.2-6)
i
The summation extends over those values of i for which xi ≤ x .
The probability F(x) corresponds to a cumulative probability, that is, the sum of
probabilities from the initial x1 to x. From this, the probability F(x) is called the
-7-
cumulative distribution function for which the abbreviation c.d.f. is often used.
The cumulative distribution function is often called simply distribution function.
F ( x)  P( X  x) 

x

f ( x)dx
for the continuous random varialbe
(4.2-7)
The probability of Eq. (4.2-7) is the area under the probability density function f(x)
up to the point x as illustrated in Fig. 4.2-2.
y
y  f ( x)
F ( x)
x
x
Fig. 4.2-2 Cumulative distribution function
For this case, the probability F(x) corresponds to the accumulated value (or sum)
of probabilities from -∞ to x.
As can be easily found from Eq.(4.2-7), the cumulative distribution function has
the following properties:
i)
F(- ∞) =0, F(∞) =1
ii)
dF ( x)
 f ( x)
dx
(4.2-8)
The property ii) in Eq. (4.2-8) is well utilized for obtaining the probability
diensity function from the cumulative distribution function which is easier to
quantify.
As can be found from Fig.4.2-2, the cumulative distribution function can be
considered to imply a ratio to the total (total probability equal to one), which
-8-
corresponds to the relative frequency concept of probability mentioned in Section
4.1. This feature makes it easy to quantify the cumulative distribution function.
The cumulative distribution function can be conveniently utilized for calculating a
probability like as
b
P (a  X  b) 
b
a


 f ( x)dx   f ( x)dx   f ( x)dx
a
 P ( X  b )  P ( X  a)
 F (b )  F (a)
4.2.8 Population mean μ and population variance σ2
Population mean and population variance are important probabilisic properties of a
population and denoted by μ and σ2, respectively. The quantities expressing the
probabilisic properties of a population such as μ and σ2 are called parameters.
4.2.9 Expected value of a random variable
It is already mentioned that the mean value is a quantity representing the center
of distribution.
The quantity termed as the expected value or mathmatical expectation is also
used for representing the center of distribution, in other words, for describing the
central tendency of a random variable, and denoted as E(X ).
The expected values are given by
E( X ) 
 x p( x )
i
i
for a discrete random variable
(4.2-9)
i

E( X ) 
 xf ( x)dx
for a continuous random variable
(4.2-10)

The expected value of a random variable corresponds to the mean value of the
random variable.
Mathematically, the expected value is the probabilities (p(xi) or f (x)dx )-weighted
average (mean).
-9-
The expected value of a certain function of a random variable, that is,
g ( x)  X 3  a3 can be obtained as
E[ g ( X )] 
(x
3
i
 a3 ) p( xi ) for a discrete random variable
(4.2-11)
i

 (x
E[ g ( X )] 
3
 a 3 ) f ( x)dx for a continuous random variable
(4.2-12)

The word “expected” may come from the fact that the expected value is what
you expect to happen on average.
4.2.10 Variance of a random variable Var(X)
In the past, the simbol V(X) was well used.
Var ( X ) 
{x  E( X )}
2
i
p( xi )  E[{ X  E ( X )}2 ]
i
for a discrete random variable
(4.2-13)

Var ( X ) 
 {x  E ( X )}
2
f ( x)dx  E[{ X  E ( X )}2 ]

for a continuous random variable
(4.2-14)
Since the value of E(X) is constant, the variance can be expressed as
Var ( X ) 
{x  E( X )}
2
i
i
p( xi ) 
x
i
2
p( xi )  {E ( X )}2
i
for a discrete random variable

Var ( X ) 


(4.2-15)

{x  E ( X )}2 f ( x) dx 
x
2
f ( x) dx {E ( X )}2

for a continuous random variable
4.2.11 Standard deviation s, sd(X) or SD(X)
The standard deviation is the square root of the variance.
The sample standard deviation is s in Eq. (4.2-2) as
- 10 -
(4.2-16)
s2 
1
n 1
n
 ( x  x)
2
→
i
s
i 1
1
n 1
n
 ( x  x)
2
i
i 1
The standard deviation of a random variable is denoted by
sd ( X )  SD( X )  Var ( X )
(4.2-17)
4.2.12 Coefficient of variation CV
CV 
s
x



The Coefficient of variation CV denotes the dispersion relative to the mean and
is a convenient measure to make comparisons in variability between different data
sets or random variables (probability distributions).
4.2.13 Median
x
The mean value is well used to represent the central location of a distribution.
However, when a certain data value is extremely larger or smaller than other data
values, the data value exerts influence on the mean value so significantly that the
obtained mean value don’t always represent the central location of the distribution.
As an alternative representative value of the central location of a distribution free
from such a fault, the median is frequently well used and is defined as
i) When arranging the data in an increasing order, the median is the value in the
middle (when the number of data is an even number, the median is the averaged
value of two data values in the middle).
ii) For a continuous random variable X, the median is the value of X at which the
cumulative distribution function has a value of 0.5, that is,
0.5  F ( x) 
The median is denoted by

x

f ( x)dx
x (x tilde)
- 11 -
4.2.14 Mode
As a quantity expressing an important feature of a distribution, there is the value
called the mode. The mode is the value at which the probability density function
has the peak value as shown in Fig. 4.2-3, that is,
df ( x)
0
dx
df(x)
dx = 0
f(x)
.
x
mode
Fig.4.2-3 Mode
For a discrete random variable X, the mode is the value of X at which the
probability mass function p(xi) has the highest value.
Conclusively, the mode is the value that has the highest probability.
4.2.15 Percentile
The 100pth percentile is the value a for which the cumulative distribution function
has a value p (0< p <1) , that is,
F(a)= P (X ≤a)= p
When p =0.5, that is, the 50th percentile is the median.
For a data set, when arranging the data in an increasing order, the value at the
p(n+1)th location is the 100pth percentile. Unless the value p(n+1) is integer, the
100pth percentile is obtained by interpolation between the two sample values in the
neighborhood of p(n+1)th location.
- 12 -
4.3 Representative discrete probability distributions
Although there are many kinds of discrete probability distributions, only two
representative ones, binomial and multinomial distributions, are necessary for this
course, for the time being. So, theses are described here.
4.3.1 Binomial distribution
The binomial distribution is very convenient distribution for probabilistic
analyses of the phenomenon which has only two outcomes such as tossing a coin
(head or tail)or testing to select defective products (pass or fail).
The probability that heads appear a particular times when a coin is tossed
several times, or the probability that i (number) products are defective in testing n
(number) products can be calculated by the binomial distribution.
To try a random experiment for the phenomenon which has only two outcomes is
called a Bernoulli experiment or Bernoulli trial. If the two outcomes are defined as
success (labeled 1) and failure (labeled 0) and their probability are p and q,
respectively,
(4.3-1) (corresponds to 4.3-5* in the Korean textbook)
p+q=1
If n-Bernoulli trials are made independently and a random variable X counts the
number of successes, the probability that the number of successes will be x, that is,
X=x is given by
n
n!
p( x)  P( X  x)    p x (1  p) n  x  n C x p x (1  p) n  x 
p x (1  p) n  x , x  0, 1, , n
x
x
!(
n

x
)!
 
(4.3-2)=(4.3.6)*
n
The notation   and
 x
n Cx
denote a combination of x objects from n objects and
are given by
- 13 -
n
n!
   n Cx 
x !(n  x)!
 x
n
  is also read as “n choose x “.
 x
Eq.(4.3-2) is easy to understand.
If the number of successes is x, the number of failures will be (n - x). The
probability for this is px(1-p)n-x. The x successes can occur anywhere among the n
trials and the number of ways of arranging (or distributing) x successes in n trials is
given by a combination of x objects from n objects.
When the probability is expressed as Eq. (4.3-2), the random variable is said to
follow a binomial distribution and is written as
X~B(n, p)
The expected value and variance of a B(n, p)random variable are calculated by
using the binomial theorem of the following equation.
(a  b) n  a n  na n  1 b 
n(n  1)
a n  2b 2 
2
n


r 0
n
n
Cr anrbr 
n
  r a
nr
n(n  1)  (n  r  1)
1 2  3 r
a n  r b r  b n
br
r o
(4.3-3)=(4.3-7)*
The expected value and variance of a B(n, p)random variable are
(4.3-4)=(4.3-8)*
E(X)=np
(4.3-5)=(4.3-9)*
Var(X)=np(1-p)=npq
These values are obtained as follows:
- 14 -


E[ X ] 
n
xp( x) 
x 
x
n Cx p
x
(1  p ) n  x
x 0
n

 x x !(n  x)! p
n!
x 0
n

(1  p )n  x
x
 ( x  1)!(n  x)! p
n!
x
(1  p )n  x
x 1
n
 np
(n  1)!
 ( x  1)![(n  1)  ( x  1)]! p
x 1
(1  p )( n 1) ( x 1)
x 1
(If we set y  x  1 )
n 1
 np
(n  1)!
 y![(n  1)-y]! p
y
(1  p )( n 1)  y
y 0
 np[ p  (1  p )]n 1  np


V[X ] 
n
x 2 p ( x)  ( E[ X ]) 2 
x 
n

x
x 0
n

2
 n C x p x (1  p ) n  x  (np ) 2
n!
p x (1  p )n  x  (np )2
x !(n  x)!
 [ x( x  1)  x] x !(n  x)! p

2
x 0
n!
x 0
n
 x( x  1) x !(n  x)! p
n!
x 0
n

x
 ( x  2)!(n  x)! p
n!
x
x
x
(1  p )n  x  (np ) 2
(1  p )n  x 
n
 x x !(n  x)! p
n!
x
(1  p ) n  x  (np ) 2
x 0
(1  p )n  x  E[ X ]  (np )2
x2
n
 n(n  1) p 2
(n  2)!
 ( x  2)![(n  2)  ( x  2)]! p
x2
(1  p )( n  2) ( x  2)  np  (np )2
x2
(By setting y  x  2)
n-2
 n(n  1) p 2
(n  2)!
 y![(n  2)-y]! p
y
(1  p )( n  2)  y  np  (np ) 2
y 0
 n(n  1) p [ p  (1  p )]n  2  np  (np )2
2
 n(n  1) p 2  np  (np )2  np 2  np  np (1  p )  npq
- 15 -
A binomial distribution will be used in Sections 7.5 and 9.3.4.
If the success probability p is not so close to 0 or to 1, and np≥5 and n (1-p)≥5,
the binomial distribution can be approximated by the normal distribution to be
described in the following section. For approximation, the expected value and
variance of a binomial distribution np and np (1-p) are replaced by the population
mean and variance of a normal distribution μ and σ2, respectively.
If the random variable X is a binomial random variable as X ~ B(n, p) and denotes
the number of successes and the relative ratio of successes X/ n is employed as a
random variable denoted by X ,
is, X 
the probability that X takes the value of k/n, that
k
is given by the following binomial distribution
n
n
k
n!
P( X  )    p k (1  p) n  k  n Ck p k (1  p ) n  k 
p k (1  p ) n  k
n k 
k !(n  k )!
(4.3-6)=(4.3-10)*
The expected value and variance of X is given by
E( X )  p
Var ( X ) 
(4.3-7)=(4.3-11)*
p(1  p ) pq

n
n
(4.3-8)=(4.3-12)*
As k denotes the number of successes, the random variable X 
k
has the
n
binomial distribution of Eq. (4.3-6). Its expected value and variance can be
obtained similarly as in the above.
The binomial distribution table may be found in some books on statistics. A part
of the table is shown in Table 4.3-1.
- 16 -
Table 4.3-1. Binomial distribution table for n=10
n
p
k
0.05
10
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0
0.5987 0.3487 0.1969 0.1074 0.0563 0.0282 0.0135 0.0060 0.0025 0.0010
1
0.9139 0.7361 0.5443 0.3758 0.2440 0.1493 0.0860 0.0464 0.0233 0.0107
2
0.9885 0.9298 0.8202 0.6778 0.5256 0.3828 0.2516 0.1673 0.0996 0.0547
3
0.9990 0.9872 0.9500 0.8791 0.7759 0.6496 0.5138 0.3823 0.2660 0.1719
4
0.9999 0.9984 0.9901 0.9672 0.9219 0.8497 0.7515 0.6331 0.5044 0.3770
5
1.0000 0.9999 0.9986 0.9936 0.9803 0.9527 0.9051 0.8338 0.7384 0.6230
6
1.0000 0.9999 0.9991 0.9965 0.9894 0.9740 0.9452 0.8980 0.8281
7
1.0000 0.9999 0.9996 0.9984 0.9952 0.9877 0.9726 0.9453
8
1.0000 1.0000 0.9999 0.9995 0.9983 0.9955 0.9893
9
1.0000 1.0000 0.9999 0.9997 0.9990
As a binomial distribution can be approximated by a normal distribution if n is
large, the expected value and variance of Eqs. (4.3-7, 8)are well used for
inference of approval rating in public opinion polls and so on.
- 17 -
4.3.2 Multinomial distribution
The multinomial distribution is used for probabilistic analyses of the phenomenon
which has three or more outcomes such as throwing a die.
Consider a random event or a random trial which can have k possible outcomes
A1,…, Ak that occur with probabilities p1,…, pk satisfying p1  p2    pk  1 .
When n trials are performed, the probability that the outcomes A1,…, Ak occur
x1,…, xk times, respectively is given by
p( x1 , x2 ,, xk ) 
n!
p x1 p x2  pkxk
x1! x2 ! xk ! 1 2
(4.3-9)=(4.3-13)*
for nonnegative integers x1,…, xk with x1  x2    xk  n .
The probability distribution expressed as Eq. (4.3-9) is called a multinomial
distribution and denoted by
( X1 ,
, X k ) ~ MNk (n, p1,
, pk )
This distribution is used in Sections 7.2 and 7.5.
Although the Poisson distribution is described in the Korean textbook, it is
omitted here because it is not always necessary within the limit of this course.
- 18 -
4.4 The normal distribution as a representative continuous probability distribution
and the central limit theorem
4.4.1 Normal distribution and standard normal distribution
The normal distribution is the most important, representative continuous
probability distribution well expressing many naturally occurring phenomena and
measurement error distribution.
Remember that the averaged strength model already described in Section 3.3.1 as
a material strength model is related to the normal distribution.
The probability density function (p.d.f.) of normal distribution is given as
1
f ( x) 
2 
e
1 x 2
 (
)
2 
  x  
,
(4.4-1)
where μ and σ2are the population mean and variance, respectively.
When a random variable X follows a normal distribution with mean μ and variance
σ2, we denote X~N(μ, σ2) and say the random variable X is “normally distributed.”
As the normal distribution was discovered by Karl Friedrich Gauss (1777~1855)
through his study of error theory, it is called Gaussian distribution which is often
more frequently used than the name “normal distribution.” Remember it!
The normal distribution is symmetrical about its mean value μ and takes the
well-known bell-shaped curve as shown in Table 4.4-1.
The probability of a normal distribution
P ( a  X  b) 

b
a
f ( x)dx 

b
a
1
2 
e
1 x 2
 (
)
2 
dx
(4.4-2)
can not be easily obtained because there is no closed-form solution for the
integral.
- 19 -
In order to calculate the probability, Eq. (4.4-2) is standardized as follows:
If we transform the variable X of Eq. (4.4-2) into Z given by
Z
X 
(4.4-3)

Eq. (4.4-2) is changed to
P(a  X  b) 
b

f ( x)dx 
a
b

a
1
2 
1 x 2
 (
)
e 2  dx 
b'

a'
1
2

e
z2
2 dz

b'

a'
f ( z )dz
(4.4-4)
where
a' 
a
, b' 

b
(4.4-5)

The new random variable Z 
X 
is called the standard normal random

variable and its probability density function f(z) is called the probability density
function of the standard normal distribution
f ( z) 
1
2

e
z2
2
  z  
,
(4.4-6)
This probability density function has the mean μ=0 and the variance σ2 =1.
The standard normal distribution is denoted by N(0, 1).
The probability of the standard normal distribution is given as a function of z in
tabular forms such as Table 4.4-1.
The probability in Table 4.4-1 corresponds to
P (0  Z  z )  p 

z
0
1
2

e
x2
2 dx
- 20 -
(4.4-7)
Table 4.4-1 Table of the standard normal distribution-1
- 21 -
Depending on tables, the probability
P( Z  z )  p 
z
1

2


e
x2
2 dx
(4.4-8)
or
P( Z  z )  p 

1
z
2


e
x2
2 dx
(4.4-9)
happens to be given. Caution is needed for using tables of the standard normal
distribution.
The probability P(a ≤ X ≤ b) can be calculated through transformation of variable
as follows:
P (a  X  b)  P (
a

 P (a ' 
Z 
a

X 

Z

b
b


)
 b ')
:
 P (a '  Z  b ')
Example: Calculate the probability P(2 ≤ X ≤ 10) for X~N(μ, σ2) =N(5, 42)
Solution)
25
X  5 10  5
Z 

) : transformation of variable
4
4
4
 P( 0.75  Z  1.25)
P(2  X  10)  P(
 P( 0.75  Z  0)  P(0  Z  1.25)
 P(0  Z  0.75)  P(0  Z  1.25) : use of symmetry
=0.2734+0.3944=0.6678 : from Table4.4-1
More detailed table of the standard normal distribution is shown for z≥3.4 in
Table 4.4-2.
- 22 -
Table 4.4-2 Table of the standard normal distribution-2
z
3.4*
*=0
.49966
1
.49968
2
.49969
3
.49970
4
.49971
5
.49972
6
.49973
7
.49974
8
.49975
9
.49976
3.5*
3.6*
3.7*
3.8*
3.9*
.49977
.49984
.49989
.49993
.49995
.49978
.49985
.49990
.49993
.49995
.49978
.49985
.49990
.49993
.49996
.49979
.49986
.49990
.49994
.49996
.49980
.49986
.49991
.49994
.49996
.49981
.49987
.49991
.49994
.49996
.49981
.49987
.49992
.49994
.49996
.49982
.49988
.49992
.49995
.49996
.49983
.49988
.49992
.49995
.49997
.49983
.49989
.49992
.49995
.49997
The following probabilities are important as the properties of the normal
distribution.
1. The probability that a variable takes a value within the range of one standard
deviation about its mean : P(μ - σ ≤ X ≤ μ + σ)
P(    X     )  P(1 
X 

 1)  P(1  Z  1)
(4.4-10)
 2 P(0  Z  1)  2  0.3413  0.6826
2. The probability that a variable takes a value within the range of two standard
deviations about its mean : P(μ - 2σ ≤ X ≤ μ + 2σ)
P(   2  X    2 )  P(2 
X 

 2)  P(2  Z  2)
(4.4-11)
 2 P(0  Z  2)  2  0.4773  0.9546
3. The probability that a variable takes a value within the range of three standard
deviations about its mean : P(μ - 3σ ≤ X ≤ μ + 3σ)
P(   3  X    3 )  P(3 
X 

 3)  P(3  Z  3)
 2 P(0  Z  3)  2  0.4987  0.9974
- 23 -
(4.4-12)
The probability that a variable is included in the range of ±3σ is 99.7%. This
means that almost all sampling data may be included in the range of ±3σ.
Therefore, the range of ±3σ was well employed as an upper limit for sampling data.
In the recent, the concept of six sigma has been introduced as a management
strategy and then the probability for values of z larger than 3 is needed.
A table of the standard normal distribution for z≥4 is shown in Table 4.4-3
where the probability is defined as
P( Z  z )   

1
z
2

e

x2
2
dx
Table 4.4-3 Table of the standard normal distribution-3
P(Z  z )   

1
z
2

e

x2
2
dx

z
0
z

z

z

z

z

4.00
4.05
4.10
4.15
4.20
4.25
4.30
4.35
4.40
4.45
3.1686*10-5
2.5622*10-5
2.0669*10-5
1.6633*10-5
1.3354*10-5
1.0696*10-5
8.5460*10-6
6.8121*10-6
5.4170*10-6
4.2972*10-6
4.50
4.55
4.60
4.65
4.70
4.75
4.80
4.85
4.90
4.95
3.4008*10-6
2.6849*10-6
2.1146*10-6
1.6615*10-6
1.3023*10-6
1.0183*10-6
7.9435*10-7
6.1815*10-7
4.7987*10-7
3.7163*10-7
5.00
5.10
5.20
5.30
5.40
5.50
5.60
5.70
5.80
5.90
2.8710*10-7
1.7012*10-7
9.9834*10-8
5.8022*10-8
3.3396*10-8
1.9036*10-8
1.0746*10-8
6.0077*10-9
3.3261*10-9
1.8236*10-9
6.00
6.10
6.20
6.30
6.40
6.50
6.60
6.70
6.80
6.90
9.9012*10-10
5.3238*10-10
2.8347*10-10
1.4947*10-10
7.8049*10-11
4.0358*10-11
2.0665*10-11
1.0479*10-11
5.2616*10-12
2.6161*10-12
7.00
7.10
7.20
7.30
7.40
7.50
7.60
7.70
7.80
7.90
1.2881*10-12
6.2805*10-13
3.0320*10-13
1.4500*10-13
6.8612*10-14
3.2196*10-14
1.4988*10-14
6.8834*10-15
3.1086*10-15
1.4433*10-15
- 24 -
4.4.2 Central limit theorem
X1, X 2 , , X n with means 1 ,  2 , ,  n and
Consider the random variables
variances 12 ,  22 , ,  n2 and further another random variable Y expressed as
Y
 X  

i
i
(4.4-13)
2
i
If the random variables X1, X 2 , , X n follow normal distributions, the random
variable Y will have the standard normal distribution N (0, 1). Even when the random
variables X1, X 2 , , X n do not follow normal distributions, if no single variable
contributes significantly to the sum, the limit distribution of the random variable Y
tends to a standard normal distribution N (0, 1) as n increases to infinity. This is
known as the central limit theorem.
The central limit theorem explains why many naturally occurring phenomena can
be approximated by the normal distribution. The theorem is known to hold well as
long as n≥30.
If all the random variables X1, X 2 , , X n are obtained from a single, identical
population

i
with
 1  2 
mean
μ
  n  n
and
variance

and
2
i
σ2 ,
  2  2 
X
i
 X1  X 2 
  2  n 2
are
 Xn  nX
obtained.
Consequently,
Y
 X  

i
2
i
i

n X  n
n
2

X 

(4.4-14)
n
Owing to the central limit theorem, Eq.(4.4-14) implies that the sample mean X
of a random variable X
follows the normal distribution N(μ, σ2/ n) .
- 25 -
,
This can be successfully utilized for probabilistic analysis of the sample mean
obtained from a population having no normal distribution.
The important is that this is for the sample mean, not for individual sample data.
Example) A new material has been developed. Although the probabilistic
distribution of hardness of the material is not clear, the mean and standard
deviation of Vickers hardness are known to be 180 and 18, respectively. When 36
samples are obtained from the material,
a) How can the distribution of mean value of hardness be described due to the
central limit theorem?
b) Calculate the probability that the mean value of hardness exceeds 190.
c) Calculate the probability that the mean value of hardness is smaller than 175.
Solutions)
a) The variable of sample mean X follows the normal distribution N(μ, σ2/ n).
From μ =180, σ =18 and n =36, the distribution of X is approximated by
N(180, 182/ 36) = N(180, 32)
b)
P( X  190)  P(
c)
X  180 190  180

)  P( Z  3.33)  0.5  P(0  Z  3.33)  0.5  0.4996  0.0004
3
3
P( X  175)  P(
X  180 175  180

)  P( Z  1.667)  P( Z  1.667) : use of symmetry
3
3
 0.5  P(0  Z  1.667)  0.5  0.4522  0.0478
- 26 -
4.5 Main features of the normal distribution
As most of all naturally occurring phenomena follow the normal distribution, it
will be very useful to understand well the important features of the normal
distribution.
Feature-1)
If the random variable X has a normal distribution N(μ, σ2) , then
the random
variable Y=aX + b (a and b are constants) follows the normal distribution
N(aμ+ b, a2σ2).
Feature-2)
If the random variables X and Y have normal distributions N(μx, σx2) and N(μy, σy2),
respectively, then
the random variable U=aX + bY (a and b are realnumbers)
follows the normal distribution N(aμx+ bμy, a2σx2 + b2 σy2).
The feature-2) is an important, basic feature of the normal distribution. The
following features-3 and 4) are derived from it. It is desirable to memorize the
feature-2).
Feature-3)
If all the random variables X1, X 2 , , X n follow a normal distribution N(μ, σ2),
then
the arithmetic mean X 
1
n
n
X
i 1
i
follows the normal distribution N(μ, σ2/ n).
- 27 -
Comparing the distribution of arithmetic mean
X with that of the random
variable X, the distribution of X has thinner, sharper bell-shaped curve than that
of X, as shown in Fig.4.5-1.
Fig. 4.5-1 Comparison between the distributions of X and X
Since the feature-3) is frequently well used in statistics, it is very useful to
memorize it.
Feature-4)
If
the
two
independent
sample
random
variables
, Ym follow normal distributions N(μx, σx2) and N(μy, σy2), respectively, then
Y1 , Y2 ,
the difference ( X  Y ) between the two arithmetic mean
1
Y
m
and
X1, X 2 , , X n
m
Y
i
follows the normal distribution
i 1
- 28 -
N ( x   y ,
 x2
n

 y2
m
).
X 
1
n
n
X
i 1
i
and