Download SQC_Module2_DD-done

Document related concepts
no text concepts found
Transcript
Statistical Quality Control in Textiles
Module 2:
Statistical Description of Quality
Dr. Dipayan Das
Assistant Professor
Dept. of Textile Technology
Indian Institute of Technology Delhi
Phone: +91-11-26591402
E-mail: [email protected]
Random Variable
Random Variable
The science and application of statistics deal with quantities that can
vary. Such quantities are called variables. Generally, there are two
types of variables, depending on the origin and nature of the values
that they assign.
A variable whose values are obtained by comparing the variable
with a standard measuring scale such that they can assign any
number including whole numbers and fractional numbers is known
as continuous variable.
A variable whose values are obtained simply by counting such that
they can be always whole numbers or integers, is known as
discontinuous or discrete variable.
Characteristics of Random Variable
A random variable is generally characterized by its statistical as well
as probability characteristics.
The basic statistical characteristics include mean (measure of central
tendency of the data) and standard deviation (measure of variation
in the data). In order to get more information on the data, the
frequency distribution of the data need to be evaluated.
The probability characteristics include probability distribution and
its parameters.
Continuous Random Variable
The Continuous Random Variable x
Let x be a continuous random variable, xxmin,xmax. Let the number
of measurements be n. Then x takes the values x1, x2, x3, …,xn.
Alternately, we write x takes the values xj, where j=1, 2, 3, …, n. The
measured values of x are different, because the variable x is a
random variable. Actually, the number of measurements is limited
mainly because of the time and the capacity of the measuring
instrument. Here, we consider that the number of measurements is
very large and it can be increased without any limitation of time.
Statistical Characteristics of x
1 j n
Mean: x   x j
n j 1
2
1 j n
1 j n 2
s    x j  x     x j  2x j x  x 2  
n j 1
n j 1
Variance:
2
j n
1 j n 2
1 j n
1 j n 2
1
2 1
  x j  2 x  x j  x 1   x j  2 x x  x 2 n 
n j 1
n j 1
n j 1
n j 1
n
x
1 j n 2
1 j n 2
2
2
  x j  2x  x   x j  x 2  x2  x 2
n j 1
n j 1
where
1 j n 2
x   xj
n j 1
2
is the mean of the square values of x.
Standard deviation: s   s 2
Distribution of x
Let us divide the data domain xmin, xmax into m number of classes of
x x
constant class interval x as follows
x  max min
m
Let us mark the classes by serial number i=1, 2, …, m. Then, we get
Serial
No.
(i)
Class Interval
Lower Limit
Upper Limit
1
xmin
xmin + x
2
xmin + x
xmin + 2x
m
xmax - x
xmax
Class Value
(Mid-Value)
(xi)
x
2
3x

2
xmin 
xmin
xmax 
x
2
Class
Frequency
(ni)
Relative
Frequency
(gi)
Relative
Frequency Density
(fi)
n1
g1  n1 n
f1  g1 x
n2
g 2  n2 n
f 2  g 2 x
nM
g m  nm n
f m  g M x
im
n   ni
i 1
im
1   gi
i 1
Histogram of x [1]
fi
xmin
fi
M
6
m=6
m
M
1
2
δx
δx δx δx δx δx xmax
Mm=12
 12
1 2
x
xmin
fi
m
............... M
δx δx δx δx δx δx δx δx δx δx δx δx
x
xmax
Mm=24
 24
xmin
1
...................... m
M
δx
δx
xmax
x
Observation
As the number of class increases, the width of class decreases. The
contours of the histogram should have roughly similar shape, but the
steps become smoother until they “diminish” and become “infinitely
small”. This is valid if and only if the chosen “higher” number of class is
very small as compared to the number of measurements, that is m<<n.
For example, if we choose the number of classes significantly higher than
the number of measurements, then it is possible that some classes may
have zero frequency or very small value of frequency, then the shape of
the histogram will change significantly. Therefore, we modify our
procedure. We always double the number of measurements before
doubling the number of classes. Then, the values of class frequency ni are
“approximately” doubled and the relative frequencies of these classes
remain “approximately” unchanged. Thus, we ensure that after doubling
the number of classes it is always valid that m<<n.
Statistical Characteristics
For a finite (limited) number m of classes, the statistical
characteristics of the random variable x are described below. It is
known that in a given class, the measured value xj does not differ by
more than x/2 from the class value xi. For simplicity, we consider
that all values in a given class are of same value xi. Then,
Mean:
i m
i m
1 i m
1 i m
x    xi ni     xi ngi     xi gi     xi f i x 
n i 1
n i 1
i 1
i 1
Mean of square values:
i m
im
1 i m 2
1 i m 2
2
x    xi ni     xi ngi     xi gi     xi2 f i x 
n i 1
n i 1
i 1
i 1
2
Statistical Characteristics (Continued)
Variance:
im
1 i m
1 i m
2
2
2
s    xi  x  ni    xi  x  ng i    xi  x  f i x 
n i 1
n i 1
i 1
n
g
2
i
i
i m
 i m

2
   x  2 xi x  x  f i x   x f x  2 x   xi f i x   x  f i x 
i 1
i 1
i 1
 i 1

i m
i m
2
i
2
2
i i
1
2




2
2
i m
i

m
i

m
i

m
i m
i m
im










  xi2 f i x  2   xi f i x    xi f i x     xi fi x    xi2 f i x  2   xi f i x     xi f i x  
i 1
i 1
  i 1
 i 1
  i 1

 i 1
  i 1

x
x




2


i m
i

m


  xi2 f i x    xi f i x   x 2  x 2
i 1
 i 1

2

x


x
Standard deviation:
s   s2
Discussion
Here, we used the class value xi for all calculations. This value may
differ by x/2 from the real measured value xj. As a result, the
statistical characteristics obtained by using the class value are
erroneous and this error decreases as the class width (x) decreases.
Let us now decrease the class width by (a) increasing the number of
classes say twice and also (b) increasing the measurements say twice
and repeat this procedure to infinity.
As a result, intuitively, the class width becomes smaller and smaller
until it becomes “infinitesimal”. Such a class with infinitely small
width is defined as “elementary class”, its width is denoted by the
differential symbol dx, instead of the symbol x used to denote higher
or finite value. Then,
Discussion (Continued)
1.) The contours of the histogram should
have roughly similar shape, but the steps
become smoother until they “diminish”
and become “infinitely small”. The
contours of the histogram change to a
continuous function called probability
density function f(x).
f  x
x
xmin
x
dx
xmax
2.) As the number of class is infinitely high, it is impossible to
identify them by serial numbers i. The elementary class having
lower limit x and upper limit x+dx will be simply called as
“elementary class of x”.
3.) The area under each elementary class of x is f(x)dx. This product
expresses the relative frequency of x in an elementary class of lower
limit x and upper limit x+dx.
Discussion (Continued)
4.) The area under the probability density curve still remains one.
xmax
Thus,  f  x  dx  1. In other words, the integration of all probabilities
x
(“cumulative probability”) from xmin to xmax equals to one. It is
min
possible to find out the cumulative probability of x from the
x
following expression: F  x    f  w dw.
The function F(x) is known as
x
cumulative distribution function or simply distribution function.
min
Note: Here we use integral expression
 iM 
expression    .
 i 1 
 xmax
 
x
 min




instead
of
summation
Remark: For simplicity, we suggest that the domain of values x is finite
(closed) interval xmin , xmax . It can be proved that it is valid even when xmin  
and xmax  .
Statistical Characteristics
1) We use “relative frequency” f(x)dx, which belongs to the
“elementary class of x” instead of relative frequency of the i-th
class gi=fix. (As the elementary class width is infinitely small, the
error of calculation mentioned before is thus eliminated.)
2) The value of x is used as the class value of the elementary class x
instead of the middle value xi of the i-th class.
xmax
3) We use the integral expression  instead of the summation
iM
x
expression 
min
i 1
Then, the following expressions are valid to use:
Mean:
x
xmax

xmin
x f  x  dx
Mean of square values:
x 
2
xmax

xmin
x 2 f  x  dx
Statistical Characteristics (Continued)
Variance:
s2 
xmax
 x  x
2
f  x  dx
xmin
Standard deviation:
s  s2 
xmax
 x  x
2
f  x  dx
xmin
rth
central moment:
mr 
xmax

 x  x  f  x  dx
r
xmin
rth non-central moment:
mr 
xmax
  x  0
xmin
r
f  x  dx
Probability
According to the classical definition of probability, it is the ratio of the number of
successful outcomes to the number of all outcomes. If we have n measurements and
only ni measurements belong to i-th class (i=1, 2, …,m) then, the probability that a
randomly chosen value belongs to i-th class is
ni
Pi   gi  fi x
n
We see that probability and relative frequency possess the same value, that is,
probability is relative frequency and vice-versa. Relative frequency is used when
we would like to characterize a value which is already measured. It means relative
frequency is used as “ex post”. In opposite to this, probability is used to explore the
future based on past investigation. Hence probability is used as “ex ante”.
The earlier concept of “relative frequency” and “probability” for a class of certain
width is also applicable for an “elementary class”. Thus, we understand the
meaning of f(x)dx not only as the relative frequency of x in the elementary class of
lower limit x and upper limit x+dx, but also as the “probability of occurrence”
(future measured values) of x in the elementary class.
Normal Distribution
Normal Probability Distribution
Let us consider xmin   & xmax   and assume that x follows
normal probability distribution. Then, its probability density
function takes the following form
 1  x   2 
1
f  x 
exp   
 
2
 2    
where,



  0,   0.5
f  x
  0,   1
x f  x  dx

  0,   2


 x  x
2
f  x  dx

x
Standard Normal Probability Distribution
x 
Consider a variable u such that u   and u   ,   . Assume that u
follows normal distribution with mean equals to zero and standard
deviation equals to one. Then, its probability density function   u  is
 u2 
1
 u  
exp  
2
 2
u is called as standard normal
variable.
u
u=0
u=1 or u=-1
u=2 or u=-2
u=3 or u=-3
 u 
0.3989
0.2419
0.0540
0.0044
 u 
u
Standard Normal Probability Distribution (Continued)
The distribution function of u is shown below
 u  
u
u


   v  dv  
 v2 
1
exp   dv
2
 2
This integral is known
as
Laplace-Gauss
integral. It has no
analytical solution, but
it can be solved by
numerical integration.
 u 

u
Standard Normal Probability Distribution (Continued)
Take u  1
 1 
1
   v  dv 

0
=


1


 u 
 v
1
exp  
2
 2
1
 v2 
1
exp    dv  
2
 2
0

 dv =

 v2 
1
exp    dv 
2
 2
2
 0.050000
1  v2 v4 v6
v8
v10
v12
=0.500000+
1  2  8  48  384  3840  46080 

2 0 
u
 1  0.8412
u

 dv =

1
1  u3 u5 u7
u9
u11
u13
=0.500000+
u  6  40  336  3456  42240  599040 
2 

 
0
1  1 1
1
1
1
1
 0.500000 
1








6
40
336
3456
42240
599040
2 
 0.500000  0.341190  0.8412

 
 u 
1   1 
 0.1588
u
Relationships
  x   2 
  x   2 


 

 



1
   1 1
   1  x   1
f  x 
exp   

exp   
 
  u 





2
 2
2
    
2




 u 




 x  


  
F  x 
x
x


 f  w dw  
  w   2 
 x   
 
 
2


1
1
v

 x  


 dw 
exp  
exp

d
v


  u 







2
2
2
  
 2





Let w  v  , dw   dv
Relationship (Continued)
 u 
or
f  x 
0.6827
0.9545
0.9974
x  3
x  2
x  1
x
u-scale
x  1
x  2
x  3
x-scale
Practical Example
Example: Yarn Strength (cN.tex-1) Dataset
14.11
14.99
15.08
13.14
13.21
15.79
13.78
15.65
15.47
14.41
15.85
14.84
12.26
11.93
14.08
15.32
14.57
16.80
14.31
13.69
15.16
15.12
17.03
13.09
17.97
14.41
12.35
13.69
15.58
13.90
16.38
15.36
15.21
16.49
13.99
12.86
11.82
14.31
15.05
14.92
15.65
14.48
14.45
16.14
14.62
16.80
12.52
15.76
11.87
14.08
13.25
14.67
15.10
15.10
14.38
14.04
15.67
15.44
14.67
12.93
12.40
15.90
16.53
14.43
13.01
14.45
14.62
15.77
17.12
13.40
13.56
13.62
13.40
14.05
13.62
15.26
14.67
14.08
13.44
14.67
14.87
13.35
12.72
13.40
13.78
17.06
14.53
14.18
11.98
15.58
17.51
16.14
13.94
13.31
14.84
13.45
15.58
15.90
13.17
16.53
14.08
15.85
15.46
14.17
13.35
13.41
13.25
15.90
15.03
15.56
12.42
14.16
15.90
14.58
15.90
13.40
14.03
15.44
13.44
14.82
14.43
13.67
15.42
14.84
14.18
16.17
15.36
13.62
13.62
12.44
15.21
16.43
14.97
12.86
14.67
14.08
13.73
16.34
12.72
16.01
13.78
12.90
14.31
14.53
14.99
15.44
14.08
15.44
14.85
13.41
13.69
12.72
12.72
14.18
15.41
14.87
16.94
14.38
13.40
17.89
16.70
11.09
17.71
13.84
14.08
14.92
13.81
13.39
17.09
14.62
14.94
14.68
15.05
13.78
14.48
13.60
16.63
14.18
14.41
13.22
13.29
14.92
15.62
16.09
13.28
15.67
14.99
14.71
10.57
14.92
14.84
15.68
15.05
14.84
15.10
15.10
12.72
14.09
14.31
15.65
14.67
15.94
13.30
12.29
14.41
10.84
17.64
12.34
16.69
13.99
13.11
15.16
12.23
14.15
15.44
13.89
16.19
15.85
13.73
14.18
14.31
12.80
15.34
15.31
17.17
12.95
14.62
15.44
13.32
15.34
12.72
14.08
13.51
12.91
13.50
13.26
15.62
15.08
14.92
16.53
14.40
14.76
14.67
13.14
14.08
16.96
13.44
14.31
13.79
13.89
15.68
15.86
13.84
13.06
14.87
14.71
12.23
16.32
14.84
14.54
13.78
14.67
15.90
14.53
13.21
13.06
13.53
17.36
14.92
16.34
14.57
13.44
13.85
15.94
13.78
13.60
14.76
14.84
13.60
14.58
15.47
14.99
12.47
16.08
14.31
14.99
12.53
13.25
12.81
16.11
16.35
16.48
12.47
14.08
13.78
12.60
13.35
13.51
13.06
15.58
13.89
13.87
15.12
15.36
12.98
16.19
13.51
14.18
14.53
12.19
12.96
15.70
16.32
15.90
14.31
14.35
15.20
16.19
15.15
13.17
13.69
14.18
13.21
14.31
15.26
14.99
14.72
15.49
14.84
15.62
15.12
12.91
13.21
15.67
16.43
17.12
14.53
14.62
13.69
15.68
11.44
14.53
12.93
13.30
14.13
15.03
15.68
14.31
16.14
13.85
13.55
15.65
14.67
11.97
13.89
14.97
14.58
15.68
14.43
13.44
15.16
17.49
13.82
15.35
13.48
14.41
14.08
14.67
14.99
16.96
15.71
13.85
14.52
13.94
12.44
14.09
12.72
14.84
16.14
15.94
15.16
15.01
14.18
16.70
14.59
14.31
15.21
12.72
13.89
14.41
15.16
14.31
16.53
15.16
14.67
14.08
11.92
13.56
14.41
15.37
15.21
16.35
13.35
14.92
13.62
16.80
15.71
14.99
14.82
13.62
14.53
15.26
15.12
14.84
16.34
16.11
15.90
15.21
13.06
14.04
13.44
15.58
15.31
16.96
15.58
14.31
15.65
18.02
12.32
14.77
13.42
14.31
15.58
15.90
14.62
14.26
16.43
13.81
15.16
14.22
14.31
13.40
13.21
15.16
15.22
15.81
14.18
16.14
16.11
16.80
Original Dataset: Statistical Characteristics
Let us denote yarn strength by x. Then,
Mean: xcNtex

-1 

1 j 450

x j cNtex-1   14.57


450 j 1 
Variance: s 2 cN tex

2
-2 

1 j 450 2

x j cNtex-1   x2cNtex-1   1.52




450 j 1 
Standard deviation: scNtex   1.23

-1

Grouped Dataset: Frequency Distribution
Class Interval
(cN.tex-1)
Class Value
xi
(cN.tex-1)
Frequency
ni
(-)
Relative
Frequency gi
(-)
Relative Frequency
Density fi
(cN-1.tex)
10.00-11.00
10.50
2
0.0044
0.0044
11.00-12.00
11.50
8
0.0178
0.0178
12.00-13.00
12.50
37
0.0822
0.0822
13.00-14.00
13.50
102
0.2267
0.2267
14.00-15.00
14.50
140
0.3111
0.3111
15.00-16.00
15.50
104
0.2311
0.2311
16.00-17.00
16.50
43
0.0956
0.0956
17.00-18.00
17.50
13
0.0289
0.0289
18.00-19.00
18.50
1
0.0022
0.0022
450
1.0000
TOTAL
Grouped Dataset: Histogram
0.4
0.3
f cN-1 tex  0.2


0.1
0
0 10 11 12 13 14 15 16 17 18 19
xcNtex-1 


Grouped Dataset: Statistical Characteristics
Mean:
xcNtex -1 

Variance: s

1 i 9 


 ni xi cNtex -1    14.56


 
450 i 1 
2
cN 2 tex -2 


2
 1 i 9 
 
2



 ni xi cNtex-1      xcNtex -1    1.69


 

 
 450 i 1 
Standard deviation: scNtex   1.30

-1

Comparison
Statistical
Characteristics
xcNtex-1 


s 2
cN 2 tex -2 


scNtex -1 


Original Dataset
Grouped Dataset
14.57
14.56
1.52
1.69
1.23
1.30
Grouped error!
Fitting with Normal Distribution
Experimental Distribution
xi cNtex -1   x 
xi cNtex-1 



s

cNtex -1


cNtex -1


ni fi cN-1 tex 

 ui

Theoretical (Normal) Distribution
fi cN-1 tex   


cNtex -1


   ui 
  ui 
(from PDF of
standard normal
distribution)
n xcNtex -1 

f i cN -1 tex 



 ui   
 n xcNtex -1 


  ui  


cNtex -1 

10.50
-3.12
2
0.0044
0.0057
0.0031
1.081
11.50
-2.35
8
0.0178
0.0231
0.0252
8.73 9
12.50
-1.58
37
0.0822
0.1069
0.1145
39.65 40
13.50
-0.82
102
0.2267
0.2947
0.2850
98.64 99
14.50
-0.05
140
0.3111
0.4044
0.3984
137.93 138
15.50
0.72
104
0.2311
0.3004
0.3079
106.56 106
16.50
1.49
43
0.0956
0.1248
0.1315
45.54 45
17.50
2.26
13
0.0289
0.0376
0.0310
10.71 11
18.50
3.03
1
0.0022
0.0029
0.0040
1.40 1
TOTAL
450
450
cNtex -1 


Fitting with Normal Distribution (Continued)
Experimental
Theoretical
  ui 
ui
Checking for Normality
Checking for normality can be done by various ways:
1)
Goodness of fit: Chi-square test
2)
Probability plot
3)
Quantile-Quantile plot (QQ Plot)
Goodness of Fit: Chi-square Test
1.) Hypothesis: The experimental frequency distribution follows the
theoretical normal probability distribution.
2.) Test Statistic:
m
2  
i 1
 ni,E  ni,T 
ni ,T
2
,
where
ni ,E
is the experimental
frequency, ni ,T is the theoretical frequency, and m is the number of
class. The test statistics follows chi-square distribution with
 m  c  degree of freedom, where c denotes the number of constrains.
Here we have three constrains. One constraint is that the total
number of data should be the same in experimental and theoretical
distributions. Another constraint is that the mean value should be
the same in experimental and theoretical distributions. One more
constraint is the variance should be the same in experimental and
theoretical distributions. Thus, c  3.
Goodness of Fit: Chi-square Test (Contd.)
3) Choice of significance level: Let us choose a significance level of
2
2
0.05. Therefore, our hypothesis will be rejected if    mc, ,
where
2mc , 
is the chi-square percent point function with m  c degree of
freedom and significance level of . The values of this function can
be obtained from a standard table. We obtain 26,0.05  12.5920.
4.) Computation:
 2  1.9463.
5.) Conclusion: As we see that
 2  26,0.05 ,
1.9463
12.5920
there is no reason to
reject the hypothesis. Hence we conclude that the experimental
frequency distribution follows the theoretical normal probability
distribution.
Probability Plot
Steps for constructing the probability plot for checking with normal
distribution:
Step 1) Arrange the observations in ascending order of magnitude,
let x(j) denotes the j-th order variable.
Step 2) Calculate their cumulative relative frequencies (j-0.5)/n,
where n denotes the number of observations.
Step 3) Plot 100(j-0.5)/n against x(j). If a straight line, chosen
subjectively, can pass through the points, the observations can be
regarded as taken from a normal distribution. A good rule of thumb
is to draw the line approximately between 25th and 75th percentile
points. If all the points are covered by a “fat pencil” lying along the
straight line, a normal distribution adequately describes the data.
Normal Probability Plot (Continued)
Normal Probability Plot
x(j)
(j-0.5)/(n=9)
1
10.50
0.0556
0.75
2
11.50
0.1667
3
12.50
0.2778
4
13.50
0.3889
5
14.50
0.5000
6
15.50
0.6111
7
16.50
0.7222
8
17.50
0.8333
9
18.50
0.9444
j  0.5

100
n
Probability
j
0.95
0.90
0.50
0.25
0.10
0.05
12
14
16
Data x j 
18
As the majority of the points fall on the straight, the
observations can be regarded as taken from a
population following normal distribution.
Quantile-Quantile Plot (QQ Plot)
Steps for constructing the QQ plot for checking with normal
distribution.
Step 1) Arrange the observations in ascending order of magnitude,
let x(j) denotes the j-th order variable.
Step 2) Calculate their cumulative relative frequencies (j-0.5)/n,
where n denotes the number of observations.
Step 3) Find out the standardized normal scores uj by using the
following formula
j  0.5
 P U  u j     u j 
n
Step 4) Plot x(j) against u j If a straight line, chosen subjectively, can
pass through the points, the observations can be regarded as taken
from a normal distribution.
Normal Probability Plot (Continued)
QQ Plot of Sample Data versus Standard Normal
20
x(j)
(j-0.5)/(n=9)
uj
(from
standard
normal
table)
1
10.50
0.0556
-1.59
2
11.50
0.1667
-0.97
3
12.50
0.2778
-0.59
4
13.50
0.3889
-0.28
5
14.50
0.5000
0
6
15.50
0.6111
0.28
7
16.50
0.7222
0.59
8
17.50
0.8333
0.97
9
18.50
0.9444
1.59
x j  18
Quantiles of Input Sample
j
16
14
12
10
8
-2
-1
0
Standard Normal Quantiles
1
uj
2
As the majority of the points fall on the
straight, the observations can be regarded as
taken from a population following normal
distribution.
Discrete Random Variable
The Discrete Random Variable x
Let x be a discrete random variable, xxmin,xmax such that it can
only assign values with whole numbers or integers. Let the number
of observations be n. Then x takes the values x1, x2, x3, …,xn.
Alternately, we write x takes the values xj, where j=1, 2, 3, …, n. The
observed values of x are different, because the variable x is a random
variable. Actually, the number of observations is limited mainly
because of the time and the cost of the sample. Here, we consider
that the number of observations is very large and it can be increased
without any limitation of time.
Statistical Characteristics of x
1 j n
Mean: x   x j
n j 1
2
1 j n
1 j n 2
s    x j  x     x j  2x j x  x 2  
n j 1
n j 1
Variance:
2
j n
1 j n 2
1 j n
1 j n 2
1
2 1
  x j  2 x  x j  x 1   x j  2 x x  x 2 n 
n j 1
n j 1
n j 1
n j 1
n
x
1 j n 2
1 j n 2
2
2
  x j  2x  x   x j  x 2  x2  x 2
n j 1
n j 1
where
1 j n 2
x   xj
n j 1
2
is the mean of the square values of x.
Standard deviation: s   s 2
Distribution of x
Let us divide the data domain xmin, xmax into m number of classes,
each class corresponds to one single value. Let us mark the classes
by serial number i=1, 2, …, m. Then, we get
Serial No.
(i)
Class value
(xi)
Class frequency
(ni)
Relative frequency
(gi)
Cumulative relative
frequency (hi)
1
xmin
n1
g1=n1/∑ni
h1=f1
2
x2
n2
g2=n2/∑ni
h2=f1+f2
m
xmax
nm
gm=nm/∑ni
hm=∑fi=1
n=∑ni
g=∑gi=1
Histogram of x
g
xmin
x2
x3
xmax
x
Statistical Characteristics
For a finite (limited) number m of classes, the statistical
characteristics of the random variable x are described below.
Mean:
i m
1 i m
1 i m
x    xi ni     xi ngi     xi gi 
n i 1
n i 1
i 1
i m
1 im 2
1 i m 2
Mean of square values: x    xi ni     xi ngi     xi2 gi 
n i 1
n i 1
i 1
2
Statistical Characteristics (Continued)
Variance:
1 i m
1 i m
2
2
s    xi  x  ni    xi  x  ngi 
n i 1
n i 1
n
2
i
i m
 i m

2
   x  2 xi x  x gi   x gi  2 x   xi g i   x  g i 
i 1
i 1
i 1
 i 1

i m
i m
2
i
2
2
i
1
2




2
2
i m
i

m
i

m
i

m
i m
i m
i m










  xi2 gi  2   xi gi    xi g i     xi gi    xi2 gi  2   xi gi     xi gi  
i 1
i 1
  i 1
 i 1
  i 1

 i 1
  i 1

 x 
 x 
2


i m
i

m


  xi2 gi    xi gi   x 2  x 2
i 1
 i 1

 x 
 x2
Standard deviation:
s   s2
Binomial Distribution
Bernoulli Trial
Let us consider that a bundle of fibers are being drawn by rollers as it
happens in draw frame or speed frame or ring frame. Let us select
four fibers (red color) and study their movement, that is, the
probability of occurrence of passing the strip (yellow color) by these
four fibers at a given time. We denote the occurrence of passing by
symbol “Y” and the occurrence of not passing by symbol “N”.
Assume that these events are independent to each other.
Bernoulli Trial (Continued)
Let us list down all probable occurrences. Here, x denotes the
number of occurrences that a fiber pass the strip, that is, number of
occurrences of “Y” and n=4
Outcome x
NNNN 0
Outcome x
YYNN
2
NNNY
NNYN
NYNN
YNNN
1
1
1
1
YNYN
YNNY
NYYY
YYYN
2
2
3
3
NNYY
2
YNYY
3
NYYN
NYNY
2
2
YYNY
YYYY
3
4
Let us now find out the probability of
x=2. The occurrences are: YYNN,
YNYN, YNNY, NNYY, NYYN,
NYNY. The probability is equal to
[P(Y)P(Y)P(N)P(N)]
[P(Y)P(N)P(Y)P(N)]
[P(Y)P(N)P(N)P(Y)]
[P(N)P(N)P(Y)P(Y)]
[P(N)P(Y)P(Y)P(N)]
[P(N)P(Y)P(N)P(Y)]
Bernoulli Trial (Continued)
If we take that P(Y)=0.1 then P(N)=0.9 (complementary probability).
Then the probability can be calculated as
[0.1  0.1  0.9  0.9][0.1  0.9  0.1  0.9][0.1  0.9  0.9  0.1] [0.9
 0.9  0.1  0.1] [0.9  0.1  0.1  0.9] [0.9  0.1  0.9  0.1]
= 6 (0.1)2 (0.9)2
=4C2 (0.1)2 (0.9)4-2
=0.0486
Bernoulli Trial (Continued)
If we take that P(Y)=0.1 then P(N)=0.9 (complementary probability).
Then the probability can be calculated as
[0.1  0.1  0.9  0.9][0.1  0.9  0.1  0.9][0.1  0.9  0.9  0.1] [0.9
 0.9  0.1  0.1] [0.9  0.1  0.1  0.9] [0.9  0.1  0.9  0.1]
= 6 (0.1)2 (0.9)2
=4C2 (0.1)2 (0.9)4-2
=0.0486
Example
Each sample of a chemical used in a textile dying process has a 10%
chance of containing a pollutant. Find out the probability that in the
next 20 samples, exactly 2 contain the pollutant. Assume that the
samples are independent with regard to the presence of the
pollutant.
Let x be the number of samples that contain the pollutant in the next
20 samples to be analyzed. Then, x is a binomial random variable
with p=0.1 and n=20. Then
f  x  2   C2  0.1 1  0.1
20
2
20  2
 190  0.1  0.9   0.2852
2
18
Example (Continued)
Determine the probability that at least four sample contain the
pollutant.
The required probability is f  x  4  
x  20

x4
Cx  0.1 1  0.1
x
20
However, it is easier to calculate this as follows.
x 3
f  x  4   1  f  x  4   1   20C x  0.1  0.9 
x
20  x
x 0
 1  [0.1261  0.2702  0.2852  0.1901]  0.1284
20  x
Poission Distribution
Poission Distribution
Consider the Bernoulli trial of fiber drawing process. Let the random
variable x equal to the number of occurrences that a fiber passes the
strip at given time, and n denotes the number of fibers whose
movement are studied, p denotes the probability that a fiber passes
the strip. Assume x follows binomial distribution. Let  = pn, then
f  x   nCx p x 1  p 
n x

n
 Cx  
n
x
 
1  
 n
n x
Now suppose that the number (n) of fibers studied increases and the
probability (p) of a fiber passing the strip decreases exactly enough
that pn remains constant. Then,
e  x
Lt f  x  
n 
x!
x  0,1, 2,
x is then said to follow Poission
distribution.
Example
Assume that the number of fibers present in the cross-section of a
yarn follows Poission distribution with a mean of 100. Determine the
probability that there are exactly 105 fibers present in some crosssections of the yarn.
Let x denote the number of fibers present in the cross-section of the
yarn. Then the mean value  = 100. The required probability is
f  x  105 
e
100
100 
105!
105
 0.0344
Another Example
The number of flaws in a cloth is assumed to be Poission distributed
with a mean of 0.1 flaw per square meter. (a) What is the probability
that there are two flaws in 1 square meter of cloth? (b) What is the
probability that there is one flaw in 10 square meters of cloth? (c) What
is the probability that there are no flaws in 20 square meter of a cloth?
(d) What is the probability that there are at least two flaws in 10 square
meters of cloth?
(a)Let x denote the number of flaws in 1 square meter of cloth. Then,
the mean value  = 0.1 and
2
0.1
f  x  2 
e
 0.1
2!
 0.0045
(b)Let x denote the number of flaws in 10 square meter of cloth. Then,
the mean value  = 0.110 = 1 and the require probability is
Another Example (Continued)
f  x  1 
e
1
1
1
1!
 0.3679
(c) Let x denote the number of flaws in 20 square meter of cloth.
Then, the mean value  = 0.120 = 2 and
f  x  0 
e
2
 2
0!
0
 0.1353
(d) Let x denote the number of flaws in 10 square meter of cloth.
Then, the mean value  = 0.110 = 1 and
0
1
1
1

e 1 e 1 
e 
f  x  2  1  f  x  2  1  
 1 

  0.2642
x!
1! 
x 0
 0!
x 1

x
Another Example (Continued)
(b) Let x denote the number of flaws in 10 square meter of cloth.
Then, the mean value  = 0.110 = 1 and
f  x  1 
e
1
1
1
1!
 0.3679
(c) Let x denote the number of flaws in 20 square meter of cloth.
Then, the mean value  = 0.120 = 2 and
f  x  0 
e
2
 2
0
 0.1353
0!
(d) Let x denote the number of flaws in 10 square meter of cloth.
Then, the mean value  = 0.110 = 1 and
Frequently Asked Questions & Answers
Frequently Asked Questions & Answers
Q1: Give two examples each on continuous random variable and discrete random
variable.
A1: Fiber length and fiber strength are the two examples of continuous random
variable. Number of fibers in yarn cross-section, number of holes in a knitwear are
the two examples of discrete random variable.
Q2: Why the statistical characteristics of the primary data do not often exactly equal
to those of grouped data?
A2: While calculation of statistical characteristics of a grouped data the different
values that fall in a certain class are considered to be numerically same as the middle
value of that class and this makes the results different.
Q3: Is it so that the probability and relative frequency are same?
A3: The probability and relative frequency possess the same value. Relative
frequency is interpreted as “ex post”, while probability is interpreted as “ex ante”.
Frequently Asked Questions & Answers (Contd.)
Q4: Is normal distribution an example of two-parameter distribution?
A4: Yes, the normal distribution is described by two parameters, namely, mean
and standard deviation.
Q5: How one can conclude whether a sample can be regarded as taken from a
population that follows normal distribution?
A5: By using goodness-of-fit tests (objectively) and probability plot (subjectively),
one can conclude this.
Q6: Is Binomial distribution can be taken as a limiting form of Poission
distribution?
A6: Yes, a binomial distribution with probability approaching to zero and number
of samples approaching to infinity limits to a Poission distribution.
References
1. Neckar, B. and Ibrahim, S., Structural Theory of Fibrous Assemblies and Yarns,
Part I: Structure of Fibrous Assemblies, Technical University of Liberec, Czech
Republic, Liberec, Czech Republic, 2003.
Sources of Further Reading
1.
Leaf, G. A. V., Practical Statistics for the Textile Industry: Part I, The Textile
Institute, UK, 1984.
2.
Leaf, G. A. V., Practical Statistics for the Textile Industry: Part II, The Textile
Institute, UK, 1984.
3.
Gupta, S. C. and Kapoor, V. K., Fundamentals of Mathematical Statistics,
Sultan Chand & Sons, New Delhi, 2002.
4.
Gupta, S. C. and Kapoor, V. K., Fundamentals of Applied Statistics, Sultan
Chand & Sons, New Delhi, 2007.
5.
Montgomery, D. C., Introduction to Statistical Quality Control, John Wiley &
Sons, Inc., Singapore, 2001.
6.
Grant, E. L. and Leavenworth, R. S., Statistical Quality Control, Tata McGraw
Hill Education Private Limited, New Delhi, 2000.
7.
Montgomery, D. C. and Runger, G. C., Applied Statistics and Probability for
Engineers, John Wiley & Sons, Inc., New Delhi, 2003.
Related documents