Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Statistical Quality Control in Textiles Module 2: Statistical Description of Quality Dr. Dipayan Das Assistant Professor Dept. of Textile Technology Indian Institute of Technology Delhi Phone: +91-11-26591402 E-mail: [email protected] Random Variable Random Variable The science and application of statistics deal with quantities that can vary. Such quantities are called variables. Generally, there are two types of variables, depending on the origin and nature of the values that they assign. A variable whose values are obtained by comparing the variable with a standard measuring scale such that they can assign any number including whole numbers and fractional numbers is known as continuous variable. A variable whose values are obtained simply by counting such that they can be always whole numbers or integers, is known as discontinuous or discrete variable. Characteristics of Random Variable A random variable is generally characterized by its statistical as well as probability characteristics. The basic statistical characteristics include mean (measure of central tendency of the data) and standard deviation (measure of variation in the data). In order to get more information on the data, the frequency distribution of the data need to be evaluated. The probability characteristics include probability distribution and its parameters. Continuous Random Variable The Continuous Random Variable x Let x be a continuous random variable, xxmin,xmax. Let the number of measurements be n. Then x takes the values x1, x2, x3, …,xn. Alternately, we write x takes the values xj, where j=1, 2, 3, …, n. The measured values of x are different, because the variable x is a random variable. Actually, the number of measurements is limited mainly because of the time and the capacity of the measuring instrument. Here, we consider that the number of measurements is very large and it can be increased without any limitation of time. Statistical Characteristics of x 1 j n Mean: x x j n j 1 2 1 j n 1 j n 2 s x j x x j 2x j x x 2 n j 1 n j 1 Variance: 2 j n 1 j n 2 1 j n 1 j n 2 1 2 1 x j 2 x x j x 1 x j 2 x x x 2 n n j 1 n j 1 n j 1 n j 1 n x 1 j n 2 1 j n 2 2 2 x j 2x x x j x 2 x2 x 2 n j 1 n j 1 where 1 j n 2 x xj n j 1 2 is the mean of the square values of x. Standard deviation: s s 2 Distribution of x Let us divide the data domain xmin, xmax into m number of classes of x x constant class interval x as follows x max min m Let us mark the classes by serial number i=1, 2, …, m. Then, we get Serial No. (i) Class Interval Lower Limit Upper Limit 1 xmin xmin + x 2 xmin + x xmin + 2x m xmax - x xmax Class Value (Mid-Value) (xi) x 2 3x 2 xmin xmin xmax x 2 Class Frequency (ni) Relative Frequency (gi) Relative Frequency Density (fi) n1 g1 n1 n f1 g1 x n2 g 2 n2 n f 2 g 2 x nM g m nm n f m g M x im n ni i 1 im 1 gi i 1 Histogram of x [1] fi xmin fi M 6 m=6 m M 1 2 δx δx δx δx δx δx xmax Mm=12 12 1 2 x xmin fi m ............... M δx δx δx δx δx δx δx δx δx δx δx δx x xmax Mm=24 24 xmin 1 ...................... m M δx δx xmax x Observation As the number of class increases, the width of class decreases. The contours of the histogram should have roughly similar shape, but the steps become smoother until they “diminish” and become “infinitely small”. This is valid if and only if the chosen “higher” number of class is very small as compared to the number of measurements, that is m<<n. For example, if we choose the number of classes significantly higher than the number of measurements, then it is possible that some classes may have zero frequency or very small value of frequency, then the shape of the histogram will change significantly. Therefore, we modify our procedure. We always double the number of measurements before doubling the number of classes. Then, the values of class frequency ni are “approximately” doubled and the relative frequencies of these classes remain “approximately” unchanged. Thus, we ensure that after doubling the number of classes it is always valid that m<<n. Statistical Characteristics For a finite (limited) number m of classes, the statistical characteristics of the random variable x are described below. It is known that in a given class, the measured value xj does not differ by more than x/2 from the class value xi. For simplicity, we consider that all values in a given class are of same value xi. Then, Mean: i m i m 1 i m 1 i m x xi ni xi ngi xi gi xi f i x n i 1 n i 1 i 1 i 1 Mean of square values: i m im 1 i m 2 1 i m 2 2 x xi ni xi ngi xi gi xi2 f i x n i 1 n i 1 i 1 i 1 2 Statistical Characteristics (Continued) Variance: im 1 i m 1 i m 2 2 2 s xi x ni xi x ng i xi x f i x n i 1 n i 1 i 1 n g 2 i i i m i m 2 x 2 xi x x f i x x f x 2 x xi f i x x f i x i 1 i 1 i 1 i 1 i m i m 2 i 2 2 i i 1 2 2 2 i m i m i m i m i m i m im xi2 f i x 2 xi f i x xi f i x xi fi x xi2 f i x 2 xi f i x xi f i x i 1 i 1 i 1 i 1 i 1 i 1 i 1 x x 2 i m i m xi2 f i x xi f i x x 2 x 2 i 1 i 1 2 x x Standard deviation: s s2 Discussion Here, we used the class value xi for all calculations. This value may differ by x/2 from the real measured value xj. As a result, the statistical characteristics obtained by using the class value are erroneous and this error decreases as the class width (x) decreases. Let us now decrease the class width by (a) increasing the number of classes say twice and also (b) increasing the measurements say twice and repeat this procedure to infinity. As a result, intuitively, the class width becomes smaller and smaller until it becomes “infinitesimal”. Such a class with infinitely small width is defined as “elementary class”, its width is denoted by the differential symbol dx, instead of the symbol x used to denote higher or finite value. Then, Discussion (Continued) 1.) The contours of the histogram should have roughly similar shape, but the steps become smoother until they “diminish” and become “infinitely small”. The contours of the histogram change to a continuous function called probability density function f(x). f x x xmin x dx xmax 2.) As the number of class is infinitely high, it is impossible to identify them by serial numbers i. The elementary class having lower limit x and upper limit x+dx will be simply called as “elementary class of x”. 3.) The area under each elementary class of x is f(x)dx. This product expresses the relative frequency of x in an elementary class of lower limit x and upper limit x+dx. Discussion (Continued) 4.) The area under the probability density curve still remains one. xmax Thus, f x dx 1. In other words, the integration of all probabilities x (“cumulative probability”) from xmin to xmax equals to one. It is min possible to find out the cumulative probability of x from the x following expression: F x f w dw. The function F(x) is known as x cumulative distribution function or simply distribution function. min Note: Here we use integral expression iM expression . i 1 xmax x min instead of summation Remark: For simplicity, we suggest that the domain of values x is finite (closed) interval xmin , xmax . It can be proved that it is valid even when xmin and xmax . Statistical Characteristics 1) We use “relative frequency” f(x)dx, which belongs to the “elementary class of x” instead of relative frequency of the i-th class gi=fix. (As the elementary class width is infinitely small, the error of calculation mentioned before is thus eliminated.) 2) The value of x is used as the class value of the elementary class x instead of the middle value xi of the i-th class. xmax 3) We use the integral expression instead of the summation iM x expression min i 1 Then, the following expressions are valid to use: Mean: x xmax xmin x f x dx Mean of square values: x 2 xmax xmin x 2 f x dx Statistical Characteristics (Continued) Variance: s2 xmax x x 2 f x dx xmin Standard deviation: s s2 xmax x x 2 f x dx xmin rth central moment: mr xmax x x f x dx r xmin rth non-central moment: mr xmax x 0 xmin r f x dx Probability According to the classical definition of probability, it is the ratio of the number of successful outcomes to the number of all outcomes. If we have n measurements and only ni measurements belong to i-th class (i=1, 2, …,m) then, the probability that a randomly chosen value belongs to i-th class is ni Pi gi fi x n We see that probability and relative frequency possess the same value, that is, probability is relative frequency and vice-versa. Relative frequency is used when we would like to characterize a value which is already measured. It means relative frequency is used as “ex post”. In opposite to this, probability is used to explore the future based on past investigation. Hence probability is used as “ex ante”. The earlier concept of “relative frequency” and “probability” for a class of certain width is also applicable for an “elementary class”. Thus, we understand the meaning of f(x)dx not only as the relative frequency of x in the elementary class of lower limit x and upper limit x+dx, but also as the “probability of occurrence” (future measured values) of x in the elementary class. Normal Distribution Normal Probability Distribution Let us consider xmin & xmax and assume that x follows normal probability distribution. Then, its probability density function takes the following form 1 x 2 1 f x exp 2 2 where, 0, 0.5 f x 0, 1 x f x dx 0, 2 x x 2 f x dx x Standard Normal Probability Distribution x Consider a variable u such that u and u , . Assume that u follows normal distribution with mean equals to zero and standard deviation equals to one. Then, its probability density function u is u2 1 u exp 2 2 u is called as standard normal variable. u u=0 u=1 or u=-1 u=2 or u=-2 u=3 or u=-3 u 0.3989 0.2419 0.0540 0.0044 u u Standard Normal Probability Distribution (Continued) The distribution function of u is shown below u u u v dv v2 1 exp dv 2 2 This integral is known as Laplace-Gauss integral. It has no analytical solution, but it can be solved by numerical integration. u u Standard Normal Probability Distribution (Continued) Take u 1 1 1 v dv 0 = 1 u v 1 exp 2 2 1 v2 1 exp dv 2 2 0 dv = v2 1 exp dv 2 2 2 0.050000 1 v2 v4 v6 v8 v10 v12 =0.500000+ 1 2 8 48 384 3840 46080 2 0 u 1 0.8412 u dv = 1 1 u3 u5 u7 u9 u11 u13 =0.500000+ u 6 40 336 3456 42240 599040 2 0 1 1 1 1 1 1 1 0.500000 1 6 40 336 3456 42240 599040 2 0.500000 0.341190 0.8412 u 1 1 0.1588 u Relationships x 2 x 2 1 1 1 1 x 1 f x exp exp u 2 2 2 2 u x F x x x f w dw w 2 x 2 1 1 v x dw exp exp d v u 2 2 2 2 Let w v , dw dv Relationship (Continued) u or f x 0.6827 0.9545 0.9974 x 3 x 2 x 1 x u-scale x 1 x 2 x 3 x-scale Practical Example Example: Yarn Strength (cN.tex-1) Dataset 14.11 14.99 15.08 13.14 13.21 15.79 13.78 15.65 15.47 14.41 15.85 14.84 12.26 11.93 14.08 15.32 14.57 16.80 14.31 13.69 15.16 15.12 17.03 13.09 17.97 14.41 12.35 13.69 15.58 13.90 16.38 15.36 15.21 16.49 13.99 12.86 11.82 14.31 15.05 14.92 15.65 14.48 14.45 16.14 14.62 16.80 12.52 15.76 11.87 14.08 13.25 14.67 15.10 15.10 14.38 14.04 15.67 15.44 14.67 12.93 12.40 15.90 16.53 14.43 13.01 14.45 14.62 15.77 17.12 13.40 13.56 13.62 13.40 14.05 13.62 15.26 14.67 14.08 13.44 14.67 14.87 13.35 12.72 13.40 13.78 17.06 14.53 14.18 11.98 15.58 17.51 16.14 13.94 13.31 14.84 13.45 15.58 15.90 13.17 16.53 14.08 15.85 15.46 14.17 13.35 13.41 13.25 15.90 15.03 15.56 12.42 14.16 15.90 14.58 15.90 13.40 14.03 15.44 13.44 14.82 14.43 13.67 15.42 14.84 14.18 16.17 15.36 13.62 13.62 12.44 15.21 16.43 14.97 12.86 14.67 14.08 13.73 16.34 12.72 16.01 13.78 12.90 14.31 14.53 14.99 15.44 14.08 15.44 14.85 13.41 13.69 12.72 12.72 14.18 15.41 14.87 16.94 14.38 13.40 17.89 16.70 11.09 17.71 13.84 14.08 14.92 13.81 13.39 17.09 14.62 14.94 14.68 15.05 13.78 14.48 13.60 16.63 14.18 14.41 13.22 13.29 14.92 15.62 16.09 13.28 15.67 14.99 14.71 10.57 14.92 14.84 15.68 15.05 14.84 15.10 15.10 12.72 14.09 14.31 15.65 14.67 15.94 13.30 12.29 14.41 10.84 17.64 12.34 16.69 13.99 13.11 15.16 12.23 14.15 15.44 13.89 16.19 15.85 13.73 14.18 14.31 12.80 15.34 15.31 17.17 12.95 14.62 15.44 13.32 15.34 12.72 14.08 13.51 12.91 13.50 13.26 15.62 15.08 14.92 16.53 14.40 14.76 14.67 13.14 14.08 16.96 13.44 14.31 13.79 13.89 15.68 15.86 13.84 13.06 14.87 14.71 12.23 16.32 14.84 14.54 13.78 14.67 15.90 14.53 13.21 13.06 13.53 17.36 14.92 16.34 14.57 13.44 13.85 15.94 13.78 13.60 14.76 14.84 13.60 14.58 15.47 14.99 12.47 16.08 14.31 14.99 12.53 13.25 12.81 16.11 16.35 16.48 12.47 14.08 13.78 12.60 13.35 13.51 13.06 15.58 13.89 13.87 15.12 15.36 12.98 16.19 13.51 14.18 14.53 12.19 12.96 15.70 16.32 15.90 14.31 14.35 15.20 16.19 15.15 13.17 13.69 14.18 13.21 14.31 15.26 14.99 14.72 15.49 14.84 15.62 15.12 12.91 13.21 15.67 16.43 17.12 14.53 14.62 13.69 15.68 11.44 14.53 12.93 13.30 14.13 15.03 15.68 14.31 16.14 13.85 13.55 15.65 14.67 11.97 13.89 14.97 14.58 15.68 14.43 13.44 15.16 17.49 13.82 15.35 13.48 14.41 14.08 14.67 14.99 16.96 15.71 13.85 14.52 13.94 12.44 14.09 12.72 14.84 16.14 15.94 15.16 15.01 14.18 16.70 14.59 14.31 15.21 12.72 13.89 14.41 15.16 14.31 16.53 15.16 14.67 14.08 11.92 13.56 14.41 15.37 15.21 16.35 13.35 14.92 13.62 16.80 15.71 14.99 14.82 13.62 14.53 15.26 15.12 14.84 16.34 16.11 15.90 15.21 13.06 14.04 13.44 15.58 15.31 16.96 15.58 14.31 15.65 18.02 12.32 14.77 13.42 14.31 15.58 15.90 14.62 14.26 16.43 13.81 15.16 14.22 14.31 13.40 13.21 15.16 15.22 15.81 14.18 16.14 16.11 16.80 Original Dataset: Statistical Characteristics Let us denote yarn strength by x. Then, Mean: xcNtex -1 1 j 450 x j cNtex-1 14.57 450 j 1 Variance: s 2 cN tex 2 -2 1 j 450 2 x j cNtex-1 x2cNtex-1 1.52 450 j 1 Standard deviation: scNtex 1.23 -1 Grouped Dataset: Frequency Distribution Class Interval (cN.tex-1) Class Value xi (cN.tex-1) Frequency ni (-) Relative Frequency gi (-) Relative Frequency Density fi (cN-1.tex) 10.00-11.00 10.50 2 0.0044 0.0044 11.00-12.00 11.50 8 0.0178 0.0178 12.00-13.00 12.50 37 0.0822 0.0822 13.00-14.00 13.50 102 0.2267 0.2267 14.00-15.00 14.50 140 0.3111 0.3111 15.00-16.00 15.50 104 0.2311 0.2311 16.00-17.00 16.50 43 0.0956 0.0956 17.00-18.00 17.50 13 0.0289 0.0289 18.00-19.00 18.50 1 0.0022 0.0022 450 1.0000 TOTAL Grouped Dataset: Histogram 0.4 0.3 f cN-1 tex 0.2 0.1 0 0 10 11 12 13 14 15 16 17 18 19 xcNtex-1 Grouped Dataset: Statistical Characteristics Mean: xcNtex -1 Variance: s 1 i 9 ni xi cNtex -1 14.56 450 i 1 2 cN 2 tex -2 2 1 i 9 2 ni xi cNtex-1 xcNtex -1 1.69 450 i 1 Standard deviation: scNtex 1.30 -1 Comparison Statistical Characteristics xcNtex-1 s 2 cN 2 tex -2 scNtex -1 Original Dataset Grouped Dataset 14.57 14.56 1.52 1.69 1.23 1.30 Grouped error! Fitting with Normal Distribution Experimental Distribution xi cNtex -1 x xi cNtex-1 s cNtex -1 cNtex -1 ni fi cN-1 tex ui Theoretical (Normal) Distribution fi cN-1 tex cNtex -1 ui ui (from PDF of standard normal distribution) n xcNtex -1 f i cN -1 tex ui n xcNtex -1 ui cNtex -1 10.50 -3.12 2 0.0044 0.0057 0.0031 1.081 11.50 -2.35 8 0.0178 0.0231 0.0252 8.73 9 12.50 -1.58 37 0.0822 0.1069 0.1145 39.65 40 13.50 -0.82 102 0.2267 0.2947 0.2850 98.64 99 14.50 -0.05 140 0.3111 0.4044 0.3984 137.93 138 15.50 0.72 104 0.2311 0.3004 0.3079 106.56 106 16.50 1.49 43 0.0956 0.1248 0.1315 45.54 45 17.50 2.26 13 0.0289 0.0376 0.0310 10.71 11 18.50 3.03 1 0.0022 0.0029 0.0040 1.40 1 TOTAL 450 450 cNtex -1 Fitting with Normal Distribution (Continued) Experimental Theoretical ui ui Checking for Normality Checking for normality can be done by various ways: 1) Goodness of fit: Chi-square test 2) Probability plot 3) Quantile-Quantile plot (QQ Plot) Goodness of Fit: Chi-square Test 1.) Hypothesis: The experimental frequency distribution follows the theoretical normal probability distribution. 2.) Test Statistic: m 2 i 1 ni,E ni,T ni ,T 2 , where ni ,E is the experimental frequency, ni ,T is the theoretical frequency, and m is the number of class. The test statistics follows chi-square distribution with m c degree of freedom, where c denotes the number of constrains. Here we have three constrains. One constraint is that the total number of data should be the same in experimental and theoretical distributions. Another constraint is that the mean value should be the same in experimental and theoretical distributions. One more constraint is the variance should be the same in experimental and theoretical distributions. Thus, c 3. Goodness of Fit: Chi-square Test (Contd.) 3) Choice of significance level: Let us choose a significance level of 2 2 0.05. Therefore, our hypothesis will be rejected if mc, , where 2mc , is the chi-square percent point function with m c degree of freedom and significance level of . The values of this function can be obtained from a standard table. We obtain 26,0.05 12.5920. 4.) Computation: 2 1.9463. 5.) Conclusion: As we see that 2 26,0.05 , 1.9463 12.5920 there is no reason to reject the hypothesis. Hence we conclude that the experimental frequency distribution follows the theoretical normal probability distribution. Probability Plot Steps for constructing the probability plot for checking with normal distribution: Step 1) Arrange the observations in ascending order of magnitude, let x(j) denotes the j-th order variable. Step 2) Calculate their cumulative relative frequencies (j-0.5)/n, where n denotes the number of observations. Step 3) Plot 100(j-0.5)/n against x(j). If a straight line, chosen subjectively, can pass through the points, the observations can be regarded as taken from a normal distribution. A good rule of thumb is to draw the line approximately between 25th and 75th percentile points. If all the points are covered by a “fat pencil” lying along the straight line, a normal distribution adequately describes the data. Normal Probability Plot (Continued) Normal Probability Plot x(j) (j-0.5)/(n=9) 1 10.50 0.0556 0.75 2 11.50 0.1667 3 12.50 0.2778 4 13.50 0.3889 5 14.50 0.5000 6 15.50 0.6111 7 16.50 0.7222 8 17.50 0.8333 9 18.50 0.9444 j 0.5 100 n Probability j 0.95 0.90 0.50 0.25 0.10 0.05 12 14 16 Data x j 18 As the majority of the points fall on the straight, the observations can be regarded as taken from a population following normal distribution. Quantile-Quantile Plot (QQ Plot) Steps for constructing the QQ plot for checking with normal distribution. Step 1) Arrange the observations in ascending order of magnitude, let x(j) denotes the j-th order variable. Step 2) Calculate their cumulative relative frequencies (j-0.5)/n, where n denotes the number of observations. Step 3) Find out the standardized normal scores uj by using the following formula j 0.5 P U u j u j n Step 4) Plot x(j) against u j If a straight line, chosen subjectively, can pass through the points, the observations can be regarded as taken from a normal distribution. Normal Probability Plot (Continued) QQ Plot of Sample Data versus Standard Normal 20 x(j) (j-0.5)/(n=9) uj (from standard normal table) 1 10.50 0.0556 -1.59 2 11.50 0.1667 -0.97 3 12.50 0.2778 -0.59 4 13.50 0.3889 -0.28 5 14.50 0.5000 0 6 15.50 0.6111 0.28 7 16.50 0.7222 0.59 8 17.50 0.8333 0.97 9 18.50 0.9444 1.59 x j 18 Quantiles of Input Sample j 16 14 12 10 8 -2 -1 0 Standard Normal Quantiles 1 uj 2 As the majority of the points fall on the straight, the observations can be regarded as taken from a population following normal distribution. Discrete Random Variable The Discrete Random Variable x Let x be a discrete random variable, xxmin,xmax such that it can only assign values with whole numbers or integers. Let the number of observations be n. Then x takes the values x1, x2, x3, …,xn. Alternately, we write x takes the values xj, where j=1, 2, 3, …, n. The observed values of x are different, because the variable x is a random variable. Actually, the number of observations is limited mainly because of the time and the cost of the sample. Here, we consider that the number of observations is very large and it can be increased without any limitation of time. Statistical Characteristics of x 1 j n Mean: x x j n j 1 2 1 j n 1 j n 2 s x j x x j 2x j x x 2 n j 1 n j 1 Variance: 2 j n 1 j n 2 1 j n 1 j n 2 1 2 1 x j 2 x x j x 1 x j 2 x x x 2 n n j 1 n j 1 n j 1 n j 1 n x 1 j n 2 1 j n 2 2 2 x j 2x x x j x 2 x2 x 2 n j 1 n j 1 where 1 j n 2 x xj n j 1 2 is the mean of the square values of x. Standard deviation: s s 2 Distribution of x Let us divide the data domain xmin, xmax into m number of classes, each class corresponds to one single value. Let us mark the classes by serial number i=1, 2, …, m. Then, we get Serial No. (i) Class value (xi) Class frequency (ni) Relative frequency (gi) Cumulative relative frequency (hi) 1 xmin n1 g1=n1/∑ni h1=f1 2 x2 n2 g2=n2/∑ni h2=f1+f2 m xmax nm gm=nm/∑ni hm=∑fi=1 n=∑ni g=∑gi=1 Histogram of x g xmin x2 x3 xmax x Statistical Characteristics For a finite (limited) number m of classes, the statistical characteristics of the random variable x are described below. Mean: i m 1 i m 1 i m x xi ni xi ngi xi gi n i 1 n i 1 i 1 i m 1 im 2 1 i m 2 Mean of square values: x xi ni xi ngi xi2 gi n i 1 n i 1 i 1 2 Statistical Characteristics (Continued) Variance: 1 i m 1 i m 2 2 s xi x ni xi x ngi n i 1 n i 1 n 2 i i m i m 2 x 2 xi x x gi x gi 2 x xi g i x g i i 1 i 1 i 1 i 1 i m i m 2 i 2 2 i 1 2 2 2 i m i m i m i m i m i m i m xi2 gi 2 xi gi xi g i xi gi xi2 gi 2 xi gi xi gi i 1 i 1 i 1 i 1 i 1 i 1 i 1 x x 2 i m i m xi2 gi xi gi x 2 x 2 i 1 i 1 x x2 Standard deviation: s s2 Binomial Distribution Bernoulli Trial Let us consider that a bundle of fibers are being drawn by rollers as it happens in draw frame or speed frame or ring frame. Let us select four fibers (red color) and study their movement, that is, the probability of occurrence of passing the strip (yellow color) by these four fibers at a given time. We denote the occurrence of passing by symbol “Y” and the occurrence of not passing by symbol “N”. Assume that these events are independent to each other. Bernoulli Trial (Continued) Let us list down all probable occurrences. Here, x denotes the number of occurrences that a fiber pass the strip, that is, number of occurrences of “Y” and n=4 Outcome x NNNN 0 Outcome x YYNN 2 NNNY NNYN NYNN YNNN 1 1 1 1 YNYN YNNY NYYY YYYN 2 2 3 3 NNYY 2 YNYY 3 NYYN NYNY 2 2 YYNY YYYY 3 4 Let us now find out the probability of x=2. The occurrences are: YYNN, YNYN, YNNY, NNYY, NYYN, NYNY. The probability is equal to [P(Y)P(Y)P(N)P(N)] [P(Y)P(N)P(Y)P(N)] [P(Y)P(N)P(N)P(Y)] [P(N)P(N)P(Y)P(Y)] [P(N)P(Y)P(Y)P(N)] [P(N)P(Y)P(N)P(Y)] Bernoulli Trial (Continued) If we take that P(Y)=0.1 then P(N)=0.9 (complementary probability). Then the probability can be calculated as [0.1 0.1 0.9 0.9][0.1 0.9 0.1 0.9][0.1 0.9 0.9 0.1] [0.9 0.9 0.1 0.1] [0.9 0.1 0.1 0.9] [0.9 0.1 0.9 0.1] = 6 (0.1)2 (0.9)2 =4C2 (0.1)2 (0.9)4-2 =0.0486 Bernoulli Trial (Continued) If we take that P(Y)=0.1 then P(N)=0.9 (complementary probability). Then the probability can be calculated as [0.1 0.1 0.9 0.9][0.1 0.9 0.1 0.9][0.1 0.9 0.9 0.1] [0.9 0.9 0.1 0.1] [0.9 0.1 0.1 0.9] [0.9 0.1 0.9 0.1] = 6 (0.1)2 (0.9)2 =4C2 (0.1)2 (0.9)4-2 =0.0486 Example Each sample of a chemical used in a textile dying process has a 10% chance of containing a pollutant. Find out the probability that in the next 20 samples, exactly 2 contain the pollutant. Assume that the samples are independent with regard to the presence of the pollutant. Let x be the number of samples that contain the pollutant in the next 20 samples to be analyzed. Then, x is a binomial random variable with p=0.1 and n=20. Then f x 2 C2 0.1 1 0.1 20 2 20 2 190 0.1 0.9 0.2852 2 18 Example (Continued) Determine the probability that at least four sample contain the pollutant. The required probability is f x 4 x 20 x4 Cx 0.1 1 0.1 x 20 However, it is easier to calculate this as follows. x 3 f x 4 1 f x 4 1 20C x 0.1 0.9 x 20 x x 0 1 [0.1261 0.2702 0.2852 0.1901] 0.1284 20 x Poission Distribution Poission Distribution Consider the Bernoulli trial of fiber drawing process. Let the random variable x equal to the number of occurrences that a fiber passes the strip at given time, and n denotes the number of fibers whose movement are studied, p denotes the probability that a fiber passes the strip. Assume x follows binomial distribution. Let = pn, then f x nCx p x 1 p n x n Cx n x 1 n n x Now suppose that the number (n) of fibers studied increases and the probability (p) of a fiber passing the strip decreases exactly enough that pn remains constant. Then, e x Lt f x n x! x 0,1, 2, x is then said to follow Poission distribution. Example Assume that the number of fibers present in the cross-section of a yarn follows Poission distribution with a mean of 100. Determine the probability that there are exactly 105 fibers present in some crosssections of the yarn. Let x denote the number of fibers present in the cross-section of the yarn. Then the mean value = 100. The required probability is f x 105 e 100 100 105! 105 0.0344 Another Example The number of flaws in a cloth is assumed to be Poission distributed with a mean of 0.1 flaw per square meter. (a) What is the probability that there are two flaws in 1 square meter of cloth? (b) What is the probability that there is one flaw in 10 square meters of cloth? (c) What is the probability that there are no flaws in 20 square meter of a cloth? (d) What is the probability that there are at least two flaws in 10 square meters of cloth? (a)Let x denote the number of flaws in 1 square meter of cloth. Then, the mean value = 0.1 and 2 0.1 f x 2 e 0.1 2! 0.0045 (b)Let x denote the number of flaws in 10 square meter of cloth. Then, the mean value = 0.110 = 1 and the require probability is Another Example (Continued) f x 1 e 1 1 1 1! 0.3679 (c) Let x denote the number of flaws in 20 square meter of cloth. Then, the mean value = 0.120 = 2 and f x 0 e 2 2 0! 0 0.1353 (d) Let x denote the number of flaws in 10 square meter of cloth. Then, the mean value = 0.110 = 1 and 0 1 1 1 e 1 e 1 e f x 2 1 f x 2 1 1 0.2642 x! 1! x 0 0! x 1 x Another Example (Continued) (b) Let x denote the number of flaws in 10 square meter of cloth. Then, the mean value = 0.110 = 1 and f x 1 e 1 1 1 1! 0.3679 (c) Let x denote the number of flaws in 20 square meter of cloth. Then, the mean value = 0.120 = 2 and f x 0 e 2 2 0 0.1353 0! (d) Let x denote the number of flaws in 10 square meter of cloth. Then, the mean value = 0.110 = 1 and Frequently Asked Questions & Answers Frequently Asked Questions & Answers Q1: Give two examples each on continuous random variable and discrete random variable. A1: Fiber length and fiber strength are the two examples of continuous random variable. Number of fibers in yarn cross-section, number of holes in a knitwear are the two examples of discrete random variable. Q2: Why the statistical characteristics of the primary data do not often exactly equal to those of grouped data? A2: While calculation of statistical characteristics of a grouped data the different values that fall in a certain class are considered to be numerically same as the middle value of that class and this makes the results different. Q3: Is it so that the probability and relative frequency are same? A3: The probability and relative frequency possess the same value. Relative frequency is interpreted as “ex post”, while probability is interpreted as “ex ante”. Frequently Asked Questions & Answers (Contd.) Q4: Is normal distribution an example of two-parameter distribution? A4: Yes, the normal distribution is described by two parameters, namely, mean and standard deviation. Q5: How one can conclude whether a sample can be regarded as taken from a population that follows normal distribution? A5: By using goodness-of-fit tests (objectively) and probability plot (subjectively), one can conclude this. Q6: Is Binomial distribution can be taken as a limiting form of Poission distribution? A6: Yes, a binomial distribution with probability approaching to zero and number of samples approaching to infinity limits to a Poission distribution. References 1. Neckar, B. and Ibrahim, S., Structural Theory of Fibrous Assemblies and Yarns, Part I: Structure of Fibrous Assemblies, Technical University of Liberec, Czech Republic, Liberec, Czech Republic, 2003. Sources of Further Reading 1. Leaf, G. A. V., Practical Statistics for the Textile Industry: Part I, The Textile Institute, UK, 1984. 2. Leaf, G. A. V., Practical Statistics for the Textile Industry: Part II, The Textile Institute, UK, 1984. 3. Gupta, S. C. and Kapoor, V. K., Fundamentals of Mathematical Statistics, Sultan Chand & Sons, New Delhi, 2002. 4. Gupta, S. C. and Kapoor, V. K., Fundamentals of Applied Statistics, Sultan Chand & Sons, New Delhi, 2007. 5. Montgomery, D. C., Introduction to Statistical Quality Control, John Wiley & Sons, Inc., Singapore, 2001. 6. Grant, E. L. and Leavenworth, R. S., Statistical Quality Control, Tata McGraw Hill Education Private Limited, New Delhi, 2000. 7. Montgomery, D. C. and Runger, G. C., Applied Statistics and Probability for Engineers, John Wiley & Sons, Inc., New Delhi, 2003.