Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Modeling Process Quality Statistical methodology plays an important role in quality control and quality improvement. Descriptive statistics of a sample of data display the variation in a quality characteristic. Probability distributions are our main tool for modeling the variation in a quality characteristic. 1 / 21 Modeling Process Quality ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Describing Variation Characteristics of the quality of goods and services always show variability. Examples: Net content of a can of a soft drink, when measured precisely; Time to complete an on-line banking transaction. We shall use both graphical and numerical tools to describe this variability. 2 / 21 Modeling Process Quality Describing Variation ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Stem-and-Leaf Plot The stem-and-leaf display is a useful pencil-and-paper technique. Shows the distribution of data, like a histogram. Makes it easy to find the median, quartiles, and other quantiles (percentiles). Example 3.1 Table 3.1, Days to pay a health insurance claim In R: Table03p01 <- read.csv("Data/Table-03-01.csv"); stem(Table03p01$Days) 3 / 21 Modeling Process Quality Describing Variation ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Time Series Plot The stem-and-leaf plot shows many aspects of the data, but not the time order. The time series plot is the same as a control chart. It can show: Shift in the mean; Trend; Change in variability. In R: plot(Days ~ Claim, data = Table03p01) # or simply plot(Table03p01$Days) 4 / 21 Modeling Process Quality Describing Variation ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Histogram The number of leaves on each stem in a stem-and-leaf plot shows the distribution of the data. The histogram shows similar counts, with fewer restrictions on the bins. In R: hist(Table03p01$Days) hist(Table03p01$Days, right = FALSE) # same counts as in stem 5 / 21 Modeling Process Quality Describing Variation ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Example 3.2 Table 3.2, thickness of vapor deposition layers on silicon In R: Table03p02 <- read.csv("Data/Table-03-02.csv"); hist(Table03p02$Thickness) # To look like Figure 3.3: hist(Table03p02$Thickness, right = FALSE, col = "brown", border = "white") 6 / 21 Modeling Process Quality Describing Variation ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Numerical Summary of Data Graphical summaries like the stem-and-leaf plot and the histogram give a visual impression of the distribution of a set of data. Two key properties are: Central tendency; Dispersion. We also need numerical summaries of these properties. 7 / 21 Modeling Process Quality Describing Variation ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control For a set of observations x1 , x2 , . . . , xn , numerical summaries of central tendency include: The sample average Pn xi x1 + x2 + · · · + xn x̄ = = i=1 ; n n The sample median; Other trimmed means. In R: mean(Table03p01$Days) # 33.25 median(Table03p02$Thickness) # 450 mean(Table03p02$Thickness, trim = 0.25) # 449.94 8 / 21 Modeling Process Quality Describing Variation ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control The most common numerical √ summary of dispersion is the sample standard deviation s = s 2 , where n X s2 = (xi − x̄)2 i−1 n−1 is the sample variance. In R: sd(Table03p01$Days) # 9.374679 sd(Table03p02$Thickness) # 13.42732 9 / 21 Modeling Process Quality Describing Variation ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Box Plot The box plot is a graphical display that shows 5 useful numerical summaries: The median; The two quartiles; The two extremes. It often also shows possible outliers. In R: # no outliers, whiskers extend to extremes: boxplot(Table03p01$Days) # possible outliers, whiskers extend only to non-outliers: boxplot(Table03p02$Thickness) 10 / 21 Modeling Process Quality Describing Variation ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Comparison Box Plots Box plots are also useful for comparing multiple sets of data. In R: batteries <- read.csv("Data/batteries.csv"); boxplot(Life ~ Temperature, data = batteries) When a control chart is based on samples of characteristics, a comparison box plot is a good way to look for shifts in center or dispersion, possible outliers, and so on. 11 / 21 Modeling Process Quality Describing Variation ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Probability Distributions We shall be working with two kinds of distribution: Discrete distributions, characterized by a probability mass function (pmf), like the binomial distribution; the pmf is p(xi ) = P(X = xi ) for each possible value xi . Continuous distributions, characterized by a probability density function (pdf), like the normal distribution; the pdf f (x) is interpreted through integrals: Z b P(a ≤ X ≤ b) = f (x)dx. a 12 / 21 Modeling Process Quality Describing Variation ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Mean and Variance The mean µ and variance σ 2 of a probability distribution are defined similarly to those of a sample: P ∞ xi p(xi ) discrete X i=1 µ = R∞ xf (x)dx continuous X −∞ P ∞ (xi − µ)2 p(xi ) discrete X i=1 2 σ = R∞ (x − µ)2 f (x)dx continuous X −∞ 13 / 21 Modeling Process Quality Describing Variation ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Examples # binomial pmf: plot(0:10, dbinom(0:10, size = 10, prob = 0.4), type = "h"); points(0:10, dbinom(0:10, size = 10, prob = 0.4)) # normal pdf: curve(dnorm(x), from = -3, to = 3) 14 / 21 Modeling Process Quality Describing Variation ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Some Discrete Distributions Hypergeometric Distribution Sampling without replacement from a finite population. Population size N, of which D have some characteristic of interest. Sample size n. The random variable X is the number in the sample that are found to have that characteristic. Then the probability mass function is D N −D x n−x p(x) == , max(0, n + D − N) ≤ x ≤ min(n, D) N n 15 / 21 Modeling Process Quality Important Discrete Distributions ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control The mean and variance are E(X ) = and nD Var(X ) = N nD N D N −n 1− N N −1 Application Acceptance sampling: N is lot size, D is number defective in the lot. 16 / 21 Modeling Process Quality Important Discrete Distributions ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Binomial Distribution Sampling with replacement from a finite population, or sampling from an infinite population. A sequence of n independent trials, each of which results in “success” or “failure”. The probability of success in each trial is p. The random variable X is the number of successes found in the n trials. Then the probability mass function is n x p(x) = P(X = x) = p (1 − p)n−x , 0 ≤ x ≤ n x 17 / 21 Modeling Process Quality Important Discrete Distributions ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control The mean and variance are E(X ) = np and Var(X ) = np(1 − p) Sample fraction The quantity p̂ = X n is the fraction of trials that result in success. p̂ is the sample analog of the population probability p, and is the natural estimator of it. 18 / 21 Modeling Process Quality Important Discrete Distributions ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Poisson Distribution Simplest model for random counts with no upper limit. For example, the number of blemishes in a new car’s paint. The probability mass function is p(x) = P(X = x) = e −λ λx ,x ≥ 0 x! where λ is a positive parameter. The mean and variance are E(X ) = Var(X ) = λ 19 / 21 Modeling Process Quality Important Discrete Distributions ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control Negative Binomial and Geometric Distribution Recall the sequence of independent trials, each with success probability p. For some r > 0, the random variable X is the number of the trial when the r th success is seen. Then the probability mass function is x −1 r p(x) = P(X = x) = p (1 − p)x−r , x ≥ r r −1 20 / 21 Modeling Process Quality Important Discrete Distributions ST 435/535 Statistical Methods for Quality and Productivity Improvement / Statistical Process Control The mean and variance are E(X ) = and Var(X ) = r p r (1 − p) p2 In the special case r = 1, that is waiting for the first success, the pmf simplifies to p(x) = p(1 − p)x−1 , x ≥ 1, the Geometric Distribution. 21 / 21 Modeling Process Quality Important Discrete Distributions