ST 435/535
Statistical Methods for Quality and Productivity Improvement / Statistical Process Control
Modeling Process Quality
Statistical methodology plays an important role in quality control and
quality improvement.
Descriptive statistics of a sample of data display the variation in a
quality characteristic.
Probability distributions are our main tool for modeling the variation
in a quality characteristic.
Modeling Process Quality
ST 435/535
Statistical Methods for Quality and Productivity Improvement / Statistical Process Control
Describing Variation
Characteristics of the quality of goods and services always show
Net content of a can of a soft drink, when measured precisely;
Time to complete an on-line banking transaction.
We shall use both graphical and numerical tools to describe this
Modeling Process Quality
Describing Variation
ST 435/535
Statistical Methods for Quality and Productivity Improvement / Statistical Process Control
Stem-and-Leaf Plot
The stem-and-leaf display is a useful pencil-and-paper technique.
Shows the distribution of data, like a histogram.
Makes it easy to find the median, quartiles, and other quantiles
Example 3.1
Table 3.1, Days to pay a health insurance claim
In R:
Table03p01 <- read.csv("Data/Table-03-01.csv");
Modeling Process Quality
Describing Variation
ST 435/535
Statistical Methods for Quality and Productivity Improvement / Statistical Process Control
Time Series Plot
The stem-and-leaf plot shows many aspects of the data, but not the
time order.
The time series plot is the same as a control chart. It can show:
Shift in the mean;
Change in variability.
In R:
plot(Days ~ Claim, data = Table03p01)
# or simply plot(Table03p01$Days)
Modeling Process Quality
Describing Variation
ST 435/535
Statistical Methods for Quality and Productivity Improvement / Statistical Process Control
The number of leaves on each stem in a stem-and-leaf plot shows the
distribution of the data.
The histogram shows similar counts, with fewer restrictions on the
In R:
hist(Table03p01$Days, right = FALSE) # same counts as in stem
Modeling Process Quality
Describing Variation
ST 435/535
Statistical Methods for Quality and Productivity Improvement / Statistical Process Control
Example 3.2
Table 3.2, thickness of vapor deposition layers on silicon
In R:
Table03p02 <- read.csv("Data/Table-03-02.csv");
# To look like Figure 3.3:
hist(Table03p02$Thickness, right = FALSE,
col = "brown", border = "white")
Modeling Process Quality
Describing Variation
ST 435/535
Statistical Methods for Quality and Productivity Improvement / Statistical Process Control
Numerical Summary of Data
Graphical summaries like the stem-and-leaf plot and the histogram
give a visual impression of the distribution of a set of data.
Two key properties are:
Central tendency;
We also need numerical summaries of these properties.
Modeling Process Quality
Describing Variation
ST 435/535
Statistical Methods for Quality and Productivity Improvement / Statistical Process Control
For a set of observations x1 , x2 , . . . , xn , numerical summaries of
central tendency include:
The sample average
x1 + x2 + · · · + xn
x̄ =
= i=1 ;
The sample median;
Other trimmed means.
In R:
# 33.25
# 450
mean(Table03p02$Thickness, trim = 0.25) # 449.94
Modeling Process Quality
Describing Variation
ST 435/535
Statistical Methods for Quality and Productivity Improvement / Statistical Process Control
The most common numerical
√ summary of dispersion is the sample
standard deviation s = s 2 , where
s2 =
(xi − x̄)2
is the sample variance.
In R:
# 9.374679
sd(Table03p02$Thickness) # 13.42732
Modeling Process Quality
Describing Variation
ST 435/535
Statistical Methods for Quality and Productivity Improvement / Statistical Process Control
Box Plot
The box plot is a graphical display that shows 5 useful numerical
The median;
The two quartiles;
The two extremes.
It often also shows possible outliers.
In R:
# no outliers, whiskers extend to extremes:
# possible outliers, whiskers extend only to non-outliers:
Modeling Process Quality
Describing Variation
ST 435/535
Statistical Methods for Quality and Productivity Improvement / Statistical Process Control
Comparison Box Plots
Box plots are also useful for comparing multiple sets of data.
In R:
batteries <- read.csv("Data/batteries.csv");
boxplot(Life ~ Temperature, data = batteries)
When a control chart is based on samples of characteristics, a
comparison box plot is a good way to look for shifts in center or
dispersion, possible outliers, and so on.
Modeling Process Quality
Describing Variation
ST 435/535
Statistical Methods for Quality and Productivity Improvement / Statistical Process Control
Probability Distributions
We shall be working with two kinds of distribution:
Discrete distributions, characterized by a probability mass
function (pmf), like the binomial distribution; the pmf is
p(xi ) = P(X = xi ) for each possible value xi .
Continuous distributions, characterized by a probability density
function (pdf), like the normal distribution; the pdf f (x) is
interpreted through integrals:
Z b
P(a ≤ X ≤ b) =
f (x)dx.
Modeling Process Quality
Describing Variation
ST 435/535
Statistical Methods for Quality and Productivity Improvement / Statistical Process Control
Mean and Variance
The mean µ and variance σ 2 of a probability distribution are defined
similarly to those of a sample:
 ∞ xi p(xi ) discrete X
µ = R∞
xf (x)dx continuous X
 ∞ (xi − µ)2 p(xi ) discrete X
σ = R∞
(x − µ)2 f (x)dx continuous X
Modeling Process Quality
Describing Variation
ST 435/535
Statistical Methods for Quality and Productivity Improvement / Statistical Process Control
# binomial pmf:
plot(0:10, dbinom(0:10, size = 10, prob = 0.4), type = "h");
points(0:10, dbinom(0:10, size = 10, prob = 0.4))
# normal pdf:
curve(dnorm(x), from = -3, to = 3)
Modeling Process Quality
Describing Variation
ST 435/535
Statistical Methods for Quality and Productivity Improvement / Statistical Process Control
Some Discrete Distributions
Hypergeometric Distribution
Sampling without replacement from a finite population.
Population size N, of which D have some characteristic of
Sample size n. The random variable X is the number in the
sample that are found to have that characteristic.
Then the probability mass function is
N −D
p(x) ==
, max(0, n + D − N) ≤ x ≤ min(n, D)
Modeling Process Quality
Important Discrete Distributions
ST 435/535
Statistical Methods for Quality and Productivity Improvement / Statistical Process Control
The mean and variance are
E(X ) =
Var(X ) =
N −n
N −1
Acceptance sampling: N is lot size, D is number defective in the lot.
Modeling Process Quality
Important Discrete Distributions
ST 435/535
Statistical Methods for Quality and Productivity Improvement / Statistical Process Control
Binomial Distribution
Sampling with replacement from a finite population, or sampling
from an infinite population.
A sequence of n independent trials, each of which results in
“success” or “failure”.
The probability of success in each trial is p.
The random variable X is the number of successes found in the
n trials.
Then the probability mass function is
n x
p(x) = P(X = x) =
p (1 − p)n−x , 0 ≤ x ≤ n
Modeling Process Quality
Important Discrete Distributions
ST 435/535
Statistical Methods for Quality and Productivity Improvement / Statistical Process Control
The mean and variance are
E(X ) = np
Var(X ) = np(1 − p)
Sample fraction
The quantity p̂ =
is the fraction of trials that result in success.
p̂ is the sample analog of the population probability p, and is the
natural estimator of it.
Modeling Process Quality
Important Discrete Distributions
ST 435/535
Statistical Methods for Quality and Productivity Improvement / Statistical Process Control
Poisson Distribution
Simplest model for random counts with no upper limit.
For example, the number of blemishes in a new car’s paint.
The probability mass function is
p(x) = P(X = x) =
e −λ λx
,x ≥ 0
where λ is a positive parameter.
The mean and variance are
E(X ) = Var(X ) = λ
Modeling Process Quality
Important Discrete Distributions
ST 435/535
Statistical Methods for Quality and Productivity Improvement / Statistical Process Control
Negative Binomial and Geometric Distribution
Recall the sequence of independent trials, each with success
probability p.
For some r > 0, the random variable X is the number of the trial
when the r th success is seen.
Then the probability mass function is
x −1 r
p(x) = P(X = x) =
p (1 − p)x−r , x ≥ r
r −1
Modeling Process Quality
Important Discrete Distributions
ST 435/535
Statistical Methods for Quality and Productivity Improvement / Statistical Process Control
The mean and variance are
E(X ) =
Var(X ) =
r (1 − p)
In the special case r = 1, that is waiting for the first success, the pmf
simplifies to
p(x) = p(1 − p)x−1 , x ≥ 1,
the Geometric Distribution.
Modeling Process Quality
Important Discrete Distributions