Download TM 720 Lecture 03: Describing/Using Variation, SPC Process

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

Statistical mechanics wikipedia , lookup

Statistical inference wikipedia , lookup

Misuse of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Transcript
ENGM 720 - Lecture 03
Describing & Using
Distributions, SPC Process
5/24/2017
ENGM 720: Statistical Process Control
1
Assignment:

Reading:
•
•

Chapter 2
•
Finish reading
Chapter 3
•
Start reading
Assignment 2:
•
•
Obtain access to MS Excel
•
Verify access to the Data Analysis Add-In
Access the class website:
•
•
Download of Normal Plot data spreadsheet
Download Assignment 2 Instructions (Materials page)
5/24/2017
ENGM 720: Statistical Process Control
2
What is Quality
 Many
definitions:
•Better performance
•Better service
•Better value
•Whatever the customer says it is…
 For
SPC, quality means better:
•Understanding of process variation,
•Control of the variation in the process, and
•Improvement in the process variation.
5/24/2017
ENGM 720: Statistical Process Control
3
Understanding Process
Variation
 Three
Aspects:
 Basic
Statistics:
•Location
•Spread
•Shape
•Quantify
•Communicate
5/24/2017
ENGM 720: Statistical Process Control
4
Location: Mode


The mode is the value (or values) that occurs
most frequently in a distribution.
To find the mode:
1. Sort the values into order (with no repeats),
2. Tally up how many times each value appears in the
original distribution.
3. The mode (or modes) has the largest tally


Dist. 1 has two modes: 20 and 15 (four times, ea.)
Dist. 2 has one mode: 15 (appearing seven times)
5/24/2017
ENGM 720: Statistical Process Control
5
Location: Median
 Half
of the values will fall above and half of the
values will fall below the median value.
 To estimate the median:
• Sort the values (keeping the duplicates in the list), and then
count from one end until you get to one half (rounding down)
of the total number of values.
• For an odd number of values, the median is the next value.
• For an even number of values, the median value is half of the
sum of the current value and the next sorted value.


Dist. 1 median is 19.5
Dist. 2 median is 15
5/24/2017
ENGM 720: Statistical Process Control
6
Location: Mean


The mean has a special notation: x for a
sample ( for the entire population)
To calculate the mean:
1. add up all of the values
2. divide the sum by the number of values
n
x

Dist. 1 mean is 18.6,

Dist. 2 mean is 15.0
5/24/2017
x
i 1
i
n
Mean is influenced by outliers
ENGM 720: Statistical Process Control
7
Spread: Range
 Range
is the difference between the maximum
and the minimum values, denoted R.
R  max( xi )  min( xi )
 This
value gives us the extreme limits of the
distribution spread.
• Much easier to calculate than other measures
• Very sensitive to outliers


Range of Dist. 1 is 11
Range of Dist. 2 is 4
5/24/2017
ENGM 720: Statistical Process Control
8
Spread: Variance
has the symbol 2 when referring to the
entire population (s2 for a sample variance)
 Variance
• The formula for the variance is:
 x
n
S2 
i 1
i
x

2
n 1
• Measures the dispersion with less emphasis on outliers
• Units for variance aren’t very intuitive
If population is
• Manual calculation is unpleasant
known, use n
(calculating equation could be used)
in denominator!
 The variance for Dist. 1 is 10.58, for Dist. 2 it is 1.63
5/24/2017
ENGM 720: Statistical Process Control
9
Spread: Standard Deviation
 The
standard deviation ( for the population, or s
for a sample) is the square root of the variance.
• Defn.
Special calculating formula:
 x
n
S  S2 
i 1
i
x

2
n 1


  x i 
n
2
 i 1 
x


i
n
i 1
n 1
n
S
• Not as easily influenced by outliers
• Has the same units as measure of location.


Std deviation for Dist. 1 is 3.25
Std deviation for Dist. 2 is 1.28
5/24/2017
ENGM 720: Statistical Process Control
2
If population is
known, use n
in denominator!
10
Shape:
Prob. Density Functions
 The
shape of a distribution is a function that
maps each potential x-value to the likelihood
that it would appear if we sampled at random
from the distribution. This is the probability
density function (PDF).
  1 :68.26% of the total area
  2 :95.46% of the total area
  3 :99.73% of the total area
-3
-2
-

+
+2
+3
Area Under the Normal Curve
5/24/2017
ENGM 720: Statistical Process Control
11
Shape: Stem-and-Leaf Plot

48
53
49
52
51
52
63
60
53
64
59
54
47
49
45
64
79
65
62
60
Divide each number into:
•
•
•
Stem – one or more of
the leading digits
Leaf – remaining digits
(may be ordered)
Choose between 4 and
20 stems
5/24/2017

Example:
4| 8 9 7 9 5
5| 3 2 1 2 3 4
5| 9
6| 3 0 4 4 2 0
6| 5
7|
7| 9
Done!
ENGM 720: Statistical Process
Control
12
Shape: Box (and Whisker) Plot
Box-and-Whisker Plot
Max value
85
80
Third quartile
Value
75
70
65
Mean
Median
60
55
50
45

First quartile
Visual display of
•
Min value
central tendency, variability, symmetry, outliers
5/24/2017
ENGM 720: Statistical Process Control
13
Shape: Histogram
A
histogram is a vertical bar chart that takes
the shape of the distribution of the data. The
process for creating a histogram depends on
the purpose for making the histogram.
• One purpose of a histogram is to see the shape of a
distribution. To do this, we would like to have as much
data as possible, and use a fine resolution.
• A second purpose of a histogram is to observe the
frequency with which a class of problems occurs. The
resolution is controlled by the number of problem
classes.
5/24/2017
ENGM 720: Statistical Process Control
14
Histogram Example (Excel)
Histogram
25
20
19
16
15
13
12
11
10
4
0
0
0
0
0
526
527
528
529
530
2
525
524
523
515
522
514
1
521
513
1
520
0
519
0
518
0
516
0
512
0
0
511
5
517
Frequency
20
Bin
5/24/2017
ENGM 720: Statistical Process Control
15
Goals of Statistical Quality
Improvement




Find special
causes
Head off
shifts in
process
Obtain
predictable
output
Continually
improve the
process
Statistical Quality Control and Improvement
Improving Process Capability and Performance
Continually Improve the System
Characterize Stable Process Capability
Head Off Shifts in Location, Spread
Time
Identify Special Causes - Bad (Remove)
Identify Special Causes - Good (Incorporate)
Reduce Variability
Center the Process
LSL
5/24/2017
0
USL
ENGM 720: Statistical Process Control
16
Distributions


Distributions quantify the probability of an event
Events near the mean are most likely to occur, events
further away are less likely to be observed
35.0 
2.5
30.4
(-3)
5/24/2017
34.8
32.6
(-)
(-2)
37
()
39.2
(+)
43.6
41.4
(+3)
(+2)
ENGM 720: Statistical Process Control
17
Normal Distribution
Normal Distribution
0.4
Mean,Std. dev.
0,1
f(x)
0.3
0.2
0.1
0
-4


Notation: r.v.
•
-3
-2
-1
x ~ N   ,
0
1
2
3
4
X

This is read: “x is normally distributed with mean  and
standard deviation .”
Standard Normal Distribution
r.v. z ~ N    0,  1
•
(z represents a Standard Normal r.v.)
5/24/2017
ENGM 720: Statistical Process Control
18
Simple Interpretation of
Standard Deviation of Normal
Distribution

 

 
P (     x     )  .6827

  

  
P (   2  x    2 )  .9546

  

  
P (   3  x    3 )  .9973
5/24/2017
ENGM 720: Statistical Process
Control
19
Standard Normal Distribution
• The Standard Normal Distribution has a mean () of 0 and
a standard deviation () of 1
• Total area under the curve, (z), from z = – to z =  is
exactly 1
• The curve is symmetric about the mean
• Half of the total area lays on either side, so:
(– z) = 1 – (z)
(z)

5/24/2017
z
ENGM 720: Statistical Process Control
20
Standard Normal Distribution
• How likely is it that we would observe a data point more
than 2.57 standard deviations beyond the mean?
• Area under the curve from – to z = 2.5  is found by
using the table on pp. 716-717, looking up the
cumulative area for z = 2.57, and then subtracting the
cumulative area from 1.
(z)

5/24/2017
z
ENGM 720: Statistical Process Control
21
5/24/2017
ENGM 720: Statistical Process
Control
22
Standard Normal Distribution
• How likely is it that we would observe a data point more
than 2.57 standard deviations beyond the mean?
• Area under the curve from – to z = 2.5  is found by
using the table on pp. 716-717, looking up the
cumulative area for z = 2.57, and then subtracting the
cumulative area from 1.
• Answer: 1 – .99492 = .00508, or about 5 times in 1000
(z)

5/24/2017
z
ENGM 720: Statistical Process Control
23
What if the distribution isn’t a
Standard Normal Distribution?
 If
it is from any Normal Distribution, we can
express the difference from an observation to
the mean in units of the standard deviation, and
this converts it to a Standard Normal
Distribution.
• Conversion formula is:
where:
z
x

x is the point in the interval,
 is the population mean, and
 is the population standard deviation.
5/24/2017
ENGM 720: Statistical Process Control
24
What if the distribution isn’t
even a Normal Distribution?

The Central Limit Theorem allows us to take the sum of
several means, regardless of their distribution, and
approximate this sum using the Normal Distribution if the
number of observations is large enough.
• Most assemblies are the result of adding together
components, so if we take the sum of the means for each
component as an estimate for the entire assembly, we
meet the CLT criteria.
• If we take the mean of a sample from a distribution, we
meet the CLT criteria (think of how the mean is computed).
5/24/2017
ENGM 720: Statistical Process Control
25
Example: Process Yield

Specifications are often set irrespective of process
distribution, but if we understand our process we can
estimate yield / defects.
• Assume a specification calls for a value of 35.0  2.5.
• Assume the process has a distribution that is Normally
distributed, with a mean of 37.0 and a standard deviation of
2.20.
• Estimate the proportion of the process output that will meet
specifications.
5/24/2017
ENGM 720: Statistical Process Control
26
Continuous & Discrete Distributions
 Continuous
• Probability of a range of
outcomes is the area
under the PDF
(integration)
 Discrete
• Probability of a range of
outcomes is the area under
the PDF
(sum discrete outcomes)
35.0 
2.5
30.4
(-3)
34.8
32.6
(-)
(-2)
5/24/2017
35.0 
2.5
37
()
39.2
(+)
43.6
41.4
(+3)
(+2)
30
32
ENGM 720: Statistical Process
Control
34
36
()
38
40
27
42
Discrete Distribution Example

Sum of two six-sided dice:
• Outcomes range from 2 to 12.
• Count the possible ways to obtain each individual sum forms a histogram
• What is the most frequently occurring sum that you could roll?
• Most likely outcome is a sum of 7 (there are 6 ways to
obtain it)
• What is the probability of obtaining the most likely sum in a
single roll of the dice?
• 6  36 = .167
• What is the probability of obtaining a sum greater than 2 and
less than 11?
• 32  36 = .889
5/24/2017
ENGM 720: Statistical Process Control
28
How do we know what the
distribution is when all we have
is a sample?

Theory – “CLT applies to measurements taken consisting of
many assemblies…”

Experience – “past use of a distribution has generated very
good results…”

“Testing” – combination of the above … in this case, anyway!
•
•
If we know the generating function for a distribution, we can
construct a grid (probability paper) that will allow us to observe
a straight line when sufficient data from that distribution are
plotted on the grid
Easiest grid to create is the Standard Normal Distribution …
•
because it is an easy transformation to “standard“ parameters
5/24/2017
ENGM 720: Statistical Process Control
29
Normal Probability Plots







Take raw data and count observations (n)
Set up a column of j values (1 to j)
Compute (zj) for each j value
 (zj) = (j - 0.5)/n
Get zj value for each (zj) in Standard Normal Table
• Find table entry((zj)), then read index value (zj)
Set up a column of sorted, observed data
• Sorted in increasing value
Plot zj values versus sorted data values
Approximate with sketched line at 25% and 75% points
5/24/2017
ENGM 720: Statistical Process Control
30
Interpreting Normal Plots

Assess Equal-Variance and Normality assumptions
•
•


Data from a Normal sample should tend to fall along the line, so
if a “fat pencil” covers almost all of the points, then a normality
assumption is supported
The slope of the line reflects the variance of the sample, so
equal slopes support the equal variance assumption
Theoretically:
•
Sketched line should intercept the zj = 0 axis at the mean value
Practically:
•
•
•
Close is good enough for comparing means
Closer is better for comparing variances
If the slopes differ much for two samples, use a test that
assumes the variances are not the same
5/24/2017
ENGM 720: Statistical Process Control
31
Relationship with Hypothesis Tests

Assuming that our process is Normally Distributed and
centered at the mean, how far apart should our specification
limits be to obtain 99. 5% yield?
• Proportion defective will be 1 – .995 = .005, and if the
process is centered, half of those defectives will occur on
the right tail (.0025), and half on the left tail.
• To get 1 – .0025 = 99.75% yield before the right tail
requires the upper specification limit to be set at
 + 2.81.
5/24/2017
ENGM 720: Statistical Process Control
32
5/24/2017
ENGM 720: Statistical Process
Control
33
Relationship with Hypothesis Tests

Assuming that our process is Normally Distributed and centered at
the mean, how far apart should our specification limits be to obtain
99. 5% yield?
• Proportion defective will be 1 – .995 = .005, and if the process
is centered, half of those defectives will occur on the right tail
(.0025), and half on the left tail.
• To get 1 – .0025 = 99.75% yield before the right tail requires
the upper specification limit to be set at  + 2.81.
• By symmetry, the remaining .25% defective should occur at
the left side, with the lower specification limit set at  – 2.81
• If we specify our process in this manner and made a lot of
parts, we would only produce bad parts .5% of the time.
5/24/2017
ENGM 720: Statistical Process Control
34
Questions & Issues
5/24/2017
ENGM 720: Statistical Process Control
35