Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ENGM 720 - Lecture 03 Describing & Using Distributions, SPC Process 5/24/2017 ENGM 720: Statistical Process Control 1 Assignment: Reading: • • Chapter 2 • Finish reading Chapter 3 • Start reading Assignment 2: • • Obtain access to MS Excel • Verify access to the Data Analysis Add-In Access the class website: • • Download of Normal Plot data spreadsheet Download Assignment 2 Instructions (Materials page) 5/24/2017 ENGM 720: Statistical Process Control 2 What is Quality Many definitions: •Better performance •Better service •Better value •Whatever the customer says it is… For SPC, quality means better: •Understanding of process variation, •Control of the variation in the process, and •Improvement in the process variation. 5/24/2017 ENGM 720: Statistical Process Control 3 Understanding Process Variation Three Aspects: Basic Statistics: •Location •Spread •Shape •Quantify •Communicate 5/24/2017 ENGM 720: Statistical Process Control 4 Location: Mode The mode is the value (or values) that occurs most frequently in a distribution. To find the mode: 1. Sort the values into order (with no repeats), 2. Tally up how many times each value appears in the original distribution. 3. The mode (or modes) has the largest tally Dist. 1 has two modes: 20 and 15 (four times, ea.) Dist. 2 has one mode: 15 (appearing seven times) 5/24/2017 ENGM 720: Statistical Process Control 5 Location: Median Half of the values will fall above and half of the values will fall below the median value. To estimate the median: • Sort the values (keeping the duplicates in the list), and then count from one end until you get to one half (rounding down) of the total number of values. • For an odd number of values, the median is the next value. • For an even number of values, the median value is half of the sum of the current value and the next sorted value. Dist. 1 median is 19.5 Dist. 2 median is 15 5/24/2017 ENGM 720: Statistical Process Control 6 Location: Mean The mean has a special notation: x for a sample ( for the entire population) To calculate the mean: 1. add up all of the values 2. divide the sum by the number of values n x Dist. 1 mean is 18.6, Dist. 2 mean is 15.0 5/24/2017 x i 1 i n Mean is influenced by outliers ENGM 720: Statistical Process Control 7 Spread: Range Range is the difference between the maximum and the minimum values, denoted R. R max( xi ) min( xi ) This value gives us the extreme limits of the distribution spread. • Much easier to calculate than other measures • Very sensitive to outliers Range of Dist. 1 is 11 Range of Dist. 2 is 4 5/24/2017 ENGM 720: Statistical Process Control 8 Spread: Variance has the symbol 2 when referring to the entire population (s2 for a sample variance) Variance • The formula for the variance is: x n S2 i 1 i x 2 n 1 • Measures the dispersion with less emphasis on outliers • Units for variance aren’t very intuitive If population is • Manual calculation is unpleasant known, use n (calculating equation could be used) in denominator! The variance for Dist. 1 is 10.58, for Dist. 2 it is 1.63 5/24/2017 ENGM 720: Statistical Process Control 9 Spread: Standard Deviation The standard deviation ( for the population, or s for a sample) is the square root of the variance. • Defn. Special calculating formula: x n S S2 i 1 i x 2 n 1 x i n 2 i 1 x i n i 1 n 1 n S • Not as easily influenced by outliers • Has the same units as measure of location. Std deviation for Dist. 1 is 3.25 Std deviation for Dist. 2 is 1.28 5/24/2017 ENGM 720: Statistical Process Control 2 If population is known, use n in denominator! 10 Shape: Prob. Density Functions The shape of a distribution is a function that maps each potential x-value to the likelihood that it would appear if we sampled at random from the distribution. This is the probability density function (PDF). 1 :68.26% of the total area 2 :95.46% of the total area 3 :99.73% of the total area -3 -2 - + +2 +3 Area Under the Normal Curve 5/24/2017 ENGM 720: Statistical Process Control 11 Shape: Stem-and-Leaf Plot 48 53 49 52 51 52 63 60 53 64 59 54 47 49 45 64 79 65 62 60 Divide each number into: • • • Stem – one or more of the leading digits Leaf – remaining digits (may be ordered) Choose between 4 and 20 stems 5/24/2017 Example: 4| 8 9 7 9 5 5| 3 2 1 2 3 4 5| 9 6| 3 0 4 4 2 0 6| 5 7| 7| 9 Done! ENGM 720: Statistical Process Control 12 Shape: Box (and Whisker) Plot Box-and-Whisker Plot Max value 85 80 Third quartile Value 75 70 65 Mean Median 60 55 50 45 First quartile Visual display of • Min value central tendency, variability, symmetry, outliers 5/24/2017 ENGM 720: Statistical Process Control 13 Shape: Histogram A histogram is a vertical bar chart that takes the shape of the distribution of the data. The process for creating a histogram depends on the purpose for making the histogram. • One purpose of a histogram is to see the shape of a distribution. To do this, we would like to have as much data as possible, and use a fine resolution. • A second purpose of a histogram is to observe the frequency with which a class of problems occurs. The resolution is controlled by the number of problem classes. 5/24/2017 ENGM 720: Statistical Process Control 14 Histogram Example (Excel) Histogram 25 20 19 16 15 13 12 11 10 4 0 0 0 0 0 526 527 528 529 530 2 525 524 523 515 522 514 1 521 513 1 520 0 519 0 518 0 516 0 512 0 0 511 5 517 Frequency 20 Bin 5/24/2017 ENGM 720: Statistical Process Control 15 Goals of Statistical Quality Improvement Find special causes Head off shifts in process Obtain predictable output Continually improve the process Statistical Quality Control and Improvement Improving Process Capability and Performance Continually Improve the System Characterize Stable Process Capability Head Off Shifts in Location, Spread Time Identify Special Causes - Bad (Remove) Identify Special Causes - Good (Incorporate) Reduce Variability Center the Process LSL 5/24/2017 0 USL ENGM 720: Statistical Process Control 16 Distributions Distributions quantify the probability of an event Events near the mean are most likely to occur, events further away are less likely to be observed 35.0 2.5 30.4 (-3) 5/24/2017 34.8 32.6 (-) (-2) 37 () 39.2 (+) 43.6 41.4 (+3) (+2) ENGM 720: Statistical Process Control 17 Normal Distribution Normal Distribution 0.4 Mean,Std. dev. 0,1 f(x) 0.3 0.2 0.1 0 -4 Notation: r.v. • -3 -2 -1 x ~ N , 0 1 2 3 4 X This is read: “x is normally distributed with mean and standard deviation .” Standard Normal Distribution r.v. z ~ N 0, 1 • (z represents a Standard Normal r.v.) 5/24/2017 ENGM 720: Statistical Process Control 18 Simple Interpretation of Standard Deviation of Normal Distribution P ( x ) .6827 P ( 2 x 2 ) .9546 P ( 3 x 3 ) .9973 5/24/2017 ENGM 720: Statistical Process Control 19 Standard Normal Distribution • The Standard Normal Distribution has a mean () of 0 and a standard deviation () of 1 • Total area under the curve, (z), from z = – to z = is exactly 1 • The curve is symmetric about the mean • Half of the total area lays on either side, so: (– z) = 1 – (z) (z) 5/24/2017 z ENGM 720: Statistical Process Control 20 Standard Normal Distribution • How likely is it that we would observe a data point more than 2.57 standard deviations beyond the mean? • Area under the curve from – to z = 2.5 is found by using the table on pp. 716-717, looking up the cumulative area for z = 2.57, and then subtracting the cumulative area from 1. (z) 5/24/2017 z ENGM 720: Statistical Process Control 21 5/24/2017 ENGM 720: Statistical Process Control 22 Standard Normal Distribution • How likely is it that we would observe a data point more than 2.57 standard deviations beyond the mean? • Area under the curve from – to z = 2.5 is found by using the table on pp. 716-717, looking up the cumulative area for z = 2.57, and then subtracting the cumulative area from 1. • Answer: 1 – .99492 = .00508, or about 5 times in 1000 (z) 5/24/2017 z ENGM 720: Statistical Process Control 23 What if the distribution isn’t a Standard Normal Distribution? If it is from any Normal Distribution, we can express the difference from an observation to the mean in units of the standard deviation, and this converts it to a Standard Normal Distribution. • Conversion formula is: where: z x x is the point in the interval, is the population mean, and is the population standard deviation. 5/24/2017 ENGM 720: Statistical Process Control 24 What if the distribution isn’t even a Normal Distribution? The Central Limit Theorem allows us to take the sum of several means, regardless of their distribution, and approximate this sum using the Normal Distribution if the number of observations is large enough. • Most assemblies are the result of adding together components, so if we take the sum of the means for each component as an estimate for the entire assembly, we meet the CLT criteria. • If we take the mean of a sample from a distribution, we meet the CLT criteria (think of how the mean is computed). 5/24/2017 ENGM 720: Statistical Process Control 25 Example: Process Yield Specifications are often set irrespective of process distribution, but if we understand our process we can estimate yield / defects. • Assume a specification calls for a value of 35.0 2.5. • Assume the process has a distribution that is Normally distributed, with a mean of 37.0 and a standard deviation of 2.20. • Estimate the proportion of the process output that will meet specifications. 5/24/2017 ENGM 720: Statistical Process Control 26 Continuous & Discrete Distributions Continuous • Probability of a range of outcomes is the area under the PDF (integration) Discrete • Probability of a range of outcomes is the area under the PDF (sum discrete outcomes) 35.0 2.5 30.4 (-3) 34.8 32.6 (-) (-2) 5/24/2017 35.0 2.5 37 () 39.2 (+) 43.6 41.4 (+3) (+2) 30 32 ENGM 720: Statistical Process Control 34 36 () 38 40 27 42 Discrete Distribution Example Sum of two six-sided dice: • Outcomes range from 2 to 12. • Count the possible ways to obtain each individual sum forms a histogram • What is the most frequently occurring sum that you could roll? • Most likely outcome is a sum of 7 (there are 6 ways to obtain it) • What is the probability of obtaining the most likely sum in a single roll of the dice? • 6 36 = .167 • What is the probability of obtaining a sum greater than 2 and less than 11? • 32 36 = .889 5/24/2017 ENGM 720: Statistical Process Control 28 How do we know what the distribution is when all we have is a sample? Theory – “CLT applies to measurements taken consisting of many assemblies…” Experience – “past use of a distribution has generated very good results…” “Testing” – combination of the above … in this case, anyway! • • If we know the generating function for a distribution, we can construct a grid (probability paper) that will allow us to observe a straight line when sufficient data from that distribution are plotted on the grid Easiest grid to create is the Standard Normal Distribution … • because it is an easy transformation to “standard“ parameters 5/24/2017 ENGM 720: Statistical Process Control 29 Normal Probability Plots Take raw data and count observations (n) Set up a column of j values (1 to j) Compute (zj) for each j value (zj) = (j - 0.5)/n Get zj value for each (zj) in Standard Normal Table • Find table entry((zj)), then read index value (zj) Set up a column of sorted, observed data • Sorted in increasing value Plot zj values versus sorted data values Approximate with sketched line at 25% and 75% points 5/24/2017 ENGM 720: Statistical Process Control 30 Interpreting Normal Plots Assess Equal-Variance and Normality assumptions • • Data from a Normal sample should tend to fall along the line, so if a “fat pencil” covers almost all of the points, then a normality assumption is supported The slope of the line reflects the variance of the sample, so equal slopes support the equal variance assumption Theoretically: • Sketched line should intercept the zj = 0 axis at the mean value Practically: • • • Close is good enough for comparing means Closer is better for comparing variances If the slopes differ much for two samples, use a test that assumes the variances are not the same 5/24/2017 ENGM 720: Statistical Process Control 31 Relationship with Hypothesis Tests Assuming that our process is Normally Distributed and centered at the mean, how far apart should our specification limits be to obtain 99. 5% yield? • Proportion defective will be 1 – .995 = .005, and if the process is centered, half of those defectives will occur on the right tail (.0025), and half on the left tail. • To get 1 – .0025 = 99.75% yield before the right tail requires the upper specification limit to be set at + 2.81. 5/24/2017 ENGM 720: Statistical Process Control 32 5/24/2017 ENGM 720: Statistical Process Control 33 Relationship with Hypothesis Tests Assuming that our process is Normally Distributed and centered at the mean, how far apart should our specification limits be to obtain 99. 5% yield? • Proportion defective will be 1 – .995 = .005, and if the process is centered, half of those defectives will occur on the right tail (.0025), and half on the left tail. • To get 1 – .0025 = 99.75% yield before the right tail requires the upper specification limit to be set at + 2.81. • By symmetry, the remaining .25% defective should occur at the left side, with the lower specification limit set at – 2.81 • If we specify our process in this manner and made a lot of parts, we would only produce bad parts .5% of the time. 5/24/2017 ENGM 720: Statistical Process Control 34 Questions & Issues 5/24/2017 ENGM 720: Statistical Process Control 35