Download Interpreting Variation Using Distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
IENG 486 - Lecture 05
Interpreting Variation Using
Distributions
4/29/2017
IENG 486 Statistical Quality &
Process Control
1
Assignment:
 Reading:

Chapter 1: (1.1, 1.3 – 1.4.5)


Chapter 2: (2.2 – 2.7)


Cursory – get Fig. 1.12., p.34; Deming Management,1.4.4 Liability
Cursory – Define, Measure, Analyze, Improve, Control
Chapter 3: (3.1, 3.3.1, 3.4.1)
 HW 1: Chapter 3 Exercises:



4/29/2017
1, 3, 4 – using exam calculator
10 (use Normal Plots spreadsheet from Materials page)
43, 46, 47 (use Exam Tables from Materials page – Normal Dist.)
IENG 486 Statistical Quality & Process
Control
2
Distributions
 Distributions quantify the probability of an event
 Events near the mean are most likely to occur, events
further away are less likely to be observed
35.0 
2.5
30.4
(-3)
4/29/2017
34.8
32.6
(-)
(-2)
37
()
39.2
(+)
43.6
41.4
(+3)
(+2)
IENG 486 Statistical Quality & Process
Control
3
Normal Distribution
Normal Distribution
0.4
Mean,Std. dev.
0,1
f(x)
0.3
0.2
0.1
0
-4
-3
-2
-1
0
X
 Notation: r.v.

x ~ N   ,
1
2
3
4

This is read: “x is normally distributed with mean  and
standard deviation .”
 Standard Normal Distribution
r.v. z ~ N    0,  1

(z represents a Standard Normal r.v.)
4/29/2017
IENG 486 Statistical Quality & Process
Control
4
Simple Interpretation of Standard
Deviation of Normal Distribution

 

 
P (     x     )  .6827

  

  
P (   2  x    2 )  .9546

  

  
P (  IENG
3486Statistical
x  Quality
 &3Process
 )  .9973
4/29/2017
Control
5
Standard Normal Distribution
The Standard Normal Distribution has a mean () of 0
and a variance (2) of 1 (thus, standard deviation is also 1)
 Total area under the curve, (z), from z = – to z = 
is exactly 1
 The curve is symmetric about the mean
 Half of the total area lays on either side, so:
(– z) = 1 – (z)

(z)

4/29/2017
z
IENG 486 Statistical Quality & Process
Control
6
Standard Normal Distribution

How likely is it that we would observe a data point
more than 2.57 standard deviations beyond the
mean?
under the curve from – to z = 2.57  is found by
using the table on pp. 693-694, looking up the
cumulative area for z = 2.57, and then subtracting the
cumulative area from 1.
 Area
(z)

4/29/2017
z
IENG 486 Statistical Quality & Process
Control
7
4/29/2017
IENG 486 Statistical Quality & Process
Control
8
Standard Normal Distribution

How likely is it that we would observe a data point
more than 2.57 standard deviations beyond the
mean?
under the curve from – to z = 2.57  is found by
using the table on pp. 693-694, looking up the
cumulative area for z = 2.57, and then subtracting the
cumulative area from 1.
 Answer: 1 – .99492 = .00508, or about 5 times in 1000
 Area
(z)

4/29/2017
z
IENG 486 Statistical Quality & Process
Control
9
What if the distribution isn’t a Standard
Normal Distribution?
If it is from any Normal Distribution, we can
express the difference from a sample mean to
the population mean in units of the standard
deviation, and this converts it to a Standard
Normal Distribution.

Conversion formula is:
where:
z
x

x is the sample location point,
 is the population mean, and
 is the population standard deviation.
4/29/2017
IENG 486 Statistical Quality & Process
Control
10
What if the distribution isn’t even a
Normal Distribution?
 The Central Limit Theorem allows us to take the sum
of several means, regardless of their distribution, and
approximate this sum using the Normal Distribution if
the number of observations is large enough.

Most assemblies are the result of adding together
components, so if we take the sum of the means for each
component as an estimate for the entire assembly, we
meet the CLT criteria.

If we take the mean of a sample from a distribution, we
meet the CLT criteria (think of how the mean is computed).
4/29/2017
IENG 486 Statistical Quality & Process
Control
11
Example: Process Yield
 Specifications are often set irrespective of process
distribution, but if we understand our process we can
estimate yield / defects.

Assume a specification calls for a value of 35.0  2.5.

Assume the process has a distribution that is Normally
distributed, with a mean of 37.0 and a standard deviation of
2.20.

Estimate the proportion of the process output that will meet
specifications.
4/29/2017
IENG 486 Statistical Quality & Process
Control
12
Continuous & Discrete Distributions
Continuous

Discrete
Probability of a range
of outcomes is the area
under the PDF
(integration)

Probability of a range of
outcomes is the area
under the PDF
(sum discrete outcomes)
35.0 
2.5
30.4
(-3)
34.8
32.6
(-)
(-2)
4/29/2017
35.0 
2.5
37
()
39.2
(+)
43.6
41.4
(+3)
(+2)
30
32
34
IENG 486 Statistical Quality & Process
Control
36
()
38
40
42
13
Discrete Distribution Example
 Sum of two six-sided dice:
Outcomes range from 2 to 12.
 Count the possible ways to obtain each individual sum forms a histogram
 What is the most frequently occurring sum that you could roll?
 Most likely outcome is a sum of 7 (there are 6 ways to
obtain it)
 What is the probability of obtaining the most likely sum in a
single roll of the dice?
 6  36 = .167
 What is the probability of obtaining a sum greater than 2 and
less than 11?
 32  36 = .889

4/29/2017
IENG 486 Statistical Quality & Process
Control
14
How do we know what the distribution
is when all we have is a sample?
 Theory – “CLT applies to measurements taken consisting of
many assemblies…”
 Experience – “past use of a distribution has generated very
good results…”
 “Testing” – combination of the above … in this case, anyway!


If we know the generating function for a distribution, we can
construct a grid (probability paper) that will allow us to observe
a straight line when sufficient data from that distribution are
plotted on the grid
Easiest grid to create is the Standard Normal Distribution …

4/29/2017
because it is an easy transformation to “standard“ parameters
IENG 486 Statistical Quality & Process
Control
15
Normal Probability Plots
 Take raw data and count observations (n)
 Set up a column of j values (1 to j)
 Compute F(zj) for each j value

F(zj) = (j - 0.5)/n
 Get zj value for each F(zj) in Standard Normal Table

Find table entry(F(zj)), then read index value (zj)
 Set up a column of sorted, observed data

Sorted in increasing value
 Plot zj values versus sorted data values
 Approximate with sketched line at 25% and 75% points
4/29/2017
IENG 486 Statistical Quality & Process
Control
16
Interpreting Normal Plots
 Assess Equal-Variance and Normality assumptions


Data from a Normal sample should tend to fall along the line, so
if a “fat pencil” covers almost all of the points, then a normality
assumption is supported
The slope of the line reflects the variance of the sample, so
equal slopes support the equal variance assumption
 Theoretically:

Sketched line should intercept the zj = 0 axis at the mean value
 Practically:



4/29/2017
Close is good enough for comparing means
Closer is better for comparing variances
If the slopes differ much for two samples, use a test that
assumes the variances are not the same
IENG 486 Statistical Quality & Process
Control
17
Normal Probability Plots
Tools for constructing Normal Probability Plots:
 (Normal) Probability Paper

In-class handout
 Normal Plots Template

Materials Page on course website
 Interpretation

Fat Pencil Test
4/29/2017
IENG 486 Statistical Quality & Process
Control
18