Download Statistics Statistical Thinking Basic Concepts of Variation DATA

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Time series wikipedia , lookup

Transcript
2/20/2014
Statistics
ƒ The collection, tabulation, analysis, interpretation, and presentation of numerical data
ƒ Deductive Statistics (descriptive statistics)
à Describe a population or a complete group of data
à Each entity in the population must be studied
ƒ Inductive Statistics
à Deals with a limited amount of data or a representative sample of the population
à Used for samples to predict the population
Basic Concepts of Variation
ƒ Variation defined = Change over time
ƒ Everything varies – no two things are exactly the same
ƒ Variation can be measured (quantified)
Statistical Thinking
A PHILOSOPHY of learning and action based on the
following fundamental principles:
ƒ Work needs to be viewed as a process that can be
studied and improved
ƒ All work occurs in a system of interconnected processes
ƒ Variation exists in all processes
ƒ Understanding and reducing variation are keys to
success*
DATA
ƒ Data Sources = computer or manual reports, log books, special studies, vendor data, memos and notes, peoples’ memory*
ƒ Variation has a pattern
ƒ Groups of like measurements tend to cluster around a middle value
dd e a ue
ƒ VARIABLE = Continuous Scale, Data that is measured
ƒ The shape of a distribution curve can be determined for any process
ƒ ATTRIBUTE = Go/No‐Go, Data that is counted, Characteristic with limited choices/aspects
ƒ Variations from assignable causes tend to distort the normal distribution curve
Common Terms
ƒ Tolerance – range of acceptance
ƒ Specifications – given values by a customer
ƒ Population – all of a group, collection of all possible elements, values, or items associated with a situation
ƒ Sample – part of a group, subset of elements or measurements taken from a population, Must be randomized to represent the population
Quality Tool #6: CONTROL CHARTS
ƒ Has average and upper & lower control limits,
tolerances (range of acceptance)
ƒ Used to determine:
à Process centering, averages and ranges
à Variability of process
à Capability of a process
à Control of a process
à Non-normal patterns or trends
1
2/20/2014
Control Chart showing centerline, UCL, LCL
Run Chart
ƒ Data over time
Donna C.S. Summers
Quality, 3e
Copyright ©2003 by Pearson Education, Inc.
Upper Saddle River, New Jersey 07458
All rights reserved.
Control Chart of Inner/Outer Diameter
Concentricity
Donna C.S. Summers
Quality, 3e
Copyright ©2003 by Pearson Education, Inc.
Upper Saddle River, New Jersey 07458
All rights reserved.
Quality Tool #7: HISTOGRAM
ƒ Pictorial representation of a set of data using a bar graph which divides the measurements into cells
ƒ Indicate status, not cause of problem
ƒ Process:
à Determine/Interpret the shape of the data set (show frequency distributions and range)
à Determine dispersion & central tendency
à Compare to specifications
Donna C.S. Summers
Quality, 3e
Copyright ©2003 by Pearson Education, Inc.
Upper Saddle River, New Jersey 07458
All rights reserved.
Percentage of Measurements Falling Within Each Standard Deviation
Donna C.S. Summers
Quality, 3e
Copyright ©2003 by Pearson Education, Inc.
Upper Saddle River, New Jersey 07458
All rights reserved.
2
2/20/2014
HISTOGRAM TYPES
CENTRAL TENDENCY
ƒ “Middle” value of a distribution (determined from the mean, median, or mode)
GENERAL
LEFT SKEW
RIGHT PRECIPICE
ƒ Typical value representing a population
ƒ “Mean” is not enough to go off of
COMB
PLATEAU
TWIN PEAK
Different Distributions with
Same Averages and Ranges
ISOLATED PEAK
Construction - Clutch Plate Thickness Frequency Distribution for Clutch Plate Thickness
# of Categorizations affect
what we see and don’t see!!
Here, using too few of cells
hides a potential problem
Check this out
Donna C.S. Summers
Quality, 3e
Copyright ©2003 by Pearson Education, Inc.
Upper Saddle River, New Jersey 07458
All rights reserved.
HISTOGRAM Construction
à Find the largest and smallest values, subtract to calculate
range
à Select the # cells (divisions)
à Determine the width of each cell
‚ Divide range by # cells
‚ Round off to convenient (odd) number
à Compute cell boundaries
‚ Use smallest value of set as midpoint of first cell
‚ Subtract and add half of cell width to midpoint for first cell boundary
‚ Add cell width to each upper boundary until value is greater than the
largest value of set
à Use tick marks to assign each measurement to it’s cell
à Count the tick marks to complete frequency chart
à Construct the graph
‚ Vertical axis is frequency, Horizontal axis shows cell boundary
‚ Draw bars
à Overlay specification limits
à Interpret capability and shape
Donna C.S. Summers
Quality, 3e
Copyright ©2003 by Pearson Education, Inc.
Upper Saddle River, New Jersey 07458
All rights reserved.
Construction - Number of Cells
à NOTE: Different rules for this – affects what we see in
histograms (no single value can fit into 2 cells)
# Data Points
Under 50
50 – 100
100 – 250
Over 250
# Classes K
5-7
6-10
7-12
10-20
One rule = (Sqroot of n)
3
2/20/2014
1.005
0.995
1.002
1.006
1.000
0.984
0.987
0.992
0.994
1.002
0.994
0.994
0.988
0.990
1.006
0.998
1.002
1.015
0.991
1.007
1.006
0.997
0.987
1.002
0.993
1.002
1.008
1.006
0.988
1.007
0.995
0.987
0.991
1.004
1.008
0.989
1.004
1.001
0.993
0.990
0.982
0.992
0.996
1.003
1.001
0.995
1.002
0.997
0.992
0.999
1.002
0.992
0.984
1.010
1.004
0.990
1.003
1.000
0.996
1.010
1.021
0.992
0.985
0.984
0.984
0.995
0.992
1.019
1.005
0.997
0.987
0.990
1.002
1.016
1.008
0.989
1.014
0.986
1.012
L
S
1.021
-0.982
0.039
Range
No. Cells
‚ (sum of values divided by the # of values)
0.039
10
Cell width = 0.0039
Round to
.004
0.982
-0.0005
0.9815
STATISTICAL MEASURES
ƒ Central Tendency
à Mean ‐ average of a set of values Largest
Smallest
Range
Smallest
Half of last place
Midpoint
0.9815
-0.0020
0.9795
1
2
3
4
5
6
7
8
9
10
CELL START
X
X
X
Σ X
n
11 + 12 + 13 + 13 + 14 + 15
6
78
=
6
= 13
=
MID POINT
TALLY
////\ ///
////\ ////
////\ ////\
////\ ////\
////\ ////
////\ ////\
////\ ////\
////\ /
///
//
0.9815
0.9855
0.9895
0.9935
0.9975
1.0015
1.0055
1.0095
1.0135
1.0175
LSL
.986
16
FREQUENCY
////\ //
////\ /
////\ ////
/
8
9
17
16
9
19
11
6
3
2
USL
1.012
12
1/2 Cell width
Lower boundary
=
0.9835
0.9875
0.9915
0.9955
0.9995
1.0035
1.0075
1.0115
1.0155
1.0195
20
1 decimal place beyond!
Why not use half of .004???
8
4
0.9815
+0.0020 1/2 Cell width
0.9835 Upper boundary
X
CELL END
0.9795
0.9835
0.9875
0.9915
0.9955
0.9995
1.0035
1.0075
1.0115
1.0155
1.0175
1.002
0.992
0.985
0.985
1.0135
1.000
1.007
1.001
1.005
1.0015
0.995
0.997
1.013
1.012
CELL #
.9815
1.002
1.000
0.997
0.990
HISTOGRAM Construction
Calculate First Cell Boundaries
.9895
HISTOGRAM Construction
Mean Formula
Statistical Formula
à Median ‐ The middle number in a set of values Æ 10, 11, 12, 13, 14
à Mode ‐ The most often occurring value in a set of values Æ 1,2,3,3,4,4,4,5,5,6,7,8.9
ƒ Dispersion
à Range ‐ The largest value in a sample minus R = X l arg e − X small
the smallest à Variance ‐ The sum of the differences of each Σ( X − X n )2
Σ( X − X n ) 2
2
or
value and the average squared divided by the S =
n
n −1
degrees of freedom (number of values or number of values minus 1)
2
2
à Standard Deviation ‐ The square root of the S = Σ ( X − X n ) or Σ ( X − X n )
n −1
n
variance
Median
Donna C.S. Summers
Quality, 3e
ΣX
n
11 + 12 + 13 + 13 + 14 + 15
X=
6
78
X=
6
X = 13
X=
Simplified Formula
Mode(s)
Copyright ©2003 by Pearson Education, Inc.
Upper Saddle River, New Jersey 07458
All rights reserved.
Donna C.S. Summers
Quality, 3e
Copyright ©2003 by Pearson Education, Inc.
Upper Saddle River, New Jersey 07458
All rights reserved.
4
2/20/2014
Normal Curve
Skewness
Median
Mode
Mean
Median
Copyright ©2003 by Pearson Education, Inc.
Upper Saddle River, New Jersey 07458
All rights reserved.
R = X l arg e − X small
Seven Numbers:
36
35
Find:
39
Range
40
Average
35
Standard Deviation
38
41
AVERAGE
STANDARD DEVIATION
S=
S=
RANGE
R = 41 − 35
R=6
ΣX
n
36 + 35 + 39 + 40 + 35 + 38 + 41
X=
7
264
X=
7
X = 37.7
X=
Donna C.S. Summers
Quality, 3e
Mean
Mode
Copyright ©2003 by Pearson Education, Inc.
Upper Saddle River, New Jersey 07458
All rights reserved.
Manual Calculation of Standard Deviation
Three numbers:
7
10
13
AVERAGE
STANDARD DEVIATION
Σ( X − Xn) 2
n −1
(37.7 − 36)2 + (37.7 − 35)2 + (37.7 − 39)2 + (37.7 − 40)2 + (37.7 − 35)2 + (37.7 − 38)2 + (37.7 − 41)2
n −1
S = 2.43
TERMS
à Data
‚ Variable ‐ Quality characteristics that can be measured
‚ Attribute ‐ Quality characteristics that are observed to be either present or absent, conforming or nonconforming
‚ Relative ‐ Quality characteristics which are assigned a value which cannot be actually measured
à Accuracy
‚ How far from the actual or real value the measurement is
‚ The location of X or X bar
à Precision
‚ The ability to repeat a series of measurements and get the same value each time
‚ Repeatability
‚ The variability of measurements
à Measurement Error
‚ The difference between a value measured and the true value
5