Download Higher National Certificate in General Engineering

Document related concepts

History of statistics wikipedia , lookup

Time series wikipedia , lookup

Transcript
NDGTA
Higher National Certificate
in Engineering
Unit 36 – LO1.3
Learning Outcome 1.1
• LO 1.3
– Use selected data to construct frequency
distribution and calculate mean, range and
standard deviation
NDGTA
Data Collection
• So there are two basic types of data –
– Variable type data - continuous
– Attribute type data - binary
NDGTA
Recording the
Data Collected
NDGTA
• Data should be recorded in such a way that it
is easy to use.
• Calculations of totals, averages and ranges
are often necessary and the format used for
recording the data can make things easier
Recording the
Data Collected
NDGTA
Percentage Impurity
Date
Week Total
Week
Average
15th
16th
17th
18th
19th
8 a.m.
0.26
0.24
0.28
0.30
0.26
1.34
0.27
10 a.m.
0.31
0.33
0.33
0.30
0.31
1.58
0.32
12 noon
0.33
0.33
0.34
0.31
0.31
1.62
0.32
2 p.m.
0.32
0.34
0.36
0.32
0.32
1.66
0.33
4 p.m.
0.28
0.24
0.26
0.28
0.27
1.33
0.27
6 p.m.
0.27
0.25
0.24
0.28
0.26
1.30
0.26
Day Total
1.77
1.73
1.81
1.79
1.73
Day Average
0.30
0.29
0.30
0.30
0.29
8.83
0.29
Time
Operator
Handout 2
Recording the
Data Collected
NDGTA
• Careful design of data collection will
facilitate easier and more meaningful
analysis.
– Step 1: agree on the exact event to be observed –
ensure that everyone involved in monitoring the
same thing
– Step2: decide both how often the events will be
observed (the frequency) and over what time
period (the duration)
Recording the
Data Collected
NDGTA
– Step 3: design a draft format – keep it simple
and leave adequate space for the entry of the
observations to be made.
– Step 4: tell the observers how to use the format
and put it into trial use – be careful to note their
initial observations let them know that it will be
reviewed after a period if use and make sure that
they accept that there is adequate time for them to
record the information required
Recording the
Data Collected
NDGTA
– Step 5: make sure that the observers record the
actual observations and not a ‘tick’ to show that
they made an observation
– Step 6: review the format with the observers to
discuss how easy / difficult it has proved to be in
use and also how the data has been of value after
analysis.
– Step 7: Put the recording system into practise.
Recording the
Data Collected
NDGTA
• Presenting the recorded data in a format that
is understandable is important if potential
information is to be used to the full.
• Refer to your table of recorded lengths of
wire – note its just a table of figures –
presenting data in this format is not very
helpful!
• Now refer to Handout 3
Recording the
Data Collected
NDGTA
56.1
56.0
55.7
55.4
55.5
55.9
55.7
55.4
55.1
55.8
55.3
55.4
55.5
55.5
55.2
55.8
55.6
55.7
55.1
56.2
55.6
55.7
55.3
55.5
55.0
55.6
55.4
55.9
55.2
56.0
55.7
55.6
55.9
55.8
55.6
55.4
56.1
55.7
55.8
55.3
55.6
56.0
55.8
55.7
55.5
56.0
55.3
55.7
55.9
55.4
55.9
55.5
55.8
55.5
55.6
55.2
Dimensions of pistons (mm) - raw data
What’s the biggest value?
What’s the smallest value?
What’s the average?
Handout 3
Recording the
Data Collected
NDGTA
Arranging each column from lowest to highest helps identify the lowest and highest values
55.0
55.4
55.1
55.4
55.2
55.5
55.2
55.2
55.1
55.6
55.3
55.4
55.5
55.5
55.3
55.3
55.6
55.7
55.4
55.4
55.5
55.7
55.3
55.4
55.6
55.8
55.6
55.5
55.5
55.7
55.6
55.5
55.9
55.8
55.7
55.7
55.6
55.9
55.7
55.6
55.9
56.0
55.8
55.9
55.8
56.0
55.7
55.7
56.1
56.0
55.9
56.2
56.1
56.0
55.8
55.8
Dimensions of pistons (mm) - raw data
Recording the
Data Collected
NDGTA
Arranging in ascending order immediately reveals the highest and lowest values
55.0
55.1
55.1
55.2
55.2
55.2
55.3
55.3
55.3
55.3
55.4
55.4
55.4
55.4
55.4
55.4
55.5
55.5
55.5
55.5
55.5
55.5
55.5
55.6
55.6
55.6
55.6
55.6
55.6
55.6
55.7
55.7
55.7
55.7
55.7
55.7
55.7
55.7
55.8
55.8
55.8
55.8
55.8
55.8
55.9
55.9
55.9
55.9
55.9
56.0
56.0
56.0
56.0
56.1
56.1
56.2
Dimensions of pistons (mm) - raw data
Recording the
Data Collected
NDGTA
Tally chart and frequency distribution of diameters of pistons (mm)
Diameter
55.0
Tally
1
Frequency
1
55.1
11
2
55.2
111
3
55.3
1111
4
55.4
1111
1
6
55.5
1111
11
7
55.6
1111
11
7
55.7
1111
111
8
55.8
1111
1
6
55.9
1111
5
56.0
1111
4
56.1
11
2
56.2
1
1
Total
56
Recording the
Data Collected
Piston Diameter
9
8
7
Frequency
6
5
4
3
2
1
0
Piston diameter
NDGTA
Recording the
Data Collected
NDGTA
• When there are a large number of observations, it is
often more useful to present data in the format of a
grouped frequency distribution.
• Guidelines:
– Make the class intervals of equal width (if possible)
– Choose the class boundaries so that they lie between
possible observations
– Determine the approximate number of class intervals
– use Sturgess rule: K = 1 + 3.3log10N, where
N = No. of observations
Recording the
Data Collected
NDGTA
• Handout 4
• Given the following raw data, determine the
number of class intervals using Sturgess’s
Rule.
• Develop a tally chart
• Determine the class boundaries
• Using Graph paper, develop a histogram to
represent the data.
Interpreting the
Data Collected
NDGTA
• Organising information into classes and
representing it in the form of a histogram is
useful, but more information can be gleaned
by employing other parameters to describe
the distribution.
• These parameters are collectively known as
measures of central tendency and
dispersion.
Central Tendency –
Mean
NDGTA
• The mean (average) is a single number that is
often used to describe the whole of the
represented data.
• It is a measure of ‘central tendency’ of the
-.
distribution and is represented by X
x- = Σ xi/n
n
n
I=1
(for discrete data)
or
-x = Σ
i=1
fixi/Σfi
(for grouped data with xi
being the class mid point)
Central Tendency –
Mean
NDGTA
• Take the values 144cm, 146cm, 154cm and
146cm (i.e. discrete data), then
x- = Σ xi/n
n
I=1
•
x = (144 + 146 + 154 + 146)/4 = 590/4 = 147.5cm
Now determine the mean for the population given in
handout 4
Central Tendency –
Mode
NDGTA
• The mode is the most commonly occurring value.
• Thus for the discrete values 144cm, 146cm, 154cm
and 146cm, the value 146cm occurs twice. This
then is the modal value.
– Note 1: it is possible for the mode to have more
than one value in a series of numbers?
– Note 2: for grouped data the modal value can be
found graphically from the class having the
highest frequency.
Central Tendency –
Mode
NDGTA
• For grouped data the modal value can be
determined graphically
• Using the histogram of the data in handout 4,
determine the modal value of the grouped
data.
Central Tendency –
Median
NDGTA
• The median is the measure of central tendency that
splits the series of numbers in half (i.e. it is the
middle value).
– Thus for the discrete values 19mm, 21mm and
23mm (an odd series of numbers) the median is
21mm.
– For the discrete values 19mm, 21mm, 23mm, 25mm
(i.e. an even group of numbers) the median is
determined by taking the middle two values, in this
case 21mm and 23mm adding them together (44mm)
and dividing by 2. Thus the median is 22mm
Central Tendency –
Median
NDGTA
• For grouped data, the median is a little more tricky
to determine.
– Method 1:
•
•
•
Using the histogram of the data in handout 4 add the
frequencies of the classed from both ends to determine the
class that contains the median value.
Determine the class boundaries of the median class
Determine the proportion required to ‘split’ the median
class such that 50% of the population lies either side of this
value.
Central Tendency –
Median
NDGTA
– Method 2:
•
•
•
•
•
(Handout graph paper and using the information from
Handout 4)
Determine the class boundaries for each of the class
intervals.
Determine the cumulative frequency and convert to
percentage values
Plot class boundaries against cumulative frequency
(percentage values)
Identify the 50% value and read onto the scale to determine
the median value
Dispersion –
Range
NDGTA
• The range, quite simply if the difference between
the highest and the lowest observations of scatter
(spread of values)
• Thus using our discrete values 144cm, 146cm,
154cm and 146cm, the range is 154 – 144 = 10cm
• The range is usually denoted by, Ri and the mean of
a set of ranges is given by…
-R = Σk R /k
i=1
i
where k is the number of set of ranges
Dispersion –
Range
NDGTA
• The range is thus a measure of the ‘scatter’.
• There are two major problems with its use…
1. the value of the range tends to depend upon the number of
observations in the sample.
Consider the following…
144
146
154
146
151
150
134
153
Range for the first two values is 2cm
Range for the first three values is 10cm
Range for the first six values is 10cm
Range for all eight values is 20cm
Dispersion –
Range
NDGTA
• The second major problems with its use…
2. the calculation of range uses only a portion of
data obtained. The range remains the same
despite changes in the values lying between the
lowest and the highest values.
• We can use a measure that is not subject to
these problems
Dispersion –
Standard Deviation
NDGTA
• The standard deviation, σ, takes all the data
into account and is a measure of the deviation
of the values from the mean.
• Take our discrete values 144cm, 146cm,
154cm and 146cm the mean is 147.5cm.
• The deviation of each of the values from the
- i.e. (144-147.5)= -3.5,
mean is given by (x-x),
-1.5, +6.5 and -1.5 respectively.
Dispersion –
Standard Deviation
NDGTA
• Adding -3.5, -1.5, +6.5, -1.5 up gives a
deviation from the mean as zero which
obviously isn’t true!
• Thus we square the numbers to make each of
- 2. Hence 12.25, 2.25,
them positive i.e. (x-x)
42.25 and 2.25.
n
- 2 = 59.00
• Summing these i.e. i=1
Σ (xi-x)
• This value is what we call the variance!
Dispersion –
Standard Deviation
NDGTA
• Finding the mean of this value...
n
- 2 /n = 59.00/4 = 14.75
Σ (xi-x)
i=1
• The standard deviation σ, is given as the
square root of this value…
σ = √(Σ (x -x)2 /n) = √14.75 = 3.84cm
n
i=1
i
Dispersion –
Standard Deviation
NDGTA
• It should be noted that for samples of a
population, the formula for standard deviation
is modified by replacing n by (n-1)
- 2 /(n-1)] = √59.00/3 = 4.43cm
σ = √[Σ (xi-x)
n
i=1
The Normal
Distribution
NDGTA
• The meaning of standard deviation is perhaps
most easily understood in terms of the normal
distribution (Gaussian distribution).
• If a continuous variable is monitored (such as
the length of a straw from a cutting process, or
the volume of paint in tins from a filling process,
or the weights of tablets from a palletizing
process, or monthly sales of a product, etc.),
the variable will usually be distributed about a
mean value, μ.
The Normal
Distribution
NDGTA
• The spread of values may be measured in
terms of the population standard deviation,
σ, which defines the width of a bell shaped
curve.
NDGTA
The Normal Distribution
_
x
x±σ
-
3x ± σ
2x ± σ
Range
The Normal
Distribution
NDGTA
• Note:
– 68.3 % of the population lie within ± 1 s.d. of
the mean i.e. μ ± σ
– 95.4 % within μ ± 2σ
– 99.7 % within μ ± 3σ
• Now Refer to LO1.2 for Common Cause
and Special Cause variation
The Run Chart
NDGTA
• Refer to Handout 5 – handout graph paper
• Trying to interpret data in a table can be
difficult, so presenting the information
graphically (i.e. ‘paining a picture’) can help.
• Using the information plot the data using a
line graph: the x-axis to represent time and
the y axis, sales.
The Run Chart
NDGTA
• Representing data in this manner is what is
referred to as a run chart.
• As can be clearly seen, there is variation in
the process which reflects our expectation
that Sales will vary month-by-month.
• However in order to understand this variation
we need to understand its causes – this will
then help us to make decisions!
Variability
NDGTA
• How much variation there is in a process and its
nature (i.e. common cause and special cause) may be
determined by carrying out simple statistical
calculations on the process data.
• From this control limits may be set for use with a
simple run chart.
• These control limits describe the extent of the
variation that is being seen in the process due to all
the common causes and will help indicate the
presence of special causes
Variability
NDGTA
• If or when the special causes have been identified,
accounted for and eliminated, the control limits will
allow managers to predict the future performance
of the process with some confidence.
• Traditionally two charts are used to help interpret
what is happening in a process (i.e. whether the
process is in or out of control)
– Range charts (R-charts)
– Mean chart (x-bar Charts).
What is a
Control Chart?
NDGTA
• A control chart is a device intended to be
used at a point of operation where the process
is carried out and by the operators of that
process.
• Results of observations/measurements are
plotted on a chart which reflects the variation
in the process.
• Essentially a control chart has three zones
What is a
Control Chart?
3
NDGTA
Action Zone
Upper control limit
2
Warning Zone
Variable or Attribute
Upper Warning limit
1
Stable
Zone
Central line
1
Lower Warning limit
2
Warning Zone
Lower control limit
3
Action Zone
time
What is a
Control Chart?
NDGTA
• Zone 1 – Stable zone – common cause
variation is prevalent
• Zone 2 – Warning Zone – need to keep an
eye on this zone as the process might be
going out of control.
• Zone 3 – Action Zone – requires action to
bring process back into control and an
investigation to prevent a reoccurrence
Variability and Control
NDGTA
• When a process is found to be out-of-control
the first reaction must be to investigate the
assignable (special) cause of variability.
• This may require in some cases the charting
of process parameters rather than the product
parameters which appear in the specification.
Variability
NDGTA
• For example it may be that viscosity of a chemical
product is directly affected by the pressure in the
reactor vessel which in turn may be affected by the
reactor temperature.
• A control chart for pressure with recorded changes
in temperature may be the first step in breaking into
the complexity of the relationship involved.
• The important point is to ensure that all adjustments
to the process are recorded and the relevant data
charted.
Variability
NDGTA
• There can be no compromise on processes
which are shown to be ‘not in control’.
• Simply the charting method and / or the
control limits will not bring the process into
control!
• A proper process investigation must take
place.
Variability
NDGTA
• There are numerous potential special causes
for processes being out of control.
• It is extremely difficult even dangerous to try
to find an association between types of causes
and patterns shown on control charts.
• There are clearly many causes which could
give rise to different patterns in different
industries and conditions.
Potential
Special Causes
People
Plan /
Equipment
NDGTA
Procedures /
Processes
Process out of
Control
Materials
Environment
Special Causes
• People:
–
–
–
–
–
–
–
Fatigue or illness
Lack of training / novices
Unsupervised
Unaware
Attitudes / motivation
Changes / improvement
Rotation of shifts
NDGTA
Special Causes
• Plant / equipment
–
–
–
–
–
–
–
Rotation of machines
Differences in test or measuring devices
Scheduled preventative maintenance
Lack of maintenance
Badly designed equipment
Worn equipment
Gradual deterioration of plant /equipment
NDGTA
Special Causes
NDGTA
• Processes /procedures
– Unsuitable techniques of operation or test
– Untried / new processes
– Changes in methods, inspection or check
Special Causes
NDGTA
• Materials
– Merging or mixing of batches, parts, components,
subassemblies, intermediates, etc.
– Accumulation of waste products
– Homogeneity
– Changes in supplier / material
Special Causes
• Environment
–
–
–
–
–
Gradual deterioration in conditions
Temperature changes
Humidity
Noise
Dusty atmospheres
NDGTA