Download Aramco AGE-214 Oct

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Birthday problem wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Probability interpretations wikipedia , lookup

Probability wikipedia , lookup

Transcript
STATISTICS & NUMERICAL METHODS
FOR PLANT ENGINEERS
AGE-214
By
S. O. Duffuaa
Systems Engineering
Department
Salih Duffuaa
Dr. Duffuaa is a Professor of Industrial and Systems Engineering at the •
Department of Systems Engineering at King Fahd University of Petroleum
and Minerals, Dhahran, Saudi Arabia. He received his PhD in Operations
Research from the University of Texas at Austin, USA. His research
interests are in the areas of Operations research, Optimization, quality
control, process improvement and maintenance engineering and
management.
He teaches course in the areas of Statistics, Quality control, Production
and inventory control, Maintenance and reliability engineering and
Operations Management. He consulted to industry on maintenance , quality
control and facility planning. He authored a book on maintenance planning
and control published by John Wiley and Sons and edited a book on
maintenance optimization and control. He is the Editor of the Journal of
Quality in Maintenance Engineering, published by Emerald in the United
Kingdom.
King Fahd University of Petroleum & Minerals
Department of Systems Engineering
A closed short course for Saudi Aramco Employees
On
Statistics for Plant Engineers and
Lab Scientists
Oct, 18-22, 2008
Morning Session
Afternoon session
Day
7:30 – 11:45 A.M
12:45 to 3:45 PM
Saturday Dr. Salih Duffuaa, Introduction to probability
,frequency & Probability distributions, mean
and variance.
Sunday Dr. Mohammad Haboubi, The normal distribution,
the central limit theorem and sampling distributions.
.
Monday Dr. Hesham Al-Fares: , Point and interval
estimation, statistical significance tests.
Dr. Mohammad Al Salamah, Simple
Tuesday regression, residual analysis
Wednesday Dr. Shokri Selim:Multiple regression, adequacy
of a regression model, Applications
Course Outcomes
•
•
•
•
•
•
Apply probability concepts and laws to solve basic
lab problems.
Summarize and present data in meaningful ways.
Compute probabilities from probability
distributions.
Construct confidence intervals for sample data.
Test statistical hypothesis.
Construct regression models and use them in
various applications such as equipment calibration
and prediction.
Day 1 Module Objectives
• Concept and definition of probability.
• Axioms of probability
• Laws of probability
Day 1 Module Objectives
• Data Summary
– Measures of central tendency.
• Mean X-bar and Median M
– Measures of variability
– Range R, Variance S2, Standard deviation S
and coefficient of variation (CoV).
• Frequency Distribution.
• Distributions
• Expected value
Day 1 Module Objectives
• Random variables.
• Mass and distribution functions
• Expected value
Examples of a Random Experiment
• Measuring a current in a wire.
• Number of samples analyzed per day .
• Time to do a task. Time to analyze a sample.
• Yearly rain fall in Dhahran
Examples of a Random Experiment
• Throwing a coin
• Number of accidents on campus per month.
• Students must generate at least 5 examples.
Random Experiments
• Every time the experiment is repeated a
different out come results.
• The set of all possible outcomes is call
Sample Space denoted by S.
• In the experiment of throwing the coin
the sample space S = { H, T}.
Random Experiments
• In the experiment on the number of
defective parts in three parts the sample
space S = { 0, 1, 2, 3}.
• Number of weekly traffic accidents on
KFUPM campus.
Event
• An event E is a subset of the sample
space.
• Example of Events in the experiment of the
number of defective in a sample of 3 parts
are:
• E1 = { 0}, E2 = { 0,1}, E3 = { 1, 2}
Example of Events
• A sample of polycarbonate plastic is analyzed
for scratch resistance and shock resistance. The
results from 49 samples are:
•
Shock resistance
H
H
40
L
4
Scratch Resistance
L
2
3
Let A denote the event a sample has high shock
resistance and B denote the event a sample has high
scratch resistance. Determine the the number of
samples in AB, AB and A`
Solution of Example
• IAI = 42, IBI = 44
• IABI = 40
• IABI = 46
• A=7,B=5
Exercise
• Refer to the event example and answer
the following:
• Find the number in AB
• Find the number of elements in AB
• Find the number of elements in AB
Listing of Sample Spaces
• Tree Diagrams
• Experience
•
Listing of Sample Spaces
• The experiment of throwing a coin twice
H
H
T
H
1
T
T
S = { HH, HT, TH,
TT}
Example on Listing Sample Spaces
• Draw the tree diagram for finding the
sample space for the number of defect
item in a sample of size three taken from a
production line producing chips.
Types of Sample Spaces
• A sample space is discrete if it consists of a
finite ( or countable infinite ) set of outcomes.
Examples are:
• S = { H, T}, S = { 1, 2, 3, …}
• Sample space is continuous if contains an
interval (finite or infinite): { T: 0 ≤ T ≤ 60}.
• Students should give more examples
Notation
•P
- denotes a probability
•A, B, ...
•P (A)
-
- denote a specific event
denotes the probability of an
event occurring
Concepts and Definition of
Probability
Four definitions of probability:
• Classical or a priori probability
• Statistical or a posteriori probability
• Subjective probability (used in
Bayesian methods).
• Mathematical probability
CLASSICAL OR A PRIORI
PROBABILITY
P(A)
=
# of ways A can occur (# favorable cases)
Total number of possible cases (# of total
possible cases)
STATISTICAL OR A POSTERIORI
PROBABILITY
Pr (A) =
# of successes
Number of trials
In the limit as # of
trials
Infinity
SUBJECTIVE PROBABILITY
• A measure of the degree of belief.
• There is a 10% chance it will rain
today.
• There is a 95% chance you can see
the new moon tomorrow morning.
• Subjective probability is the basis for
Bayesian methods.
Probability Limits
 The probability of an impossible event is 0.
 The probability of an event that is certain
to occur is 1.
0 ≤ P(A) ≤ 1
Impossible
to occur
Probability Limits
 The probability of an impossible event is 0.
 The probability of an event that is certain
to occur is 1.
0 ≤ P(A) ≤ 1
Impossible
to occur
Certain
to occur
MATHEMATICAL PROBABILITY
• A measure of uncertainty ( or possibility)
that satisfy the following conditions:
• 0 ≤ P(A) ≤ 1
• P(S) = 1
• Pr (A U B) = Pr (A) + Pr (B) If A Π B
=Ǿ
Possible Values for Probabilities
1
Certain
Likely
0.5
50-50 Chance
Unlikely
0
Impossible
Probability of an Event
• For discrete a sample space, the
probability of an event denoted as P(E)
equals the sum of the probabilities of the
outcomes in E.
• Example: S = { 1, 2, 3, 4, 5} each outcome
is equally likely. E is even numbers
within S. E = { 2, 4}, P(E) = 2/5.
Axioms of Probability
• If S is the sample space and E is any
event then the axioms of probability
are:
1.
P(S) = 1
2.
0  P(E)  1
3.
If E1 and E2 are event such
that E1  E2 = , then,
P(E1  E2) = P(E1 ) + P(E2)
mutually exclusive
• Events A and B are mutually
exclusive if they cannot occur
simultaneously
.
Definition
Total Area = 1
P(A)
P(B)
P(A and B)
Overlapping Events
Definition
Total Area = 1
P(A)
P(B)
Total Area = 1
P(A)
P(B)
P(A and B)
Overlapping Events
Nonoverlapping Events
Complementary Events
The complement of event A, denoted
by A, consists of all outcomes in
which event A does not occur.
P(A)
P(A)
(read “not A”)
Rules for Complementary
Events
P(A) + P(A) = 1
Rules for Complementary
Events
P(A) + P(A) = 1
P(A)
= 1 – P(A)
Rules for Complementary
Events
P(A) + P(A) = 1
P(A)
= 1 – P(A)
P(A)
= 1 – P(A)
Venn Diagram for the
Complement of Event A
Total Area = 1
P(A)
P(A) = 1 – P(A)
Probability of ‘At Least One’
 ‘At least one’ is equivalent to one or
more.
 The complement of getting at
least one item of a particular type is
that you get no items of that type.
Probability of ‘At Least One’
• If P(A) = P (getting at least one), then
•
P(A) = 1 – P(A)
• where P(A) is P (getting none)
Definitions
• Any event combining 2 or more
events
 Compound Even Notation
• P(A or B) = P (event A occurs or event B
occurs or they both occur
•
)
P(A and B) = P (event A occurs and
event B occurs)
Addition Rule
• P(A or B) = P (event A occurs or event
B occurs or they both occur
).
• P(AB) = P(A) + P(B) – P( AB)
• If AB) = , then,
• P(AB) = P(A) + P(B)
Addition Rule
•
A
B
P(AB) = P(A) + P(B) – P( AB)
Addition Rule: Example
•
•
•
•
•
•
Let S = { 1, 2, 3, 4, 5, 6, 7, 8,9,10)
A = { 2,3,4,5,6}, B = {4, 5,6,7,9,10}
AB = { 2,3,4,5,6,7,9,10}
P(A) = 5/10 =0.5 P(B) = 6/10 = 0.6
P(AB) = 0.8
P(AB) = 0.5 + 0.6 – 0.3 = 0.8
Problem
• Let assume A, B and C are events from the
sample space S. P(A) = 0.4. P(B) = 0.5,
P(C) = 0.3, P(BC) = 0.1, P(AC) = 0.2, A
and B are mutually exclusive. Compute the
following:
(i) P(A'), (ii) P(AυC), (iii) P[(AυB)C)
(iv) P(A υB υC), (v) P(B'UC‘)
Note A' means A compliment.
Conditional Probability
• Conditional Probability Concept
• P(A B) = P(A B)/ P(B) for P(B) > 0
• Give Examples
• Solve problems
Example on Conditional Probability
•
•
•
•
Let S = { 1, 2, 3, 4, 5, 6, 7, 8,9,10)
A = { 2,3,4,5,6}, B = {4, 5,6,7,9,10}
P(A|B) = P(AB)/P(B) = 0.3/0.6 = 0.5
This is as if we consider B our sample
space and see how many elements
from A in B. This will make P(B) = 1
Conditional Probability
Dependent Events
P(A and B) = P(A) • P(B|A)
Conditional Probability
Dependent Events
P(A and B) = P(A) • P(B|A)
Formal
Intuitive
P(B|A) =
P(A and B)
P(A)
The conditional probability of B given A can be
found by assuming the event A has occurred and,
operating under that assumption, calculating the
probability that event B will occur.
Independent Events
 Independent Events
Two events A and B are independent
if the occurrence of one does not
affect the probability of the
occurrence of the other.
• P(A|B) = P(A)
Dependent Events
Dependent Events
If A and B are not independent,
they are said to be dependent.
Formal Multiplication Rule
• P(A and B) = P(A) • P(B) if A and B are
independent
• P(AB) = P(A) • P(B)
Figure 3-9 Applying the Multiplication Rule
P(A and B)
Multiplication Rule
Are
A and B
independent
?
Yes
No
P(A and B) = P(A) • P(B | A)
P(A and B) = P(A) • P(B)
Generalization of Addition Rules
• Addition Rule
P(AB) = P(A) + P(B) – P( AB)
• If AB) = , then,
• P(AB) = P(A) + P(B)
• This rule can be generalized to k
events
• If Ei  Ej = , then
• P( E1  E2  … Ek) = P(E1) + P(E2) +
… + P(EK)
Multiplication Rule
• P(A B) = P(AB) ) P(B) = P(BA) ) P(A)
• Example:
The probability that an automobile battery
subject to high engine compartment
temperature suffer low charging is 0.7. The
probability a battery is subject to high engine
compartment temperature is 0.05.
What is the probability a battery is subject to
low charging current and high engine
compartment temperature?
Solution of Example
• Let A denote the event a battery suffers low
charging current. Let B denote the event that
a battery is subject to high engine
compartment temperature. The probability the
battery is subject to both low charging current
and high engine compartment temperature is
the intersection of A and B.
• P(A B) = P(AB) ) P(A) = 0.7 x 0.05 =
0.035
Example On Conditional and
Multiplication ( Product) Rule
• Consider a town that has a population of
900 persons, out of which 600 are males.
The rest are females. A total of 600 are
employed, out of which 500 are males.
Let M denote male, F denote female and E
employed and NE not employed. A person
is picked at random. Find the following
probabilities. P(M), P(E), P(EF), P(EF),
P(E  F).
Town Population Example
• Town Population Distribution
Male
Female
Total
Employed
500
100
600
Unemployed
100
200
300
600
300
900
Total
Solution of Example
•
•
•
•
P(M) = 600/900 = 2/3
P(E) = 600/900 = 2/3
P(EF) = 100/300 = 1/3
P(EF) = P(EF) P(F) = (1/3) x (1/3) =
1/9
• P(E  F) = P(E) + P(F) – P(EF)
= 2/3 + 1/3 – 1/9 = 8/9
Statistical Independence
• Two events are statistically independent if
the knowledge about one occurring does
not affect the probability of the other
happing. Mathematically expressed as:
• P(AB) = P(A) 
• P(A B) = P(A) P(B) Why ?
Example of Independence
• Let us consider the experiment of
throwing the coin twice. Let B denote the
event of having a head (H) in the first
throw and A denote having a tale (T) in the
second throw.
• P(AB) ) = ½ = P(A)
• P(A B) = ½ x ½ = ¼ = P(A) P(B)
• Therefore A and B are independent
Example of Dependent
•
A daily production of manufactured parts
contains 50 parts that do not meet
specifications while 800 meets specification.
Two parts are selected at random without
replacement from the batch. Let A denote the
event the first part is defective and B the
event the second part is defective.
• Are A and B independent?
• The answer is NO. Work it out before
you see the next slide
Example of Dependent
• P( BA ) = 49/849 why?
• P(B) = P(B A )P(A) + P(B A)P(A)
= (49/849)(50/850) + (50/849)(800/850)
= 50/850
Therefore A and B are not independent.
Problem 1
• A box contains 9 components of which
3 are defective.
• If one unit is drawn at random what is
the probability it is defective?
• If two components are chosen at
random
– What is the probability both are defective?
– what is the probability just one is
defective?
Problem 2
• A company has two machines one is less
reliable than the other. The better one has
probability of 0.9 of working through out
the week without repair. The probability for
the other is 0.7.
• What is the probability both machines is
working satisfactory through out the week?
• What is the probability at least one of
them requires repair?
Total Probability Rule
Motivational Problem
• Aramco has three labs that perform the same oil
analysis. Lab1 in Abqaiq, Lab2 in Ras Tanura and
Lab3 is in Dhahran. 100 samples were analyzed in
lab 1, 70 samples in lab2 and 30 samples in lab 3.
The chance of error in lab1 is 5%, in lab2 is 10%
and in lab3 is 8%.
What is the probability that a sample analyzed by
Aramco is in error. If the analysis of a sample is
found to be in error what is the chance the analysis
is done in lab1.
Total Probability Rule
• In a chip manufacturing process 20% of the
chips produced are subjected to a high level
of contamination. 0.1 of these chips causes
product failure. The probability is 0.005 that
a chip that is not subjected to high
contamination levels during manufacturing
causes a product failure.
• What is the probability that a product
using one of these chips fails?
Total Probability Rule
• Let B the event that a chip causes product
failure. We can write B as part of B in A
and part of B in A.
• B = (B A)  (B  A)
• P(B) = P(B A) + P(B  A)
• P(B) = P(BA) ) P(A) + P(B A) ) P(A)
• Graphically on next slide.
Graphical Representation
A
A
B
A
General Form of Total Probability Rule
• Assume E1, E2, … Ek are mutually
exclusive and exhaustive events. Then
•
P(B) = P(B  E1) + P(B E2) + …+ P(B Ek) )
= P(B E1) P(E1) + P(B E2) P(E2) + …+ P(B Ek) P(Ek)
Bayes Rule
•
•
•
•
P(A B) = P(AB) ) P(B) = P(BA) ) P(A)
Implies
P(AB) ) = P(BA) ) P(A)/ P(B) , P(B) > 0
OR Refer to the slide on about the
general total probability rule, we get
• P(Ei B) = P(Ei B)/ P(B) = P(B Ei )P(Ei)/ P(B)
•
= P(B Ei )P(Ei)/
P(B E1) P(E1) + P(B E2) P(E2) + …+ P(B Ek) P(Ek)
Example on Bayes Theorem
• Refer to the example about the chip
production. If you know a chip caused failure
what is the chance that the chip is subjected
to a high level of contamination when its
produced.
• We want P(A  B)
• P(A  B) = P(B  A) P(A)/ P(B) = (.1)(.2)/0.024
= 5/6 = 0.833
What is the probability of the chip is not subjected to a
high level of contamination when produced ?
Answer in two ways.
Aramco Example Solution
•
KFUPM Example on Bayes Theorem
• KFUPM students when driving to building 24 th
use two roads. The main road that passes in
front of gate 1 and the second road that passes
in front of gate 2. The students use the main
road 80% of the time because it is shorter. The
radar is on 60% of the time on the main road
and 30% of the time on the other road. The
students are always speeding. Find the chance
a student will be caught speeding. If you know
student is caught speeding what is the
probability he is coming to building 24 by the
main road. Answer the same question for the
other road.
Solution of KFUPM Example
Data Table Compressive Strength of 80 Aluminum
Lithium Alloy
105 221 183
97 154 153
245 228 174
163 131 154
207 180 190
134 178 76
218 157 101
199 151 142
160 175 149
196 201 200
186 121 181
174 120 168
199 181 158
115 160 208
193 194 133
167 184 135
171 165 172
163 145 171
87 160 237
176 150 170
180
167
176
158
156
229
158
148
150
118
143
141
110
133
123
146
169
158
135
149
How to Summarize The Data in
The Table Above
• Point summary
• Tabular format
• Graphical format
Point Summary
• Measures of Central Tendency
• Measures of Variation
• Why are we interested in these type of
measures?
Measures of Central Tendency or
Location
• Central tendency measures
– Mean  xi/n
– Median --- Middle value
– Mode --- Most frequent value
Percentiles: Measure of Location
• Pth percentile of the data is a value where
at least P% of the data takes on this value
or less and at least (1-P)% of the data
takes on this value or more.
• Median is 50th percentile. ( Q2)
• First quartile Q1 is the 25th percentile.
• Third quartile Q3 is the 75th percentile.
Measures of Variability
•
Range = Max xi - Min xi
• Variance = V =  (xi – x )2/ n-1
• Standard deviation = S
S = Square root
(V)
Tabular and Graphical Summary
• Frequency distribution table.
• Histogram
• Cumulative frequency plot.
Steps for Constructing Frequency
Table
• Determine number of classes/intervals.
– Between 5 and 20
– Close to Square root of n : number of data points.
• Count how many data points in each
class. This is the frequency.
• The relative frequency is the frequency divided
by n.
• The cumulative frequency is the sum of the
frequency up to certain level.
Class Interval
(psi)
Tally
Frequency
Relative
Frequency
Cumulative
Relative Frequency
70 ≤ x < 90
||
2
0.0250
0.0250
90 ≤ x < 110
|||
3
0.0375
0.0625
110 ≤ x < 130
|||| |
6
0.0750
0.1375
130 ≤ x < 150
|||| |||| ||||
14
0.1750
0.3125
150 ≤ x < 170
|||| |||| |||| |||| ||
22
0.2750
0.5875
170 ≤ x <1 90
|||| |||| |||| ||
17
0.2125
0.8000
190 ≤ x < 210
|||| ||||
10
0.1250
0.9250
210 ≤ x < 230
||||
4
0.0500
0.9750
230 ≤ x < 250
||
2
0.0250
1.0000
Histogram
• Plot class versus frequency or relative
frequency to get the histogram.
• Plot the classes versus cumulative frequency
to get the cumulative frequency plot.
Histogram of Compressive Strength Data
25
Frequency
20
15
10
5
0
70 90 110 130 150 170 190 210 230 250
Compressive Strength (psi)
Cumulative Frequency Plot of Compressive
Strength Data
Cumulative Frequency
90
80
70
60
50
40
30
20
10
0
100
150 1
Strength
200
250
Concept of Distribution
Histogram Shapes
• While the number of shapes that a histogram ca
take is unlimited, certain shapes appear often then
others.
• Drawing a line that connects the edges of the bars
in a Histogram forms a curve. We can make certain
inferences about the data from the shape of the
curve.
Distribution
Random Experiment and random
variables
• Throwing a coin
• S = { H, T}.
• Define a mapping X: { H, T}  R
• X(H) = 1 and X(T) = 0, Also the
probability of 1 and 0 are the same as for
H and T.
• Then we call X a random variable.
Random Experiment and random
variables
• In the experiment on the number of
defective parts in three parts the sample
space S = { 0, 1, 2, 3}
• Find P(0), P(1), P(2) and P(3)
• P(0) = 1/8, P(1) = 3/8, P(2) = 3/8
and P(3) = 1/8
Probability Mass Function
X
f(x)
o
1
2
3
1/8
3/8
3/8
1/8
Properties of f(x)
f(x)  0
 f(x) = 1
Give many examples in class
•
Probability Mass Function
• Build the probability mass functions
for the following random variables:
– Number of traffic accidents per month on
campus.
– Class grade distribution
– Number of “ F” in SE 205 class per semester
– Number of students that register for SE 205
every semester.
Cumulative Distribution Function
• It is a function that provide the
cumulative probability up to a point for a
random variable (r.v). Defined as
follows for a discrete r.v:
• P( X  x) = F(x) =  f(t)
t x
Cumulative Distribution Function
(CDF)
• Example of a cumulative distribution
function
• F(x) = 0
x  -2
= 0.2 -2  x  0
= 0.7 0  x  2
= 1.0 2  x
What is the density function for the above
F(x). Note you need to subtract
Probability Mass Function Corresponding
to Previous CDF
X
f(x)
-2
0
2
0.2
0.5
0.3
The above density function is the one corresponding to
the previous CDF is
•
Mean /Expected Value of a Discrete
Random Variable (r.v)
• The mean of a discrete r.v denoted
as E(X) also called the expected
value is given as:
• E(X) = μ =  xx f(x)
• The expected value provides a good idea a
bout the center of the r.v.
• compute the mean of the r.v in previous
slide:
• E(X) = (-2) (0.2) + (0) (0.5) + (2)(0.3) = 0.2
Variance of A Random Variable
•
•
•
•
The variance is a measure of variability.
What is variability?
The variance is defined as:
V(X) =σ2 = E(X-μ)2 =  (x-μ)2f(x)
•
Compute the variance of the r.v in the slide before the
previous one.
•
σ2 = (-2-0.2)2 (0.2) + (0-0.2)2(0.5) + (2-0.2)2(0.3)
Expected Value of a Function of a r.v
• Let X be a r.v with p.m.f f(x) and let h(X) is
a function of X. Then the expected value
of h(X) is given as:
• E(H(X)) =  h(x) f(x)
x
• Compute the expected value of h(X) = X2 - X for the
r.v in the previous slides.
.
Problem 3
• Compute the expected value and the
variance for the random variable on
slide 94 page 47.