Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
STATISTICS & NUMERICAL METHODS FOR PLANT ENGINEERS AGE-214 By S. O. Duffuaa Systems Engineering Department Salih Duffuaa Dr. Duffuaa is a Professor of Industrial and Systems Engineering at the • Department of Systems Engineering at King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia. He received his PhD in Operations Research from the University of Texas at Austin, USA. His research interests are in the areas of Operations research, Optimization, quality control, process improvement and maintenance engineering and management. He teaches course in the areas of Statistics, Quality control, Production and inventory control, Maintenance and reliability engineering and Operations Management. He consulted to industry on maintenance , quality control and facility planning. He authored a book on maintenance planning and control published by John Wiley and Sons and edited a book on maintenance optimization and control. He is the Editor of the Journal of Quality in Maintenance Engineering, published by Emerald in the United Kingdom. King Fahd University of Petroleum & Minerals Department of Systems Engineering A closed short course for Saudi Aramco Employees On Statistics for Plant Engineers and Lab Scientists Oct, 18-22, 2008 Morning Session Afternoon session Day 7:30 – 11:45 A.M 12:45 to 3:45 PM Saturday Dr. Salih Duffuaa, Introduction to probability ,frequency & Probability distributions, mean and variance. Sunday Dr. Mohammad Haboubi, The normal distribution, the central limit theorem and sampling distributions. . Monday Dr. Hesham Al-Fares: , Point and interval estimation, statistical significance tests. Dr. Mohammad Al Salamah, Simple Tuesday regression, residual analysis Wednesday Dr. Shokri Selim:Multiple regression, adequacy of a regression model, Applications Course Outcomes • • • • • • Apply probability concepts and laws to solve basic lab problems. Summarize and present data in meaningful ways. Compute probabilities from probability distributions. Construct confidence intervals for sample data. Test statistical hypothesis. Construct regression models and use them in various applications such as equipment calibration and prediction. Day 1 Module Objectives • Concept and definition of probability. • Axioms of probability • Laws of probability Day 1 Module Objectives • Data Summary – Measures of central tendency. • Mean X-bar and Median M – Measures of variability – Range R, Variance S2, Standard deviation S and coefficient of variation (CoV). • Frequency Distribution. • Distributions • Expected value Day 1 Module Objectives • Random variables. • Mass and distribution functions • Expected value Examples of a Random Experiment • Measuring a current in a wire. • Number of samples analyzed per day . • Time to do a task. Time to analyze a sample. • Yearly rain fall in Dhahran Examples of a Random Experiment • Throwing a coin • Number of accidents on campus per month. • Students must generate at least 5 examples. Random Experiments • Every time the experiment is repeated a different out come results. • The set of all possible outcomes is call Sample Space denoted by S. • In the experiment of throwing the coin the sample space S = { H, T}. Random Experiments • In the experiment on the number of defective parts in three parts the sample space S = { 0, 1, 2, 3}. • Number of weekly traffic accidents on KFUPM campus. Event • An event E is a subset of the sample space. • Example of Events in the experiment of the number of defective in a sample of 3 parts are: • E1 = { 0}, E2 = { 0,1}, E3 = { 1, 2} Example of Events • A sample of polycarbonate plastic is analyzed for scratch resistance and shock resistance. The results from 49 samples are: • Shock resistance H H 40 L 4 Scratch Resistance L 2 3 Let A denote the event a sample has high shock resistance and B denote the event a sample has high scratch resistance. Determine the the number of samples in AB, AB and A` Solution of Example • IAI = 42, IBI = 44 • IABI = 40 • IABI = 46 • A=7,B=5 Exercise • Refer to the event example and answer the following: • Find the number in AB • Find the number of elements in AB • Find the number of elements in AB Listing of Sample Spaces • Tree Diagrams • Experience • Listing of Sample Spaces • The experiment of throwing a coin twice H H T H 1 T T S = { HH, HT, TH, TT} Example on Listing Sample Spaces • Draw the tree diagram for finding the sample space for the number of defect item in a sample of size three taken from a production line producing chips. Types of Sample Spaces • A sample space is discrete if it consists of a finite ( or countable infinite ) set of outcomes. Examples are: • S = { H, T}, S = { 1, 2, 3, …} • Sample space is continuous if contains an interval (finite or infinite): { T: 0 ≤ T ≤ 60}. • Students should give more examples Notation •P - denotes a probability •A, B, ... •P (A) - - denote a specific event denotes the probability of an event occurring Concepts and Definition of Probability Four definitions of probability: • Classical or a priori probability • Statistical or a posteriori probability • Subjective probability (used in Bayesian methods). • Mathematical probability CLASSICAL OR A PRIORI PROBABILITY P(A) = # of ways A can occur (# favorable cases) Total number of possible cases (# of total possible cases) STATISTICAL OR A POSTERIORI PROBABILITY Pr (A) = # of successes Number of trials In the limit as # of trials Infinity SUBJECTIVE PROBABILITY • A measure of the degree of belief. • There is a 10% chance it will rain today. • There is a 95% chance you can see the new moon tomorrow morning. • Subjective probability is the basis for Bayesian methods. Probability Limits The probability of an impossible event is 0. The probability of an event that is certain to occur is 1. 0 ≤ P(A) ≤ 1 Impossible to occur Probability Limits The probability of an impossible event is 0. The probability of an event that is certain to occur is 1. 0 ≤ P(A) ≤ 1 Impossible to occur Certain to occur MATHEMATICAL PROBABILITY • A measure of uncertainty ( or possibility) that satisfy the following conditions: • 0 ≤ P(A) ≤ 1 • P(S) = 1 • Pr (A U B) = Pr (A) + Pr (B) If A Π B =Ǿ Possible Values for Probabilities 1 Certain Likely 0.5 50-50 Chance Unlikely 0 Impossible Probability of an Event • For discrete a sample space, the probability of an event denoted as P(E) equals the sum of the probabilities of the outcomes in E. • Example: S = { 1, 2, 3, 4, 5} each outcome is equally likely. E is even numbers within S. E = { 2, 4}, P(E) = 2/5. Axioms of Probability • If S is the sample space and E is any event then the axioms of probability are: 1. P(S) = 1 2. 0 P(E) 1 3. If E1 and E2 are event such that E1 E2 = , then, P(E1 E2) = P(E1 ) + P(E2) mutually exclusive • Events A and B are mutually exclusive if they cannot occur simultaneously . Definition Total Area = 1 P(A) P(B) P(A and B) Overlapping Events Definition Total Area = 1 P(A) P(B) Total Area = 1 P(A) P(B) P(A and B) Overlapping Events Nonoverlapping Events Complementary Events The complement of event A, denoted by A, consists of all outcomes in which event A does not occur. P(A) P(A) (read “not A”) Rules for Complementary Events P(A) + P(A) = 1 Rules for Complementary Events P(A) + P(A) = 1 P(A) = 1 – P(A) Rules for Complementary Events P(A) + P(A) = 1 P(A) = 1 – P(A) P(A) = 1 – P(A) Venn Diagram for the Complement of Event A Total Area = 1 P(A) P(A) = 1 – P(A) Probability of ‘At Least One’ ‘At least one’ is equivalent to one or more. The complement of getting at least one item of a particular type is that you get no items of that type. Probability of ‘At Least One’ • If P(A) = P (getting at least one), then • P(A) = 1 – P(A) • where P(A) is P (getting none) Definitions • Any event combining 2 or more events Compound Even Notation • P(A or B) = P (event A occurs or event B occurs or they both occur • ) P(A and B) = P (event A occurs and event B occurs) Addition Rule • P(A or B) = P (event A occurs or event B occurs or they both occur ). • P(AB) = P(A) + P(B) – P( AB) • If AB) = , then, • P(AB) = P(A) + P(B) Addition Rule • A B P(AB) = P(A) + P(B) – P( AB) Addition Rule: Example • • • • • • Let S = { 1, 2, 3, 4, 5, 6, 7, 8,9,10) A = { 2,3,4,5,6}, B = {4, 5,6,7,9,10} AB = { 2,3,4,5,6,7,9,10} P(A) = 5/10 =0.5 P(B) = 6/10 = 0.6 P(AB) = 0.8 P(AB) = 0.5 + 0.6 – 0.3 = 0.8 Problem • Let assume A, B and C are events from the sample space S. P(A) = 0.4. P(B) = 0.5, P(C) = 0.3, P(BC) = 0.1, P(AC) = 0.2, A and B are mutually exclusive. Compute the following: (i) P(A'), (ii) P(AυC), (iii) P[(AυB)C) (iv) P(A υB υC), (v) P(B'UC‘) Note A' means A compliment. Conditional Probability • Conditional Probability Concept • P(A B) = P(A B)/ P(B) for P(B) > 0 • Give Examples • Solve problems Example on Conditional Probability • • • • Let S = { 1, 2, 3, 4, 5, 6, 7, 8,9,10) A = { 2,3,4,5,6}, B = {4, 5,6,7,9,10} P(A|B) = P(AB)/P(B) = 0.3/0.6 = 0.5 This is as if we consider B our sample space and see how many elements from A in B. This will make P(B) = 1 Conditional Probability Dependent Events P(A and B) = P(A) • P(B|A) Conditional Probability Dependent Events P(A and B) = P(A) • P(B|A) Formal Intuitive P(B|A) = P(A and B) P(A) The conditional probability of B given A can be found by assuming the event A has occurred and, operating under that assumption, calculating the probability that event B will occur. Independent Events Independent Events Two events A and B are independent if the occurrence of one does not affect the probability of the occurrence of the other. • P(A|B) = P(A) Dependent Events Dependent Events If A and B are not independent, they are said to be dependent. Formal Multiplication Rule • P(A and B) = P(A) • P(B) if A and B are independent • P(AB) = P(A) • P(B) Figure 3-9 Applying the Multiplication Rule P(A and B) Multiplication Rule Are A and B independent ? Yes No P(A and B) = P(A) • P(B | A) P(A and B) = P(A) • P(B) Generalization of Addition Rules • Addition Rule P(AB) = P(A) + P(B) – P( AB) • If AB) = , then, • P(AB) = P(A) + P(B) • This rule can be generalized to k events • If Ei Ej = , then • P( E1 E2 … Ek) = P(E1) + P(E2) + … + P(EK) Multiplication Rule • P(A B) = P(AB) ) P(B) = P(BA) ) P(A) • Example: The probability that an automobile battery subject to high engine compartment temperature suffer low charging is 0.7. The probability a battery is subject to high engine compartment temperature is 0.05. What is the probability a battery is subject to low charging current and high engine compartment temperature? Solution of Example • Let A denote the event a battery suffers low charging current. Let B denote the event that a battery is subject to high engine compartment temperature. The probability the battery is subject to both low charging current and high engine compartment temperature is the intersection of A and B. • P(A B) = P(AB) ) P(A) = 0.7 x 0.05 = 0.035 Example On Conditional and Multiplication ( Product) Rule • Consider a town that has a population of 900 persons, out of which 600 are males. The rest are females. A total of 600 are employed, out of which 500 are males. Let M denote male, F denote female and E employed and NE not employed. A person is picked at random. Find the following probabilities. P(M), P(E), P(EF), P(EF), P(E F). Town Population Example • Town Population Distribution Male Female Total Employed 500 100 600 Unemployed 100 200 300 600 300 900 Total Solution of Example • • • • P(M) = 600/900 = 2/3 P(E) = 600/900 = 2/3 P(EF) = 100/300 = 1/3 P(EF) = P(EF) P(F) = (1/3) x (1/3) = 1/9 • P(E F) = P(E) + P(F) – P(EF) = 2/3 + 1/3 – 1/9 = 8/9 Statistical Independence • Two events are statistically independent if the knowledge about one occurring does not affect the probability of the other happing. Mathematically expressed as: • P(AB) = P(A) • P(A B) = P(A) P(B) Why ? Example of Independence • Let us consider the experiment of throwing the coin twice. Let B denote the event of having a head (H) in the first throw and A denote having a tale (T) in the second throw. • P(AB) ) = ½ = P(A) • P(A B) = ½ x ½ = ¼ = P(A) P(B) • Therefore A and B are independent Example of Dependent • A daily production of manufactured parts contains 50 parts that do not meet specifications while 800 meets specification. Two parts are selected at random without replacement from the batch. Let A denote the event the first part is defective and B the event the second part is defective. • Are A and B independent? • The answer is NO. Work it out before you see the next slide Example of Dependent • P( BA ) = 49/849 why? • P(B) = P(B A )P(A) + P(B A)P(A) = (49/849)(50/850) + (50/849)(800/850) = 50/850 Therefore A and B are not independent. Problem 1 • A box contains 9 components of which 3 are defective. • If one unit is drawn at random what is the probability it is defective? • If two components are chosen at random – What is the probability both are defective? – what is the probability just one is defective? Problem 2 • A company has two machines one is less reliable than the other. The better one has probability of 0.9 of working through out the week without repair. The probability for the other is 0.7. • What is the probability both machines is working satisfactory through out the week? • What is the probability at least one of them requires repair? Total Probability Rule Motivational Problem • Aramco has three labs that perform the same oil analysis. Lab1 in Abqaiq, Lab2 in Ras Tanura and Lab3 is in Dhahran. 100 samples were analyzed in lab 1, 70 samples in lab2 and 30 samples in lab 3. The chance of error in lab1 is 5%, in lab2 is 10% and in lab3 is 8%. What is the probability that a sample analyzed by Aramco is in error. If the analysis of a sample is found to be in error what is the chance the analysis is done in lab1. Total Probability Rule • In a chip manufacturing process 20% of the chips produced are subjected to a high level of contamination. 0.1 of these chips causes product failure. The probability is 0.005 that a chip that is not subjected to high contamination levels during manufacturing causes a product failure. • What is the probability that a product using one of these chips fails? Total Probability Rule • Let B the event that a chip causes product failure. We can write B as part of B in A and part of B in A. • B = (B A) (B A) • P(B) = P(B A) + P(B A) • P(B) = P(BA) ) P(A) + P(B A) ) P(A) • Graphically on next slide. Graphical Representation A A B A General Form of Total Probability Rule • Assume E1, E2, … Ek are mutually exclusive and exhaustive events. Then • P(B) = P(B E1) + P(B E2) + …+ P(B Ek) ) = P(B E1) P(E1) + P(B E2) P(E2) + …+ P(B Ek) P(Ek) Bayes Rule • • • • P(A B) = P(AB) ) P(B) = P(BA) ) P(A) Implies P(AB) ) = P(BA) ) P(A)/ P(B) , P(B) > 0 OR Refer to the slide on about the general total probability rule, we get • P(Ei B) = P(Ei B)/ P(B) = P(B Ei )P(Ei)/ P(B) • = P(B Ei )P(Ei)/ P(B E1) P(E1) + P(B E2) P(E2) + …+ P(B Ek) P(Ek) Example on Bayes Theorem • Refer to the example about the chip production. If you know a chip caused failure what is the chance that the chip is subjected to a high level of contamination when its produced. • We want P(A B) • P(A B) = P(B A) P(A)/ P(B) = (.1)(.2)/0.024 = 5/6 = 0.833 What is the probability of the chip is not subjected to a high level of contamination when produced ? Answer in two ways. Aramco Example Solution • KFUPM Example on Bayes Theorem • KFUPM students when driving to building 24 th use two roads. The main road that passes in front of gate 1 and the second road that passes in front of gate 2. The students use the main road 80% of the time because it is shorter. The radar is on 60% of the time on the main road and 30% of the time on the other road. The students are always speeding. Find the chance a student will be caught speeding. If you know student is caught speeding what is the probability he is coming to building 24 by the main road. Answer the same question for the other road. Solution of KFUPM Example Data Table Compressive Strength of 80 Aluminum Lithium Alloy 105 221 183 97 154 153 245 228 174 163 131 154 207 180 190 134 178 76 218 157 101 199 151 142 160 175 149 196 201 200 186 121 181 174 120 168 199 181 158 115 160 208 193 194 133 167 184 135 171 165 172 163 145 171 87 160 237 176 150 170 180 167 176 158 156 229 158 148 150 118 143 141 110 133 123 146 169 158 135 149 How to Summarize The Data in The Table Above • Point summary • Tabular format • Graphical format Point Summary • Measures of Central Tendency • Measures of Variation • Why are we interested in these type of measures? Measures of Central Tendency or Location • Central tendency measures – Mean xi/n – Median --- Middle value – Mode --- Most frequent value Percentiles: Measure of Location • Pth percentile of the data is a value where at least P% of the data takes on this value or less and at least (1-P)% of the data takes on this value or more. • Median is 50th percentile. ( Q2) • First quartile Q1 is the 25th percentile. • Third quartile Q3 is the 75th percentile. Measures of Variability • Range = Max xi - Min xi • Variance = V = (xi – x )2/ n-1 • Standard deviation = S S = Square root (V) Tabular and Graphical Summary • Frequency distribution table. • Histogram • Cumulative frequency plot. Steps for Constructing Frequency Table • Determine number of classes/intervals. – Between 5 and 20 – Close to Square root of n : number of data points. • Count how many data points in each class. This is the frequency. • The relative frequency is the frequency divided by n. • The cumulative frequency is the sum of the frequency up to certain level. Class Interval (psi) Tally Frequency Relative Frequency Cumulative Relative Frequency 70 ≤ x < 90 || 2 0.0250 0.0250 90 ≤ x < 110 ||| 3 0.0375 0.0625 110 ≤ x < 130 |||| | 6 0.0750 0.1375 130 ≤ x < 150 |||| |||| |||| 14 0.1750 0.3125 150 ≤ x < 170 |||| |||| |||| |||| || 22 0.2750 0.5875 170 ≤ x <1 90 |||| |||| |||| || 17 0.2125 0.8000 190 ≤ x < 210 |||| |||| 10 0.1250 0.9250 210 ≤ x < 230 |||| 4 0.0500 0.9750 230 ≤ x < 250 || 2 0.0250 1.0000 Histogram • Plot class versus frequency or relative frequency to get the histogram. • Plot the classes versus cumulative frequency to get the cumulative frequency plot. Histogram of Compressive Strength Data 25 Frequency 20 15 10 5 0 70 90 110 130 150 170 190 210 230 250 Compressive Strength (psi) Cumulative Frequency Plot of Compressive Strength Data Cumulative Frequency 90 80 70 60 50 40 30 20 10 0 100 150 1 Strength 200 250 Concept of Distribution Histogram Shapes • While the number of shapes that a histogram ca take is unlimited, certain shapes appear often then others. • Drawing a line that connects the edges of the bars in a Histogram forms a curve. We can make certain inferences about the data from the shape of the curve. Distribution Random Experiment and random variables • Throwing a coin • S = { H, T}. • Define a mapping X: { H, T} R • X(H) = 1 and X(T) = 0, Also the probability of 1 and 0 are the same as for H and T. • Then we call X a random variable. Random Experiment and random variables • In the experiment on the number of defective parts in three parts the sample space S = { 0, 1, 2, 3} • Find P(0), P(1), P(2) and P(3) • P(0) = 1/8, P(1) = 3/8, P(2) = 3/8 and P(3) = 1/8 Probability Mass Function X f(x) o 1 2 3 1/8 3/8 3/8 1/8 Properties of f(x) f(x) 0 f(x) = 1 Give many examples in class • Probability Mass Function • Build the probability mass functions for the following random variables: – Number of traffic accidents per month on campus. – Class grade distribution – Number of “ F” in SE 205 class per semester – Number of students that register for SE 205 every semester. Cumulative Distribution Function • It is a function that provide the cumulative probability up to a point for a random variable (r.v). Defined as follows for a discrete r.v: • P( X x) = F(x) = f(t) t x Cumulative Distribution Function (CDF) • Example of a cumulative distribution function • F(x) = 0 x -2 = 0.2 -2 x 0 = 0.7 0 x 2 = 1.0 2 x What is the density function for the above F(x). Note you need to subtract Probability Mass Function Corresponding to Previous CDF X f(x) -2 0 2 0.2 0.5 0.3 The above density function is the one corresponding to the previous CDF is • Mean /Expected Value of a Discrete Random Variable (r.v) • The mean of a discrete r.v denoted as E(X) also called the expected value is given as: • E(X) = μ = xx f(x) • The expected value provides a good idea a bout the center of the r.v. • compute the mean of the r.v in previous slide: • E(X) = (-2) (0.2) + (0) (0.5) + (2)(0.3) = 0.2 Variance of A Random Variable • • • • The variance is a measure of variability. What is variability? The variance is defined as: V(X) =σ2 = E(X-μ)2 = (x-μ)2f(x) • Compute the variance of the r.v in the slide before the previous one. • σ2 = (-2-0.2)2 (0.2) + (0-0.2)2(0.5) + (2-0.2)2(0.3) Expected Value of a Function of a r.v • Let X be a r.v with p.m.f f(x) and let h(X) is a function of X. Then the expected value of h(X) is given as: • E(H(X)) = h(x) f(x) x • Compute the expected value of h(X) = X2 - X for the r.v in the previous slides. . Problem 3 • Compute the expected value and the variance for the random variable on slide 94 page 47.