Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Introduction to Probability and Statistics Chapter 12 Topics • Types of Probability • Fundamentals of Probability • Statistical Independence and Dependence • Expected Value • The Normal Distribution Sample Space and Event • Probability is associated with performing an experiment whose outcomes occur randomly • Sample space contains all the outcomes of an experiment • An event is a subset of sample space • Probability of an event is always greater than or equal to zero • Probabilities of all the events must sum to one • Events in an experiment are mutually exclusive if only one can occur at a time Objective Probability • Objective Probability • Stated prior to the occurrence of the event • Based on the logic of the process producing the outcomes • Relative frequency is the more widely used definition of objective probability. • Subjective Probability • Based on personal belief, experience, or knowledge of a situation. • Frequently used in making business decisions. • Different people often arrive at different subjective probabilities. Fundamentals of Probability Distributions • Frequency Distribution • organization of numerical data about the events • Probability Distribution • A list of corresponding probabilities for each event • Mutually Exclusive Events • If two or more events cannot occur at the same time • Probability that one or more events will occur is found by summing the individual probabilities of the events: P(A or B) = P(A) + P(B) Fundamentals of Probability A Frequency Distribution Example • Grades for past four years. Event Grade A B C D F Number of Students 300 600 1,500 450 150 3,000 Relative Frequency 300/3,000 600/3,000 1,500/3,000 450/3,000 150/3,000 Probability .10 .20 .50 .15 .05 1.00 Fundamentals of Probability Non-Mutually Exclusive Events & Joint Probability • Probability that non-mutually exclusive events M and F or both will occur expressed as: P(M or F) = P(M) + P(F) - P(MF) • A joint (intersection) probability, P(MF), is the probability that two or more events that are not mutually exclusive can occur simultaneously. Fundamentals of Probability Cumulative Probability Distribution • Determined by adding the probability of an event to the sum of all previously listed probabilities Event Grade A B C D F Probability .10 .20 .50 .15 .05 1.00 Cumulative Probability .10 .30 .80 .95 1.00 • Probability that a student will get a grade of C or higher: • P(A or B or C) = P(A) + P(B) + P(C) = .10 + .20 + .50 = .80 Statistical Independence and Dependence Independent Events • Events that do not affect each other are independent. • Computed by multiplying the probabilities of each event. P(AB) = P(A) P(B) • For coin tossed three consecutive times: Probability of getting head on first toss, tail on second, tail on third is: • P(HTT) = P(H) P(T) P(T) = (.5)(.5)(.5) = .125 Statistical Independence and Dependence Independent Events – Bernoulli Process Definition • Properties of a Bernoulli Process: • Two possible outcomes for each trial. • Probability of the outcome remains constant over time. • Outcomes of the trials are independent. • Number of trials is discrete and integer. Binomial Distribution • Used to determine the probability of a number of successes in n trials. P(r) n! prqn -r r!(n-r)! where: p = probability of a success q = 1- p = probability of a failure n = number of trials r = number of successes in n trials • Determine probability of getting exactly two tails in three tosses of a coin. 3! (.5)2(.5)3 2 2! (3 - 2)! (321) (.25)(.5) (21)(1) 6 (.125) 2 P(2 tails) P(r 2) P(r 2) .375 Example • Microchips are inspected at the quality control station • From every batch, four are selected and tested for defects • Given defective rate of 20%, what is the probability that each batch contains exactly two defectives Binomial Distribution Example – Quality Control • What is probability that each batch will contain exactly two defectives? 4! (.2)2(.8)2 2!(4 - 2)! (43 21)(.25)(.5) (21)(1) 24 (.0256) 2 .1536 P(r 2 defectives ) • What is probability of getting two or more defectives? 4! 4! 4! (.2)2(.8)2 (.2)3(.8)1 (.2)4(.8)0 2!(4 - 2)! 3!(4 3)! 4!(4 - 4)! .1536 .0256 .0016 .1808 P(r 2) • Probability of less than two defectives: P(r<2) = P(r=0) + P(r=1) = 1.0 - [P(r=2) + P(r=3) + P(r=4)] = 1.0 - .1808 = .8192 Dependent Events • If the occurrence of one event affects the probability of the occurrence of another event, the events are dependent. • Coin toss to select bucket, draw for blue ball. • If tail occurs, 1/6 chance of drawing blue ball from bucket 2; if head results, no possibility of drawing blue ball from bucket 1. • Probability of event “drawing a blue ball” dependent on event “flipping a coin.” Dependent Events – Conditional Probabilities • Unconditional: P(H) = .5; P(T) = .5, must sum to one. • Conditional: P(RH) =.33, P(WH) = .67, P(RT) = .83, P(WT) = .17 Math Formulation of Conditional Probabilities • Given two dependent events A and B: P(AB) = P(AB)/P(B) or P(AB) = P(A|B).P(B) • With data from previous example: P(RH) = P(RH) P(H) = (.33)(.5) = .165 P(WH) = P(WH) P(H) = (.67)(.5) = .335 P(RT) = P(RT) P(T) = (.83)(.5) = .415 P(WT) = P(WT) P(T) = (.17)(.5) = .085 Summary of Example Problem Probabilities Bayesian Analysis • In Bayesian analysis, additional information is used to alter (improve) the marginal probability of the occurrence of an event. • Improved probability is called posterior probability • A posterior probability is the altered marginal probability of an event based on additional information. • Bayes’ Rule for two events, A and B, and third event, C, conditionally dependent on A and B: P(A C) P(C A)P(A) P(C A)P(A) P(CB)P(B) Bayesian Analysis – Example (1 of 2) • Machine setup; if correct 10% chance of defective part; if incorrect, 40%. • 50% chance setup will be correct or incorrect. • What is probability that machine setup is incorrect if sample part is defective? • Solution: P(C) = .50, P(IC) = .50, P(DC) = .10, P(DIC) = .40 where C = correct, IC = incorrect, D = defective P(ICD) P(DIC)P(IC) P(DIC)P(IC) P(DC)P(C) (.40)(.50) (.40)(.50) (.10)(.50) .80 Statistical Independence and Dependence Bayesian Analysis – Example (2 of 2) • Previously, the manager knew that there was a 50% chance that the machine was set up incorrectly • Now, after testing the part, he knows that if it is defective, there is 0.8 probability that the machine was set up incorrectly Expected Value Random Variables • When the values of variables occur in no particular order or sequence, the variables are referred to as random variables. • Random variables are represented by a letter x, y, z, etc. • Possible to assign a probability to the occurrence of possible values. Possible values of no. of heads are: Possible values of demand/week: Expected Value Example (1 of 4) • Machines break down 0, 1, 2, 3, or 4 times per month. • Relative frequency of breakdowns , or a probability distribution: Random Variable x (Number of Breakdowns) 0 1 2 3 4 P(x) .10 .20 .30 .25 .15 1.00 Expected Value Example (2 of 4) • Computed by multiplying each possible value of the variable by its probability and summing these products. • The weighted average, or mean, of the probability distribution of the random variable. • Expected value of number of breakdowns per month: E(x) = (0)(.10) + (1)(.20) + (2)(.30) + (3)(.25) + (4)(.15) = 0 + .20 + .60 + .75 + .60 = 2.15 breakdowns Expected Value Example (3 of 4) • Variance is a measure of the dispersion of random variable values about the mean. • Variance computed as follows: • Square the difference between each value and the expected value. • Multiply resulting amounts by the probability of each value. • Sum the values compiled in step 2. • General formula: 2 = [xi - E(xi)] 2 P(xi) Expected Value Example (4 of 4) • Standard deviation computed by taking the square root of the variance. • For example data: xi P(xi) xi – E(x) 0 .10 -2.15 1 .20 -1.15 2 .30 -0.15 3 .25 0.85 4 .15 1.85 1.00 [xi – E(xi)]2 4.62 1.32 0.02 0.72 3.42 [xi – E(x)]2 P(xi) .462 .264 .006 .180 .513 1.425 2 = 1.425 breakdowns per month standard deviation = = sqrt(1.425) = 1.19 breakdowns per month Poisson Distribution • Based on the number of outcomes occurring during a given time interval or in a specified regions • Examples – # of accidents that occur on a given highway during a 1week period – # of customers coming to a bank during a 1-hour interval – # of TVs sold at a department store during a given week – # of breakdowns of a washing machine per month Conditions • Consider the # of breakdowns of a washing machine per month example – Each breakdown is called an occurrence – Occurrences are random that they do not follow any pattern (unpredictable) – Occurrence is always considered with respect to an interval (one month) The Probability Mass Distribution • X = number of counts in the interval • Poisson random variable with > 0 x • PMF e f(x)= x=0,1,2, x! • Mean and Variance E[X] = , V (X) = Example • If a bank gets on average = 6 bad checks per day, what are the probabilities that it will receive four bad checks on any given day?10 bad checks on any two consecutive days? • Solution x = 4 and = 6, then f(4) = 6 4 e 6 = 0.135 4! e 12 1210 = 12 and x = 10, then f(10) = = 0.105 10! Example • The number of failures of a testing instrument from contamination particle on the product is a Poisson random variable with a mean of 0.02 failure per hour. – What is the probability that the instrument does not fail in an 8-hour shift? – What is the probability of at least one failure in one 24hour day? Solution a) Let X denote the failure in 8 hours. Then, X has a Poisson distribution with =0.16 P(X=0)=0.8521 b) Let Y denote the number of failure in 24 hours. Then, Y has a Poisson distribution with =0.48 P(Y1) = 1-P(Y = 0) =0.3812 The Normal Distribution Continuous Random Variables • Continuous random variable can take on an infinite number of values within some interval. • Continuous random variables have values that are not countable • Cannot assign a unique probability to each value The Normal Distribution Definition • The normal distribution is a continuous probability distribution that is symmetrical on both sides of the mean. • The center of a normal distribution is its mean . • The area under the normal curve represents probability, and total area under the curve sums to one. The Normal Distribution Example (1 of 5) • Mean weekly carpet sales of 4,200 yards, with standard deviation of 1,400 yards. • What is probability of sales exceeding 6,000 yards? • = 4,200 yd; = 1,400 yd; probability that number of yards of carpet will be equal to or greater than 6,000 expressed as: P(x6,000). The Normal Distribution Example (2 of 5) - The Normal Distribution Standard Normal Curve (1 of 2) • The area or probability under a normal curve is measured by determining the number of standard deviations from the mean. • Number of standard deviations a value is from the mean designated as Z. Z = (x - )/ The Normal Distribution Standard Normal Curve (2 of 2) The Normal Distribution Example (3 of 5) Z = (x - )/ = (6,000 - 4,200)/1,400 = 1.29 standard deviations P(x 6,000) = .5000 - .4015 = .0985 The Normal Distribution Example (4 of 5) • Determine probability that demand will be 5,000 yards or less. Z = (x - )/ = (5,000 - 4,200)/1,400 = .57 standard deviations P(x 5,000) = .5000 + .2157 = .7157 The Normal Distribution Example (5 of 5) • Determine probability that demand will be between 3,000 yards and 5,000 yards. Z = (3,000 - 4,200)/1,400 = -1,200/1,400 = -.86 P(3,000 x 5,000) = .2157 + .3051= .5208 Different Table • P(3,000 x 5,000)= • P((3,000 - 4,200)/1,400) z ((5,000 4,200)/1,400) • P(-0.86 z 0.57)= • P( z 0.57)- P( z -0.86)= • P( z 0.57)- P( z ≥0.86)= • P( z 0.57)- [1-P( z 0.86)]= • (0.7157)-[1-0.8051]=0.5208 The Normal Distribution Sample Mean and Variance • The population mean and variance are for the entire set of data being analyzed. • The sample mean and variance are derived from a subset of the population data and are used to make inferences about the population. The Normal Distribution Computing the Sample Mean and Variance n xi Sample mean x i n1 n (x - x)2 2 i 1 i Sample variance s n -1 Sample standard deviation s s 2 The Normal Distribution Example Problem Re-Done Sample mean = 42,000/10 = 4,200 yd Sample variance = [(190,060,000) - (1,764,000,000/10)]/9 = 1,517,777 Sample std. dev. = sqrt(1,517,777) = 1,232 yd Week i 1 2 3 4 5 6 7 8 9 10 Demand xi 2,900 5,400 3,100 4,700 3,800 4,300 6,800 2,900 3,600 4,500 42,000 The Normal Distribution Chi-Square Test for Normality (1 of 2) • It can never be simply assumed that data are normally distributed. • A statistical test must be performed to determine the exact distribution. • The Chi-square test is used to determine if a set of data fit a particular distribution. • It compares an observed frequency distribution with a theoretical frequency distribution that would be expected to occur if the data followed a particular distribution (testing the goodness-of-fit). The Normal Distribution Chi-Square Test for Normality (2 of 2) • In the test, the actual number of frequencies in each range of frequency distribution is compared to the theoretical frequencies that should occur in each range if the data follow a particular distribution. • A Chi-square statistic is then calculated and compared to a number, called a critical value, from a chi-square table. • If the test statistic is greater than the critical value, the distribution does not follow the distribution being tested; if it is less, the distribution does exist. • Chi-square test is a form of hypothesis testing. Statistical Analysis with Excel (1 of 3) Statistical Analysis with Excel (2 of 3) Statistical Analysis with Excel (3 of 3)