Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
LESSON FIVE: INTRODUCTION TO PROBABILITY DISTRIBUTIONS Introduction to probability distributions and random variables The concept of a probability distribution was introduced briefly in lesson four here it was described as a list of every possible outcome with corresponding probability. The corresponding probabilities were calculated as simple relative frequencies. In lesson four you may also have noted that probability calculations also for simple real life situations became complex very quickly. But, under certain well defined conditions, it is possible to derive formulae to calculate the probabilities. Random variables In statistics the outcome of a random experiment is variable and determined by chance. The numeric value of the outcome is called a random variable. For example, when a die is thrown the face value that turns up may be any one of the numbers: 1, 2,3, 4, 5 or 6. The face value that turns is a random variable. In other situations the outcome of a random experiment is not numeric, so a formula or rule is used to assign a numeric value to each outcome. For example, if three coins are tossed, the outcomes are. 3 (when the outcome is three Heads) 2 (when the outcome is two Heads and one Tail) 1 (when the outcome is one Head and two Tails) 0 (when the outcome is three Tails) Hence the more general definition of a random variable is the rule that assigns a numeric value to each outcome of a random experiment. The numeric values of a random variable may be discrete or continuous. A discrete random variable can assume a finite number of distinct values. A continuous random variable can assume any value within a continuous interval (example: the time to process a online customer order). The probability that the random variable X will assume the value x is written as P(X=x) or simply P(x) on a trial of a random experiment. A trial of a random experiment is a single execution of the experiment. Probability distributions Definitions An empirical probability distribution is a list of every outcome of a random experiment with the corresponding probability. A discrete probability distribution is the probability distribution of the discrete random variable. The Uniform, Bernoulli and Poisson distributions are discrete distribution. A continuous probability distribution is the probability distribution of a continuous random variable: the outcomes can be any value from of continuous interval. The discrete uniform probability distribution οP(x) = 1: the sum of the probability must be unity. This is an essential property of all probability distribution. Graphs of a probability distribution The outcomes (x) of the experiment are marked on the horizontal axis and probabilities (y). A probability histogram depicts probability as area for discrete distributions. The area of a probability histogram must be unity. Probability histograms are useful as a graphical tool to aid visualization of certain ideas as reading probabilities from tables and approximating discrete probabilities with continuous. However, for continuous random variables it will be necessary to calculate probabilities in terms of area under the probability curve. The Binomial probability distribution. The Binomial formula may be used to calculate probabilities when the following conditions are true: 1. There are only two mutually exclusive outcomes; 2. The probability of success is the same for each trial; 3. The trials are independent; 4. The number of trial is finite. The Binomial probability distribution function (Binomial formula) is π! π(π₯ ) = (ππ₯)π π₯ π πβπ₯ = (πβπ₯)!π₯! π π₯ π πβπ₯ = π πΆπ₯ π π₯ π πβπ₯ π πΆπ₯ gives the number of outcomes (or arrangements) that satisfied the condition x successes out of n trials. Example 1 An exam consists of four multiple choice question. If each question has only one correct answer calculate the probability that a student randomly selects the correct answer to (a) all four questions; (b) any three questions; (c) any two questions; (d) any one question; (e) none of the questions when there is a choice of four possible answers. When there a choice of four answer to a question p =0,25 and q = 0,75. 4! P(x=0) =(4β0)!0! 0,250 0,754 = 1 π₯ 0,3164 = 0,3164 4! 4β3β2β1 P(x=1) =(4β1)!1! 0,251 0,753 = 3β2β1 0,25 β 0,753 = 4 β 0,25 β 0,4219 = 0,4219 4! P(x=2) =(4β2)!2! 0,252 , 752 = 4β3β2β1 2β2 0,0625 β 0,5625 = 6 β 0,0625 β 0,5625 = 0,2109 P(x=3) = 4! (4β3)!3! 0,75 = 0,0469 0,253 0,751 = 4β3β2β1 3β2β1 0,253 β 0,751 = 4 β 0,015625 β P(x=4) = 4! (4β4)!4! 0,254 0,750 = 0,0039 Example 2 It is known that one out of every tax returns contains errors and are classified as faulty. An inspector randomly selects a sample of 20 tax returns. Calculate the probability that in the sample of 20 (i) seven are faulty; (ii) at most 2 are faulty. p = 0,20 q = 0,80. P(x=7) = 20! (20β7)!7! 0,207 0,8013 = 20! 13!7! 0,207 0,8013 = 390700800 5040 β 0,0000064 β 0,0687194 = 0,034 Probability that at most two are faulty means P(x=0) + P(x=1) + P (x=2). P(x=0) = 0,8020= 0,0115 20! P(x=1) = 0,20 0,8019 = 20 β 0,20 β 0,01441 = 0,0576 P(x=2) = (20β1)!1! 20! (20β2)!2! 0,202 0,8018 = 20β19 2 0,202 8018 = 190 β 0,04 β 0,01801 = 0,1369 Discrete cumulative probabilities distributions and applications In Lesson 1 a cumulative frequency was the sum of all frequencies up to and including the frequency for a given interval, say interval r: βππ=1 ππ . Similarly, a cumulative probability is the sum of all the probabilities up to and including the probability P(x=r) of r successes. The cumulative probability is written as π(π₯ β€ π) = βπ₯=π π₯=0 π(π₯). A cumulative probability distribution is a list of every outcome x with the corresponding cumulative probability π(π₯) β€ π, r =0,1,2,β¦n. Example p=0,20 n= 5 r 0 1 2 3 4 5 P(x=r) 0,3277 0,4096 0,2048 0,0512 0,0064 0,0003 P(xβ€r) 0,3277 0,7373 0,9421 0,9933 0,9997 1,0000 The Poisson probability distribution Binomial probability calculations require a finite sample size. There are many situations where a sample size does not feature. For example sample size does not feature in the calculation of the probability of x emergency calls per hour or x faults per km. of cable, etc. In situation such as these (subjected to given assumptions outlined below), a formula called the βPoisson probability formulaβ will be used to calculate the probability of x occurrences of an event over a given interval of time or length (area, volume, etcβ¦). Assumptions The Poisson formula is π(π₯ ) = ππ₯ π βπ π₯! . This βPoisson probability formulaβ will be used to calculate probabilities in situations where a rare, random event occurs at a uniform rate. Example An ambulance service receives an average of four calls in 1 hour during βoff peakβ hours. Calculate that during βoff peaks hoursβ there are (a) three calls in one hour; (b) at most two calls in 1 hour; (c) one call in 30 minutes. Assumptions 1. The average rate at which an event occurs per interval is uniform. For the ambulance service the average rate was given as π = four calls in 1 hour. Since the rate was uniform, the rate may be halved to give an average of two calls in 30 minutes; doubled to give an average of eigth calls in 2 hours, etc. 2. The number of calls is not influenced by what happened in the previous interval. So if there were no calls in 1 hour, this has no effect on the chance of x calls in the next hour. This property is often quoted as βthe Poisson process has no memoryβ. 3. There is practically no chance of more than one call arriving in a very short time. For example, the probability of more than one call in a minute is 0,0645 and in a second is 0,0110. As the interval becomes smaller, the probability of more than one call approaches zero. The Poisson probability distribution function is given by the formula π(π₯ ) = ππ₯ π βπ π₯! We must know π. Example (a) Three calls in 1 hour; (b) at most two calls in an hour (c) one call in 30 minutes (d) more of one call in 1 minute; 1 second. (a) x= 3, π= 4 in 1 hour π(π₯ = 3) = (b) 43 π β4 3! = 64β(0,01832) 6 = 0,19549 βat most twoβ means β0 OR 1 OR 2β P(at most two) = 40 π β4 0! + 41 π β4 1! + 42 π β4 2! = π β4 (1 + 4 + 8) = 0,2381 c) Since the rate is uniform, for 30 minutes π =2. Hence the probability of one call in 30 minutes is 21 π β2 π(π₯ = 1) = = 0,2706 2! d) The probability of more than one call;) P(xβ₯1)= 1 β P(x=0). 400 For 1 minute π= (4/60) hence P(xβ₯1) = 1 β 0,0645. (60 0! π For 1 second π = (4/60x60) hence P(xβ₯1) = 1 -( 40 β60 ) = 1 β 0,9355 = β4 4 0 3600 π 3600 0! )=1β 0.9890 = 0,0110 Cumulative Poisson probabilities Example π=4 r 0 1 2 3 4 5 6 7 P(x=r) 0,0183 0,0733 0,1465 0,1954 0,1954 0,1563 0,1042 0,0595 P(xβ€r) 0,0183 0,0916 0,2381 0,4335 0,6288 0,7851 0,8893 0,9489 The Normal probability distribution The Normal curve was introduced in 1733 by De Moivre as an approximation to certain Binomial distribution. However Gauss gave a rigorous account of its properties in 1809. The Normal probability is a continuos probability distribution and it is the most important probability in statistics. It will feature in the remainder of the text, not just in probability but also in statistical inference. It has been long recognized that large numbers of measurements, when sorted and plotted in a probability (relative frequencies) histogram, tend to assume a bell-shaped form. The equation of the Normal curve is given by: π(π₯ ) = 1 β 1 (π₯βπ)2 2π 2 π πβ2π f(x) is called a probability density function. ΞΌ and Ο are the mean and the standard deviation of the distribution. x can take any value between mins infinity and plus infinity. The value of ΞΌ determines the location of the curve; the value of Ο determines the width of the curve. The area under the curve is always unity, a property of any probability distribution. The probability that a random variable, x, has a value between x=a and x=b is given by the area under the curve between x=a and x=b. Areas under curves are usually calculated by integration π₯=π π(π β€ π₯ β€ π) = β«π₯=π π(π₯ )ππ₯ Fortunately, for the Normal probability curve, it will be not necessary to use integration to calculate areas. The Normal curve has special characteristics which allows us to find the area from a single set of tables. Special properties of the Normal distribution 1. Total area under the curve is one 2. The curve is symmetrical about the mean. The area to the left of the mean is 0,5 and the area to the right of the mean is 0,5. 3. The area under the curve between the mean and any point x depends on the number of standard deviations between x and ΞΌ. For example, the area between the mean and a point which is one standard deviation (1xΟ) greater (or less) than the mean is 0,3413. The area between the mean and a point which is two standard deviations (2xΟ ) greater (or less) than the mean is 0,4772. Use the Normal probability tables to determine areas under the Normal curve. Example The time taken to complete a transaction at an ATM machine is normally distributed with a mean of 11 minutes and a standard deviation of 3 minutes. Use Normal probability tables to calculate the probability that a transaction will take a) (i) 11 and 14 minutes (ii) 8 and 14 minutes; b) (i) 11 and 17 minutes (ii) 5 and 17 minutes; c) (i) 11 and 20 minutes (ii) 2 and 20 minutes. Since Z = the number of standard deviations between a point x and the mean, in this example where ΞΌ= 11 and Ο=3 the values of Z for part (a), (b) and (c) are respectively +1 and -1; +2 and minus 2; +3 and -3. The probability between Z= -1 and Z=+1 is 0,6826. The probability between Z=-2 and Z=+2 is 0,9544. The probability between Z=-3 and Z=+3 is 0,9972. a) (i) since ΞΌ = 11 and Ο= 3, then x=14 is one standard deviation from the mean, since Z= +1. From the Normal probability distribution the tail area from Z= 1 is 0,1587 so the area between 11 and 14 is 0,50,1587= 0,3413. (ii) point x=8 is three minutes, one standard deviation below the mean ; Z= -1. Because of symmetry, areas equidistant on each side of the mean are equal. Since the area on the right is = 0,3413, then the area to the left is also 0,3413. The total area is 0,6826. The probability than a transaction will take between 8 and 14 minutes is 0,6826. b) (i) Since ΞΌ= 11 and Ο= 3, then x=17 is two standard deviations (six minutes) from the mean, hence Z= 2,00. From the Normal probability distribution the tail area from Z=2,00 is 0,0228. Hence, the area between the mean and Z = 2,00 is =0,5 -0,0228 = 0,4772. The probability of a transaction will take between 11 and 17 minutes is 0,4772. (ii) By symmetry, the total area between Z=-2 and Z= 2 is 0,9544. The probability that a transaction will take between 5 and 17 minutes is 0,9544. c) (i) Since ΞΌ=11 and Ο=3, then x=20 is three standard deviations (nine minutes) from the mean, then Z=3,00. From the Normal probability tables the tail area from Z = 3 is 0,0013. Hence the area between the mean and Z=3 is calculated as 0,5 β 0,0013 = 0,4987. The probability that a transaction will take between 11 and 20 minutes is 0,4987. (ii) By symmetry, the total area between Z=-3 and Z=3 is 0,9974. The probability that a transaction will take between two and 20 minutes is 0,9974. In most problems, the Z-values must be calculated by the formula: π= π₯βπ π Example. The time taken to process an email enquiry is normally distributed with a mean time of 500 sec. and standard deviation of 10 sec. What is the probability that a randomly select email enquiry will be processed in (a) more than 505 seconds; (b) less than 485 seconds; (c) between 485 and 505 seconds. a) For x= 505, π = π₯βπ π = 505β500 10 = 0,50 When Z = 0,50 the area in the tail is 0,3085, that is the required probability. b) For x= 485, π = π₯βπ π = 485β500 10 = β1,50 When Z is negative,look up the tables for Z=+1,5, the tail area is 0,0668. By symmetry, the area above Z=1,5 is the same as area below Z=-1,5 = 0,0668. c) 1- 0,3085 -0,0668 = 0,6247. Method for calculating the limits that contains a given percentage of values under the Normal curve. In some situations it might be useful to know that it takes between P and Q seconds to process, say, 95% of inquiries. Calculating the values of P and Q is possible when the times are Normally distributed with mean and standard deviation known. The method for calculating the two limits, , symmetrical about the mean, that contain a given percentage of all the data is set out as follows: Step 1. Sketch a Normal curve marking and label the given area and tail areas. Step 2. From the tail areas, look up tables to find Z. Step 3. From the value of Z, calculate the number of units (d) between x and the mean. It will be shown that the number of units d=ZΟ. Step 4. Calculate the values of P and Q, Q= ΞΌ+ZΟ, P=ΞΌ-ZΟ. Example Within what time should 94% of the mail be processed? Since there is 94% below x this leaves 6% in the upper tail, to find the value of Z corresponding to a tail area of 6% you will use the Normal probability distribution tables, Z = 1,56. Hence Q=500+1,56(10) = 515,6. In general, if Ξ± represent the area in both tails of a Normal curve (that is Ξ±/2 is the area in each), then (1-Ξ±) represents the area between the two tails, then (1-Ξ±)100% of the area under the normal curve will fall between π β ππΌ/2 πππ π + ππΌ/2 The standard Normal distribution Step 2 in the method for calculating Normal probabilities required the calculation of the number of standard deviations (Z) between x and ΞΌ by the formula π₯βπ π This formula transforms (or rescales) any Normal probability distribution that has a mean ΞΌ and a standard deviation Ο to the standard Normal distribution. The standard Normal distribution has a mean value ΞΌ = 0 and standard deviation = 1. Hence the standard Normal tables (Z) are tables for the standard Normal distribution. π= Sums or differences of Normal independent random variables The distribution of the sums and differences of two normally distributed, independent variables (NIRV) features in many applications. The distribution for the difference between NIRVs is fundamental to making inference about the difference between population means and proportion. RULE Suppose the random variables X1 is Normally distributed with mean ΞΌ1 and standard deviation Ο1 X2 is Normally distributed with mean ΞΌ2 and standard deviation Ο2 Then the sum of the random variables X1 and X2 is a random variable that is Normally distributed with mean ΞΌ= (ΞΌ1 + ΞΌ2) and variance π 2 = π12 + π22 . Expected values (mathematical expectations) The expected value of a random variable is defined as its mean value. It could also be described as βthe value that you can expect on averageβ. The formula for calculating expected values follows directly from the formula calculating the mean value for grouped data. The expected value is a very important mathematical tool in the theory of statistics. It is used in many applications as the calculation of expected profits and losses and risk analysis. The mean value of a random variable The expected value of the random variable X is π = πΈ (π₯ ) = β π₯π(π₯) The expected value of any function (or formula) of the random variable is defined as πΈ(π(π)) = β π(π₯ )π(π₯) For example E(X2) = β π₯ 2 π(π₯). The variance for a random variable is π (π₯ ) = π 2 = πΈ(π₯ β π)2 = β(π₯ β π)2 π(π₯) For continuous random variables +β πΈ (π₯ ) = β« π₯π (π₯ )ππ₯ ββ +β π (π₯ ) = β« (π₯ β π)2 π(π₯ )ππ₯ ββ