Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Probability The definition – probability of an Event PE= nE nS nE N no. of outcomes in E total no. of outcomes Applies only to the special case when 1. The sample space has a finite no.of outcomes, and 2. Each outcome is equi-probable If this is not true a more general definition of probability is required. Summary of the Rules of Probability The additive rule P[A B] = P[A] + P[B] – P[A B] and P[A B] = P[A] + P[B] if P[A B] = f The Rule for complements for any event E P E 1 P E Conditional probability P A B P A B P B The multiplicative rule of probability P A P B A if P A 0 P A B P B P A B if P B 0 and P A B P A P B if A and B are independent. This is the definition of independent Counting techniques Summary of counting results Rule 1 n(A1 A2 A3 …. ) = n(A1) + n(A2) + n(A3) + … if the sets A1, A2, A3, … are pairwise mutually exclusive (i.e. Ai Aj = f) Rule 2 N = n1 n2 = the number of ways that two operations can be performed in sequence if n1 = the number of ways the first operation can be performed n2 = the number of ways the second operation can be performed once the first operation has been completed. Rule 3 N = n1n2 … nk = the number of ways the k operations can be performed in sequence if n1 = the number of ways the first operation can be performed ni = the number of ways the ith operation can be performed once the first (i - 1) operations have been completed. i = 2, 3, … , k Basic counting formulae 1. Orderings n ! the number of ways you can order n objects 2. Permutations n! The number of ways that you n Pk n k ! can choose k objects from n in a specific order 3. Combinations n n! The number of ways that you n Ck k ! n k ! k can choose k objects from n (order of selection irrelevant) Applications to some counting problems • The trick is to use the basic counting formulae together with the Rules • We will illustrate this with examples • Counting problems are not easy. The more practice better the techniques Random Variables Numerical Quantities whose values are determine by the outcome of a random experiment Random variables are either • Discrete – Integer valued – The set of possible values for X are integers • Continuous – The set of possible values for X are all real numbers – Range over a continuum. Examples • Discrete – A die is rolled and X = number of spots showing on the upper face. – Two dice are rolled and X = Total number of spots showing on the two upper faces. – A coin is tossed n = 100 times and X = number of times the coin toss resulted in a head. – We observe X, the number of hurricanes in the Carribean from April 1 to September 30 for a given year Examples • Continuous – A person is selected at random from a population and X = weight of that individual. – A patient who has received who has revieved a kidney transplant is measured for his serum creatinine level, X, 7 days after transplant. – A sample of n = 100 individuals are selected at random from a population (i.e. all samples of n = 100 have the same probability of being selected) . X = the average weight of the 100 individuals. The Probability distribution of A random variable A Mathematical description of the possible values of the random variable together with the probabilities of those values The probability distribution of a discrete random variable is describe by its : probability function p(x). p(x) = the probability that X takes on the value x. This can be given in either a tabular form or in the form of an equation. It can also be displayed in a graph. Example 1 • Discrete – A die is rolled and X = number of spots showing on the upper face. x 1 2 3 4 5 6 p(x) 1/6 1/6 1/6 1/6 1/6 1/6 formula – p(x) = 1/6 if x = 1, 2, 3, 4, 5, 6 Graphs To plot a graph of p(x), draw bars of height p(x) above each value of x. Rolling a die 0 1 2 3 4 5 6 Example 2 – Two dice are rolled and X = Total number of spots showing on the two upper faces. x p(x) 2 3 4 5 6 7 8 9 10 11 12 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 Formula: x 1 36 p( x) 13 x 26 x 2,3, 4,5, 6 x 7,8,9,10,11,12 Rolling two dice 0 36 possible outcome for rolling two dice Comments: Every probability function must satisfy: 1. The probability assigned to each value of the random variable must be between 0 and 1, inclusive: 0 p( x) 1 2. The sum of the probabilities assigned to all the values of the random variable must equal 1: p( x) 1 3. Pa X b x b p( x) x a p(a) p(a 1) p(b) Example In baseball the number of individuals, X, on base when a home run is hit ranges in value from 0 to 3. The probability distribution is known and is given below: x p(x) 0 6/14 1 4/14 2 3/14 3 1/14 Note: This chart implies the only values x takes on are 0, 1, 2, and 3. If the random variable X is observed repeatedly the probabilities, p(x), represents the proportion times the value x appears in that sequence. 3 14 3 1 4 Pthe random variable X is at least 2 p2 p3 14 14 14 P( the random variable X equals 2) p (2) A Bar Graph 0.500 0.429 No. of persons on base when a home run is hit 0.400 0.286 p(x) 0.300 0.214 0.200 0.100 0.071 0.000 0 1 2 # on base 3 Discrete Random Variables Discrete Random Variable: A random variable usually assuming an integer value. • a discrete random variable assumes values that are isolated points along the real line. That is neighbouring values are not “possible values” for a discrete random variable Note: Usually associated with counting • The number of times a head occurs in 10 tosses of a coin • The number of auto accidents occurring on a weekend • The size of a family Continuous Random Variables Continuous Random Variable: A quantitative random variable that can vary over a continuum • A continuous random variable can assume any value along a line interval, including every possible value between any two points on the line Note: Usually associated with a measurement • Blood Pressure • Weight gain • Height Probability Distributions of Continuous Random Variables Probability Density Function The probability distribution of a continuous random variable is describe by probability density curve f(x). Notes: The Total Area under the probability density curve is 1. The Area under the probability density curve is from a to b is P[a < X < b]. Normal Probability Distributions (Bell shaped curve) P(a x b) a b x Mean and Variance (standard deviation) of a Discrete Probability Distribution • Describe the center and spread of a probability distribution • The mean (denoted by greek letter (mu)), measures the centre of the distribution. • The variance (s2) and the standard deviation (s) measure the spread of the distribution. s is the greek letter for s. Mean of a Discrete Random Variable • The mean, , of a discrete random variable x is found by multiplying each possible value of x by its own probability and then adding all the products together: xpx x x1 px1 x2 px2 xk pxk Notes: The mean is a weighted average of the values of X. The mean is the long-run average value of the random variable. The mean is centre of gravity of the probability distribution of the random variable 0.3 0.2 0.1 1 2 3 4 5 6 7 8 9 10 11 Variance and Standard Deviation Variance of a Discrete Random Variable: Variance, s2, of a discrete random variable x is found by multiplying each possible value of the squared deviation from the mean, (x )2, by its own probability and then adding all the products together: s 2 x 2 px 2 x 2 x px xpx x x x 2 px 2 x Standard Deviation of a Discrete Random Variable: The positive square root of the variance: s s2 Example The number of individuals, X, on base when a home run is hit ranges in value from 0 to 3. x 0 1 2 3 Total p (x ) xp(x) 0.429 0.000 0.286 0.286 0.214 0.429 0.071 0.214 1.000 0.929 p(x) xp(x) x 2 0 1 4 9 2 x p(x) 0.000 0.286 0.857 0.643 1.786 2 x p( x) • Computing the mean: xpx 0.929 x Note: • 0.929 is the long-run average value of the random variable • 0.929 is the centre of gravity value of the probability distribution of the random variable • Computing the variance: s 2 x 2 px 2 x 2 x px xpx x x 1.786 .929 0.923 2 • Computing the standard deviation: s s2 0.923 0.961 Random Variables Numerical Quantities whose values are determine by the outcome of a random experiment Random variables are either • Discrete – Integer valued – The set of possible values for X are integers • Continuous – The set of possible values for X are all real numbers – Range over a continuum. The Probability distribution of A random variable A Mathematical description of the possible values of the random variable together with the probabilities of those values The probability distribution of a discrete random variable is describe by its : probability function p(x). p(x) = the probability that X takes on the value x. This can be given in either a tabular form or in the form of an equation. It can also be displayed in a graph. Example In baseball the number of individuals, X, on base when a home run is hit ranges in value from 0 to 3. The probability distribution is known and is given below: x p(x) 0 6/14 1 4/14 2 3/14 3 1/14 Note: This chart implies the only values x takes on are 0, 1, 2, and 3. If the random variable X is observed repeatedly the probabilities, p(x), represents the proportion times the value x appears in that sequence. 3 14 3 1 4 Pthe random variable X is at least 2 p2 p3 14 14 14 P( the random variable X equals 2) p (2) A Bar Graph 0.500 0.429 No. of persons on base when a home run is hit 0.400 0.286 p(x) 0.300 0.214 0.200 0.100 0.071 0.000 0 1 2 # on base 3 Probability Distributions of Continuous Random Variables Probability Density Function The probability distribution of a continuous random variable is describe by probability density curve f(x). Notes: The Total Area under the probability density curve is 1. The Area under the probability density curve is from a to b is P[a < X < b]. Mean, Variance and standard deviation of Random Variables Numerical descriptors of the distribution of a Random Variable Mean of a Discrete Random Variable • The mean, , of a discrete random variable x is found by multiplying each possible value of x by its own probability and then adding all the products together: xpx x x1 px1 x2 px2 xk pxk Notes: The mean is a weighted average of the values of X. The mean is the long-run average value of the random variable. The mean is centre of gravity of the probability distribution of the random variable 0.3 0.2 0.1 1 2 3 4 5 6 7 8 9 10 11 Variance and Standard Deviation Variance of a Discrete Random Variable: Variance, s2, of a discrete random variable x is found by multiplying each possible value of the squared deviation from the mean, (x )2, by its own probability and then adding all the products together: s 2 x 2 px 2 x 2 x px xpx x x x 2 px 2 x Standard Deviation of a Discrete Random Variable: The positive square root of the variance: s s2 Example The number of individuals, X, on base when a home run is hit ranges in value from 0 to 3. x 0 1 2 3 Total p (x ) xp(x) 0.429 0.000 0.286 0.286 0.214 0.429 0.071 0.214 1.000 0.929 p(x) xp(x) x 2 0 1 4 9 2 x p(x) 0.000 0.286 0.857 0.643 1.786 2 x p( x) • Computing the mean: xpx 0.929 x Note: • 0.929 is the long-run average value of the random variable • 0.929 is the centre of gravity value of the probability distribution of the random variable • Computing the variance: s 2 x 2 px 2 x 2 x px xpx x x 1.786 .929 0.923 2 • Computing the standard deviation: s s2 0.923 0.961 The Binomial distribution An important discrete distribution Situation - in which the binomial distribution arises • We have a random experiment that has two outcomes – Success (S) and failure (F) – p = P[S], q = 1 - p = P[F], • The random experiment is repeated n times independently • X = the number of times S occurs in the n repititions • Then X has a binomial distribution Example • A coin is tosses n = 20 times – X = the number of heads – Success (S) = {head}, failure (F) = {tail – p = P[S] = 0.50, q = 1 - p = P[F]= 0.50 • An eye operation has %85 chance of success. It is performed n =100 times – X = the number of Sucesses (S) – p = P[S] = 0.85, q = 1 - p = P[F]= 0.15 • In a large population %30 support the death penalty. A sample n =50 indiviuals are selected at random – X = the number who support the death penalty (S) – p = P[S] = 0.30, q = 1 - p = P[F]= 0.70 The Binomial distribution 1. We have an experiment with two outcomes – Success(S) and Failure(F). 2. Let p denote the probability of S (Success). 3. In this case q=1-p denotes the probability of Failure(F). 4. This experiment is repeated n times independently. 5. X denote the number of successes occuring in the n repititions. The possible values of X are 0, 1, 2, 3, 4, … , (n – 2), (n – 1), n and p(x) for any of the above values of x is given by: n x n x n x n x px p 1 p p q x x X is said to have the Binomial distribution with parameters n and p. Summary: X is said to have the Binomial distribution with parameters n and p. 1. X is the number of successes occurring in the n repetitions of a Success-Failure Experiment. 2. The probability of success is p. 3. The probability function n x n x px p 1 p x Example: 1. A coin is tossed n = 5 times. X is the number of heads occurring in the 5 tosses of the coin. In this case p = ½ and 5 1 x 1 5 x 5 1 5 5 1 px 2 2 2 32 x x x x 0 1 2 3 4 5 p(x) 1 32 5 32 10 32 10 32 5 32 1 32 Note: 5 5! x x ! 5 x ! 5 5! 1 0 0! 5 0 ! 5 5! 5! 5 1 1! 5 1! 4! 5 5! 5 4 10 2 2!3! 2 1 5 5! 5 4 10 3 3!2! 2 1 5 5! 5 4 4!1! 5 5! 1 5 0!5! 0.4 p (x ) 0.3 0.2 0.1 0.0 1 2 3 4 number of heads 5 6 Computing the summary parameters for the distribution – , s2, s x 0 1 2 3 4 5 Total p (x ) 0.03125 0.15625 0.31250 0.31250 0.15625 0.03125 1.000 p(x) xp(x) 0.000 0.156 0.625 0.938 0.625 0.156 2.500 xp(x) x 2 0 1 4 9 16 25 2 x p(x) 0.000 0.156 1.250 2.813 2.500 0.781 7.500 2 x p( x) • Computing the mean: xpx 2.5 x • Computing the variance: s 2 x 2 px 2 x 2 x px xpx x x 7.5 2.5 1.25 2 • Computing the standard deviation: s s2 1.25 1.118 Example: • A surgeon performs a difficult operation n = 10 times. • X is the number of times that the operation is a success. • The success rate for the operation is 80%. In this case p = 0.80 and • X has a Binomial distribution with n = 10 and p = 0.80. 10 x 10 x px 0.80 0.20 x Computing p(x) for x = 0, 1, 2, 3, … , 10 x p (x ) x p (x ) 0 0.0000 6 0.0881 1 0.0000 7 0.2013 2 0.0001 8 0.3020 3 0.0008 9 0.2684 4 0.0055 10 0.1074 5 0.0264 The Graph 0.4 p (x ) 0.3 0.2 0.1 0 1 2 3 4 5 6 7 Number of successes, x 8 9 10 Computing the summary parameters for the distribution – , s2, s x 0 1 2 3 4 5 6 7 8 9 10 Total p (x ) 0.0000 0.0000 0.0001 0.0008 0.0055 0.0264 0.0881 0.2013 0.3020 0.2684 0.1074 1.000 xp(x) 0.000 0.000 0.000 0.002 0.022 0.132 0.528 1.409 2.416 2.416 1.074 8.000 xp(x) x2 x 2 p(x) 0 1 4 9 16 25 36 49 64 81 100 0.000 0.000 0.000 0.007 0.088 0.661 3.171 9.865 19.327 21.743 10.737 65.600 2 x p( x) • Computing the mean: xpx 8.0 x • Computing the variance: s 2 x 2 px 2 x 2 x px xpx x x 65.6 8.0 1.60 2 • Computing the standard deviation: s s2 1.25 1.118 Notes The value of many binomial probabilities are found in Tables posted on the Stats 245 site. The value that is tabulated for n = 1, 2, 3, …,20; 25 and various values of p is: c n x 10 x PX c p 1 p px x 0 x x 0 c p0 p1 p2 pc Hence pc Tabled value for c Tabled value for c 1 The other table, tabulates p(x). Thus when using this table you will have to sum up the values Example n =5 Suppose n = 8 and p = 0.70 and we want to compute P[X = 5] = p(5) c 0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95 0 0.663 0.430 0.168 0.058 0.017 0.004 0.001 0.000 0.000 0.000 0.000 1 0.943 0.813 0.503 0.255 0.106 0.035 0.009 0.001 0.000 0.000 0.000 2 0.994 0.962 0.797 0.552 0.315 0.145 0.050 0.011 0.001 0.000 0.000 3 1.000 0.995 0.944 0.806 0.594 0.363 0.174 0.058 0.010 0.000 0.000 4 1.000 1.000 0.990 0.942 0.826 0.637 0.406 0.194 0.056 0.005 0.000 5 1.000 1.000 0.999 0.989 0.950 0.855 0.685 0.448 0.203 0.038 0.006 6 1.000 1.000 1.000 0.999 0.991 0.965 0.894 0.745 0.497 0.187 0.057 7 1.000 1.000 1.000 1.000 0.999 0.996 0.983 0.942 0.832 0.570 0.337 8 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 Table value for n = 8, p = 0.70 and c =5 is 0.448 = P[X ≤ 5] P[X = 5] = p(5) = P[X ≤ 5] - P[X ≤ 4] = 0.448 – 0.194 = .254 We can also compute Binomial probabilities using Excel The function =BINOMDIST(x, n, p, FALSE) will compute p(x). The function =BINOMDIST(c, n, p, TRUE) c n x 10 x will compute PX c p 1 p px x 0 x x 0 p0 p1 p2 pc c Mean, Variance and standard deviation of Binomial Random Variables Mean of a Discrete Random Variable • The mean, , of a discrete random variable x xpx x x1 px1 x2 px2 xk pxk Notes: The mean is a weighted average of the values of X. The mean is the long-run average value of the random variable. The mean is centre of gravity of the probability distribution of the random variable Variance and Standard Deviation Variance of a Discrete Random Variable: Variance, s2, of a discrete random variable x s x px 2 x x 2 x px xpx x 2 2 x 2 px 2 x Standard Deviation of a Discrete Random Variable: The positive square root of the variance: s s2 The Binomial ditribution X is said to have the Binomial distribution with parameters n and p. 1. X is the number of successes occurring in the n repetitions of a Success-Failure Experiment. 2. The probability of success is p. 3. The probability function n x n x px p 1 p x Mean,Variance & Standard Deviation of the Binomial Ditribution • The mean, variance and standard deviation of the binomial distribution can be found by using the following three formulas: 1. np 2. s npq np1 p 2 3. s npq np1 p Example: Find the mean and standard deviation of the binomial distribution when n = 20 and p = 0.75 Solutions: 1) n = 20, p = 0.75, q = 1 - 0.75 = 0.25 np (20)(0.75) 15 s npq (20)(0.75)(0.25) 3.75 1936 . 2) These values can also be calculated using the probability function: 20 p ( x ) (0.75) x (0.25)20 x for x 0, 1, 2, ... , 20 x Table of probabilities x 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Total p (x ) 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0002 0.0008 0.0030 0.0099 0.0271 0.0609 0.1124 0.1686 0.2023 0.1897 0.1339 0.0669 0.0211 0.0032 1.000 xp(x) 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.006 0.027 0.099 0.298 0.731 1.461 2.361 3.035 3.035 2.276 1.205 0.402 0.063 15.000 x2 x 2 p(x) 0 1 4 9 16 25 36 49 64 81 100 121 144 169 196 225 256 289 324 361 400 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.008 0.048 0.244 0.992 3.274 8.768 18.997 33.047 45.525 48.559 38.696 21.691 7.632 1.268 228.750 • Computing the mean: xpx 15.0 x • Computing the variance: s 2 x 2 px 2 x 2 x px xpx x x 228.75 15.0 3.75 2 • Computing the standard deviation: s s2 3.75 1.936 Histogram 0.3 s p(x) 0.2 0.1 0 2 4 6 8 10 12 -0.1 no. of successes 14 16 18 20 Probability Distributions of Continuous Random Variables Probability Density Function The probability distribution of a continuous random variable is describe by probability density curve f(x). Notes: The Total Area under the probability density curve is 1. The Area under the probability density curve is from a to b is P[a < X < b]. Normal Probability Distributions P(a x b) a b x Normal Probability Distributions • The normal probability distribution is the most important distribution in all of statistics • Many continuous random variables have normal or approximately normal distributions The Normal Probability Distribution Points of Inflection s 3s 2s s s 2s 3s Main characteristics of the Normal Distribution • Bell Shaped, symmetric • Points of inflection on the bell shaped curve are at – s and + s. That is one standard deviation from the mean • Area under the bell shaped curve between – s and + s is approximately 2/3. • Area under the bell shaped curve between – 2s and + 2s is approximately 95%. There are many Normal distributions depending on by and s Normal = 100, s =20 0.03 Normal = 100, s = 40 Normal = 140, s =20 f(x) 0.02 0.01 0 0 50 100 x 150 200 The Standard Normal Distribution = 0, s = 1 0.4 0.3 0.2 0.1 0 -3 -2 -1 0 1 2 3 • There are infinitely many normal probability distributions (differing in and s) • Area under the Normal distribution with mean and standard deviation s can be converted to area under the standard normal distribution • If X has a Normal distribution with mean and standard deviation s than z X s has a standard normal distribution. • z is called the standard score (z-score) of X. Converting Area under the Normal distribution with mean and standard deviation s to Area under the standard normal distribution Perform the z-transformation z then X P a X b s Area under the Normal distribution with mean and standard deviation s a X b P s s s b a P z s s Area under the standard normal distribution Area under the Normal distribution with mean and standard deviation s P a X b s a b Area under the standard normal distribution b a P z s s 1 a s 0 b s Using the tables for the Standard Normal distribution Table, Posted on stats 245 web site z 0 • The table contains the area under the standard normal curve between -∞ and a specific value of z Example Find the area under the standard normal curve between z = -∞ and z = 1.45 0.9265 0 • A portion of Table 3: z 0.00 0.01 0.02 0.03 1.45 0.04 z 0.05 .. . 1.4 .. . P( z 1.45) 0.9265 0.9265 0.06 Example Find the area to the left of -0.98; P(z < -0.98) Area asked for 0.98 0 P ( z < 0.98) 0.1635 Example Find the area under the normal curve to the right of z = 1.45; P(z > 1.45) Area asked for 0.9265 0 1.45 P( z 1.45) 1.0000 0.9265 0.0735 z Example Find the area to the between z = 0 and of z = 1.45; P(0 < z < 1.45) 0 1.45 P(0 z < 1.45) 0.9265 0.5000 0.4265 • Area between two points = differences in two tabled areas z Notes Use the fact that the area above zero and the area below zero is 0.5000 the area above zero is 0.5000 When finding normal distribution probabilities, a sketch is always helpful Example: Find the area between the mean (z = 0) and z = -1.26 Area asked for 1.26 0 z P( 1.26 < z < 0) 0.5000 0.1038 0.3962 Example: Find the area between z = -2.30 and z = 1.80 Required Area .-2.30 0 . 1.80 P(2.30 < z < 1.80) 0.9641 0.0107 0.9534 Example: Find the area between z = -1.40 and z = -0.50 Area asked for -1.40 - 0.500 P( 1.40 < z < 0.50) 0.3085 0.0808 0.2277 Computing Areas under the general Normal Distributions (mean , standard deviation s) Approach: 1. Convert the random variable, X, to its z-score. z X s 2. Convert the limits on random variable, X, to their z-scores. 3. Convert area under the distribution of X to area under the standard normal distribution. b a Pa X b P z s s Example 1: Suppose a man aged 40-45 is selected at random from a population. • X is the Blood Pressure of the man. • X is random variable. • Assume that X has a Normal distribution with mean =180 and a standard deviation s = 15. The probability density of X is plotted in the graph below. • Suppose that we are interested in the probability that X between 170 and 210. X X 180 z s 15 170 170 180 a 0.667 s 15 210 210 180 b 2.000 s 15 Let Hence P170 X 210 P .667 z 2.000 P170 X 210 P .667 z 2.000 P170 X 210 P .667 z 2.000 Example 2 A bottling machine is adjusted to fill bottles with a mean of 32.0 oz of soda and standard deviation of 0.02. Assume the amount of fill is normally distributed and a bottle is selected at random: 1) Find the probability the bottle contains between 32.00 oz and 32.025 oz 2) Find the probability the bottle contains more than 31.97 oz Solution part 1) When x = 32.00 z 32.00 s 32.00 32 0.00 0.02 When x = 32.025 z 32.025 s 32.025 32 1.25 0.02 Graphical Illustration: Area asked for 32.0 0 32.025 1.25 x z 32.0 32.0 X 32.0 32.025 32.0 < < P ( 32.0 < X < 32.025) P 0.02 0.02 0.02 P ( 0 < z < 1.25) 0. 3944 Example 2, Part 2) 31.97 150 . 32.0 0 x z x 32.0 3197 . 32.0 P( z 150) P( x 3197 . ) P . 0.02 0.02 1.0000 0.0668 0.9332 Summary Random Variables Numerical Quantities whose values are determine by the outcome of a random experiment Types of Random Variables • • Discrete Possible values integers Continuous Possible values vary over a continuum The Probability distribution of a random variable A Mathematical description of the possible values of the random variable together with the probabilities of those values The probability distribution of a discrete random variable is describe by its : probability function p(x). p(x) = the probability that X takes on the value x. 0.4 p (x ) 0.3 0.2 0.1 0 1 2 3 4 5 6 7 Number of successes, x 8 9 10 The Binomial distribution X is said to have the Binomial distribution with parameters n and p. 1. X is the number of successes occurring in the n repetitions of a Success-Failure Experiment. 2. The probability of success is p. 3. The probability function n x n x px p 1 p x Probability Distributions of Continuous Random Variables Probability Density Function The probability distribution of a continuous random variable is describe by probability density curve f(x). Notes: The Total Area under the probability density curve is 1. The Area under the probability density curve is from a to b is P[a < X < b]. The Normal Probability Distribution Points of Inflection s 3s 2s s s 2s 3s Normal approximation to the Binomial distribution Using the Normal distribution to calculate Binomial probabilities Binomial distribution n = 20, p = 0.70 0.2500 Approximating Normal distribution 0.2000 np 14 s npq 2.049 0.1500 Binomial distribution 0.1000 0.0500 -0 -0.5 2 4 6 8 10 12 14 16 18 20 Normal Approximation to the Binomial distribution PX a Pa 12 Y a 12 • X has a Binomial distribution with parameters n and p • Y has a Normal distribution np s npq 1 2 continuity correction 0.2500 Approximating Normal distribution 0.2000 P[X = a] 0.1500 Binomial distribution 0.1000 0.0500 -0 -0.5 2 4 6 8 10 a 12 12 a 14 a 16 1 2 18 20 0.2500 0.2000 Pa 12 Y a 12 0.1500 0.1000 0.0500 -- -0.5 a 0.2500 0.2000 P[X = a] 0.1500 0.1000 0.0500 -- -0.5 a Example • X has a Binomial distribution with parameters n = 20 and p = 0.70 We want PX 13 The exact valu e PX 13 20 13 7 0.70 0.30 0.1643 13 Using the Normal approximation to the Binomial distribution PX 13 P12 12 Y 13 12 Where Y has a Normal distribution with: np 20(0.70) 14 s npq 20.70.30 2.049 Hence P12.5 Y 13.5 12.5 14 Y 14 13.5 14 P 2 . 049 2 . 049 2 . 049 P 0.73 Z 0.24 = 0.4052 - 0.2327 = 0.1725 Compare with 0.1643 Normal Approximation to the Binomial distribution Pa X b p(a) p(a 1) p(b) 1 1 P a 2 Y b 2 • X has a Binomial distribution with parameters n and p • Y has a Normal distribution np s npq 1 2 continuity correction 0.2500 Pa X b 0.2000 0.1500 0.1000 0.0500 -- -0.5 a 12 a b b 12 0.2500 Pa 12 Y b 12 0.2000 0.1500 0.1000 0.0500 -- -0.5 a 12 a b b 12 Example • X has a Binomial distribution with parameters n = 20 and p = 0.70 We want P11 X 14 The exact valu e P11 X 14 p(11) p(12) p(13) p(14) 20 20 11 9 14 6 0.70 0.30 0.70 0.30 11 14 0.0654 0.1144 0.1643 0.1916 0.5357 Using the Normal approximation to the Binomial distribution P11 X 14 P10 12 Y 14 12 Where Y has a Normal distribution with: np 20(0.70) 14 s npq 20.70.30 2.049 Hence P10.5 Y 14.5 10.5 14 Y 14 14.5 14 P 2 . 049 2 . 049 2 . 049 P1.71 Z 0.24 = 0.5948 - 0.0436 = 0.5512 Compare with 0.5357 Comment: • The accuracy of the normal appoximation to the binomial increases with increasing values of n Normal Approximation to the Binomial distribution Pa X b p(a) p(a 1) p(b) 1 1 P a 2 Y b 2 • X has a Binomial distribution with parameters n and p • Y has a Normal distribution np s npq 1 2 continuity correction Example • The success rate for an Eye operation is 85% • The operation is performed n = 2000 times Find the probability that 1. The number of successful operations is between 1650 and 1750. 2. The number of successful operations is at most 1800. Solution • X has a Binomial distribution with parameters n = 2000 and p = 0.85 We want P1680 X 1720 P1679.5 Y 1720.5 where Y has a Normal distribution with: np 2000(0.85) 1700 s npq 200.85.15 15.969 Hence P1680 X 1720 P1679.5 Y 1720.5 1679.5 1700 Y 1700 1720.5 1700 P 15 . 969 15 . 969 15 . 969 P1.28 Z 1.28 = 0.9004 - 0.0436 = 0.8008 Solution – part 2. We want PX 1800 PY 1800.5 Y 1700 1800.5 1700 P 15 . 969 15 . 969 PZ 6.29 = 1.000 Next topic: Sampling Theory