Download Probability Distribution (PD) of a Random variable (RV) – what

CHAPTER 4 Probability Distributions Probability Distribution (PD) of a Random variable (RV) – what values occur and how often. A PD may be expressed by a table, graph, or formula. Definition. The probability distribution of a discrete random variable is a table, graph, formula, or other device used to specify all possible values of a discrete random variable along with their respective probabilities. Example. x f r(X) P (X = x) P (X  x) 1 45 .15 .15 0 90 .30 .45 1 60 .20 .65 2 45 .15 .80 3 30 .10 .90 4 30 .10 1.00 300 1.00 This is a table based on experimental probability. For classical (theoretical) probability, we would not need the second column. The third column here gives the relative frequency of occurrence of the corresponding value of X. Based on this third column, a relative frequency histogram follows: 35 36 4. PROBABILITY DISTRIBUTIONS Note. (1) 0  P (X = x)  1 X (2) P (X = x) = 1, for all x A graph of the cumulative probability distribution, taken from column 4 of the table, called an ogive, follows: 4. PROBABILITY DISTRIBUTIONS 37 (a) Find P (1  x  3). P (1  x  3) = P (x = 1) + P (X = 2) + P (X = 3) = .20 + .15 + .10 = .45 or P (1  x  3) = P (X  3) P (X  0) = .90 .45 = .45 (b) Find P (X > 2). P (X > 2) = 1 P (X  2) = 1 .80 = .20 Mean and Variance of Discrete Probability Distributions X µ= xp(x) X X 2 2 = (x µ) = x2p(x) µ2 Example. µ = ( 1)(.15) + 0(.30) + 1(.20) + 2(.15) + 3(.1) + 4(.1) = 1.05 2 = 1(.15) + 0(.30) + 1(.2) + 4(.15) + 9(.10) + 16(.10) p = 2.3475 = 1.532 1.052 = 2.3475 The Binomial Distribution – based on a process called the Bernoulli process. Definition. A sequence of Bernoulli trials forms a Bernoulli process under the following conditions: (1) Each trial results in one of two possible, mutually exclusive outcomes. One of the possible outcomes is denoted (arbitrarily) as a success, and the other is denoted a failure. (2) The probability of a success, denoted by p, remains constant from trial to trial. The probability of a failure, 1 p, is denoted by q. (3) The trials are independent; that is, the outcome of any particular trial is not a↵ected by the outcome of any other trial. Often, we let 1 = success and 0 = f ailure. 38 4. PROBABILITY DISTRIBUTIONS Example. Find P (two ones) in 4 rolls of a fair die. P (1 0 0 1) = 1 5 5 1 · · · = 6 6 6 6 ✓ ◆2✓ ◆2 1 5 25 = 6 6 1296 1 5 1 5 · · · = 6 6 6 6 ✓ ◆2✓ ◆2 1 5 25 = 6 6 1296 P (1 0 1 0) = In how many ways can we get our two successes? This is similar to picking a committee of 2 from a group of 4. Definition. A combination of n objects taken x at a time is an unordered subset of x of the objects. The number of combinations of n objects that can be formed by taking x of them at a time is nCx (read n choose x) where ✓ ⇣ ⌘◆ n! n = n Cx = x x!(n x)! Example. 4 C2 = 20 C15 Also, = 4! 4·3·2·1 = =6 2!2! 2 · 1 · 2 · 1 20! 20 · 19 · 18 · 17 · 16 · 15! = = 15, 504 15!5! 5 · 4 · 3 · 2 · 1 · 15! 20 C5 = 15, 504 Why, in general, is n C(n x) = nCx? 4. PROBABILITY DISTRIBUTIONS 39 What about picking a representative and an alternate from our group of 4? There are 4 · 3 = 12 ways. Order comes into play here. Definition. A permutation of n objects taken x at a time is an ordered subset of x of the n objects. The number of permutations of n objects thake x at a time is n! 1)(n 2) · · · (n x + 1) = . n Px = n(n (n x)! Example. 4 P2 = 4 · 3 = 20 P5 4! = 12 2! = 20 · 19 · 18 · 17 · 16 = Note. n Cx = The order is divided out. 20! = 1, 860, 480 15! n Px x! . Binomial Distribution (formally) ( x n x , for x = 0, 1, 2, . . . , n n Cx p q f (x) = P (X = x) = 0, elsewhere Example. ⇣ 1 ⌘2⇣ 5 ⌘2 25 f (2) = 4C2p q = 6 = 6 6 216 2 2 40 4. PROBABILITY DISTRIBUTIONS 1 We make a tble for all the cases of n = 4 and p = . 6 Notation. ⇣ 1⌘ P (X = x | n, p) = P X = x | 4, 6 x f (x) = P (X = x) P (X  x) ⇣ ⌘0⇣ ⌘4 625 5 625 0 1296 = 4C0 16 6 1296 ⇣ ⌘1⇣ ⌘3 1 5 125 1 125 324 = 4 C1 6 6 144 ⇣ ⌘2⇣ ⌘2 25 5 425 2 216 = 4C2 16 6 432 ⇣ ⌘3⇣ ⌘1 5 5 1295 3 324 = 4C3 16 6 1296 ⇣ ⌘4⇣ ⌘0 1 5 4 1296 = 4C4 16 1 6 Note. P (X = 2) = P (X  2) P (X  1) = 425 432 P (1  x  3) = P (X  3) P (X  0) = 1295 1296 125 25 = 144 216 625 335 = 1296 648 425 7 = 432 432 In the next problem, we use Table B in the Appendix and the following equations: P (X > 2) = 1 P (X  2) = 1 P (X = x | n, p > .5) = P (X = n P (X  x | n, p > .5) = P (X n P (X x | n, p > .5) = P (X  n x | n, 1 x | n, 1 x | n, 1 p) p) p) 4. PROBABILITY DISTRIBUTIONS Problem (4.3.8). n = 15, p = .75 (a) P (X = 6 | 15, .75) = P (X = 9 | 15, .25) = P (X  9 | 15, .25) P (X  8 | 15, .25) = .9992 9958 = .0034 (b) (c) P (X 7 | 15, .75) = 1 P (X  6 | 15, .75) = 1 P (X 9 | 15, .25) = P (X  8 | 15, .25) = .9958 P (X  5 | 15, .75) = P (X 10 | 15, .25) = 1 P (X  9 | 15, .25) = 1 .9992 = .0008 (d) P (6  x  9 | 15, .75) = P (X  9 | 15, .75) P (X  5 | 15, .75) = P (X 6 | 15, .25) P (X 10 | 15, .25) = ⇥ ⇤ ⇥ ⇤ 1 P (X  5 | 15, .25) 1 P (X  9 | 15, .25) = ⇤ P (X  9 | 15, .25) P (X  5 | 15, .25) = .9992 .8516 = .1476 41 42 4. PROBABILITY DISTRIBUTIONS Properties of the Binomial Distribution (1) Completely determined by n and p. (2) The mean is µ = np. Example. (3) The variance is 2 ⇣1⌘ 2 µ=4 = 6 3 = np(1 p) = npq Example. ⇣ 1 ⌘⇣ 5 ⌘ 5 =4 = 6 6 9 (4) Can be used for sampling from: 2 (a) infinite populations. (b) finite populations with replacement (if n is small relative to N , say N 10n, can probably do without replacement). The Poisson Distribution – used extensively as a probability model in biology and medicine. Let x = the number of occurences of some random event in an interval of time or space (or some volume of matter). Let = the average number of occurences of the random event in the interval (or volume). is called the parameter of the distribution. Then f (x) = P (X = x) = x e x! , x = 1, 2, 3 . . . . 4. PROBABILITY DISTRIBUTIONS 43 The Poisson Process – the process underlying the Poisson distribution. (1) The occurrences of the events are independent. The occurrence of an event in an interval of space or time has no e↵ect on the probability of a second occurrence of the event in the same, or any other, interval. (2) Theoretically, an infinite number of occurrences of the event must be possible in the interval. (3) The probability of the single occurrence of the event in a given interval is proportional to the length of the interval. (4) In any infinitesmally small portion of the interval, the probability of more than one occrrence of the event is infinitesimal. Note. f (x) 0 for all x and mean: variance: Problem (4.4.4). We are given (a) P (X = 1 | .5) = P (X  1 | .5) .910 (b) P (X = 0 | .5) = P (X  4 | .5) 1.000 1 | .5) = f (x) =1 | x {z } infinite series µ= 2 = = .5 and we see Table C. P (X  0 = | .5) = .607 = .303 P (X = 0 | .5).607 (c) P (X = 4 | .5) = (d) P (X X 1 P (X  3 | .5) = .998 = .002 P (X = 0 | .5) = 1 .607 = .393 44 4. PROBABILITY DISTRIBUTIONS Reese’s Pieces Simulation – includes looking ahead a bit. Reese’s Pieces come in 3 colors – yellow, orange, and brown. The proportions of these colors is evidently a trade secret. Suppose the actual proportion of orange pieces is .45. The following javascript simulation takes same-sized random samples of Reese’s pieces, counts the number of orange ones, and then plots the proportion on a number line. The applet is located at http://www.rossmanchance.com/applets/OneProp/OneProp.htm?candy=1 This is one of several applets at http://www.rossmanchance.com/applets/ designed by Allan Rossman and Beth Chance, two prominent statistics educators. Fill out the screen as seen below, then click Draw Samples 10 times. 4. PROBABILITY DISTRIBUTIONS 45 Now click on Count. We see that, since the standard deviation for the distribution here is roughly .1, 86.83% of the samples are greater than .35. Change the .35 to .55, again click Count, and you get 18.98% of the samples are greater than .55. Thus 86.83%-18.98%=64.85% of the samples have proportions within approximately 1 standard deviation of the mean. Now use .25 and .65. We have that 95.83% of our samples have proportions within approximately two standard deviations of the mean. Finally, choose .15 and .75. We have that 99.80% of our samples have proportions within approximately three standard deviations of the mean. Notice how closely these values of 64.85 — 95.83 — 99.80 match 68.3 — 95.4 — 99.7, the corresponding percents for the normal curve. 46 4. PROBABILITY DISTRIBUTIONS Continous Probability Distribution (or Probability Density Function) – a function f (x) such that (1) f (x) 0 for all x. Z 1 (2) f (x) dx = 1. 1 Example. Consider the function f (x) = a random variable. Z P ( 10  X  20) = In general, P (a  X  b) = Z 1 ⇣ 1 5⇡ 1 + (x 25 20 f (x) dx. 10 b f (x) dx. a Also, Z 1 1 f (x) dx ⇡ Z 106 f (x) dx = 1. 106 2 1) ⌘ where X is 4. PROBABILITY DISTRIBUTIONS 47 Thus f (x) meets the requirements of being a probability density function. Finally, for every real number a, Z a P (X = a) = f (x) dx = 0. a We have been using normal curves. Let’s take a closer look at them. Definition. A normal curve with mean µ and standard deviation is the graph of the function (x µ)2 1 f (x) = p e 2 2 2⇡ Thus normal curves are completely determined by their mean and standard deviation. For the three normal curves above, one has a mean of 70 and a standard deviation of 5, another has a mean of 70 and a standard deviation of 10, and the third has a mean of 50 and a standard deviation of 10. Which is which? 48 4. PROBABILITY DISTRIBUTIONS Every normal curve extends from 1 to 1 on the horizontal axis with the area under the curve always equal to one. Consider the normal curve below with mean µ and standard deviation , or, in the case of the standard normal curve with mean 0 and standard deviation 1. Moving from the peak to the right, every normal curve curve changes from concave down to concave up exactly 1 standard deviation from the mean. From the above, we can also see that every normal curve with mean µ and standard deviation and variable x can be transformed into a standard normal curve with variable z by the formula x µ z= . Similarly, every standard normal curve with variable z can be transformed into a normal curve with mean µ and standard deviation with variable x by the formula x = µ + z. 4. PROBABILITY DISTRIBUTIONS 49 As an example, the graph of the normal curve with mean 20 and standard deviation 5 and the corresponding standard normal curve follow below. Characteristics of a Normal Distribution (1) It is symmetric about its mean µ. (2) Mean = median = mode = µ. (3) The total area between the curve and the x-axis is 1, with 50% of the area on each side of the mean. (4) The 68–95–99.7 rule says that 68% of the area is within one SD of the mean, 95% within two, and 99.7% within three. (5) The graph has a maximum at µ and points of inflection at µ ± . (6) The normal distribution is completely determined by µ and . µ is often called a location parameter, while , which determines the peakedness or flatness of the graph, is called a shape parameter. 50 4. PROBABILITY DISTRIBUTIONS Example (Using the Standard Normal Distribution with Table D). (1) P ( 1.3  z  2.53) = P (z  2.53) P (z < 1.3) = .9943 .0968 = .8975 (2) P (z > .7) = 1 P (z  .7) = 1 .7580 = .2420 (3) P ( 1.24  z  1.24) = 1 2P (z < 1.24) = 1 2(.1075) = .785 (4) If P (z  z1) = .9265, what is z1? z1 = 1.45 (5) If P (z > z1) = .2611, what is z1? P (z  z1) = 1 .2611 = .7389 =) z1 = .64 Many measurable characteristics in nature are at least approximately normal, so we can use the normal distribution to model the distribution of these variables. Problem (4.7.2). µ = 140, (a) P (X 200) = ✓ P z 200 = 50 140 ◆ = P (z 50 1 P (z  1.2) = 1 .8849 = .1151 1.2) = 4. PROBABILITY DISTRIBUTIONS (b) P (x < 100) = ✓ 100 P z< 140 50 ◆ = P (z < 51 .8) = .2119 (c) P (100  x  200) = ✓ ◆ 100 140 200 140 P z = P ( .8  z  1.2) = 50 50 P (z  1.2) P (z < .8) = .8849 .2119 = .673 (d) P (200  x  250) = ✓ ◆ 200 140 250 140 P z = P (1.2  z  2.2) = 50 50 P (z  2.2) P (z < 1.2) = .9861 .8849 = .1012 (e) From (a), P (X > 200) = .1151 =) .1151(10, 000) = 1151 have a ridge count of 200 or more.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Probability Distribution (PD) of a Random variable (RV) – what