Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lecture 5. Binomial Distribution “Bernoulli trials” – experiments satisfying 3 conditions: 1. Experiment has only 2 possible outcomes: Success (S) and Failure (F). 2. The probability of S is fixed (does not change) from trial to trial. P(S)=p, 0<p<1, P(F)= 1- P(S)=1-p=q. 3. n independent trials of the experiment are performed. Let X= # of S (successes) in n Bernoulli trials. X has Binomial distribution with number of trials n and probability of success p. X~Bin(n, p) X is a discrete r.v. with 2 parameters: n and p. X counts number of S (successes) in n Bernoulli trials (Binomial type experiment). EXAMPLES 1. Toss a coin 10 times. Record the number of H. Is this Binomial type experiment? 2. Toss a coin until you get a T. Record the number of tosses. Is this Binomial type experiment? 3. Toss a die 20 times. Record the number of “5”. Is this Binomial type experiment? 4. Toss a die 20 times. Record the number of times an “even” face comes up. Is this Binomial type experiment? Probability distribution of Bin(n,p) r.v. n = # of trials,. An outcome of the experiment with k successes, (0≤k≤n) and n-k failures, for example: SSLS FFLF {1 23 k times n - k times has probability ppLp(1− p)(1− p)L(1− p) = pk (1− p)n−k . 123144424443 k times n-k times P(k successes out of n trials) =# ways to place k S among n trials x pk(1-p)n-k n # ways to place k S among n trials = = k and n! , k !(n − k )! n ! = "n factorial" = 1 ⋅ 2L (n − 2) ⋅ (n − 1) ⋅ n. Finally, P( k successes out of n trials ) n k n−k (1 − ) p p = k Probability distribution of a Bin(n,p) r.v. X~Bin(n, p), n = number of trials, p = probability of S, 0 < p < 1. Values of X: 0, 1, …, n. n k n−k P(k successes out of n trials) = P(X= k) = p (1 − p ) k Example: 5 5! 1⋅ 2 ⋅ 3 ⋅ 4 ⋅ 5 = = 10. = 2 2!3! (1 ⋅ 2) ⋅ (1 ⋅ 2 ⋅ 3) NOTE: Table A-1 in the Appendix lists Binomial probabilities for n=2, 3, …, 15 and p=0.1, 0.2, 0.3, …, 0.99. EXAMPLE What is the probability that in a family of 5 children 2 are girls? What is the probability of having all girls? Solution. Trial/experiment: parents have a child Girl or Boy S F Our family has 5 children i.e. n=5 trials, p=P(S)=P( girl )=0.5. X= # girls among 5 children; X~Bin(5, 0.5). P(X = 2) = 5 2 10 5 5− 2 0.5 (1 − 0.5) = = . 10 2 16 2 Probability of having all girls? P(X=5)= 5 5 1 1 5−5 0.5 (1 − 0.5) = 5 = . 2 32 5 EXAMPLE A commuter plane has 10 seats. The airline books 12 people on the flight. Suppose the chance of a person who makes a reservation of actually showing up is 0.8. Find the probability that someone is bumped and the probability that at least one seat is empty. Solution. Trial/experiment: A person with reservation decides: Show up OR Do not show up for the flight S F Total # of people with reservations =12. Total # of trials= 12. P(S) = 0.8 X= # people who show up for the flight; X~Bin(12, 0.8). I used Table A-1 for binomial probabilities. P( someone is bumped) = P( more than 10 people show up)= P(11 or 12 people show up)=P(X=11 or X=12) = P(X=11) + P(X=12) = = 0.206 + 0.069 = 0.275. P(at least one seat is empty)= P( at most 9 people showed up)= 1- P(X=10or X=11 or X=12) = 1- ( 0.206+0.069+0.283) = 0.442. Mean and variance of a binomial random variable If X~Bin(n, p) random variable, then the mean of X, µx= EX=np And the standard deviation of X, σ X = np(1 − p). NOTES: 1. Variance of X is σ2=np(1-p). 2. The mean of a binomial r.v. (mean number of successes) is the number of trials x the probability of success. EXAMPLE 1. Fair coin was tossed 3 times. Let X =# of heads in the 3 tosses. What is the mean and standard deviation of X? Solution. X~Bin(3, 0.5). Mean of X µx= EX=np=3(0.5)=1.5. Standard deviation of X is σ X = np (1 − p ) = 3(0.5)(1 − 0.5) = 3 = 0.866. 4 2. The overbooking airline example. What is the mean and standard deviation of the number of passengers that show up for the flight? Solution. X~Bin(12, 0.8). Mean of Xµx= EX=np=12(0.8)=9.6. Standard deviation of X is σX = np(1− p) = 12(0.8)(1−0.8) = 1.92 =1.38. NORMAL DISTRIBUTION Normal distribution- continuous distribution. Normal curve: bell shaped, unimodal- single peak at the center, symmetric. Completely described by its center of symmetry - mean µ and spread - standard deviation σ. Random variable with normal distribution – normal random variable with mean µ and st. dev. σ: X~N(µ, σ) Standard normal random variable: mean 0 and st. dev. 1: Z~N(0, 1) NORMAL DISTRIBUTION-CHANGING LOCATION AND SCALE CHANGING SCALE σ 0.0 0.0 0.1 0.1 0.2 0.2 0.3 0.3 0.4 0.4 CHANGING LOCATION µ -4 -2 µ1 0 µ2 0=µ1 < µ2 =1 2 4 -4 “peaky” density Changes in mean/location cause shifts in the density curve along the x-axis. -2 0 1=σ1 < σ2 = 2 2 4 “flatter” density Changes in spread/standard deviation cause changes in the shape of the density curve. Why Bother with Normal Distributions? Normal distributions are great descriptions-modelsapproximations for many data sets such as weights, heights, exam scores, experimental errors, etc. Great descriptions of results-outcomes of many chance driven experiments. Statistical inference based on normal distribution works well for many (approximately) symmetric distributions. HOWEVER, remember that not everything or everybody is normal! AREAS UNDER THE NORMAL CURVE Normal probabilities = areas under the normal curve are tabulated for the standard normal distribution (table A-2 in the Appendix). In looking for probabilities keep in mind: Symmetry of the normal curve and P(Z=a)=0 for any a. FIND: P(Z < 0.01) = 0.504 P(Z ≤ - 0.01 ) = 0.496 P(Z < 0) = 0.5 P( Z < 2.92)= 0.9982 P(Z>2.92)=1-0.9982=0.0018 or, by symmetry =P(Z< - 2.92)=0.0018 P(-1.32< Z <1.2)=0.8849 – 0.0934=0.7915 SUMMARY OF RULES we used above: P(Z>a)=1-P(Z< -a) P(a < Z < b) = P(Z < b)- P( Z < a) NORMAL PERCENTILES Given that P(Z < z)=0.95 find p. Here p is called 95th percentile of Z. Inside the table I looked for 0.95. Found 0.9495 and 0.9505. Used z-value corresponding to the midpoint (0.95) between the two available probabilities 1.645. z=1.645 If an available probability is closer to the one we need, use the z-value corresponding to that probability. 0.95 Z=? Example Scientific thermometers should give a reading of 0oC at the freezing point of water. However, due to the usual random variability of the readings, the actual readings (in oC) are normally distributed with mean 0 and standard deviation 1. If a thermometer is randomly selected, find the probability that the reading is A. below 1.58o. B. above -1.23o C. between -2.00o and 1.50o D. Also, find the temp. corresponding to the 90th percentile of the temp readings. E. Find the temps separating the bottom 2.5% and the top 2.5% readings. Solution. T=temp reading at freezing point of water on a randomly chosen thermometer. T ~N(0, 1). A. P( T < 1.58)=0.9429 B. P(T> -1.23)= 1- 0.1092 = 0.8907 C. P(-2.00 < T < 1.50) = 0.9332 – 0.0.0228 = 0.9104 D. P(T < 1.28) = 0.8997, closest to 0.9, so use z=1.28 as 90th percentile E. P(T < -1.96) = 0.025, by symmetry P(T > 1.96)=0.025. Thus, T=-1.96 separates the bottom 2.5% and T=1.96 separates top 2.5% of temperatures. GENERAL NORMAL DISTRIBUTION X − µ ~N(0, 1) standard normal. IF X~N(µ, σ) then Z = σ standardization Example. Suppose that the weight of people in NV follows normal distribution with mean 150 and standard deviation 20 lb. Find the probability that a randomly selected Nevadan weighs at most 160 lb; b) over 160 lb. Solution. Let X= weight of a randomly selected Nevadan. X~N(150, 20). a) b) P(X ≤ 160) = P X − 150 ≤ 160 − 150 = P ( Z < 0.5) = 0.6515. 20 20 P(X>160)= 1 - P(X ≤ 160) =1 – 0.6915 = 0.3085. NORMAL PERCENTILES EXAMPLE. Suppose scores X on a test follow a normal distribution with mean 430 and standard deviation 100. Find 90th percentile of the scores, that is for score x such that P(X ≤ x)=0.9. Solution. Since we start with a normal but NOT STANDARD normal distribution, we have to standardize at some point: 0.9 = P(X ≤ x) = x − 430 X − 430 x − 430 P ≤ = P( Z < ). 100 100 100 4 24 3 1 4 24 3 1 Z z get equation: 0.90 x − 430 = 1.28 100 x - 430 =128 x = 558 z =1.28 90% of students scored 558 or less. EXAMPLE Height of women follows normal distribution with mean 64.5 and standard deviation of 2.5 inches. Find a) The probability that a woman is shorter than 70 in. b) The probability that a woman is between 60 and 70 in tall. c) What is the height 10% of women are shorter than, i.e. what is the 10th percentile of women heights? SOLUTION. X= women height; X~N(64.5, 2.5). a) P(X <70)=P(Z< (70-64.5)/2.5)=P(Z<2.2)=0.9861 b) P( 60 < X < 70) = P( (60-64.5)/2.5) < Z < (70-64.5)/2.5)=P(-1.8< Z < 2.2)= P( Z <-2.2) – P( Z < -1.8) = 0.9861 – 0.0359 = 0.9502. c) 10th percentile of X =? 0.1=P( X< x) = P( Z< (x-65.5)/2.5), so -2.33=(x-65.5)/2.5; x=59.675. 10% of women are shorter than 59.675 in.