* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 4.1-4.2 PowerPoint
Survey
Document related concepts
Transcript
Sta220 - Statistics Mr. Smith Room 310 Class #11 Section 4.1-4.2 4.1- Two Types of Random Variables A random variable is a variable that assumes numerical values associated with the random outcomes of an experiment, where one (and only one) numerical value is assigned to each sample point. A random variable is a numerical quantity whose value depends on chance. Example 1 You are tossing a coin twice and will be on the number of heads. The outcome is a number (0, 1,2) which depends on chance. The number of heads is a random variable. Example 2 You are tossing a coin twice and will bet on a specific outcome such as “first a head then a tail” or HT. The outcome depends on chance, but is not a number. This is NOT a random variable. Example 3 You go to Las Vegas and begin to put quarters in a slot machine. Let X be the number of quarters you play before you first win of any amount. X is a number and depends on chance. X is a random variable. Example 4.1 A panel of 10 experts for the Wine Spectator (a national publication) is asked to taste a new white wine and assign it a rating of 0, 1, 2, 3. A score is then obtained by adding them together the ratings of the 10 experts. How many values can this random variable assume? Solution A sample point is a sequence of 10 numbers associated with the rating of each expert. Example {1, 0, 0, 1, 2, 0, 0, 3, 1, 0} So the lowest score is 0 while the highest would be a 30. So possible scores range from 0 to 30 (x = 0, 1, 2, …, 30). The random variable denoted by the symbol x can assume 31 values. Note: Out sample point show here is x = 8 There are two different types of random variables, discrete and continuous. Random variables that can assume a countable number of values are called discrete. Random variables that can assume values corresponding to any of the points contained in an interval are called continuous. Examples of Discrete Random Variables 1. The number of seizures an epileptic patient has in a given week. 2. The shoe size of a tennis player: x = …5, 5.5, 6, 6.5, 7, 7.5, …. 3. The change received for paying a bill: x = $0.01, $0.02,…, $1, $1.01, $1.02, …. 4. The number of customers waiting to be served in a restaurant at a particular time: x = 1, 2, 3, … Examples of Continuous Random Variables 1. The length of time (in seconds) between arrivals at a hospital clinic: 0 ≤ 𝑥 < ∞ 2. The length in time (in minutes) it takes a student to complete a one-hour exam: 0 ≤ 𝑥 ≤ 60 3. The depth (in feet) at which a successful oildrilling venture first strikes oil: 0 ≤ 𝑥 ≤ 𝑐, where c is the maximum depth obtainable. Watch out for similar situations. The number of checkout lanes open at a grocery store is a discrete random variable, while the amount of time spent standing in line is a continuous random variable. 4.2 – Probability Distributions for Discrete Random Variables A complete description of a discrete random variable requires that we specify all the values the random variable can assume and the probability associated with each value. The probability distribution of a discrete random variable is a graph, table, or formula that specifies the probability associated with each possible value that the random variable can assume. Requirements for Probability Distribution of a Discrete Random Variable x 1. 𝑝(𝑥) ≥ 0 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑣𝑎𝑙𝑢𝑒𝑠 𝑜𝑓 𝑥 2. ∑p(𝑥) = 1 where the summation of p(x) is over all possible values of x. Example 4.4-1 Recall the experiment of tossing two coins, and let x be the number of heads observed. Find the probability associated with each value of the random variable x, assuming that the two coins are fair. Solution Sample space and sample points for this experiment are reproduced in the following figure. Note that the random variable x can assume values 0, 1, 2. The probability of the sample points with each value of x P(x = 0) = p(0) = ¼ P(x = 1) = p(1) = ½ P(x = 2) = p(2) = ¼ This dual specification completely describes the random variable and is referred to as the probability distribution, denoted by the symbol p(x). Example 4.4-2 Lets look at the experiment of tossing three coins, and let x be the number of heads observed. Find the probability associated with each value of the random variable x, assuming that the three coins are fair. Picture on the white board. Note that the random variable x can assume values 0, 1, 2, 3. The probability of the sample points with each value of x 1 𝑃 𝑥 = 0 = 𝑝 0 = 8 3 𝑃 𝑥 =1 = 𝑝 1 = 8 3 𝑃 𝑥 =2 = 𝑝 2 = 8 𝑃 𝑥 =3 = 𝑝 3 = 1 8 Since probability distributions are related to the relative frequency distributions of chapter 2, it should be not surprise that the mean and standard deviation are useful descriptive measures. Measuring Central Tendency; Expected Value The mean, or expected value, of a discrete random variable x is 𝜇 = 𝐸 𝑥 = ∑𝑥𝑝 𝑥 The expected value is the mean of the probability distribution, or a measure of its central tendency. Example 4.7 Suppose you work for an insurance company and you sell a $10,000 one-year term insurance policy at an annual premium of $290. Actuary tables show that the probability of death during the next year for a person of you customer's age, sex, health, etc., is 0.001. What is the expected gain (amount of money made by the company) for an policy of this type? Solution The experiment is to observe whether the customer survives the upcoming year. There are two sample points, Live and Die, are .999 and .001, respectively. If the customer lives, the company gains the $290 premium as profit. If the customer dies, the gain is negative because the company must pay $10,000, for a net ‘gain’ of $(290 - 10,000) = -$9,710. The random variable you are interested in is the gain x, which can assume the values shown in the following table: Gain x Sample Point Probability $290 Lives .999 -$9,710 Dies .001 The expected gain is therefore 𝜇 = 𝐸(𝑥) = ∑𝑥𝑝(𝑥) = (290)(.999) + (−9,710)(.001) = $280 In other words, if the company were to sell a very large number of $10,000 one-year policies to customer possessing the characteristics describe, it would (on the average) net $280 per sale in the next year. NOTE The E(x) need not equal a possible value of x. This is, the expected value is $280, but x will equal either -$9,710 or $290 each time the experiment is performed. The expected value is a measure of central tendency – and in this case represents the average over a very large number of one-year policies – but is not a possible value of x. The variance of a random variable x is 𝜎 2 = 𝐸[(𝑥 − 𝜇)2 ] = ∑ 𝑥 − 𝜇 2 𝑝 𝑥 = ∑𝑥 2 𝑝 𝑥 − 𝜇2 The standard deviation of a discrete random variable is equal to the square root of the variance, or 𝜎 = 𝜎 2 Procedure Copyright © 2013 Pearson Education, Inc.. All rights reserved. Example 4.8 Medical research has shown that a certain type of chemotherapy is successful 70% of the time when used to treat skin cancer. Suppose five skin cancer patients are treated with this type of chemotherapy, and let x equal the number of successful cures out of the five. The probability distribution for the number x of successful cures out of five is given in the following table: x 0 1 2 3 4 5 p(x) .002 .029 .132 .309 .360 .168 a. Find 𝜇 = 𝐸(𝑥). Interpret the results. b. Find 𝜎 = 𝐸 (𝑥 − 𝜇 2 ].Interpret the result. c. Graph p(x). Locate 𝜇 and the interval 𝜇 ± 2𝜎 on the graph. Use either Chebyshev’s rule or the empirical rule to approximate the probability that x falls into the interval. Compare your result with the actual probability. d. Would you expect to observe fewer than two successful cures out of five? Solution a. Applying the formula for 𝜇, we obtain 𝜇 = 𝐸(𝑥) = ∑𝑥𝑝 𝑥 = 0(.002) + 1(.029) + 2(.132) + 3(.309) + 4(.36) + 5(.168) =3.5 On average, the number of successful cures out of five skin cancer patients treated with chemotherapy will equal 3.5. Remember that this expected value has meaning only when the experiment-treating five skin cancer patients with chemotherapy- is repeated a large number of times. x 0 1 2 3 4 5 p(x) .002 .029 .132 .309 .360 .168 𝜇 = 3.5 𝜎 2 = 𝐸[(𝑥 − 𝜇)2 ] = ∑ 𝑥 − 𝜇 2 𝑝 𝑥 b. Now we calculate the variance of x: 𝜎2 = = 0 − 3.5 2 (0.002) + 1 − 3.5 2 (.029) + 2 – 3.5 2 (.132) + 3 − 3.5 2 (.309) + 4 − 3.5 2 (.36) + 5 − 3.5 2 (.168) = 1.05 The standard deviation is 𝜎 = 1.05 = 1.02 This value measure the spread of the probability of x, the number of successful cures out of five. A more useful interpretation is obtained by answering c and d. C. Graph p(x). The interval within two standard deviations (1.46, 5.54). Note particularly that 𝜇 = 3.5 locates the center of the probability distribution. Since this distribution is theoretical relative frequency distribution that is moderately mound shape, we expect (by Chevy’s rule) 75% and , more likely (by empirical rule, approximately 95%, of observed x values to fall between 1.46 and 5.54. The actually probability that x falls in the interval includes the sum of p(x) for the values x =2, x=3, x=4, and x= 5. The probability is p(2) + p(3) + p(4) + p(5) = .969. Therefore 96.9% of the probability distribution lies within two standard deviations of the mean. This percentage is consistent with both Chebyshev’s rule and the empirical rule. d. Fewer than two successful cures out of five implies that x = 0 or x = 1. Both these values of x lie outside the interval 𝜇 ± 2𝜎, and the empirical rule tells us that such a result is unlikely. The exact probability, 𝑃(𝑥 ≤ 1) 𝑖𝑠 𝑝(0) + 𝑝(1) = .002 + .029 = .031.