Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
STA 220H1F LEC0201 Week 7: More Probability: Discrete Random Variables Recall: A sample space for a random experiment is the set of all possible outcomes of the experiment. Random Variables A random variable takes each outcome in the sample space and assigns it a numerical value. Random variables are denoted by upper‐case letters (e.g., ) and the values they take are denoted by the corresponding lower‐case letters (e.g., ). Random variables can be discrete or continuous. A discrete random variable can take one of a countable list of distinct values. Its set of possible values is a collection of isolated points on the number line. A continuous random variable can take any value in an interval (or collection of intervals). 1 A scenario: Imagine you are planning an outdoor party. You want to pick a date to optimize the chance that it will be enjoyable. Some random variables to consider: The temperature The number of planes that will fly overhead The probability distribution of a discrete random variable gives the probability, or , of each possible value of . 1 where the sum is taken over the possible values of . Example: Chuck‐a‐luck Chuck‐a‐luck is a carnival game, popular in England. The rules: You pick a number from 1 to 6. The game operator tolls 3 dice. If the number you pick comes up on all 3 dice, you win $3. If it comes up on 2 of the 3 dice, you win $2. If it comes up on 1 of the 3 dice, you win $1. If your number doesn’t come up on any dice, you pay $1. 2 Let X be a random variable for your winnings. Probability distribution of : Values of “ ” 3 2 1 ‐1 or 3 Mean and Standard Deviation of Random Variables The mean ( or , sometimes ) and standard deviation ( or , sometimes ) of a random variable can be calculated from its probability model. (Note that in this context, the mean and variance are parameters of the probability model, and not calculated from data. When data are collected and the (sample) mean and (sample) standard deviaton are calculated, the goal is often to use the data to estimate the corresponding parameters of the probability model.) Expectation (or Mean) The expectation or mean of a random variable is the expected average value of a random variable in the long run. For a discrete random variable, it is the weighted average of the possible outcomes, where the weights are the probabilities of the outcomes. where the sum is over all of the possible values of . describes where the probability distribution of is centred. (For data, ̅ is often used to estimate .) 4 Example: Winnings in Chuck‐a‐luck 1 15 3 2 216 216 0.08 75 1 216 1 125 216 How do casinos make money if people sometimes win big? A question: If you were faced with the following alternatives, which one would you choose? (a) A gift of $240 guaranteed. OR (b) A 25% chance to win $1000 and a 75% chance of getting nothing. If your choices were the following, which one would you choose? (a) A sure loss of $740. OR (b) A 75% chance to lose $1000 and a 25% chance to lose nothing. What if you were given the opportunity to do this 1,000,000 times? 5 Properties of Expection: For random variables and and constants and : 1. 2. And it follows that 3. And it follows that Standard Deviation (and Variance) Variance is the weighted average (weighted by its probability) of the square of the deviation of the value of a random variable from its mean. For discrete random variables: where the sum is taken over all possible values of . 6 The standard deviation of a random variable is the square root its variance. Small values of the standard deviation (and variance), indicate that the values of tend to be close to the mean value. Properties of Variance: For random variables and and constants and : 1. 2. It follows that: 3. What is 2 ? This is called the covariance between and . It is a measure of the linear relationship between and . The covariance divided by the standard deviations of and is called the correlation. (And is used to estimate the correlation from data.) 7 If and are independent, their covariance is 0 and then 4. If , are independent then , And it follows that: and A linear function of a random variable is are constants. where and A linear combination of random variables , is where and are constants. The properties of means and variances are often used to calculate the means and variance of linear functions or combinations of random variables. Note: The variance properties are for variances and not standard deviations. If you want the standard deviation for a linear function or combination of random variables, work with the variances and take the square root to change to standard deviation as the last step. 8 Example: Mean and variance of the sample mean Suppose , , … , are independent random variables with the same probability distribution. Then they all have the same mean, , and standard deviation, . Let ∑ . Then 1 ⋯ 1 ⋯ 1 1 ⋯ The expectation of the average of random variables with the same mean is the mean of the individual random variables. And 1 ⋯ 1 ⋯ 1 1 ⋯ 9 The variance of independent random variables with the same variance is the variance of the original random variables divided by the sample size. And √ the standard devation is the standard deviation of the original random variables divided by the square root of the sample size. Bernoulli Random Variables New Example: Toss a coin once This is an example of a Bernoulli trial. A Bernoulli trial has the following characteristics: There are two possible outcomes (“success” or “failure”). The probability of success is p (and the probability of failure is 1 . A Bernoulli random variable is 1 if the trial was a success and 0 if it was a failure. Mean of a Bernoulli random variable, 1 0 1 10 Variance of a Bernoulli random variable, 0 1 1 1 1 1 Standard deviation of a Bernoulli random variable, 1 Another New Example: Toss a coin 6 times How many outcomes are there in the sample space? Which outcome is more likely to occur? A. HHHTTT B. HTHHTT C. HTHTHT What is the probability of getting 3 heads? 11 Recall: The number of ways of choosing things from where order doesn’t matter is denoted , read “ choose ”. ! ! where ! 1 ! 2 ⋯ 3 2 1 The number of ways you can get exactly 3 heads in 6 tosses of a ! 6 coin is 20. ! ! 3 A Common Probability Model 2: The Binomial Distribution Properties of a Binomial Experiment: The experiment consists of a fixed number of observations; each is a trial. Each trial has one of only two outcomes, success or failure. The trials are independent. Each trial has the same probability, , of success. (So the probability of failure is 1 for each trial.) We are interested in a probability model for the number of successes in the trials. If is a binomial random variable, write ~ , . 12 The probability distribution function for a Binomial random variable is for 0,1, … , Example: Let be the number of heads in 6 tosses of a coin. 3 6 3 1 2 1 2 20 64 Note that we can think of as the sum of independent Bernoulli random variables, each with probability of success . Example: Your exam is multiple choice. Suppose there are 20 q uestions. If there are 5 choices for each question and you randomly guess, what is the probability that you get at least half of the questions correct? Let be the random variable for the number of questions you get correct. ~ We want 20, 10 13 It would be tedious to calculate 10 10 11 ⋯ 20 Use Minitab: Calc > Probability Distribution and choose Binomial and Cumulative Probabilty to get 9 . 10 1 9 Mean and Variance of a Binomial Random Variable Use the fact that a Binomial random variable is the sum of independent Bernoulli random variables. If ~ Suppose , variables. ,…, are independent ⋯ Then 1 and , then ~ random , ⋯ . ⋯ ⋯ ⋯ 1 1 14 In a binomial setting, instead of being interested in the number of successes, we are often interested in the proportion of times we get a success. Example: A poll asks 1000 people: Do you approve of the way Rob Ford is running the city of Toronto? We are interested in the number of people who say “yes”. Let denote the number of people responded “yes” (successes) and let denote the probability that a randomly chosesn person from the population will respond yes. Suppose the 1000 people polled are a SRS from the population. What kind of distribution does have? 15 In a sample of size selected without replacement from a population consisting of individuals or objects. If 0.10 (i.e., if at most 10% of the population is sampled), then the binomial distribution gives a good approximation to the probablity distribution of , the number of successes in the sample. Typically in a poll like this, we are interested in the proportion of the population who support Rob Ford. Estimate it by ̂ Then ̂ And ̂ 16