Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Review ● In most card games cards are dealt without replacement. What is the probability of being dealt an ace and then a 3? Choose the closest answer. a) 0.0045 b) 0.0059 c) 0.0060 d) 0.1553 Review ● What is the probability of throwing two 6s in a row with a fair die? a) 0.0278 b) 0.0333 c) 0.1389 d) 0.333 Tree Diagrams ● ● Tree diagrams help us think through conditional probabilities by showing sequences of events as paths that look like branches of a tree We often make tree diagrams when reversing the conditioning ● ● ● ● Suppose we want to know Prob(A | B), but we know only Prob(A), Prob(B) and Prob(B | A) We also know Prob(A and B), since P(A and B) = Prob(A) x Prob(B | A) From this information, we can find Prob(A | B) When we reverse the probability from the conditional probability that we are originally give, we use Bayes Theorem Example – false positive rates ● Assume there is a screening test for a certain cancer that is 95 percent accurate if someone has the cancer. Also assume that if someone doesn't have the cancer, the test is positive just 1 percent of the time. Assume further that 0.5 percent actually have this type of cancer. What is the probability that someone who tested positive for this cancer does not actually have the cancer, i.e. what is the false positive rate? Example – false positive rates Example – false positive rates Example – false positive rates Using Bayes Rule: ● About 68% of people who test positive for cancer do not actually have cancer! Example – false positive rates ● What percent of the people who test positive for this cancer actually have cancer? Example – HIV test ● HIV prevalence is .006 in the US population, so .994 do not have HIV. There is a HIV test that if you have the disease 99% of the time the test says positive (1% false negative). If you don't have the disease 98% of the time the test says negative (2% false positive). What is the probability that someone actually has HIV if the test says positive? Chapter 6: Modeling Random Events: The Normal and Binomial Models Probability Model and Distributions ● ● A probability model is a description of how a statistician thinks data are produced ● Uniform ● Linear ● Normal ● Other A probability distribution or probability distribution function (pdf) is a table or graph that gives all the outcomes of a random experiment and their probabilities Discrete vs. Continuous ● ● A random variable is called discrete if the outcomes are values that can be listed or counted ● Number of classes taken ● The roll of a die A random variable is called continuous if the outcomes cannot be listed because they occur over a range ● Time to finish the exam ● Exact weight Discrete or Continuous Classify the following as discrete or continuous ● Length of your left thumb ● Number of children in a family ● Number of devices in the house that connect to the Internet ● Sodium concentration in the bloodstream Discrete Probability Distributions ● ● The most common way to display a pdf for discrete data is with a table The probability distribution table always has two columns (or rows) ● The first, x, displays all the possible outcomes ● The second, P(x), displays the probabilities for these outcomes Examples of Probability Distribution tables ● Important: The sum of all the probabilities must equal 1 Die Roll x P(x) 1 1/6 2 1/6 3 1/6 4 1/6 5 1/6 6 1/6 Raffle Prize x P(x) 95 0.01 995 0.005 -5 0.985 Example – Playing Dice ● Roll a fair six-sided die. You will win $4 if you roll a 5 or a 6. You will lose $5 if you roll a 1. You will lose $1 if you roll a 2. Any other outcome, you will win or lose $0. What is the probability distribution table for the amount you will win? Continuous Probability Distribution Functions ● Often represented a curve. ● The area under the curve between two values of x represents the probability of x being between the two values ● The total area under the curve must equal 1 ● The curve cannot lie below the x-axis The Normal Model ● ● ● The Normal Model is a good fit if: ● The distribution is unimodal ● The distribution is approximately symmetric ● The distribution is approximately bell shaped A Normal distribution is defined by the mean and standard deviation . Shorthand for a normal distribution is N( , ) The Normal distribution is also called the Gaussian distribution or the Bell Curve Standardizing with z-scores ● ● ● Reminder: z-scores are standardized scores Z-scores are used to compare individual data values to their mean relative to their standard deviation The formula for calculating the z-score of a data value is: z-scores ● ● ● Standardizing data into z-scores shifts the data by subtracting the mean and rescales the values by dividing by their standard deviation Standardizing into z-scores does not change the shape of the distribution Standardizing into z-scores changes the center by making the mean 0 Standardizing into z-scores changes the spread by making the standard deviation 1 Shape, center, and spread of z-scores ● ● Z-scores for normally distributed variables are also normally distributed, but with mean 0 and standard deviation 1 z ~ N(0, 1) Z-scores for a variable with some other distribution (right skewed, uniform, etc.) will follow the same shape as the original distribution, but with mean 0 and standard deviation 1 When is a z-score big? ● ● ● A z-score gives us an indication of how unusual a value is because it tells us how far it is from the mean Remember that a negative z-score tells us that the data value is below the mean, while a positive z-score tells us that the data value is above the mean The larger a z-score is (negative or positive), the more unusual it is Calculating percentiles and probabilities with normal models ● ● ● Since z-scores tell us whether or not an observation is unusual, they can also tell us how unusual the observation is (i.e. how likely it is to observe such a value) So far we have only be able to tell how unusual an observation is if it was exactly 1, 2, or 3 standard deviations from the mean (using the Empirical Rule) What happens if we have a z-score of 2.5 or -1.3? Calculating percentiles using the z-table ● ACT scores are distributed normally with mean 21 and standard deviation 5. If Adam got a 27 on his ACT, what is his percentile score? Note: percentile score means what percent is below the observed value First we compute our z-score: ● Now we go to the z-table. ● Using the z-table ● ● We have z = 1.20 z-values occur on the outer edges of the z-table, probabilities are in the middle Note: It's best to round z-scores to 2 decimal places since the ztable displays z-scores up to two decimal places Calculating percentiles using the z-table ● ● ACT scores are distributed normally with mean 21 and standard deviation 5. If Adam got a 27 on his ACT, what is his percentile score? With a z-score of 1.20 we found the value 0.8849 Adam's score is the 88.49th percentile, i.e. he scored higher than 88.49% of the test takers. Percentiles to Probabilities If a score of 27 is higher than about 88.49% of all scores on this test, this means that the probability of scoring lower than 27 is 0.8849. P(ACT score < 27) = 0.8849 Similarly, the probability of scoring higher than 27 is the complement of this probability: P(ACT score > 27) = 1 – 0.8849 = 0.1151 Note: Complement probabilities complete each other to 1; the area under the normal curve is equal to 1, so when we know the probability of one side, to get the other side we just subtract it from 1. Example - z-scores What percent of standard normal is found where z < -1.1? Draw a picture first. Example - z-scores What percent of standard normal is found where z > -2.09? Drawing a picture first may help. a) 2.09% b) 98.17% c) 1.83% d) 0.0183% Example - z-scores What percent of standard normal is found where -1< z < 2.5? Example - z-scores What percent of standard normal is found where z > 13? a) approximately 100% b) approximately 0% c) 1% d) Cannot calculate with the z-table given, the table does not go up to z = 13 Example – z-scores ACT scores are distributed normally with mean 21 and standard deviation 5. What percent of scores fall between 28 and 19 on the ACT? Example – finding observed value from percentile Let's assume SAT scores are ~ N(1500, 300). If Sophie scored at the 76th percentile, what was her actual score? ● We are given percentile, so now start in the middle of the z-table and work out to find the z-score Example – finding observed value from percentile ● Let's assume SAT scores are ~ N(1500, 300). If Sophie scored at the 76th percentile, what was her actual score? From the table we found the corresponding z-score of 0.71 th for the 76 percentile, so: Example – finding observed value from percentile Let's assume SAT scores are ~ N(1500, 300). If Snookie scored at the 3rd percentile, what was her actual score? Example – finding observed value from percentile Let's assume SAT scores are ~ N(1500, 300). Between what two scores do the middle 50% of SAT test takers score?