Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
6.2: The normal distribution Normal Distributions q Characterized by symmetric, bell-shaped (mound-shaped) curve. q q Heights, weights, standardized test scores A particular normal distribution is determined by q q µ The mean The standard deviation Normal Distribution and Deviation from the Mean Example: Adult Heights q q 95% of female adult heights are between 58 and 72 inches 95% of male adult heights are between 62 and 78 inches Z-scores (revisited) q The multiples 1, 2, and 3 or the number of standard deviations from the mean are denoted by z. q For a particular observation, x, its z-score is computed by z= q x µ For each fixed number z, the probability within z standard deviations of the mean is the area under the normal curve between µ z and µ + z Finding Probabilities for the Normal Distribution q q What if we want the probability within 1.43 standard deviations of the mean? For normal distributions there is a table we can use (Table A in back of the book). It tabulates the normal cumulative probability falling below the point µ + z Finding Probabilities for the Normal Distribution: P(-1.43< z < 1.43) To find P(-1.43 < z < 1.43) we do it in 3 steps: 1) We first find P(z < 1.43), using Table A or calculator 2) We then also know P(z > 1.43), which by symmetry means we know P(z < -1.43). 3) P(1.43 < z < 1.43) = P(z < 1.43) - P(z < -1.43) To use Table A: Find the corresponding z-score. Look up the closest standardized score (z) in the table. ◦ First column gives z to the first decimal place. ◦ First row gives the second decimal place of z. The corresponding probability found in the body of the table gives the probability of falling below the z-score. Part of Table A The Probability Less Than 1.43 Standard Deviations P(height < 70)=P(z<1.43) = 0.9236 Example 1: Mensa Mensa is a society of high-IQ people with IQ test scores at the 98th percentile or higher. The StanfordBinet IQ test scores that are used for admission are approximately normally distributed with a mean of 100 and a standard deviation of 16. q How many standard deviations above the mean is the 98th percentile? q What is the IQ score for that percentile? Example 1: Mensa Solution Example 1: Mensa Solution q 98th percentile corresponds to a zscore of 2.05 q So a person needs an IQ score of at least 100 + 2.05(16) = 133. Example 2: SAT and ACT scores q SAT and ACT exams are the two primary college entrance exams. Both have a mathematics component. q The scores for the SAT range from 200 to 800 and are normally distributed with a mean of 500 and a standard deviation of 100. q The scores for the ACT range from 1 to 36 and are normally distributed with a mean of 21 and a standard deviation of 4.7. u Which is better, a 650 on the SAT or a 30 on the ACT? We will answer by looking at percentiles. Begin by finding z-scores! Example 2: SAT and ACT Solution q 650 on the SAT (mean is 500, std. dev. is 100) Example 2: SAT and ACT Solution q 650 on the SAT (mean is 500, std. dev. is 100) q z-score is (650-500)/100 = 1.50. From Table A, this is in the 93rd percentile. In other words, 7% of people scored above 650. Example 2: SAT and ACT Solution q 650 on the SAT (mean is 500, std. dev. is 100) q z-score is (650-500)/100 = 1.50. From Table A, this is in the 93rd percentile. In other words, 7% of people scored above 650. q 30 on the ACT (mean is 21, std. dev. is 4.7) q z-score is (30-21)/4.7 = 1.91. From Table A, this is in the 97th percentile. In other words, 3% of people scored above 30. Example 2: SAT and ACT Solution q 650 on the SAT (mean is 500, std. dev. is 100) q z-score is (650-500)/100 = 1.50. From Table A, this is in the 93rd percentile. In other words, 7% of people scored above 650. q 30 on the ACT (mean is 21, std. dev. is 4.7) q z-score is (30-21)/4.7 = 1.91. From Table A, this is in the 97th percentile. In other words, 3% of people scored above 30. q Thus, a 30 on the ACT is better than a 650 on the SAT Finding Probabilities on TI-83/84 q Normalcdf(low,high,mean,std. dev.) q For calculating P(a < X < b) when X has a normal distribution of mean mu and standard deviation sigma. q On Calculator: “2nd” “DISTR” “2” “a,b,mu,sigma) ENTER” Normalcdf q Invnorm(% to left,mean,std. dev.) q For finding value a so that P(X ≦ a) = p, when X has normal distribution of mean mu and standard deviation sigma. q On Calculator: “2nd” “DISTR” “3” “p,mu,sigma) ENTER” Invnorm Finding Probabilities on TI-83/84 q Normalcdf(low,high,mean,std. dev.) q What percent of women are between 65 and 70 inches? q On Calculator: “2nd” “DISTR” “2” “65,70,65,3.5) ENTER” q Invnorm(% to left,mean,std. dev.) q How tall does a woman need to be to be in the top 10%? q On Calculator: “2nd” “DISTR” “3” “.9,65,3.5) ENTER” Finding Probabilities on TI-83/84 q Normalcdf(low,high,mean,std. dev.) q What percent of women are between 65 and 70 inches? q Answer: 42.34% q Invnorm(% to left,mean,std. dev.) q How tall does a woman need to be to be in the top 10%? q Answer: 69.5 inches, or 5’9.5” Building an Interval that Contains a Certain Percentage of the Data q q q q Suppose we have a normal distribution. We want the interval that contains 95% of the data (in terms of z values, i.e., between –z* and z*). The Emperical Rule told us “about 2 standard deviations” but we want to be more precise. This means that 5% of the data must not be between, and of this amount 2.5% will be to the left of –z*. Since 2.5% is 0.0250, we look in Table A for a z-score with an entry of 0.0250. This gives us -1.96. We conclude that 95% of the data lies between -1.96 and 1.96. Building an Interval that Contains a Certain Percentage of the Data (cont.) q For normal distributions, 95% of the data has z-score between -1.96 and 1.96. q Recall that female adult heights are normally distributed with a mean of 65 inches and a standard deviation of 3.5 inches. q We can convert the z-scores into heights. We conclude that 95% of adult women have a height between 58.14 inches and 71.86 inches. q “Unusual” Observations Adult male heights are normally distributed with a mean of 70 inches and a standard deviation of 4 inches. q Consider these two q Sam is 79 inches tall (z-score is 2.25; corresponds to 0.9878 in Table A) q Joe is 61 inches tall (z-score is -2.25; corresponds to 0.0122 in Table A) q For a given person, we can think of “unusual” in two ways q Sam is unusually tall, he is in the rarest 1.22% of tall people. q Joe is unusually short, he is in the rarest 1.22% of short people. q Both have unusual height, they are in the rarest 2.44% P-Values q q q The P-value is a measure of just how unusual the data is, in terms of what percentage of the data is even more unusual than the given data. Recall that q Sam is unusually tall, he is in the rarest 1.22% of tall people. q Joe is unusually short, he is in the rarest 1.22% of short people. q Both have unusual height, they are in the rarest 2.44% This can be restated as q Sam’s one-tail (right-tail) P-value is 0 .0122 q Joe’s one-tail (left-tail) P-value is 0.0122 q Both have a two-tail P-value of 0.0244 Graphical Depiction of P-Values Other Types of Distributions We will also work with other distributions. Some will not be symmetric. For a distribution like this, we are only interested in one-tail (right-tail) P-values. q Commuting time of 45 minutes has a P-value of 0.15. Finding Probabilities on TI-83/84 q Normalcdf(low,high) q What is the percentage of data between 1 and 1.75 standard deviations? Normalcdf(1,1.75) q Normalcdf(low,high,mean,std. dev.) q What is the percentage of women between 62 and 70 inches? Normalcdf(62,70,65,3.5) q Invnorm(% to left) q What is the z-score for data in the top 10%? Invnorm(0.9) q Invnorm(% to left,mean,std. dev.) q How tall does a woman need to be to be in the top 10%? Invnorm(0.9,65,3.5)