Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
IQL Chapter 5 – What is Normal? Statistical Reasoning for everyday life, Bennett, Briggs, Triola, 3rd Edition 5.1 The Normal Shape THE NORMAL SHAPE All normal distributions have the same characteristic bell shape, but they can differ in their mean and in their variation. A normal distribution can be fully described w/just 2 numbers: its mean and its standard deviation Definition The normal distribution is a symmetric, bell-shaped distribution with a single peak. Its peak corresponds to the mean, median, and mode of the distribution. Its variation can be characterized by the standard deviation of the distribution. THE NORMAL DISTRIBUTION AND RELATIVE FREQUENCIES The relative frequency for any range of data values is the area under the curve covering that range of values. Relative Frequencies and the Normal Distribution The area that lies under the normal distribution curve corresponding to a range of values on the horizontal axis is the relative frequency of those values. Because the total relative frequency must be 1, the total area under the normal distribution curve must equal 1, or 100%. WHEN CAN WE EXPECT A NORMAL DISTRIBUTION? Conditions for a Normal Distribution A data set that satisfies the following four criteria is likely to have a nearly normal distributions 1. Most data values are clustered near the mean, giving the distribution a well-defined single peak. 2. Data values are spread evenly around the mean, making the distribution symmetric. 3. Larger deviations from the mean become increasingly rare, producing the tapering tails of the distribution. 4. Individual data values result from a combination of many different factors, such as genetic and environmental factors. IQL Chapter 5 – What is Normal? Page 1 5.2 – Properties of the Normal Distribution Our friends the Greeks: µ = Mean σ = Standard Deviation The 68-95-99.7 Rule for a Normal Distribution • About 68% (more precisely, 68.3%), or just over two-thirds, of the data points fall within 1 standard deviation of the mean. • About 95% (more precisely, 95.4%) of the data points fall within 2 standard deviations of the mean. • About 99.7% of the data points fall within 3 standard deviations of the mean. APPLYING THE 68 – 95 – 99.7 RULE We can apply the 68-95-99.7 rule to determine when data values lie 1, 2, or 3 standard deviations from the mean. For example, suppose that 1,000 students take an exam and the scores are normally distributed with a mean of m = 75 and a standard deviation of s = 7. Figure 5.19 A normal distribution of test scores with a mean of 75 and a standard deviation of 7. (a) 68% of the scores lie within 1 standard deviation of the mean. (b) 95% of the scores lie within 2 standard deviations of the mean. IQL Chapter 5 – What is Normal? Page 2 Identifying Unusual Results In statistics, we often need to distinguish values that are typical, or “usual,” from values that are “unusual.” By applying the 68-95-99.7 rule, we find that about 95% of all values from a normal distribution lie within 2 standard deviations of the mean. This implies that, among all values, 5% lie more than 2 standard deviations away from the mean. We can use this property to identify values that are relatively “unusual”: Unusual values are values that are more than 2 standard deviations away from the µ mean. STANDARD SCORES The 68-95-99.7 rule apples only to data values that are 1,2, or 3 standard deviations from the mean. We can generalize this rule if we know precisely how many standard deviations from the mean ( lies. The number of standard deviations a data values lies above or below the mean ( µ) a particular value µ) is called its standard deviation (or z-score), often abbreviated by the letter z. For example: The standard score of the mean is z = 0, because it is 0 standard deviations from the mean The standard score of a data value 1.5 standard deviations above the mean is z = 1.5. The standard score of a data value 2.4 standard deviations below the mean is z = 2.4. Computing Standard Scores The number of standard deviations a data value lies above or below the mean is called its standard score (or zscore), defined by z = standard score = The standard score is positive for data values above the mean and negative for data values below the mean. IQL Chapter 5 – What is Normal? Page 3 STANDARD SCORES AND PERCENTILES Once we know the standard score of a data value, the properties of the normal distribution allow us to find its percentile in the distribution. This is usually done with a standard score table, such as Table 5.1 TOWARD PROBABILITY Suppose you pick a baby at random and ask whether the baby was born more than 15 days prior to his or her due date. Because births are normally distributed around the due date with a standard deviation of 15 days, we know that 16% of all births occur more than 15 days prior to the due date (see Example 3). For an individual baby chosen at random, we can therefore say that there’s a 0.16 chance (about 1 in 6) that the baby was born more than 15 days early. In other words, the properties of the normal distribution allow us to make a probability statement about an individual. In this case, our statement is that the probability of a birth occurring more than 15 days early is 0.16. This example shows that the properties of the normal distribution can be restated in terms of ideas of probability. IQL Chapter 5 – What is Normal? Page 4 5.3 –The Central Limit Theorem The Central Limit Theorem Suppose we take many random samples of size n for a variable with any distribution (not necessarily a normal distribution) and record the distribution of the means of each sample. Then, 1. The distribution of means will be approximately a normal distribution for large sample sizes. 2. The mean of the distribution of means approaches the population mean, m, for large sample sizes. 3. The standard deviation of the distribution of means approaches for large sample sizes, where s is the standard deviation of the population. IQL Chapter 5 – What is Normal? Page 5