Survey

Document related concepts

Transcript

The Standard Deviation as a Ruler and the Normal Model The result of a 50-point test looks bad, and I Knew that I made the test too hard compared to the test from last semester. So I decide to add 5 points to each student’s score. Ann: 40 pts Bob: 25 pts Carol: 37 pts Don: 14 pts Eva: 14 pts Adding (or subtracting) a constant to every data value adds (or subtracts) the same constant to measures of position. Adding (or subtracting) a constant to each value will increase (or decrease) measures of position: center, percentiles, max or min by the same constant. Its shape and spread - range, IQR, standard deviation - remain unchanged. The following histograms show a shift from men’s actual weights to kilograms above recommended weight: After I “curve”, I realize that my syllabus indicate that each test is worth 150 points. How can I fix it to align with the policy listed on the syllabus? Student Ann: 45 pts 50 pts Student Bob: 30 pts 35 pts Student Carol: 42 pts 47 pts Student Don: 19 pts 24 pts Student Eva: 19 pts 24 pts When we multiply (or divide) all the data values by any constant, all measures of position (such as the mean, median, and percentiles) and measures of spread (such as the range, the IQR, and the standard deviation) are multiplied (or divided) by that same constant. The men’s weight data set measured weights in kilograms. If we want to think about these weights in pounds, we would rescale the data: Scores on the ACT college entrance exam in a recent year were roughly normal, with mean 21.2 and standard deviation 4.8. John scores 27 on the ACT. Scores on the SAT Reasoning college entrance exam in the same year were roughly normal, with mean 1511 and standard deviation 194. Susan scores 1718 on the SAT. Who has a higher score? Which student is considered a better student? The trick in comparing very different-looking values is to use standard deviations as our rulers. The standard deviation tells us how the whole collection of values varies, so it’s a natural ruler for comparing an individual to a group. As the most common measure of variation, the standard deviation plays a crucial role in how we look at data. We compare individual data values to their mean, relative to their standard deviation using the following formula: y y z s We call the resulting values standardized values, or z-scores. Standardized values have no units. z-scores measure the distance of each data value from the mean in standard deviations. A negative z-score tells us that the data value is below the mean, while a positive zscore tells us that the data value is above the mean. Standardized values have been converted from their original units to the standard statistical unit of standard deviations from the mean. Thus, we can compare values that are measured on different scales, with different units, or from different populations. Standardizing data into z-scores shifts the data by subtracting the mean and rescales the values by dividing by their standard deviation (SD). ◦ Standardizing into z-scores does not change the shape of the distribution. ◦ Standardizing into z-scores changes the center by making the mean 0. ◦ Standardizing into z-scores changes the spread by making the standard deviation 1. A useful family of models for unimodal, symmetric distributions, usually represented by a bell-shaped curves, is called the Normal model. Biological measures (like height, weight, heart rate, head circumference, etc.) are often normally distributed. We write N(μ,σ) to represent a Normal model with a mean of μ and a standard deviation of σ. Summaries of data, like the sample mean and standard deviation, are written with Latin letters. Such summaries of data are called statistics. When we standardize Normal data, we still call the standardized value a z-score, and we write z y Once we have standardized, we need only one model, the N(0,1) model is called the standard Normal model (or the standard Normal distribution). Normal models give us an idea of how extreme a value is by telling us how likely it is to find one that far from the mean. It turns out that in a Normal model: ◦ about 68% of the values fall within 1 SD of the mean ◦ about 95% of the values fall within 2 SD of the mean ◦ about 99.7% of the values fall within 3 SD of the mean. When we use the Normal model, we are assuming the distribution is Normal. A newborn baby weighs 8.5 lbs, is it normal? We cannot check this assumption in practice, so when we have the actual data, we make a histogram of the distribution and check the Nearly Normal Condition: ◦ The shape of the data’s distribution is unimodal? ◦ Bell-shaped curve? ◦ Roughly symmetric? When a data value doesn’t fall exactly 1, 2, or 3 standard deviations from the mean, we can look it up in a table of Normal percentiles. Table Z in Appendix D provides us with normal percentiles, but many calculators and statistics computer packages provide these as well. Table Z is the standard Normal table. We have to convert our data to z-scores before using the table. The figure shows us how to find the area to the left when we have a z-score of 1.80: Sometimes we start with areas and need to find the corresponding z-score or even the original data value. Example: What z-score represents the first quartile in a Normal model? Look in Table Z for an area of 0.2500. The exact area is not there, but 0.2514 is pretty close. This figure is associated with z = -0.67, so the first quartile is 0.67 standard deviations below the mean. The distribution of scores on tests such as the SAT college entrance examination is close to normal. Scores on each of the three sections (math, critical reading, writing) of the SAT are adjusted so that the mean score is about 500 and the standard deviation is about 100. What percent of scores fall between 200 and 800? What percent of scores are above 700? How high must a student score to fall in the top 25%? What proportion of SAT scores are above 640? The scale of scores on an IQ test is approximately normal with mean 100 and standard deviation 15. The organization MENSA, which calls itself “the high IQ society,” requires an IQ score of 130 or higher for membership. What percent of adults would qualify for membership? a) 95% b) 5% c) 2.5% d) 17% Scores on the ACT college entrance exam in a recent year were roughly normal, with mean 21.2 and standard deviation 4.8. John scores 27 on the ACT. Scores on the SAT Reasoning college entrance exam in the same year were roughly normal, with mean 1511 and standard deviation 194. Susan scores 1718 on the SAT. Who has a higher score? Page 147 – 152 Problem # 3, 5, 9, 11, 15, 19, 27, 29, 33, 35, 37, 39, 41, 43, 49, 53, 57.