Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Probability and Statistics Chapter 3 – Modeling Distribution of Data 3.1 Measuring Location in a Distribution Objective: MEASURE position using percentiles INTERPRET cumulative relative frequency graphs MEASURE position using z-scores TRANSFORM data DEFINE and DESCRIBE density curves Measuring Position: Percentiles The pth percentile of a distribution is the value with p percent of the observations less than it. Examples: Here are the scores of all 25 students in Mr. Pryor’s statistics class on their first test: 79 77 81 83 80 86 77 90 73 79 83 85 74 83 93 89 78 84 80 82 75 77 67 72 73 Problem: Use the scores on Mr. Pryor’s test to find the percentiles for the for the following students (how did they perform relative to their classmates): a) Jenny, who earned an 86. b) Norman, who earned a 72. c) Katie, who earned a 93. Example Measuring Position: z-Scores d) the two students who earned scores of 80. PSAT scores In October 2007, about 1.4 million college-bound high school juniors took the PSAT. The mean score on the Critical Reading test was 46.7 and the standard deviation was 11.3. Nationally, 6 percent of testtakers earned a score higher than 65 on the Critical Reading test’s 20 to 80 scale. 3tps3ech2 Scott was one of 50 junior boys to take the PSAT at his school. He scored 65 on the Critical Reading test. This placed Scott at the 68th percentile within the group of boys. Looking at all 50 boys’ Critical Reading scores, the mean was 58.2 and the standard deviation was 9.4. Write a sentence or two comparing Scott’s percentile among the national group of test takers and among the 50 boys at his school. A z-score tells us how many standard deviations from the mean an observation falls, and in what direction. To compare data from distributions with different means and standard deviations, we need to find a common scale. We accomplish this by using standard deviation units (z-scores) as our scale. Changing to these units is called standardizing. Standardizing data shifts the data by subtracting the mean and rescales the values by dividing by their standard deviation. z score datavalue mean st .dev. or z x Standardizing does not change the shape of the distribution. It changes the center (shifts it to zero) and the spread by making the standard deviation one. 1 Example: Example: PSAT scores (continued) Refer to the previous example. Calculate and compare Scott’s z-score among these same two groups of test takers. Assignment 3.1 Part 1, page 105 #3.1-3.6 2 Transforming Data Transforming converts the original observations from the original units of measurements to another scale. Transformations can affect the shape, center, and spread of a distribution. Effect of Adding (or Subtracting) a Constant Adding the same number a (either positive, zero, or negative) to each observation: • adds a to measures of center and location (mean, median, quartiles, percentiles), but • Does not change the shape of the distribution or measures of spread (range, IQR, standard deviation). Effect of Multiplying (or Dividing) by a Constant Multiplying (or dividing) each observation by the same number b (positive, negative, or zero): • multiplies (divides) measures of center and location by b • multiplies (divides) measures of spread by |b|, but • does not change the shape of the distribution Example: Remember: Exploring Quantitative Data To describe a distribution: - Make a graph - Look for overall patterns (shape, center, and spread) and outliers - Calculate a numerical summary to describe the center (mean, median) and spread (minimum, maximum, Q1, Q3, range, IQR, standard deviation) In addition to the above distributions sometimes the overall pattern of a large number of observations is so regular that we can describe it by a smooth curve. Density Curves A density curve describes the overall pattern of a distribution o Is always on or above the horizontal axis o Has exactly 1 underneath it o The area under the curve and above any range of values is the proportion of all observations The overall pattern of this histogram of the scores of all 947 seventh-grade students in Gary, Indiana, on the vocabulary part of the Iowa Test of Basic Skills (ITBS) can be described by a smooth curve drawn through the tops of the bars. Median and Mean of a Density Curve Median of a density curve is the equal areas point, the point that divides the are under the curve in half Mean of a density curve is the balance point, at which the curve would balance if made of solid material. 3 Examples: Use the figure shown to answer the following questions. 1. Explain why this is a legitimate density curve. 2. About what proportion of observations lie between 7 and 8? 3. Mark the approximate location of the median. 4. Mark the approximate location of the mean. Explain why the mean and median have the relationship that they do in this case. Examples: Assignment 3.1 Part 2, page 110 #3.7-3.12 Assignement 3.1 Part 3, page 113 #3.13-3.20 4 3.2 Normal Distributions Objectives: DESCRIBE and APPLY the 68-95-99.7 Rule DESCRIBE the standard Normal Distribution PERFORM Normal distribution calculations ASSESS Normality Normal Distributions N(μ, σ) The 68-95-99.7 Rule All Normal curves have the same overall shape: symmetric, single-peaked, bell shaped. A Normal distribution is described by a Normal density curve. A Normal distribution can be fully described by two parameters, its mean μ and standard deviation σ The mean, µ, of a Normal distribution is at the center of the symmetric Normal curve and is the same as the median. The standard deviation σ controls the spread of a Normal curve. Curves with larger standard deviations are more spread out. The standard deviation, σ, is the distance from the center to the change-of-curvature points on either side. A short-cut notation for the normal distribution in N(μ,σ). All normal curves obey the 68-95-99.7% (Empirical) Rule. This rule tells us that in a normal distribution approximately 68% of the data values fall within one standard deviation (1σ) of the mean, 95% of the values fall within 2σ of the mean, and 99.7% (almost all) of the values fall within 3σ of the mean. Application of the 68-95-99.7 Rule Distribution of the heights of young women aged 18 to 24 What is the mean μ? What is the ? What is the height range for 95% of young women? What is the percentile for 64.5 in.? What is the percentile for 59.5 in.? What is the percentile for 67 in.? What is the percentile for 72 in.? 5 Example SAT performance Students’ scores on the SAT Critical Reading test follow a Normal distribution with mean 500 and standard deviation 100. What percent of students earn scores above 700? Assignment 3.2 Part 1, page 121 #3.21-3.26 The Standard Normal Distribution The standard Normal distribution is the Normal distribution with mean 0 and standard deviation 1. If a variable x has any Normal distribution N(µ,σ) with mean µ and standard deviation σ, then the standardized variable Z-Score Table z x has the standard Normal distribution, N(0,1). Because all Normal distributions are the same when we standardize, we can find areas under any Normal curve from a single table. Table A is a table of areas under the standard Normal curve. The table entry for each value z is the area under the curve to the left of z. Table A practice Example Use Table A to find the proportion of observations from a standard Normal distribution that falls in each of the following regions. In each case, sketch a standard Normal curve and shade the area representing the region. (a) z 2.25 (b) z 2.25 (c) z 1.77 (d) 2.25 z 1.77 Example Finding z-scores from proportions Use Table A to find the value z of a standard Normal variable that satisfies each of the following conditions. In each case, sketch a standard Normal curve with your value of z marked on the axis. (a) The point z with 70% of the observations falling below it. (b) The point z with 85% of the observations falling above it. Assignment 3.2 Part 2, page 127 #3.27-3.32 (c) Find the number z such that the proportion of observations less than z is 0.8. (d) Find the number z such that 90% of all observations are greater than z. 6 4-Step Process How to Solve Problems Involving Normal Distributions Step 1: Step 2: Step 3: Step 4: Normal calculations Example: a. Women’s heights are approximately normal with N(64.5, 2.5). What proportion of all young women are less than 68 inches tall? b. On the driving range, Tiger Woods practices his swing with a particular club by hitting many, many balls. When tiger hits his driver, the distance the balls travels follows a Normal distribution with mean 304 yards and standard deviation 8 yards. What percent of Tiger’s drives travel at least 290 yards? 7 c. What percent of Tiger’s drives travel between 305 and 325? Using Table A in Reverse Example: d. What distance would a ball have to travel to be at the 80th percentile f Tiger’s drive lengths? Assignment 3.2 Part 3 page 132 #3.333.38 8 Summary 9