Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
AP Statistics Chapter 2 The Normal Distribution Section 2.1: Describing location in a Distribution Percentiles and Ogives • An Ogive can determine the relative standing in a quantitative distribution. • Ogive = Relative Cumulative Frequency distribution • The pth percentile of a distribution is the value where “p” percent of the observations fall at or below the given percentile value. • An Ogive graph is a line graph that continuously rises from left to right. Standard Normal Calculations • • • • • All normal distributions can be measured in units of size σ about the mean center. Changing to these units is called standardizing. Standardized values have no units. € µ as € Standardizing makes all normal distributions into a single distribution, a distribution that is still normal. Standardizing a variable that has any normal distribution produces a new variable that has the standard normal distribution. Z – scores • • Tell how many standard deviations the observation falls away from the mean. Z – scores measure the distance of each data point from the mean in standard deviations. Ex: Z – score of 1.5 is one and a half standard deviations above the mean. Gives the direction from mean as well. Positive z –score value = lies above the mean Negative z –score value = lies below the mean Each z – score has a corresponding proportion of area under the curve assigned to it. (found on Z – table) § • • • • Example: The height of young women is approximately normal with µ = 64.5 inches and σ = 2.5 inches. z= € height − 64.5 € 2.5 Linear Transformations • • • • Changes the original variable into a new variable a + bx a – shifts all observations upward and downward movement b – changes the size of the unit of measurement • Adding “a” to each observation adds “a” to the mean and median but does not change the spread. • Multiplying “b” to each observation multiples “b” the mean, median, IQR and the standard deviation are multiplied by b. Section 2.2: Density Curves and the Normal Distributions • • Determine definition and properties of density curves. Apply knowledge of Mean, Median, Standard Deviation, and Quartiles to density curves • Determine definition and properties normal distributions • Use the Empirical Rule with normal distributions Mathematical Model • Gives the overall pattern of the data but ignores minor irregularities as well as any outliers. • Models do not match reality, they only model it. • Models are something we can look at and manipulate in order to learn more about the real world. • Models of data give us summaries we can learn from and use, even though they don’t fit each data point exactly. • It is an idealized description Density Curve • Describes the overall pattern of the distribution. Important property: • Area exactly 1 underneath the curve • Area under the curve and above any range of values is the proportion of all observations that fall in that range. • Always on or above the horizontal axis • Deviations from the overall pattern are not described by the curve • Approximation easy to use and accurate enough for practical use. Skewness of Density Curves • • Can be skewed to the left, right, or appear normal (but actually they are not) Normal curve = approximately symmetric distribution. Mean and Median of Density Curves Median • Of a density curve is the equal – areas point. • The point with half the area under the curve to its left and the remaining half of the area to its right. Quartiles – split the area of the curve into quarters. Finding the Median • Easy to spot when it’s an approximately symmetric density curve. • Difficult to locate on a skewed curve. (Can be found mathematically) Mean • Arithmetic average of observations. • It is the point at which the curve would balance if made of solid material. Symmetric curve • Mean and Median are equal • Lie at the center. Skewed curve • The mean is pulled away from the center in the direction of the long tail. Normal Distributions Normal Curves • Model that shows up over and over in statistics • Appropriate for distributions whose shapes are unimodal and roughly symmetric. • Density curves that are approximately symmetric, single – peaked, and bell – shaped. • These curve describe normal distributions of data. Notation = N (µ,σ ) • Normal Distribution (mean, standard deviation) • µ = mean = not a numerical summary of the data but part of the model € • σ = standard deviation = not a numerical summary of the data but part of the model Properties for all Normal Distributions § Same overall shape. § Shape is completely determined by mean and standard deviation § Normal density curves are described by mean and standard deviation § Spread controlled by standard deviation Normal Distributions • Good descriptions of some real data. • Certain data tend to follow a normal curve o Height of male or females o Blood Pressure o Amount student’s sleep per night. • Good approximations to the results of many kinds of chance outcomes. • Statistical inference procedures are based on normal distributions. Give an idea of how extreme a value is by telling us how likely it is to find one value (or observation) that far from the mean. Empirical Rule o All normal distributions follow this rule. o Using the rule provides approximations and can be used when appropriate. o Values can be calculated and is the preferred practice. In – Class Examples: In the 2006 Winter Olympics men’s combined event, Jean – Baptiste Grange of France skied the slalom in 88.46 seconds, about 1 standard deviation faster than the mean. If a normal model is useful in describing slalom times, about how many of the 35 skiers finishing the event would you expect skied the slalom faster than Jean – Baptiste? Normal Distribution Calculations • • • Proportion of observations that lie in some range of values can be answered by finding the area under the curve. Normal distributions are the same when we standardize, we can find areas under any normal curve from a single table, a table that gives areas under the curve for the standard normal distribution. Z – table = use it to answer any question about proportions of observations in a normal distribution that has been standardized. (Old Method: we use calculator) Outline of the method for finding the proportion of the distribution in any region: • • • • Draw a picture of the distribution and shade the area of interest under the curve. Standardize the given value in terms of a standard normal variable ”z”. (Find the z – score) Find the proportion of observations that satisfy the shaded area of interest. (use calculator) Answer the question within the context of the problem. That value you find represents a topic of interest. Example: Scores on the SAT Verbal test in recent years follow approximately the N (505, 110) distribution. How high must a student score in order to place in the top 20% of all students taking the SAT? top 5%? Section 2.2: Using your Calculator for Normal Distributions • Use the Normalcdf feature on your graphing calculator • Use the invNorm feature on your graphing calculator normalcdf() = Normal Cumulative Distribution Feature • • • DOES NOT display a picture of the normal distribution and the problems corresponding shaded region JUST provides the proportion of observations: “specified area under the curve” Use with nonstandardized information from a given problem. • Must specify an interval (which will change depending on the problem) o normalcdf(observation, upperbound, mean, standard deviation) o normalcdf(lower bound, observation, mean, standard deviation) • To determine the upper/lower bound: o Select a number very far into the tail of the distribution (over 5 standard deviations) Use the calculator’s “normalcdf” function to verify your answers to the following. invNorm() feature: • • Calculates the raw data value. OR it can calculate the standardized normal value (z-score) o invNorm(proportion of observations, mean, standard deviation) invNorm (proportion of observations) • To determine the upper/lower bound: o Select a number very far into the tail of the distribution (over 5 standard deviations) Use the calculator’s “invNorm” function to verify your answers to the following. Section 2.2 (again): Assessing Normality • Use two methods to assess the normality of a distribution. • Construct a normal probability plot with your calculator • Interpret a normal probability plot 2 Methods for Assessing Normality • Inference procedures in later Chapters are based on the condition (sometimes assumption) that the population is approximately normally distributed. • When normality is a condition of an inference procedure, it must be shown/stated/calculated how you know the population is approximately normal. Method 1 – Apply Empirical Rule using the values for µ and σ • Make a histogram or stemplot of data • More exact in the information gathered. • More time consuming € € • Small data sets rarely fit the Empirical Rule, even when the larger population is normal. Method 2 – Normal Probability Plot • Visual (graphs data on calculator in a plot) • Provides an assessment of the adequacy of the normal model for a data set. • Interpret shape of the plot to assess normality Method 1: Method 1: Apply Empirical Rule using the values for µ and σ • • • • • • Make a histogram or stemplot of data Put data into a graph (histogram, stemplot, boxplot, etc.) to verify it is € € approximately bell – shaped and symmetric about the mean. The graph will show the shape (symmetric or skewness) as well as any unusual features (outliers, gaps, clusters) Set up the horizontal axis to reflect X , X ±1s , X ± 2s , X ± 3s . Compare observations in each interval to the Empirical Rule. Check to see if the % of observations in each standard deviations is roughly in-line with the Empirical Rule. Method 2: Normal Probability Plot (technology toolbox on Page 125) • Use calculator to make a quick normal probability plot (located in the same place as a histogram, but choose last graph option) • Takes each data value and plots it against the z –score expected for that point to have if the distribution were perfectly normal. • X-axis = data values (observations) • Y-axis = z-scores of a perfectly normal distribution. • When these values match up well, the line is straight and the data appears to be data from a Normal distribution/model. • Probability plots can be skewed • Normal distributions produce roughly a diagonal straight line of the data set. Interpreting - Normal Probability Plot • • • Normal Distribution – plotted points will form roughly a diagonal straight line Non - normal Distribution – plotted point will form a nonlinear trend. Outliers will appear on plot as obvious deviations from the overall pattern. Normal Probability Plot • Use only when the distribution is unimodal and symmetric • Meaning – CHECK the data quickly in a histogram, stemplot, boxplot, etc.