Download STATS CH 2 Notes

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
AP STATS NOTES Section 2-­‐1 Percentile Rank If you are trying to find the percentile corresponding to a certain score x: number of scores < x
Percentile =
⋅100
total number of scores
Percentiles are used often when reporting academic scores such as SAT scores. Lets say you get a 620 on the math portion of the SAT. It might also indicate that you are in the “78th percentile”. That means that you scored better than 78% of all students taking that particular SAT. Z-­‐scores (Measures of Relative Standing) The z-­‐score tells us how many standard deviations a particular piece of data is away from the mean of the distribution. It therefore allows us to make comparisons across distributions. A z-­‐score is very, very, very useful in statistics. If x is an observation from a distribution that has a mean μ and a standard deviation σ, then the standardize value of x (often called the z-­‐score) is: x − mean
z=
standard deviation
Often times we are asked to compare the scores of two pieces of data that do not come from the same distribution. In order to decide which score is in fact higher, we must first standardize the scores. Example: Suppose there is one spot left in the University of Michigan class of 2014 and the admissions department has the decision narrowed down to 2 applicants. Everything about these 2 students is similar except their GPAs. Ferris has a 2.8 while Cameron has a 3.1. Now on paper it seems like Cameron has the higher GPA, but what about the difference in academic rigor of the student’s respective high schools. That must count for something. And it does. Ferris’ high school mean GPA is a 2.41 and the standard deviation is 1.09. Cameron’s high school mean GPA is 2.91 with a .87 standard deviation. An important assumption to make is the distribution of GPAs is approximately symmetrical. So the question is who deserves the spot? So now we can figure out whose GPA is more impressive, Ferris or Cameron. Operations on statistics: Mean Addition of constant a Increase by a Subtraction of constant a Decrease by a Multiplication of constant a Multiply by a Division of constant a Divides by a Median Increase by a Decrease by a Multiply by a Divides by a Standard Deviation No change No change Multiply by |a| Divides by |a| Quartiles Increase by a Decrease by a Multiply by a Divides by a IQR No change No change Multiply by |a| Divides by |a| Min and Max Increase by a Decrease by a Multiply by a Divides by a Range No change No change Multiply by |a| Divides by |a| Percentiles Increase by a Decrease by a Multiply by a Divides by a The following example shows what happens to a statistic when 2 is added, subtracted, multiplied, or divided by all data values. Starting Add 2 Subtract 2 Multiply by 2 Divide by 2 Value Mean 5 7 3 10 2.5 Median 4 6 2 8 2 Standard Dev. 2 2 2 4 1 Quartiles 3 and 6 5 and 8 5 and 8 6 and 12 1.5 and 3 IQR 3 3 3 6 1.5 Min and Max 1 and 11 3 and 13 -­‐1 and 9 2 and 22 .5 and 5.5 Range 10 10 10 20 5 Percentiles 3.5 is 40% 5.5 is 40% 1.5 is 40% 7 is 40% 1.75 is 40% HW A: 1,5,9,11,13,15 Density Curves So far we have worked only with jagged histograms and stem plots to analyze data. As we begin to explore more fully the many statistical calculations and analyses one can perform on data it will become clear that working with smooth curves is much easier than jagged histograms. These smooth curves are called Density curves. Characteristics of a density curve v A density curve is a smooth curve that describes the overall pattern of a distribution by showing what proportions of observations (not counts) fall into a range of values. v Areas under a density curve represent proportions of observations v The scale of a density curve is adjusted in such a way that the total area under the curve is always equal to 1 v The density curve is always on or above the horizontal axis. v The median of a density curve is the equal areas point. v The mean of a density curve is the balance point (like a see-­‐saw) if the curve was made of solid material. v The mean and the median are equal in a symmetric density curve. HW B: 19,21,23,31,33-­‐38 v The mean and the median are not equal in a skewed density curve. Normal Distributions A Particularly important class of density curves are normal distributions. A density curve that is normally distributed has the following characteristics: • Symmetric • Single peak • Bell shape The mean of a density curve (including the normal curve) is denoted by μ (the Greek lowercase letter mu) and the standard deviation is denoted by σ (the Greek lowercase letter sigma) All Normal distributions have the same overall shape. Any differences can be explained by μ and σ. Calculus connection: On the normal curve, at a distance of σ on either side of the mean μ, are two points. These points are called inflection points and they signify where the curve changes concavity. The normal curve changes from concave up to concave down at these points. To find the inflection points, you would need to find the second derivative of the function. Empirical Rule An unbelievable property that all normal curves have is that they follow the 68-­‐95-­‐99.7 rule In a normal distribution with mean μ and standard deviation σ : • 68% of the observations fall within 1 standard deviation of the mean. • 95% of the observations fall within 2 standard deviations of the mean. • 99.7% of the observations fall within 3 standard deviations of the mean. The normal curve with mean = 0 and standard deviation = 1 is called the standard normal curve. Reminder: Area under the curve = 1 Equation: We can tell the percentile to the left of a z-­‐value using table A: Or, use the AP Stats Program. Examples: What is the probability to the left of z value 2.63? What is the probability to the right of z value 1.58? What is the probability between z values 0.2 to 1.5? What is the probability outside of z values 1 to 2.3? What z value has a probability to the left equal to .75? What z value has a probability to the right equal to .4? HW C: 41,43,45,47,49,51 HW D: 53,55,57,59 NOTE: prob. Between – find the two prob. to the left and subtract them. ASSESSING NORMALITY Method 1: Construct a histogram, stem and leaf plot or box plot to determine if the shape is approximately bell shaped with symmetry about the mean. This is fairly easy because if you load the data into your calculator, you can check a histogram very quickly. Method 2: Check the normal probability plot (on TI-­‐83). This is an easy and quick way to check for normality. You are shooting for a normal probability plot that has a linear trend to it. Normal Distribution Bimodal Distribution Skewed Right Distribution
Skewed Left Distribution
Method 3: You can improve upon the accuracy of methods 1 and 2 by checking to see if the 68-­‐95-­‐99.7 rule applies (approximately) to the data. Find the mean, and standard deviation of the data. Find out if approximately 68% of the data points are within 1 standard deviation of the mean, 95% are within 2 standard deviations, and approximately 99.7% are within 3 standard deviations. The last method is cumbersome, so only use it as a back-­‐up plan. Relating IQR to the Normal Curve: Reminder-­‐Find Outliers: Q1 – 1.5IQR and Q3 + 1.5IQR HW E: 63,65,66,68-­‐74