Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Z-Scores, Shifting and Scaling Z score calculated as follows: 𝑧= (𝑠 = 𝑦−𝑦 𝑠 (𝑦−𝑦)2 𝑛−1 ) Shifting: Adding/Subtracting Changes Measures of Center DOES NOT change measures of spread Scaling Multiplying/Dividing Changes Measures of Center Changes measures of Spread Normal Models “bell-shaped curves” are called Normal Models Appropriate for distributions whose shapes are unimodal and roughly symmetric Normal Models Each is a model For symmetric, unimodal distributions a normal model provide a measure of how extreme a z-score is There is a normal model for every possible combination of mean and standard deviation Notation 𝑁 𝜇, 𝜎 This represents a normal model with a mean of 𝜇 and a standard deviation of 𝜎. Why the greek? This mean and standard deviation are not numerical summaries of data. They are part of a model. They don’t come from data. They are numbers that we choose to specify our model They are called parameters. Notation Continued 𝑁 𝜇, 𝜎 We don’t want to confuse our parameters with summaries of the data such as 𝑦 𝑎𝑛𝑑 𝑠 Summaries of the data are called statistics Z-Scores and Normal Models If we model data with a Normal Model and standardize them using the corresponding 𝜇 𝑎𝑛𝑑 𝜎 we still call the standardized value a zscore and we write: 𝑦−𝜇 𝑧= 𝜎 Z-Scores and Normal Models It is usually easier to standardize data first (using its mean and standard deviation) Then we need only model N(0,1) N(0,1) is called the standard normal model or standard normal distribution Normality Assumption In using the Normal Model to model our data, we must have a unimodal and symmetric distribution The Normality Assumption is that the data is unimodal and symmetric But it probably isn’t exactly that… Nearly Normal Condition The shape of the data’s distribution is unimodal and symmetric. Check this by making a histogram All models make assumptions – always point out the assumption you make for your model. Must also check the conditions in the data to make sure that those assumptions are reasonable. Normal Models Normal models tell us how extreme a value is by telling us how likely it is to find one that far from the mean. 68-95-99.7 Rule In a Normal Model about 68% of values fall within 1 SD of the mean About 95% of values fall within 2 SD of the mean About 99.7% of values fall within 3 SD of the mean Sample Problems Jean-Baptiste Grange of France skied the slalom in 88.46sec, approximately 1 SD faster than the mean. If a Normal Mode is useful in describing these slalom times, about how many of the 35 skiers finishing the event would you expect skied the slalom faster than Jean-Baptiste? We expect 68% of skiers to be within 1 SD of the mean. Of the remaining 32%, we expect half on the high end and half on the low end. 16% of 35 is 5.6, so conservatively, we’d expect about 5 skiers to do better than JeanBaptiste The Dutch The Dutch are among the tallest people in the world: The average Dutch man is 185cm tall, just over six feet. The average Dutch woman is just over 5’ 7’’ tall. If the Normal Model is appropriate and the SD for men is about 8cm, what percentage of Dutch men will be over 2 meters (6’ 6’’) tall? The Dutch Mean = 184 cm SD = 8 cm 2 meters = 200cm 200cm = 2 SD above mean We expect 5% of men to be more than two standard deviations below or above the mean 2.5% are likely to be above 2 meters Driving It takes you 20 minutes, on average, to drive to school with a standard deviation of 2 minutes Suppose a Normal Model is appropriate for the distribution of driving times A) How often will you arrive at school in less than 22 minutes? Answer: 68% of the time we’ll be within 1 SD, or two minutes, of the average 20 minutes. So 32% of the time we’ll arrive in less than 18 minutes or in more than 22 minutes. Half of those times (16%) will be greater than 22 minutes, so 84% will be less than 22 minutes Driving It takes you 20 minutes, on average, to drive to school with a standard deviation of 2 minutes B) How often will it take you more than 24 minutes? Answer: 24 minutes is 2 Suppose a Normal Model is appropriate for the distribution of driving times SD above the mean. By the 95% rule, we know 2.5% of the times will be more than 24 minutes Driving It takes you 20 minutes, on C) Do you think the average, to drive to school with a standard deviation of 2 minutes distribution of your driving times is unimodal and symmetric? Suppose a Normal Model is Answer: “Good” traffic will appropriate for the distribution of driving times speed up your time by a bit but traffic incidents may occasionally increase the time it takes so times may be skewed to the right and there may be outliers. Driving It takes you 20 minutes, D) What does the shape of on average, to drive to school with a standard deviation of 2 minutes the distribution then say about the accuracy of your predictions? Suppose a Normal Model Answer: If this is the case is appropriate for the distribution of driving times the Normal Model is not appropriate and the percentages we predict would not be accurate. pg 129, # 1, 2, 3, 5, 7, 9, 24 (Handed in Tomorrow for Real) Working With Normal Models 1. Make a Picture 2. Make a Picture 3. Make a Picture How to Draw a Normal Curve: - Bell shaped, symmetric about mean: start at the middle and sketch the left and right - Only need to draw out to 3SD - The place where the bell shape changes from curving downward to curving back up – the inflection point – is located exactly one standard deviation from the mean