Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Stats: Modeling the World Chapter 6 The Standard Deviation as a Ruler and The Normal Model Shifting and Scaling Data The Rules… When we shift data by adding or subtracting the same constant to every data value, measures of location/center change, but measures of spread stay the same. When we rescale data by multiplying or dividing by the same constant to every data value, measures of location/center AND measures of spread change. Who is the best athlete? How do we compare?? We are going to use something called a z-score to help us make comparisons. This z-score is simply a measure of “how many standard deviations from the mean” What can z-scores tell us?? A negative z-score tells us that the data value is below the mean. A positive z-score tells us that the data value is above the mean A z-score of 0 tells us a data value is right at the mean. The further a z-score is from 0, the more unusual it is… Example Another Example Looking at that formula again… When we convert a data value to a z-scores.. shifting it by subtracting the mean (which sets the NEW mean at 0) rescaling by the standard deviation (which sets the NEW standard deviation to 1) Standardizing does not change the shape of the distribution! Standardizing sets the center (mean) at 0. Standardizing sets the spread (st dev) to 1. The Normal Model If we have a distribution that is “bell-shaped”, symmetric, and unimodal, it is possibly modeled by the Normal Distribution. N(μ, σ) indicates a Normal model with population mean, μ, and population standard deviation, σ. m and s represent the population parameters of our model. X and s represent the sample’s statistics Making Connections… If we take a Normal model (like IQs) and standardize it using z-scores, we have a special distribution called the Standard Normal Distribution The Standard Normal Distribution is a Normal Model of N(0, 1) The Empirical Rule Percentile Ranks: 0.15th 2.5th 16th 50th 84th 97.5th 99.85th An Example A forester measured 27 of the trees in a large woods that is up for sale. He found a mean diameter of 10.4 inches and a standard deviation of 4.7 inches. Suppose that these trees provide an accurate description of the whole forest and that a Normal model applies. What size would you expect the central 95% of all trees to be? About what percent of the trees should be less than an inch in diameter? About what percent of the trees should be between 5.7 and 10.4 inches in diameter? About what percent of the trees should be over 15 inches in diameter? Instead of estimating, we can find probabilities for a Normal Model using a table Are you Normal??? To determine if a dataset follows a Normal model, either: -- Look at a histogram or stemplot and check for non-Normal features (like gaps, outliers, and skewness) -- Compare your actual data to the Empirical Rule -- Look at a Normality Plot and check if the plot approximates a diagonal straight line Assessing Normality Nearly Normal data have a histogram and a Normal probability plot that look somewhat like this example: Non-Normal Data A skewed distribution might have a histogram and Normal probability plot like this: What Can Go Wrong? Only use the Normal model for symmetric and unimodal distributions! Be careful of outliers. Remember that the mean and standard deviation are non-resistant!