* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Measures of Location and Spread Measures of Location
Survey
Document related concepts
Foundations of statistics wikipedia , lookup
Degrees of freedom (statistics) wikipedia , lookup
Mean field particle methods wikipedia , lookup
Confidence interval wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
History of statistics wikipedia , lookup
Transcript
9/1/2016 Measures of Location & Spread Summary Statistics: Measures of Location and Spread • Illustrate where the majority of locations are found: – e.g., means, medians, modes • Illustrate how variable the data are: – e.g., standard deviation, variance, standard error Statistics versus Parameters Measures of Location: Mean • Statistics describe the sample • Parameters describe the [unknown?] population Arithmetic Mean • Unbaised estimate of if: – Observations from random individuals – Samples are independent of each other – Observations drawn from a large population that can be described by a normal random variable • Arithmetic Mean – All observations weighted equally in calculation Other Means • Geometric Mean – Example from exponential population growth: when numbers are multiplied on an arithmetic scale then can be added on a logarithmic scale ... – So it depends how you use the ‘mean’ X~N(,) 1 9/1/2016 Median and Mode • Median: the ‘middle’ observation (unless tied) • Mode: the observations that occurs most frequently Which measure of location? • Arithmetic mean most common – Supported by Central Limit Theorem • Geometric mean most appropriate for multiplicative measures • Median or Mode when distribution doesn’t match a standard probability distribution • Pay attention to what measure is supplied and always be suspicious of any measure of location that is not accompanied by a measure of spread! Measures of Spread • Variance and Standard Deviation Measures of Spread • Variance • Sum of Squares (SS): • Unbiased estimate of 2 Degrees of Freedom • The number of independent observations that we have for estimating statistical parameters • ‘Usually’... n‐1 Measures of Spread • Variance • Standard Deviation 2 9/1/2016 Standard Error of the Mean • Think of the standard error (or the mean) as an estimate of the standard deviation of the POPULATION MEAN Skewness, Kurtosis, and Central Moments • A central moment is the average of the deviations of all observations in a dataset from the mean of the observations, raised to a power r: Standard Error of the Mean • If inference is about the sample: provide SD (s) • If the inference is about the means: provide the SE Skewness, Kurtosis, and Central Moments • r = 1 (1st moment) always 0 • r = 2 (2nd moment) is the variance Skewness • r = 3 (3rd moment) divided by s3 = skewness Skewness • g1 = 0 normal distribution • g1 > 0 right‐skewed (longer tail of observations to the right of the mean • Skewness describes how the sample differs in shape from a symmetrical distribution • g1 < 0 left‐skewed (longer tail of observations to the left of the mean 3 9/1/2016 Skewness Kurtosis • Based on 4th central moment (r=4) • Measures the extent to which the distribution is distributed in the tails versus the center of the distribution Kurtosis Kurtosis • Clumped, or platykurtic distributions have g2 < 0 (less probability in the tails) • Leptokurtic distributions have g2 > 0 (less probability in the center) Skewness and Kurtosis • Should be tested, but both measures are sensitive to outliers ... Quantiles • Box plots of quantiles can portray the distribution of data more accurately than plots of means and standard deviations 4 9/1/2016 Other Measures • Coefficient of Variation (CV) – Variability ‘independent of the mean’ • Coefficient of Dispersion Distribution of Points • For normally distributed random variables: – 67% of observations occur within 1 SD of the mean – 96% of observations occur within 2 SD of the mean – For discrete variables ‘variance‐ to‐mean’ ratio – Measure of clumping, but dependent on ‘scale’ Confidence Intervals Confidence Intervals Confidence Intervals Confidence Intervals 5 9/1/2016 Confidence Intervals Confidence Intervals • Interpretation: – 95% of the time such an interval will contain the true value of – NOT: “there is a 95% chance that the true occurs within the interval” it either does or does not ... 6