Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Foundations of statistics wikipedia , lookup
Degrees of freedom (statistics) wikipedia , lookup
Mean field particle methods wikipedia , lookup
Confidence interval wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
History of statistics wikipedia , lookup
9/1/2016 Measures of Location & Spread Summary Statistics: Measures of Location and Spread • Illustrate where the majority of locations are found: – e.g., means, medians, modes • Illustrate how variable the data are: – e.g., standard deviation, variance, standard error Statistics versus Parameters Measures of Location: Mean • Statistics describe the sample • Parameters describe the [unknown?] population Arithmetic Mean • Unbaised estimate of if: – Observations from random individuals – Samples are independent of each other – Observations drawn from a large population that can be described by a normal random variable • Arithmetic Mean – All observations weighted equally in calculation Other Means • Geometric Mean – Example from exponential population growth: when numbers are multiplied on an arithmetic scale then can be added on a logarithmic scale ... – So it depends how you use the ‘mean’ X~N(,) 1 9/1/2016 Median and Mode • Median: the ‘middle’ observation (unless tied) • Mode: the observations that occurs most frequently Which measure of location? • Arithmetic mean most common – Supported by Central Limit Theorem • Geometric mean most appropriate for multiplicative measures • Median or Mode when distribution doesn’t match a standard probability distribution • Pay attention to what measure is supplied and always be suspicious of any measure of location that is not accompanied by a measure of spread! Measures of Spread • Variance and Standard Deviation Measures of Spread • Variance • Sum of Squares (SS): • Unbiased estimate of 2 Degrees of Freedom • The number of independent observations that we have for estimating statistical parameters • ‘Usually’... n‐1 Measures of Spread • Variance • Standard Deviation 2 9/1/2016 Standard Error of the Mean • Think of the standard error (or the mean) as an estimate of the standard deviation of the POPULATION MEAN Skewness, Kurtosis, and Central Moments • A central moment is the average of the deviations of all observations in a dataset from the mean of the observations, raised to a power r: Standard Error of the Mean • If inference is about the sample: provide SD (s) • If the inference is about the means: provide the SE Skewness, Kurtosis, and Central Moments • r = 1 (1st moment) always 0 • r = 2 (2nd moment) is the variance Skewness • r = 3 (3rd moment) divided by s3 = skewness Skewness • g1 = 0 normal distribution • g1 > 0 right‐skewed (longer tail of observations to the right of the mean • Skewness describes how the sample differs in shape from a symmetrical distribution • g1 < 0 left‐skewed (longer tail of observations to the left of the mean 3 9/1/2016 Skewness Kurtosis • Based on 4th central moment (r=4) • Measures the extent to which the distribution is distributed in the tails versus the center of the distribution Kurtosis Kurtosis • Clumped, or platykurtic distributions have g2 < 0 (less probability in the tails) • Leptokurtic distributions have g2 > 0 (less probability in the center) Skewness and Kurtosis • Should be tested, but both measures are sensitive to outliers ... Quantiles • Box plots of quantiles can portray the distribution of data more accurately than plots of means and standard deviations 4 9/1/2016 Other Measures • Coefficient of Variation (CV) – Variability ‘independent of the mean’ • Coefficient of Dispersion Distribution of Points • For normally distributed random variables: – 67% of observations occur within 1 SD of the mean – 96% of observations occur within 2 SD of the mean – For discrete variables ‘variance‐ to‐mean’ ratio – Measure of clumping, but dependent on ‘scale’ Confidence Intervals Confidence Intervals Confidence Intervals Confidence Intervals 5 9/1/2016 Confidence Intervals Confidence Intervals • Interpretation: – 95% of the time such an interval will contain the true value of – NOT: “there is a 95% chance that the true occurs within the interval” it either does or does not ... 6