Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Parameters and Statistics A statistic is a descriptive measure computed from a sample of data. A parameter is a descriptive measure computed from an entire population of data. home back next Measures of Central Tendency - Arithmetic Mean - The arithmetic mean of a set of data is the sum of the data values divided by the number of observations. home back next Sample Mean If the data set is from a sample, then the sample mean, , is: X n X x i 1 n i x1 x2 xn n home back next Population Mean If the data set is from a population, then the population mean, , is: N x x1 x2 xn N N i 1 i home back next Measures of Central Tendency - Median - An ordered array is an arrangement of data in either ascending or descending order. Once the data are arranged in ascending order, the median is the value such that 50% of the observations are smaller and 50% of the observations are larger. home back next Measures of Central Tendency - Median - If the sample size n is an odd number, the median, Xm, is the middle observation. If the sample size n is an even number, the median, Xm, is the average of the two middle observations. The median will be located in the 0.50(n+1)th ordered position. home back next Measures of Central Tendency - Mode - The mode, if one exists, is the most frequently occurring observation in the sample or population. home back next Shape of the Distribution The shape of the distribution is said to be symmetric if the observations are balanced, or evenly distributed, about the mean. In a symmetric distribution the mean and median are equal. home back next Shape of the Distribution A distribution is skewed if the observations are not symmetrically distributed above and below the mean. A positively skewed (or skewed to the right) distribution has a tail that extends to the right in the direction of positive values. A negatively skewed (or skewed to the left) distribution has a tail that extends to the left in the direction of negative values. home back next Shapes of the Distribution Frequency Symmetric Distribution 10 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 Negatively Skewed Distribution Positively Skewed Distribution 12 12 10 10 8 8 Frequency Frequency 9 6 4 6 4 2 2 0 0 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 home 8 9 back next Measures of Variability - The Range - The range is in a set of data is the difference between the largest and smallest observations home back next Measures of Variability - Sample Variance - The sample variance, s2, is the sum of the squared differences between each observation and the sample mean divided by the sample size minus 1. n s 2 (x X ) i 1 2 i n 1 home back next Measures of Variability - Short-cut Formulas for s2 Short-cut formulas for the sample variance, s2, are: ( xi ) 2 xi n 2 i 1 s n 1 n or s2 2 2 x n X i n 1 home back next Measures of Variability - Population Variance The population variance, 2, is the sum of the squared differences between each observation and the population mean divided by the population size, N. N 2 (x ) i 1 2 i N home back next Measures of Variability - Sample Standard Deviation - The sample standard deviation, s, is the positive square root of the variance, and is defined as: n s s 2 (x X ) i 1 2 i n 1 home back next Measures of Variability - Population Standard Deviation- The population standard deviation, , is N 2 (x ) i 1 2 i N home back next The Empirical Rule (the 68%, 95%, or almost all rule) • • • For a set of data with a mound-shaped histogram, the Empirical Rule is: approximately 68% of the observations are contained with a distance of one standard deviation around the mean; 1 approximately 95% of the observations are contained with a distance of 2 standard deviations around the mean; 2 almost all of the observations are contained with a distance of three standard deviation around the mean; 3 home back next Coefficient of Variation The Coefficient of Variation, CV, is a measure of relative dispersion that expresses the standard deviation as a percentage of the mean (provided the mean is positive). The sample coefficient of variation is s CV 100 X if X 0 home back next Coefficient of Variation The population coefficient of variation is CV 100 if 0 home back next Percentiles and Quartiles Data must first be in ascending order. Percentiles separate large ordered data sets into 100ths. The Pth percentile is a number such that P percent of all the observations are at or below that number. Quartiles are descriptive measures that separate large ordered data sets into four quarters. home back next Percentiles and Quartiles The first quartile, Q1, is another name for the 25th percentile. The first quartile divides the ordered data such that 25% of the observations are at or below this value. Q1 is located in the .25(n+1)st position when the data is in ascending order. That is, (n 1) Q1 ordered position 4 home back next Percentiles and Quartiles The third quartile, Q3, is another name for the 75th percentile. The first quartile divides the ordered data such that 75% of the observations are at or below this value. Q3 is located in the .75(n+1)st position when the data is in ascending order. That is, 3(n 1) Q3 ordered position 4 home back next Interquartile Range The Interquartile Range (IQR) measures the spread in the middle 50% of the data; that is the difference between the observations at the 25th and the 75th percentiles: IQR Q3 Q1 home back next