Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Ch.2 Descriptive Statistics Prepared by: M.S Nurzaman, MIDEc. ( deden ) [email protected] (021)33783504 / 081310315562 Introduction The purpose of this lecture is to help you to understand conceptually the meanings of measures of locations (i.e., mean, median, and mode) and measures of variability (i.e., range, variance, standard deviation, and coefficient of variation). Measures of Location And Variability for Ungrouped or Raw Data: 1. By Location Arithmetic Mean: The arithmetic mean (or the average or simply mean) is computed by summing all numbers and dividing by the number of observations Weighted Mean: In some cases the data in the sample or population should not be weighted equally, and each value weighted according to its importance Types of Statistics: Median: The median is the middle value in an ordered array of observations Mode: The mode is the most frequently occurring value in a set of observation 2. by Variability Measures of variability represent the dispersion of a set of data Note that a small value for a measure of dispersion indicates that the data are around the mean; therefore, the mean is a good representative of the data set. On the other hand, a large measure of dispersion indicates that the mean is not a good representative of the data set Also, measures of dispersion can be used when we want to compare the distributions of two or more sets of data Range: The range is the difference between the largest observation of a data set and the smallest observation Variance: An important measure of variability is variance. Variance is the average of the squared deviations from the arithmetic mean The following steps are used to calculate the variance: 1. Find the arithmetic mean. 2. Find the difference between each observation and the mean. 3. Square these differences. 4. Sum the squared differences. 5. Since the data is a sample, divide the number (from step 4 above) by the number of observations minus one, i.e., n-1 (where n is equal to the number of observations in the data set). Later on, this term (n-1) will be called the degrees of freedom. As you see in the above example, the variance is not expressed in the same units as the observations. In other words, the variance is hard to understand because the deviations from the mean are squared, making it too large for logical explanation. These problems can be solved by working with the square root of the variance, which is called standard deviation. Standard Deviation: Both variance and standard deviation provide the same information; one can always be obtained from the other. In other words, the process of computing a standard deviation always involves computing a variance. As we said, since standard deviation is the square root of the variance, it is always expressed in the same units as the raw data Meaning of Standard Deviation: One way to explain the standard deviation as a measure of variation of a data set is to answer questions such as how many measurements are within one, two, and three standard deviations from the mean. To answer questions such as this, we need to talk about empirical rule and Chebyshev's rule Empirical Rule: This rule generally applies to mound-shaped data, but specifically to the data that are normally distributed, i.e., bell shaped. The rule is as follows: Approximately 68% of the measurements (data) will fall within one standard deviation of the mean, 95% fall within two standard deviations, and 97.7% (or almost 100% ) fall within three standard deviations. For example, in the height problem, the mean height was 70 inches with a standard deviation of 3.4 inches. Thus, 68% of the heights fall between 66.6 and 73.4 inches, one standard deviation, i.e., (mean + 1 standard deviation) = (70 + 3.4) = 73.4, and (mean - 1 standard deviation) = 66.6. Ninety five percent (95%) of the heights fall betweeen 63.2 and 76.8 inchesd, two standard deviations. Ninety nine and seven tenths percent (99.7%) fall between 59.8 and 80.2 inches, three standard deviations. See the following figure: The coefficient of variation expresses the standard deviation as a percentage of the mean, i.e., it reflects the variation in a distribution relative to the mean: Coefficient of Variation (C.V.) = (standard deviation / mean) x 100 Wallahu A’lam Bisshowab Next : Discussion