Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Survey

Document related concepts

Transcript

1 Statistics for Business Lecture Notes (01.12.2016) Dr. Cansu Unver Erbas |[email protected] Measure of variability (measure of dispersion) Range The simplest measure of dispersion is the range. It is the difference between the maximum and the minimum values in a data set. In the form of an equation: Range= Maximum value-Minimum value Example 1: What is the range for the given data? 1000 1050 3000 2500 1780 2210 2540 1980 3650 4970 5000 8500 7010 Solution 1: Range= Maximum value-Minimum value = 8500-1000=7500 Interquartile range A measure of variability that overcomes the dependency on extreme values is the interquartile range (IQR). This measure of variability is simply the difference between the third quartile, Q3 , and the first quartile, Q1,.In other words, the interquartile range is the range for the middle 50 per cent of the data. See Figure1 for an illustration. Figure 1 Interquartile range (IQR) Variance The variance is a measure of variability that uses all the data. The variance is based on the difference between the value of each data and the mean. The difference is called a deviation _ about the mean. For a sample, a deviation about the mean is written ( x i x) . For a population, it is written ( x i ). In the computation of the variance, the deviations about the mean are squared. Population variance= 2 (x i ) 2 N 2 In most statistical applications, the data being analysed are for a sample. When we compute a sample variance, we are often interested in using it to estimate the population variance 2 . Although the detailed explanation is beyond the scope of this lecture, it can be shown that if the sum of the squared deviations about the sample mean is divided by n 1 , and not n , the resulting sample variance provides an unbiased estimate of the population variance. For this reason, the sample variance, denoted by s 2 , is defined as follows. Sample variance= s 2 (x = _ i x) 2 n 1 Example 1- Consider the data on class size for the sample of five university classes given below: Classes Number of students 1 46 2 48 3 47 4 50 5 44 _ _ Compute mean class size ( x ), deviation about the mean ( x i x) , squared deviation about _ the mean ( x i x) 2, and the sample variance. Solution1 : Number of students x i Mean class Deviation about the size x mean ( x i x) the mean ( x i x) 2 46 48 47 50 44 Total 47 47 47 47 47 -1 1 0 3 -3 0 1 1 0 9 9 20 _ _ _ ( x i x) _ Sample variance= s 2 = ( x i x) 2 n 1 = 20 20 5 5 1 4 Squared deviation about _ _ ( x i x) 2 3 Practice 1: Consider the starting salaries in Table below for the 10 business school graduates. Compute the deviation about the mean and the variance. Monthly salary x i 1500 1650 1800 1950 2000 1750 1900 1560 1940 1600 Total Sample mean Deviation about the _ _ Squared deviation about _ monthly salary x mean ( x i x) the mean ( x i x) 2 1765 1765 1765 -265 115 35 185 235 -15 135 -205 175 -165 0 70.225 13.225 1.225 34.225 55.225 225 18.225 42.025 30.625 27.225 326.675 (x _ i x) (x _ i x) 2 Thus, the sample variance is: _ s2 = ( x i x) 2 n 1 Standard Deviation The standard deviation is defined to be the positive square root of the variance. Following the notation we adopted for a sample variance and a population variance, we use s to denote the sample standard deviation and to denote the population standard deviation. The standard deviation is derived from the variance in the following way: Sample standard deviation= s s 2 and Population standard deviation= 2 What is gained by converting the variance to its corresponding standard deviation? Recall that the units associated with the variance are squared. If the sample variance for monthly salary is s 2 = 36.297,2222( £2). Because the standard deviation is the square root of the variance, the units are converted to pounds in the standard deviation. Hence the standard deviation of the starting salary data is 190,52 £. In other words, the standard deviation is measured in the same units as the original data. For this reason the standard deviation is 4 more easily compared to the mean and other statistics that are measured in the same units as the original data. Practice 2: Find the standard deviation for class size and starting monthly salary from the previous examples. Coefficient of Variation In some situations we may be interested in a descriptive statistic that indicates how large the standard deviation is relative to the mean. This measure is called the coefficient of variation and is usually expressed as a percentage. Coefficient of variance= ( std dev 100)% Mean For instance, for a sample where a standard deviation is 4, and sample mean is 50, the coefficient of variation is 4 50 100 %=8% This mean, the sample standard deviation is 8 per cent of the value of the sample mean. Another example, a standard deviation and the sample mean is 2 and 46 respectively for a sample data. The coefficient of variation is, 2 46 100 %=4.35% which tells us the sample standard deviation is only 4.35 per cent of the value of the sample mean. In general, the coefficient of variation is a useful statistics for comparing the variability of variables that have different standard deviations and different means. Summary Statistics is the art and science of collecting, analysing, presenting and interpreting data. Data consists of the facts and figures that are collected and analysed. A set of measurements obtained for a particular element is an observation. For purposes of statistical analysis, data can be classified a qualitative or quantitative. Qualitative data use labels or names to identify an attribute of each element. Quantitative data are numeric values that indicate how much or how many. Please check the following table for the summary of data in detail. 5 Figure: Tabular and graphical methods for summarising data Data Qualitative data Tabular methods -Frequency dist. -Relative frequency dist. -Percentage frequency dist. -Cross-tabulation Graphical methods -Bar chart -Pie chart Quantitative data Tabular methods Graphical methods -Frequency dist. -Relative frequency dist. -Percentage frequency dist. -Cumulative frequency dist. -Cumulative relative f.d. -Cumulative percentage f.d. -Cross-tabulation -Dot plot -Histogram -Ogive -Scatter diagram Practice 2: Please work out on questions from 9 to 16from the Exercises at page 81-82