Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Transcript

GEOGRAPHICAL STATISTICS GE 2110 ZAKARIA A. KHAMIS Descriptive statistics 2 Statistics are interesting….”only when they are set in wider context that they begin to come to life” Five Rules for using statistics by Danny Dorling 1. 2. 3. 4. 5. Often there is little point in using statistics If you do use statistics, make sure they can be understood Do not overuse statistics in your work If you find a complex statistics useful then explain it clearly Recognize and harness the power of statistics Zakaria Khamis 5/24/2017 Measures of Central Tendency 3 In most cases, it is helpful to describe data by a single number that is most representative of the entire collection of data The single numbers which tend to appearing in the middle of the data distribution MCT They act as the fulcrum (center of gravity) at which the data balance Zakaria Khamis 5/24/2017 Means 4 Means are of many types, the very commonly used is Arithmetic mean; however, there are Geometric and Harmonic among others Arithmetic Mean Simply is the average the observations (data) Arithmetic Mean is in most cases referred to mean and is denoted by Zakaria Khamis x 5/24/2017 Means 5 The mean, or average, of n numbers is the sum of the numbers divided by n n Mathematically, x x i 1 i n Where xi denotes the value of observation i, and n denotes number of observations Mean value is influenced by extreme measurements Zakaria Khamis 5/24/2017 Means 6 Zakaria Khamis 5/24/2017 Means 7 Geometric Mean The geometric mean only applies to positive numbers. It is also often used for a set of numbers whose values are meant to be multiplied together or are exponential in nature, such as data on the growth of the human population or interest rates of a financial investment Zakaria Khamis 5/24/2017 Means 8 The Geometric mean of n numbers is the nth root of the product of the numbers Mathematically, GM n n x i i 1 Where xi denotes the value of observation i, and n denotes number of observations This is rarely used in statistical analysis Zakaria Khamis 5/24/2017 Means 9 Harmonic Mean This is most commonly used when the average rate is what of interest E.g. the average speed of a car; the average rate of population increase The Harmonic mean of n numbers is given by HM Zakaria Khamis n n 1 i 1 xi 5/24/2017 Mode and Median 10 Median is defined as the observation that splits the ranked list of observations (arranged in ascending or descending) in half When the number of observation is odd, median is simply equal to the middle value on a ranked list of observations When the number of observation is even, median is the average of the two values in the middle of ranked list Zakaria Khamis 5/24/2017 Mode and Median 11 Mode refers to the most frequently occurring value If two numbers tie for most frequent occurrence, the collection has two modes and is called bimodal. Which of the three measures of central tendency is the most representative? The answer is that it depends on the distribution of the data and the way in which you plan to use the data Zakaria Khamis 5/24/2017 Measures of Central Tendency 12 Zakaria Khamis 5/24/2017 Measures of Central Tendency 13 Class examples: 12, 33, 11, 45, 45, 34, 20, 67, 87, 19, 12, 12 Mean = Mode = Median = Zakaria Khamis 5/24/2017 Measures of Dispersion/Variability 14 The phenomena and aspects of the world we lives is changing spatially (within location) and temperarily (time to time) For examples. The changes in human population, the changes in standard living, and changes in literacy rate and the changes in price variability attract the experts to make detailed studies about them and then correlate these changes with the human life. Zakaria Khamis 5/24/2017 Measures of Dispersion/Variability 15 In statistics, the MCT measures the center of the data while the dispersion measures how the observation spread away from the center If the observation are close to the center ( arithmetic mean or median) dispersion, scatter or variation is small If the observations are spread away from the center dispersion is large. Zakaria Khamis 5/24/2017 Measures of Dispersion/Variability 16 Suppose we have three groups of students who have obtained the following marks in a test Group A: 46, 48, 50, 52, 54 Group B: 30, 40, 50, 60, 70 Group C: 40, 50, 60, 70, 80 Zakaria Khamis Mean =50 Mean =50 Mean =60 5/24/2017 Measures of Dispersion/Variability 17 The idea of dispersion is important in the study of wages of workers, prices of commodities, standard of living of different people, distribution of wealth, distribution of land among farmers and various other fields of life. It will help in identifies those variation and solve any problem which might happen. Zakaria Khamis 5/24/2017 Dispersion Range 18 Is the difference between the highest and the lowest value in a series of data Range xmax xmin Zakaria Khamis 5/24/2017 Variance and Standard Deviation 19 The variance represents the average squared deviation of an observation from the mean n s2 2 ( x x ) i i 1 n The standard deviation refers to the square root of variance n s Zakaria Khamis 2 ( x x ) i i 1 n 5/24/2017 Variance and Standard Deviation 20 The standard deviation of a set is a measure of how much a typical number in the set differs from the mean. The greater the standard deviation, the more the numbers in the set vary from the mean Imagine a researcher examine the monthly salary of Zanzibar secondary school teachers. He took 10 samples out of secondary school teachers . 44, 50, 38, 96, 42, 47, 40, 39, 46, 50’ 0000 Zakaria Khamis 5/24/2017 Variance and Standard Deviation 21 He calculated the mean = 49.2 This information telling us that all secondary school teachers receive 49.2 per months. However there might be variation because we have different categories of teacher in Zanzibar: diploma, bachelor degree, Master degree , private and public owned. Zakaria Khamis 5/24/2017 Variance and Standard Deviation 22 Standard deviation = 17 Mean +/- standard deviation 49.2 - 17 = 32.2 49.2 + 17 = 66.2 This mean that, most of the secondary school teachers receive between 32.20 and 66.20tsh/= Zakaria Khamis 5/24/2017 Quartiles 23 While standard deviation (SD) is the measure of dispersion that is associated with the mean; Quartiles measure dispersion associated with the median Consider an ordered set of numbers whose median is m. The lower quartile is the median of the numbers that occur before m. The upper quartile is the median of the numbers that occur after m. Zakaria Khamis 5/24/2017 Quartiles 24 Zakaria Khamis 5/24/2017 Inter-Quartile Range 25 In some statistical analysis we may need to find the difference which exists between the Quartiles the inter-quartile is calculated Inter-quartile range is the difference between the 25th and 75th percentile When the data have been ranked from lowest to highest, with n observations, the 25th percentile is represented by observation ( n 1) Zakaria Khamis 4 5/24/2017 Inter-Quartile Range 26 The 75th percentile is represented by observation 3(n 1) 4 This provides much more detail information about the data, for it provides within data picture of the variability by removing the outlying values Zakaria Khamis 5/24/2017 Skewness and Kurtosis 27 Skewness measures the degree of asymmetry exhibited by the data The data can exhibits +ve skewness or –ve skewness If the mean of the data is greater than its median, the data is positively skewed; and if the mean of the data is less than its median, the data is negatively skewed n Mathematically, Zakaria Khamis skewness (x x) i 1 3 i ns 3 5/24/2017 Skewness and Kurtosis 28 Kurtosis measure the peaking of the data relative to the normal distribution Data with high degree of peakeness is said to be leptokurtic and have the kaurtosis value more than 3 Flat data has the kurtosis value of less than 3, and it is called platykurtic Mathematically, Zakaria Khamis n kurtosis 4 ( x x ) i i 1 ns 4 5/24/2017 Skewness and Kurtosis 29 Zakaria Khamis 5/24/2017