Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Basic statistics Week 10 Lecture 1 1 Meanings of statistics Two meanings of statistics Statistics as a group of computational procedures that allow us to find meaning in numerical data Statistics as the value (number) you get by performing one of those procedure on sample Population parameters and sample statistics The symbol employed for designating the factor Population parameter Sample statistic Mean m x or M Standard deviation s S or SD Number or total N n Thursday, May 20, 2004 ISYS3015 Analytic methods for IS professionals School of IT, University of Sydney 2 Functions of statistics Descriptive statistics Describe what the data look like Inferential statistics Draw inferences about a large population from sample Thursday, May 20, 2004 ISYS3015 Analytic methods for IS professionals School of IT, University of Sydney 3 Descriptive statistics Points of central tendency The central point around which the data revolve. Mode Median The category or observation that appears most frequently in the distribution Only appropriate measure of central tendency for nominal variables The mid point of a distribution Frequently used to describe the central tendency of ordinal variables Mean Arithmetic average of the values within a data set M= xi/n Appropriate for interval and ratio variables Thursday, May 20, 2004 ISYS3015 Analytic methods for IS professionals School of IT, University of Sydney 4 Descriptive statistics (cont) Example High school student Joe’s daily grade in February Monday Tuesday Wednesday Thursday Friday Week 1 92 69 91 70 90 Week 2 89 72 87 73 86 Week 3 85 75 84 76 83 Week 4 83 77 81 78 79 Thursday, May 20, 2004 ISYS3015 Analytic methods for IS professionals School of IT, University of Sydney 5 Descriptive statistics (cont) Measures of variation: dispersion or deviation Range Average deviation Standard deviation xM n The standard measure of variability in most statistical operations s AD 2 ( x M ) n Variance The standard deviation squared x M 2 s2 Thursday, May 20, 2004 n ISYS3015 Analytic methods for IS professionals School of IT, University of Sydney 6 Shape of the distribution Shape of the distribution Joe's daily score The frequency of values from different ranges of the variable Use histogram to visual inspect the shape 6 5 4 3 2 Frequency Std. Dev = 7.11 1 Mean = 81.0 N = 20.00 0 65.0 - 70.0 75.0 - 80.0 70.0 - 75.0 85.0 - 90.0 80.0 - 85.0 90.0 - 95.0 Joe's daily score Thursday, May 20, 2004 ISYS3015 Analytic methods for IS professionals School of IT, University of Sydney 7 Shape of the distribution: normal distribution Many characteristics of human populations follow normal distribution Horizontally symmetrical and bell shaped most of the scores in a normal distribution tend to occur near the center, while more extreme scores on either side of the center become increasingly rare. the mean, median, and mode of the normal distribution are the same Thursday, May 20, 2004 ISYS3015 Analytic methods for IS professionals School of IT, University of Sydney 8 Features of normal distribution Predictable percentages of the population lie within any given portion of the curve 68% of the population lie within 1 standard deviation from the mean 95.46% of the cases lies within 2 standard deviation from the mean 95% of the cases fall within 1.96 standard deviation units from the mean Thursday, May 20, 2004 ISYS3015 Analytic methods for IS professionals School of IT, University of Sydney 9 The family of normal curves Mean determines where the midpoint of the distribution falls Standard deviation changes the shape of the distribution without affecting the midpoint Standard normal distribution Mean: 0 Standard deviation: 1 Thursday, May 20, 2004 ISYS3015 Analytic methods for IS professionals School of IT, University of Sydney 10 Measuring relative performance Example A student John obtained 60 out of 100 in a math exam and 50 out of 100 in an English exam Mean score? Standard deviation? Thursday, May 20, 2004 ISYS3015 Analytic methods for IS professionals School of IT, University of Sydney 11 Standard scores Z-score Measure the distance, in standard deviation units, of any value in a distribution from the mean Z = (x-m)/s John’s standard scores Zmath = (60-55)/10 = 0.5 ZEnglish = (50-45)/5 = 1 Thursday, May 20, 2004 ISYS3015 Analytic methods for IS professionals School of IT, University of Sydney 12 Create index by Z scores in survey research Triangulation of measures We have measures on income and years of education and we want to combine them to form a socio-economic index Annual incomes vary from 5,000 ~ 500,000 Yeas of education vary from 0 ~20 Thursday, May 20, 2004 ISYS3015 Analytic methods for IS professionals School of IT, University of Sydney 13 Computing an index score using Z score Income Years education Mean 65,000 11 Standard Deviation 22,000 4 A 64,000 16 B 86,000 9 Given population values Suppose 2 individuals Case A Case B Income: (64,000-65,000)/22,000 = -0.05 Income: (86,000-65,000)/22,000 Education: (16-11)/4 = 1.25 Education: (9-11)/4 Socio-economic index score Thursday, May 20, 2004 1.20 Socio-economic index score ISYS3015 Analytic methods for IS professionals School of IT, University of Sydney = 0.95 = -0.50 0.45 14 Correlation: measure of relationship a measure of the relation between two or more variables. Statistic used: correlation coefficient Between -1 and 1 Direction of relationship The sign of the correlation coefficient Strength of relationship The value of the correlation coefficient Thursday, May 20, 2004 ISYS3015 Analytic methods for IS professionals School of IT, University of Sydney 15 Pearson r correlation Simple linear correlation The measurement scales used should be at least interval scales Scatter gram (scatter plot) provide visual inspection of linear correlation Excel can calculate correlation Correlation does not indicate causation Thursday, May 20, 2004 ISYS3015 Analytic methods for IS professionals School of IT, University of Sydney 16 Example We collected data about the salaries (y) and years of experience (x) for a sample of 50 auditors. Are there any relationship between the salary and the years of experience? Thursday, May 20, 2004 ISYS3015 Analytic methods for IS professionals School of IT, University of Sydney 17