Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Measures of Central Tendency Your specific assignment is to select a type of quantitative data to collect from your own life. Some examples could be: How many minutes do you spend on studying every day. How many hours do you spend at work every day? Number of phone calls you get each day. In a brief paper, describe the data you are going to collect. Start collecting data today so you have can have at least 10 observations, preferably more. Note: you only have to choose one variable, and then collect 10 days worth of data on that one variable. Using the data you have collected in above: Calculate the mean, median, and mode. Are these numbers higher or lower than you would have expected? Which of these measures of central tendency do you think most accurately describes the lifestyle variable you are looking at? Write a brief paper with your calculations and the answers to these questions. Plan for data collection I am going to collect information on the number of emails received in my inbox each day. My habit is to open the inbox at exactly 7 am on each day. Then I open all unread emails and read them and send a reply if necessary. For data collection purpose, I have decided to note down the number of such emails found unread in the inbox at exactly 7 am each day. Collected data I have collected the data for 13 days. The following table contains the number of emails received in my inbox each day for 13 days. Days 1 2 3 4 5 6 7 8 9 10 11 12 13 Number of emails received 17 12 10 17 17 18 11 14 13 16 7 12 8 Statistical Analysis After collecting the data, the next step is to compute certain numbers which are used to summarize the data. They are called measures of central tendency. The mean, the median and the mode are three among the most important measures of central tendency. Computation of the mean The mean is defined as the sum of the observations divided by the number of observations. βπ₯ Mean = π , where π is the number of observations and β π₯ is the sum of the observations. Number of observations = n = 13 Sum of the observations = β π₯ = 17+ 12 + 10 + 17 + 18 + 11 + 14 + 13 + 16 + 7 + 12 + 8 = 172 Mean = βπ₯ π = 172 13 = 13.23 (correct to 2 places of decimal) Computation of the median The median is defined as the middle most observation when the observations are arranged in the order of magnitude. The following are the observations arranged in the increasing order of magnitude. Order of Observations 1 2 3 4 5 6 7 8 9 10 11 12 13 Value of observations 7 8 10 11 12 12 13 14 16 17 17 17 18 Since there are 13 observations, the middlemost observation is the (13+1)/2 = 7th observation. Therefore, Median = 7th observation = 13 Computation of the mode The mode is defined as the most frequently occurring observation. It is observation with maximum frequency. The following table gives the frequencies of the observations. Observations 7 8 10 11 12 13 14 16 17 18 Frequency 1 1 1 1 2 1 1 1 3 1 The observation with maximum frequency is 17. Mode = 17. Comparison of the three measures Mean = 13.23 Median = 13 Mode = 17 Here the mean and median appears to be satisfactory as a measure of central tendency. But the mode is greater than what is expected as a central value. Both mean and median describes the life style somewhat accurately. But the mean has the disadvantage that it is not a whole number, even though all observations are whole numbers. The mode has the advantage that it is most repeating item. Out of 13 observations, the observation 17 repeats 3 times. Collection of additional data The process of collecting data is continued for 5 more days. The following table contains the number of emails received in my inbox each day for 18 days. Days 1 17 Number of emails received 2 12 3 10 4 17 5 17 6 18 7 11 8 14 9 13 10 16 11 7 12 12 13 8 14 9 15 12 16 17 17 18 18 14 The important measures of central tendency are computed for this larger sample. Computation of the mean The mean is defined as the sum of the observations divided by the number of observations. βπ₯ Mean = π , where π is the number of observations and β π₯ is the sum of the observations. Number of observations = n = 18 Sum of the observations = β π₯ = 17+ 12 + 10 + 17 + 18 + 11 + 14 + 13 + 16 + 7 + 12 + 8 + 9 + 12 + 17 +1 8 + 14 = 242 Mean = βπ₯ π = 242 18 = 13.44 (correct to 2 places of decimal) Computation of the median The median is defined as the middle most observation when the observations are arranged in the order of magnitude. The following are the observations arranged in the increasing order of magnitude. Days 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Number of emails received 7 8 9 10 11 12 12 12 13 14 14 16 17 17 17 17 18 18 Since there are 18 observations, the middlemost observation is the (18+1)/2 = 9.5th observation. Therefore, Median = (9th Obs + 10th Obs)/2 = (13 + 14)/2 = 13.5 Computation of the mode The mode is defined as the most frequently occurring observation. It is observation with maximum frequency. The following table gives the frequencies of the observations. Observations 7 8 9 10 11 12 13 14 16 17 18 Frequency 1 1 1 1 1 2 1 2 1 4 2 The observation with maximum frequency is 17. Mode = 17. Comparison of the three measures Mean = 13.44 Median = 13.5 Mode = 17 Here the mean and median appears to be satisfactory as a measure of central tendency. But the mode is greater than what is expected as a central value. Both mean and median describes the life style somewhat accurately. But the mean has the disadvantage that it is not a whole number, even though all observations are whole numbers. The mode has the advantage that it is the most repeating item. Out of 18 observations, the observation 17 repeats 4 times. When the sample size increased from 13 to 18, there is no change in the value of the mode. The mean and the median changed only slightly. So far as the present study is concerned, the central value is more or less stable. There is no necessity of increasing the sample size as far as the present study is concerned. Part II ANALYSIS USING EXCEL ENTERIG THE DATA IN EXCEL In cell A1 write βDaysβ. From cells A2 to A19 enter the numbers of the day on which the observations are taken. In cell B1 write βEmailsβ. From cells B2 to B19 enter the number of emails on each day. Thus the data is entered in to excel. PREPARATION OF FREQUENCY DISTRIBUTION AND DESCRIPTIVE MEASURES In cell A21 write βMinimumβ In cell B21 write the formula β=MIN(B2:B19)β and press Enter In cell A22 write βMaximumβ In cell B22 write the formula β=MAX(B2:B19)β and press Enter Thus we have computed the minimum and maximum of the data as 7 and 18 In a similar way mean, median, mode and standard deviation are calculated in cells B23, B24, B25 and B26 using the formulae β=AVERAGE(B2:B19)β, β=MEDIAN(B2:B19)β, β=MODE(B2:B19)β, β=STDEV(B2:B19)β. In cell D1 write βValueβ. In cell E1 write βFrequencyβ. From cells D2 to D13 enter the numbers 7 to 18 Select cells E2 to E13. Enter the formula β=FREQUENCY(B2:B19,D2:D13)β and press shift + ctrl + Enter Fitting a normal distribution The mean is 13.44 and standard deviation is 3.55 For a normal distribution, approximately 68% of the data is within 1SD of the mean, approximately 95% of the data is within 2SDs of the mean, approximately 99% of the data is within 3SDs of the mean. For our data, 50% of the data is within 1SD of the mean, 100% of the data is within 2SDs of the mean, 100% of the data is within 3SDs of the mean. We can infer that our data is approximately normally distributed.