Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
INA: Element2 Descriptive Statistics Max Carroll 6/15/2012 ...An explanation to the spreadsheet Contents Element 2: Descriptive Statistics ............................................................................................................. 2 Calculating the Mode and the Median ............................................................................................... 2 Producing a frequency table (with suitable class intervals) ............................................................... 3 Calculating the mean based on the frequency table and using Excel's own standard formulae ....... 4 Calculating the standard deviation based on the frequency table and using Excel's own standard formulae.............................................................................................................................................. 4 ~1~ INA: Element 2 Max Carroll 21010175 Element 2: Descriptive Statistics Calculating the Mode and the Median First of all the data was copied and pasted into an excel spreadsheet. The data was given the label "UNITS", so that it could be referenced from formulas without typing in the cell range each time. The label "UNITS" is effectively synonymous with "K4:O9" when using formulas and the two expressions would be interchangeable, i.e. the cell range of K4:O9, contains all of the raw data: - The picture above shows selected cells from K4 to O9 containing the data set from the case study. It also shows that they have the label of "UNITS" The pictures above show how formulas have been used to calculate the median and mode from the data set in K4:O9 "UNITS". The left shows the formulas and the right shows the calculated values The next step was to create some formulas that would compute the Mode and Median. I calculated a few other values that I thought may be useful, the mean based on the raw data (to compare the mean calculated from the frequency table later) and the minimum and maximum value, in order to better decide, the class interval width and the lower and upper class limits. ~2~ INA: Element 2 Max Carroll 21010175 Producing a frequency table (with suitable class intervals) First of all the minimum and maximum values were calculated from the data (as show on previous page). The bottom lower class limit was constructed rounding down the lowest value to the nearest 10 (23 to 20) and the top upper class limit was constructed by rounding up the maximum value to the nearest 10 (77 to 80). It was decided that class widths of 5 would be used to create a table of 12 rows of data. The picture above shows the frequency table values and the table below shows the formulae used in the same table ~3~ INA: Element 2 Max Carroll 21010175 Calculating the mean based on the frequency table and using Excel's own standard formulae The sum of columns (f) Frequency and (fx) were taken. Then the mean was calculated by dividing them in the fashion: Σ(fx)/ Σ(f) The image to the left shows the values used in the calculation, upper right shows the mean value and the lower right shows the formula used in the cell to calculate the mean. The Excel Formula =AVERAGE(UNITS) returned an answer of 45.633. Calculating the average using the formula : Σ(fx)/ Σ(f), we got a value of 45.833. As we can see the answers are extremely close to each other. The difference between them is negligible in comparison with the magnitude of the numbers. Calculating the standard deviation based on the frequency table and using Excel's own standard formulae Totals were calculated for the (f),(fx) and (fx^2) columns of the frequency table. The square was then taken for the total of the (fx) column to give us the [Σ(fx)]2 value. Above left shows calculated values and above right shows formulae used to calculate The above values were then plugged into the equation below in order to obtain the answer. However the manner in which the equation was utilized affected the final value. s ( fx) 2 n (n 1) f .x 2 ~4~ INA: Element 2 Max Carroll 21010175 The above image shows the calculated standard deviations and to the right shows the formulae used to calculate those values The first value (Cell D31) was calculated by soft coding the formula based on the calculated values of the frequency table. However when calculated in a slightly different way (Cell D35) we can see there is a slightly different result. Here is a breakdown of the differences: Cell D35 is square rooted in a second stage as opposed to everything being done in the same stage The cell reference to E19 (30) and E19-1 has been replaced with the numbers 30 and 29 respectively Instead of using [Σ(fx)]2 value the Σ(fx) value is squared in the formula itself (i.e. F23 became F19^2) Although the above alterations either refer to or substitute for identical values, they seemed to produce answers that were different. I assume this is because of the way excel decides to round figures in formulas in different circumstances. However what I can conclude is that the method in which the equation was square rooted in a different step gave a more closer answer to Excel's inhouse formula for calculating standard deviation. ~5~ INA: Element 2 Max Carroll 21010175