Download 1 - TU Delft: CiTG

1. The range and estimate a reasonable number of intervals as well as class limits. For a small dataset of the size 7, a reasonable number of classes is between 3 and 7 classes (but not more than 7 classes). 2. Estimate the relative and absolute frequencies as well as both frequencies for cumulative frequency distributions. Not much can go wrong here. 3. Make a histogram and cumulative frequency distribution (figures). Interpret this briefly (one sentence!) The interpretation should be regarding the fact if the histogram looks symmetrical, or if it is skewed to the left or to the right. 4. Calculate the median, mode, and arithmetic mean. The mode is the value with largest frequency or the average value of measurements of the class with the largest frequency. If several classes have the same maximum frequency, then the averages of those classes form the modes of the dataset. 5. Calculate the variance, standard deviation and coefficient of variation. Not much can go wrong here. You may use in the standard deviation the expression where the division goes by N (for populations) or by N-1 (for samples of populations). 6. Calculate the skewness and kurtosis. Interpret the results briefly (one sentence!) There are several expressions available to calculate the skewness and kurtosis of a dataset, and in this computer exercise, all possible expressions are marked OK. The equation for skewness in Excel is defined as: n  xi  x     (n  1)(n  2)  s  3 The formula for skewness on slide 223 of the PPT presentation is different and given by: a3 = m3 3/2 m2 3= 2 n a3 (n - 1)(n - 2) However, in both cases the check should be made if the skewness is smaller or larger than 0, in order to identify if the tail to the left is larger than the tail to the right, or vice versa. The equation for kurtosis in Excel is defined as: n(n  1)  xi  x     (n  1)( n  2)( n  3)  s  4  3(n  1) 2 (n  2)( n  3) The formula for kurtosis on slide 226 of the PPT presentation is given by: a4 = m4 2 m2 4= 3 n a4 (n - 1)(n - 2)(n - 3) These two expressions give very different outcomes, because the Excel formula gives a so-called excess kurtosis. If the excess kurtosis is larger than 0, then we have a dataset which is steeper than a normal distribution, and smaller than 0 corresponds to a distribution which is flatter than a normal distribution. For the expressions on slide 226, a comparison of α4 with the value 3 should be made. 7. Calculate the parameters for a Normal-Distribution and plot that as graph (figure). Is the Normal-Distribution suitable for that data set? The parameters may be calculated by the sample average and the sample standard deviation formula’s of questions 4 and 5. The suitability question should be judged on normal probability paper, where the data is plotted in sorted order (ascending) on the horizontal axis against i/(N+1) on the vertical axis for i=1,…, 7 (which means 12,5%, 25%, 37,5%, …, 87,5%). If an acceptable straight line can be drawn through the dataset, it can be concluded that the data can be modeled by a normal distribution.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download 1 - TU Delft: CiTG