Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
PROGRAMME 27 STATISTICS STROUD Worked examples and exercises are in the text Programme 27: Statistics Introduction Arrangement of data Histograms Measure of central tendency Dispersion Frequency polygons Frequency curves Normal distribution curve Standardized normal curve STROUD Worked examples and exercises are in the text Programme 27: Statistics Introduction Arrangement of data Histograms Measure of central tendency Dispersion Frequency polygons Frequency curves Normal distribution curve Standardized normal curve STROUD Worked examples and exercises are in the text Programme 27: Statistics Introduction Statistics is concerned with the collection, ordering and analysis of data. Data consists of sets of recorded observations or values. Any quantity that can have a number of values is a variable. A variable may be one of two kinds: (a) Discrete – a variable whose possible values can be counted (b) Continuous – a variable whose values can be measured on a continuous scale STROUD Worked examples and exercises are in the text Programme 27: Statistics Introduction Arrangement of data Histograms Measure of central tendency Dispersion Frequency polygons Frequency curves Normal distribution curve Standardized normal curve STROUD Worked examples and exercises are in the text Programme 27: Statistics Arrangement of data Table of values Tally diagram Grouped data Grouping with continuous data Relative frequency Rounding off data Class boundaries STROUD Worked examples and exercises are in the text Programme 27: Statistics Arrangement of data Table of values A set of data: 28 31 29 27 30 29 29 26 30 28 28 29 27 26 32 28 32 31 25 30 27 30 29 30 28 29 31 27 28 28 Can be arranged in ascending order: STROUD 25 26 26 27 27 27 27 28 28 28 28 28 28 28 29 29 29 29 29 29 30 30 30 30 30 31 31 31 32 32 Worked examples and exercises are in the text Programme 27: Statistics Arrangement of data Table of values Once the data is in ascending order: 25 26 26 27 27 27 27 28 28 28 28 28 28 28 29 29 29 29 29 29 30 30 30 30 30 31 31 31 32 32 It can be entered into a table. The number of occasions on which any particular value occurs is called the frequency, denoted by f. STROUD Value Number of times 25 1 26 2 27 4 28 7 29 6 30 5 31 3 32 2 Worked examples and exercises are in the text Programme 27: Statistics Arrangement of data Tally diagram When dealing with large numbers of readings, instead of writing all the values in ascending order, it is more convenient to compile a tally diagram, recording the range of values of the variable and adding a stroke for each occurrence of that reading: STROUD Worked examples and exercises are in the text Programme 27: Statistics Arrangement of data Grouped data If the range of values of the variable is large, it is often helpful to consider these values arranged in regular groups or classes. STROUD Worked examples and exercises are in the text Programme 27: Statistics Arrangement of data Grouping with continuous data With continuous data the groups boundaries are given to the same number of significant figures or decimal places as the data: STROUD Worked examples and exercises are in the text Programme 27: Statistics Arrangement of data Relative frequency If the frequency of any one group is divided by the sum of the frequencies the ratio is called the relative frequency of that group. Relative frequencies can be expressed as percentages: STROUD Worked examples and exercises are in the text Programme 27: Statistics Arrangement of data Rounding off data If the value 21.7 is expressed to two significant figures, the result is rounded up to 22. similarly, 21.4 is rounded down to 21. To maintain consistency of group boundaries, middle values will always be rounded up. So that 21.5 is rounded up to 22 and 42.5 is rounded up to 43. Therefore, when a result is quoted to two significant figures as 37 on a continuous scale this includes all possible values between: 36.50000… and 37.49999… STROUD Worked examples and exercises are in the text Programme 27: Statistics Arrangement of data Class boundaries A class or group boundary lies midway between the data values. For example, for data in the class or group labelled: 7.1 – 7.3 (a) The class values 7. 1 and 7.3 are the lower and upper limits of the class and their difference gives the class width. (b) The class boundaries are 0.05 below the lower class limit and 0.05 above the upper class limit (c) The class interval is the difference between the upper and lower class boundaries. STROUD Worked examples and exercises are in the text Programme 27: Statistics Arrangement of data Class boundaries (d) The central value (or mid-value) of the class interval is one half of the difference between the upper and lower class boundaries. STROUD Worked examples and exercises are in the text Programme 27: Statistics Introduction Arrangement of data Histograms Measure of central tendency Dispersion Frequency polygons Frequency curves Normal distribution curve Standardized normal curve STROUD Worked examples and exercises are in the text Programme 27: Statistics Histograms Frequency histogram Relative frequency histogram STROUD Worked examples and exercises are in the text Programme 27: Statistics Histograms Frequency histogram A histogram is a graphical representation of a frequency distribution in which vertical rectangular blocks are drawn so that: (a) the centre of the base indicates the central value of the class and (b) the area of the rectangle represents the class frequency STROUD Worked examples and exercises are in the text Programme 27: Statistics Histograms Frequency histogram For example, the measurement of the lengths of 50 brass rods gave the following frequency distribution: STROUD Worked examples and exercises are in the text Programme 27: Statistics Histograms Frequency histogram This gives rise to the histogram: STROUD Worked examples and exercises are in the text Programme 27: Statistics Histograms Relative frequency histogram A relative frequency histogram is identical in shape to the frequency histogram but differs in that the vertical axis measures relative frequency. STROUD Worked examples and exercises are in the text Programme 27: Statistics Introduction Arrangement of data Histograms Measure of central tendency Dispersion Frequency polygons Frequency curves Normal distribution curve Standardized normal curve STROUD Worked examples and exercises are in the text Programme 27: Statistics Measure of central tendency Mean Coding for calculating the mean Decoding Coding with a grouped frequency distribution Mode of a set of data Mode of a grouped frequency distribution Median of a set of data Median with grouped data STROUD Worked examples and exercises are in the text Programme 27: Statistics Measure of central tendency Mean The arithmetic mean: x of a set of n observations is their average: mean = sum of observations x that is x number of observations n When calculating from a frequency distribution, this becomes: xf xf x n f STROUD Worked examples and exercises are in the text Programme 27: Statistics Measure of central tendency Coding for calculating the mean A deal of tedious work can be avoided by coding with a false mean. Choose a convenient value of x near the middle of the range (the false mean) and subtract it from every other value of x and then divide by a suitable data interval to give the coded value of xc. Proceed to find the mean of the coded values: xc STROUD Worked examples and exercises are in the text Programme 27: Statistics Measure of central tendency Coding for calculating the mean For example: xc STROUD x f f c 2.0 0.0333 to 4 dp 60 Worked examples and exercises are in the text Programme 27: Statistics Measure of central tendency Decoding Decoding requires the coding process to be reversed This means multiplying by the appropriate data interval and then adding the false mean: xc x f f c 2.0 x 30.8 0.0333 to 4 dp where xc 60 0.2 Therefore: x (0.0333) 0.2 30.8 30.79 to 2 dp STROUD Worked examples and exercises are in the text Programme 27: Statistics Measure of central tendency Coding with a grouped frequency distribution This procedure is similar where the false mean is the centre value of a convenient class. xc STROUD x f f c 11 0.22 50 Worked examples and exercises are in the text Programme 27: Statistics Measure of central tendency Coding with a grouped frequency distribution Decoding again requires the coding process to be reversed This means multiplying by the appropriate data interval and then adding the false mean: xc x f f c 11 x 2.30 0.22 where xc m 50 0.03 Therefore: xm (0.22) 0.3 2.30 2.3067 to 4 dp giving: x 2.307 to 3 dp STROUD Worked examples and exercises are in the text Programme 27: Statistics Measure of central tendency Mode of a set of data The mode of a set of data is that value of the variable that occurs most often. The mode of: 2, 2, 6, 7, 7, 7, 10, 13 is clearly 7. The mode may not be unique, for instance the modes of: 23, 25, 25, 25, 27, 27, 28, 28, 28 are 25 and 28. STROUD Worked examples and exercises are in the text Programme 27: Statistics Measure of central tendency Mode of a grouped frequency distribution The modal class of grouped data is the class with the greatest population. For example, the modal class of: Is the third class. STROUD Worked examples and exercises are in the text Programme 27: Statistics Measure of central tendency Mode of a grouped frequency distribution Plotting the histogram of the data enables the mode to be found: STROUD Worked examples and exercises are in the text Programme 27: Statistics Measure of central tendency Mode of a grouped frequency distribution The mode can also be calculated algebraically: If L = lower boundary value l = AB = difference in frequency on the lower boundary u = CD = difference in frequency on the upper boundary c = class interval the mode is then: l mode L c l u STROUD Worked examples and exercises are in the text Programme 27: Statistics Measure of central tendency Median of a set of data The median is the value of the middle datum when the data is arranged in ascending or descending order. If there is an even number of values the median is the average of the two middle data. STROUD Worked examples and exercises are in the text Programme 27: Statistics Measure of central tendency Median with grouped data In the case of grouped data the median divides the population of the largest block of the histogram into two parts: 6 12 15 A B 13 9 5 In this frequency distribution A + B = 20 so that A = 7: 7 The width of A class interval 20 0.35 0.3 0.105 Therefore, Median = 30.85 + 0.105 = 30.96 to 2 dp STROUD Worked examples and exercises are in the text Programme 27: Statistics Introduction Arrangement of data Histograms Measure of central tendency Dispersion Frequency polygons Frequency curves Normal distribution curve Standardized normal curve STROUD Worked examples and exercises are in the text Programme 27: Statistics Dispersion Range Standard deviation Alternative formula for the standard deviation STROUD Worked examples and exercises are in the text Programme 27: Statistics Dispersion Range The mean, mode and median give important information about the central tendency of data but they do not tell anything about the spread or dispersion about the centre. For example, the two sets of data: 26, 27, 28 ,29 30 and 5, 19, 20, 36, 60 both have a mean of 28 but one is clearly more tightly arranged about the mean than the other. The simplest measure of dispersion is the range – the difference between the highest and the lowest values. STROUD Worked examples and exercises are in the text Programme 27: Statistics Dispersion Standard deviation The standard deviation is the most widely used measure of dispersion. The variance of a set of data is the average of the square of the difference in value of a datum from the mean: ( x1 x )2 ( x2 x ) 2 variance n ( xn x ) 2 This has the disadvantage of being measured in the square of the units of the data. The standard deviation is the square root of the variance: n standard deviation STROUD (x x ) i 1 2 i n Worked examples and exercises are in the text Programme 27: Statistics Dispersion Alternative formula for the standard deviation Since: n (x x ) i i 1 n n n 2 x i 1 2 i n (x i 1 2 i n n 2 x xi x i 1 n 2 xi x x 2 ) i 1 n 2 x i 1 2 i 2nx 2 nx 2 n n That is: STROUD x i 1 n 2 i x2 x2 x 2 Worked examples and exercises are in the text Programme 27: Statistics Introduction Arrangement of data Histograms Measure of central tendency Dispersion Frequency polygons Frequency curves Normal distribution curve Standardized normal curve STROUD Worked examples and exercises are in the text Programme 27: Statistics Frequency polygons If the centre points of the tops of the rectangular blocks of a frequency histogram are joined by straight lines, the resulting figure is called a frequency polygon STROUD Worked examples and exercises are in the text Programme 27: Statistics Introduction Arrangement of data Histograms Measure of central tendency Dispersion Frequency polygons Frequency curves Normal distribution curve Standardized normal curve STROUD Worked examples and exercises are in the text Programme 27: Statistics Frequency curves If the frequency polygon is smoothed out the resulting figure is a frequency curve. STROUD Worked examples and exercises are in the text Programme 27: Statistics Introduction Arrangement of data Histograms Measure of central tendency Dispersion Frequency polygons Frequency curves Normal distribution curve Standardized normal curve STROUD Worked examples and exercises are in the text Programme 27: Statistics Normal distribution curve Values within 1 standard deviation of the mean Values within 2 standard deviations of the mean Values within 3 standard deviations of the mean STROUD Worked examples and exercises are in the text Programme 27: Statistics Normal distribution curve Values within 1 standard deviation of the mean When very large numbers of observations are made and the range is divided into a very large number of ‘narrow’ classes, the resulting frequency curve, in many cases, approximates closely to a standard curve known as the normal distribution curve. The normal distribution curve is symmetrical about its centre line which coincides with the mean of the observations STROUD Worked examples and exercises are in the text Programme 27: Statistics Normal distribution curve Values within 1 standard deviation of the mean There are two points on the normal distribution curve where the concavity switches, one from concave to convex and the other from convex to concave. The horizontal distance of each of these two points from the mean line is one standard deviation. Of the area beneath the normal distribution curve: 68% lies within one standard deviation from the mean STROUD Worked examples and exercises are in the text Programme 27: Statistics Normal distribution curve Values within 2 standard deviations of the mean Of the area beneath the normal distribution curve: 95% lies within two standard deviations from the mean STROUD Worked examples and exercises are in the text Programme 27: Statistics Normal distribution curve Values within 3 standard deviations of the mean Of the area beneath the normal distribution curve: 99.7% lies within three standard deviations from the mean STROUD Page 1164, Frame 51 Bottom diagram Worked examples and exercises are in the text Programme 27: Statistics Normal distribution curve The following diagram summarizes this information: Page 1165, Frame 51 Top of page STROUD Worked examples and exercises are in the text Programme 27: Statistics Introduction Arrangement of data Histograms Measure of central tendency Dispersion Frequency polygons Frequency curves Normal distribution curve Standardized normal curve STROUD Worked examples and exercises are in the text Programme 27: Statistics Standardized normal curve The standardized normal curve is the same shape as the normal curve but the axis of symmetry is the vertical axis; the horizontal axis carries a scale of z-values where: z xx and the area beneath the curve is 1. Its equation is: 2 ( z) STROUD 1 z2 e 2 Worked examples and exercises are in the text Programme 27: Statistics Learning outcomes Distinguish between discrete and continuous data Construct frequency and relative frequency tables for grouped and ungrouped discrete data Determine class boundaries, class intervals and central values for discrete and continuous data Construct a histogram and a frequency polygon Determine the mean, median and mode of grouped and ungrouped data Determine the range, variance and standard deviation of discrete data Measure dispersion of data using the normal and standard normal curves. STROUD Worked examples and exercises are in the text