Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Qualitative Kinds of data fred lissy max jack callum zoe luke stephen Quantitative Continuous 10 red 15 blue 5 green size size size size 12 baby 5lb3oz 14 6lb10oz 16 7lb12oz 18 11lb1oz 160cm 172cm 181cm 4 bedroomed 3 bedroomed 2 bedroomed Discrete Page 10 Exercise 2A Q1 - 5 and 7 Averages There are three types of average and they all begin with M .....most popular value or class .....middle value if all values are placed in order .....the sum of all the values shared by how many values there are Mean How would you say the mean average differed from the median average? In which circumstances may you use the mean rather than the median and vice versa? Rather than describing how to find the mean in words we need to learn some notation. The sum of all the x values should be written as: The sum of the values in the fx column should be written as: Page 15 Exercise 2B Q1, 2, 4, 6 and 7 In what kind of questions will we need to add the values in the fx column rather than just add together all the x values? Try finding the mean of the year 7 girl heights and compare it to the year 7 boy heights. Would you use the mean or the median to summarise this data and why? If you used the raw data for the above calculation why? If you used the grouped data from the frequency polygon or histogram work why am I going to tell you you've made a mistake? Mean Formulae for a list of data for a frequency table of data for grouped data it is necessary to find the midpoint of each class first and use this as a value for x and then use the same equation above. We use the "x Bar" notation to represent a sample mean. If I was using all the data possible it would be called a population mean and we use the "mu" notation Page 18 Exercise 2C Q1, 4 and 5 Stem and Leaf Put the data below into a suitable stem and leaf diagram. 127, 135, 147, 147, 149, 139, 145, 155, 149, 155, 151, 159, 139, 141, 155, 160, 138, 144, 155, 148 156, 143, 147, 157, 152, 150, 161, 133, 146, 155 The data represents heights of a first year class in a boys school. How else can you summarise or represent this data? Below is a list of the heights of 30 year 7 girls. Add these to the other side of your stem and leaf diagram and make some comparitive statements based on suitable summary data you find. 127, 145, 147, 147, 149, 149, 145, 165, 139, 157, 152, 169, 129, 121, 158, 160, 148, 141, 155, 148 156, 143, 157, 156, 152, 150, 161, 133, 146, 155 Page 55 Exercise 4A Q2 and 5 If you have grouped your data with equal class widths, you have little to worry about. However if the class widths are uneven you will need to plot them against frequency density rather than just frequency. Frequency Density = Class Frequency Class Width Sometimes Relative Frequency Density is plotted on the y axis. This can be calculated as: Rel Freq Dens = Class Frequency Total Frequency Histograms Histograms are similar to bar charts apart from the consideration of areas. In a bar chart, all of the bars are the same width and the only thing that matters is the height of the bar. In a histogram, the area is the important thing. It is best that Histograms are plotted against frequency density or relative frequency density. They should also only be drawn with continuous data. Discrete or qualitative data can be plotted in Bar Charts but their bars should not really touch as they aren't connected Page 64 Exercise 4E Q1, 4 and 5 We may also need to find the average from these grouped continuous data sets Page 22 Exercise 2D Q1 - 5 Summarising Data What types of data summary have you come up with so far? And how do they differ? Give examples of when one type would be better than another. You will recall finding the median from a list of data involves adding one to the number of values before halving it to find out which value (placed in order) you should use. For example in a list of 7 numbers the median value is the (7 + 1) / 2 th value the 4th value Quartiles 3, 5, 5, 6, 8, 10, 10 If you consider the quartiles you can see that it is the 2nd and 6th values. These can be found by dividing the number of values by 4 and as long as this gives a whole number find the average of it's value and the value above it. If this yields a decimal value rather than a whole number then always round UP to the value above it. In a list of 14 numbers the lower quartile would be taken as the 14 / 4 th value or 3.5th value so you'd take the 4th value 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 Page 34 Exercise 3A Q1 and 2 Interquartile Ranges and Boxplots We use the Quartiles and minimum and maximum values to draw boxplots. These are great for comparing the spread of data between two or more data sets Q1 Q0 Q2 Q3 Q4 Where Q0 = min value Q1 = lower quartile Q2 = median Q3 = upper quartile Q4 = max value Draw box plots to compare the year 7 height data you put into stem and leaf diagrams earlier Read page 57 Page 58 Exercise 4B Q1 - 2 Page 59 Exercise 4C Q1 - 2 Page 61 Exercise 4D Q1 - 2 Cumulative Frequency We have seen how to find the quartiles from a list of data or stem and leaf diagrams. We have also seen that data is often stored in frequency distributions. If these are grouped it becomes difficult to find these quartiles. Why? We used to overcome this by drawing cumulative frequency curves. by joining up these two points we pretend the 30 students with between 40 to 50 marks are spread evenly throughout the band. 60 students got below 50 marks 30 students got below 40 marks Interpolation We could actually find this value much quicker by using some simple mathematics known as interpolating. Think how you could find the mark of the 40th student from this year 10 class using just the data rather than reading from the graph. Now try estimating the mark of the 55th student. Remember estimating doesn't mean guessing; it involves exact calculations but it is unlikely to be the true mark as the students are unlikely to be spaced evenly througout the class. we cannot know the exact mark without the raw data - we are not given this with grouped data - hence we estimate Quantiles We can divide the data into as many equal parts as we like. Quartiles divide in four Deciles in ten and Percentiles into 100 The formula below is known as interpolating and estimates a quantile by assuming the data collected in each class is spread evenly. It should be very similar to the formula yo created earlier to find quartiles without drawing a cumulative frequency curve. Quantile = b + (Qn - f) x w fm Where fm is the frequency of the class the quantile falls in f is the cumulative frequency up to the class the quantile falls in w is the class width of the class the quantile falls in Q is the quantile you are finding expressed as a fraction n is the number of data b is the LOWER bound of the class the quantile fits in Page 34 Exercise 3A Q3 and 5 Page 37 Exercise 3B Q1, 3 and 5