Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Match up definitions 1. Hypothesis A. The value that occurs most often in a data set 2. Mean B. A statistical test used to determine the significance of an association between categories of data 3. Mode C. The sum of the data divided by the number of data entries (n) 4. Median D. A graphical representation of the variability of data. 5. Range Bars E. The number that occurs in the middle of a set of sorted numbers. 6. Chi-squared χ2 F. A tentative explanation of an observation, capable of being tested by experimentation Evaluative Prep Mussel shell length A student investigated the variation in the length of mussel shells on two locations on a rocky shore, the results are shown in the table below. Location Mean length (mm) Median length (mm) Length Range (mm) Lower shore 9.67 9.00 8.00 to 12.00 Upper shore 6.00 6.00 6.00 to 7.00 1) Plot the two mean length values on a bar chart 2) Add range bars to the bar chart 3) Indicate with a small cross the median value on the range bar Graph skills Rules for graph drawing PA The graph should be of an appropriate size to make good use of the paper. There should be an informative title, and axes scaled appropriately with ascending scale and equidistant intervals should be fully labelled with units. Bar charts – made up of blocks of equal width, which do not touch. Range bars – upper and lower range values connected by a line Median value accurately plotted with a cross Question – Using the table and the bar chart describe one similarity and one difference in the range of mussel lengths for both shores (3 marks) KARL PEARSON (1857-1936) (Pearson’s) British mathematician, ‘father’ of modern statistics and a pioneer of eugenics! Chi-squared (χ2) test • Chi-squared is used to test if the observed frequency fits the frequency you expected or predicted. • The theory is used to predict a result – this is called the expected result • The experiment is carried out and the actual result is recorded – this is called the observed result How do we calculate the expected frequency? • You might expect the observed frequency of your data to match a specific ratio. e.g. a 3:1 ratio of phenotypes in a genetic cross. • Or you may predict a homogenous distribution of individuals in an environment. e.g. numbers of daisies counted in quadrats on a field. Note: In some cases you might expect the observed frequencies to match the expected, in others you might hope for a difference between them. What is the null hypothesis (H0)? To see if the results support the theory you make a hypothesis called the null hypothesis H0 = there is no statistically significant difference between the observed frequency and the expected frequency Your experimental result will always be a bit different but you need to know if the difference is just due to chance, or because the theory is wrong Χ2 is carried out and the outcome either supports or rejects the hypothesis Calculating χ2 χ2 = (O – E)2 E O = the observed results E = the expected (or predicted) results You have been wandering about on a seashore and you have noticed that a small snail (the flat periwinkle) seems to live only on seaweeds of various kinds. You decide to investigate whether the animals prefer certain kinds of seaweed by counting numbers of animals on different species. You end up with the following data: Write a hypothesis for this investigation Seaweed serrated wrack bladder wrack egg wrack spiral wrack other algae TOTAL Observed 45 38 10 5 2 Expected O-E (O – E)2 (O – E)2/E How do we calculate expected values? Expected results = 45 + 38 + 10 + 5 + 2 = 100 = 20 5 5 In the question you may be given a calculation for how to work out your expected values Answers Seaweed Observed serrated 45 wrack bladder 38 wrack egg 10 wrack spiral 5 wrack other 2 algae TOTAL Expected O-E (O – E)2 (O – E)2/E 20 25 625 31.3 20 18 324 16.2 20 -10 100 5.0 20 -15 225 11.3 20 -18 324 16.2 80.0 Compare your calculated value of χ2 with the critical value in your stats table Our value of χ2 = 80.0 Degrees of freedom = no. of categories - 1 = 4 D.F. Critical Value (P = 0.05) 1 2 3 4 5 3.84 5.99 7.82 9.49 11.07 Our value for χ2 exceeds the critical value at 5% (p = 0.05) probability level, so we can reject the null hypothesis. There is a significant difference between our expected and observed results at the 5% level of probability. In doing this we are saying that the snails are not scattered about the various sorts of seaweed but seem to prefer living on certain species. P Values • If the P value is less than 0.05 (an arbitrary, but well accepted threshold), the results are deemed to be statistically significant. • All it means is that, by chance alone, the difference (or association or correlation..) you observed (or one even larger) would happen less than 5% of the time. • It’s an estimate of how likely we are to observe our result by chance. • If p = <0.05, there is a 5% probability of making our observation purely by chance. • This means we can reject the null hypothesis. Plenary SA statistic sheet