Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Animal Behavior (Biol 395) ANALYTICS Once you have designed your experiment to test your hypothesis, and then run that experiment you will produce data. We must find a way to take that data and determine if it supports the hypothesis (matches our predictions), or rejects our hypothesis. This is where statistical analyses come in, they give us a way to convert our data from measured values, into answers to our questions. Some common statistical terms There are two types of data, Qualitative and Quantitative. Qualitative datum are descriptive; such as red or brighter. Quantitative datum are numerical values; such as length, or temperature. Many qualitative traits can be expressed as quantitative, such as expressing a shade of red as the ratio of its red:blue:green components, for example the red line under a misspelled word in excel is 254:21:42. Brightness can also be expressed as a number, using the correct data collection you can quantify brightness as foot-candles, or lux, both measurable units. It is very hard to run statistics on qualitative data, so we must transform data into quantifiable results, OR make sure we are measuring quantifiable information to begin with. The types of quantitative data that we can collect vary. Nominal or Categorical data are values that fit into discrete groups such as Red and Green. We can then get frequencies (# of individuals) that meet each parameter, but they is no way to compare parameters, except to say they are different. Ordinal data scales are information that fit into a rank scale, such as the pain scale, where a higher pain level is represented as a larger number. Each ordinal values fits into a category, and we can say one category is greater or less than another (i.e. more painful). Interval data scales are values measured along a rank scale, however they are equal units between values. For example temperatures are interval scales, where the difference between 1 and 50oC, are 49 units, and the distance between 34-122oF is 88 units, but since they are not an arbitrary scale the change is the same, but the units are different. Inveral scales can be compared if they measure a common variable. Ratio data are like an interval, but with a true 0 value, meaning that 0 on a interval scale is arbitrary (e.g. 0oF and 0oC are different, and neither is the absence of any temperature). Length is a ratio scale, where 0 in. and 0 cm are the same thing, and both represent an absence of length Suppose that we are measuring the size of cells, the height of trees, the biomass of microbial cultures, the number of eggs in nests, or anything else. The thing that we are measuring or recording (e.g. cell size, plant height, etc.) is called a variable. Independent variables or Factors are controlled in an experiment, and are used a predictor for changes in other variables. Dependent variables, or Response variables are ones that expect to show a change with experimental modification, and changes in the independent variable. Each measurement that we record (e.g. the size of each cell) is a value or observation. We obtain a number of values (e.g. 100 for cells), and this is our sample. The sample (e.g. 100 cells) is part of a larger population. In this case the population (in biological terms) is all the cells in the culture (or all the trees in a forest, etc.). Theoretically, we could measure every cell or tree to get a precise measure of that population. But often we want to be able to say more than this, something of general significance, based on our sample. For example, that if anyone were to measure the cells of that organism, then they would find a certain average value and a certain range of variation. Here are a couple of thing that you might want to say. The optimum temperature for growth of the bacterium Escherichia coli is 37oC, whereas the optimum temperature for Bacillus cereus is 30oC. The average height of adult men in the US is 178 cm, whereas the average height of women is164 cm. General statements such as these will always be based on a sample, because we could never test every possible strain of E. coli, nor measure every human adult in the US. So, in these and in many other cases the population can be considered infinite. That's the sense in which statisticians use the term "population" for all the possible measurements or events (i.e. all the possible values of a variable) of a particular type that there could ever be. In statistics, we use SAMPLES to ESTIMATE the PARAMETERS of a POPULATION. Example 1: Descriptive statistics All the students in a college marching band are asked to line up behind signs that represent their height. All the guys were given navy blue shirts, and all the ladies were given white shirts. How can we figure out the average height? What about the distribution of heights and its middle? What is the most common height within the group? These are the mean median mode and range. These statistical tests are called descriptive statistics. The most frequently used of these is mean. The mean is the statistical average of the group, or the mathematical middle value taken by adding all the values and then dividing by the number of individuals in the sample (n). The Mean should not be confused with the median, which is the middle value given the entire range, or distribution of values. In this example the median would be halfway between 5’0” and 6’5” (5’8.5”). The mode is the most frequent value; here it is 5’6”. The mean, median, and mode do not have to match, and will only do so with a normal distribution. A normal distribution finds that most of the values are closest to the mean, with values further from the mean less frequent, creating a bell curve. Descriptive data are the values we most typically include in graphs, and it is the inferential statistical tests that support our conclusions about that data. It is not enough to just know the mean, median, mode and range; we must also be able to determine how far values fall from these measurements. For this we use a measurement called variance. Variance is another descriptive statistic which explains how far apart given values are in a normal distribution. The most common measure of variance is standard deviation (SD). Another measure of variance is standard error, however this should only be used in specific cases (typically when your results represent the ENTIRE population and not just a selected few called a sample). Below our normal distribution is divided into standard deviations. You can see how many individuals typically fall into each group above and below the mean (middle). We will come back to standard deviations as we talk about analytical statistics. To calculate standard deviation we use the formula: Now we must talk about degrees of freedom (d.f.) and p-values. Degrees of freedom are the parameters of the test that allow you to sample for variation. Plainly the more categories you have the most variability we expect, and therefore must allow for this in our statistical test. To figure out the degrees of freedom you take the total number of categories (n) and subtract 1 d.f. = n-1). Next is the p-value, which tests for type-2 error. Type-2 error is the failure to reject a false null hypothesis (i.e. the chance of failing to accept a true experimental hypothesis). While we are at it Type-1 error is the incorrect rejection of the null hypothesis (i.e. a false positive). We want to use our statistical test to be sure our data supports our hypothesis, and by testing to be sure our results are not due to random chance we can be sure they truly support or reject our hypothesis. We therefore set our p-value or threshold to 95% (p=0.05), which means that with 95% confidence we can rule out type-2 error, can correctly reject the null hypothesis. Again to put this all in plain perspective (for all us non-statisticians). We use the degrees of freedom and the appropriate p-value to determine whether or results are a valid expression or our data, and we must do this with each statistical test. Example 2: Chi2 Suppose that we want to test the results of a Mendelian genetic cross. We start with 2 parents of genotype AABB and aabb (where A and a represent the dominant and recessive alleles of one gene, and B and b represent the dominant and recessive alleles of another gene). We know that all the F1 generation (first generation progeny of these parents) will have genotype AaBb and that their phenotype will display both dominant alleles (e.g. in fruit flies all the F1 generation will have red eyes rather than white eyes, and normal wings rather than stubby wings). This F1 generation will produce 4 types of gamete (AB, Ab, aB and ab), and when we self-cross the F1 generation we will end up with a variety of F2 genotypes (see the table below). Gametes Gametes AB Ab aB ab AB AABB AABb AaBB AaBb Ab AABb AAbb AaBb Aabb aB AaBB AaBb aaBB aaBb ab AaBb Aabb aaBb aabb All these genotypes fall into 4 phenotypes, shown by colors in the table: double dominant, single dominant A, single dominant B and double recessive. And we know that in classical Mendelian genetics the ratio of these phenotypes is 9:3:3:1. How can we be sure that our results match the expected ratio? A chi squared test (Chi2 or χ2) compares the observed values to expected rations, and allows you to determine if your results (which will naturally vary) match the expected outcome. We use a chi squared test to analyze the results of genetic crosses: we do our experiment, count the number of F2 progeny that fall into the different categories, and test to see if our results agree with an expectation. In this case, a 9:3:3:1 ratio. The formula for a Chi2: χ2= Pearson's cumulative test statistic, which will be grater than or equal to the predicted χ2 for that test. Oi = an observed frequency; Ei = an expected (theoretical) frequency, asserted by the null hypothesis; n = the number of categories. However it is important to have a minimum number of individuals. This is called the Power. Without enough power your statistical test will not yield reliable results. In this case you would need a minimum of 16 individuals to have the statistical power to run a chi2 test. Example 3: T-tests If we want to compare the biomass produced by plant callus culture in flasks containing different nutrient solutions. We know that we need more than one flask of each nutrient solution (i.e. replicates), but we will need a way to compare the growth rates (which will vary between flasks of each nutrient type). Using a Student's t-test to compare the mean growth in each solution. Basically, a t-test compares the difference between the two means relative to the amount of variation within the treatments. In other words, we get a significant result if the difference between the means is large and/or the variation between replicates is small]. So, how many replicates should we use? This is a matter of judgment (and the available resources) but if we look at a t-table we can make some rational decisions. If we use 2 flasks for each treatment (4 flasks in total), we would have 2 degrees of freedom. This term is explained elsewhere, but for now we can note that the number of degrees of freedom for each treatment is one less than the number of replicates. In other words, with 2 treatments of 2 flasks each we have 2 degrees of freedom. With 2 treatments of 10 flasks each we have 18 degrees of freedom. When we analyze our results by Student's t-test, we calculate a t-value and compare it with the t-value for probability of 0.05 in the t-table. Our treatments differ significantly if the calculated t-value is greater than the tabulated value. If you look at the tabulated t-value for 2 degrees of freedom (4.30), it is quite high, and we would only find a significant difference between our treatments if we have quite a large difference between the means and also little variation in our replicates. But if we used 4 replicates of each treatment (6 degrees of freedom) we would have a much better chance of finding a significant difference (t-value of 2.45) between the same means. However, going even further, from 10 degrees of freedom (t-value 2.23), we would gain very little by using any more replicates. We would be in the realm of diminishing returns, gaining very little for all the extra time and resources. T-test can contain 1 or two (independent) variables. A 1-sample t-test is called a regression, and tests for changes in the dependent variable relative to a change in the independent variable. This predicts a direct relationship between the two variables that could be direct, indirect or null. The patter of the relationship could be linear, logarithmic, exponential or step-wise. The basic formula for a regression is the slope of the line (linear; y=mx+b), however we can also determine if the regression satisfactorily explains our results, by examining how far each data point lies from the regression line (usually reported as β). It can also be reported as r or r2(as is usually done in excel). A two-sample t-test examines the relationship between two independent variables. Typically we assume the variance similar, since we are only changing a single independent variable, and have only two values (usually experimental and control). We use the mean, standard deviation (s) and number of samples where Example 4: ANOVA’s We have now seen how to compare observed results to an expected set, or how to compare results using 1 or 2 variables. But what if we have more independent variables (e.g. condition 1, 2 and 3), or overlapping variables. Then we will need a test that allows us to look at the variance of each variable and compare it to the others. This is called an analysis of variance or ANOVA (also called an F-test). ANOVA’s can compare many related independent variables as well as categorical variables that can interact. The first type is a one-way ANOVA, which allows you to test multiple non-overlapping variables (e.g. different concentrations of antiseptic on bacterial growth). Since we are using one category of variables (antiseptic) and just changing the value, we can use each value as an independent variables value and look at the dependent variables result, comparing the means and variation (similar to a t-test). The next type of ANOVA is a multi-level ANOVA. This assumes that you have two different types of independent variables, and have set up a full factorial experiment where conditions include two independent variables. The table below demonstrates the full-factorial design for two categories of independent variables. Independent Variable 2 Independent Variable 1 2A (Treatment) 2B (Control) 1A (Treatment ) 1B (Control) In a multi-way design we must not only look for the effect of independent variable 1 or 2, but we must also look for the interaction of these variables. The interaction is the result of the two variables together, and the variance of this combined term may be greater than either condition alone. In fact this test takes our hypothesis and creates the new sub-hypothesis; the effect of variable 1, the effect of variable 2, and the interaction effect. We will discuss this further in class, and how we calculate the statistical values for this test. Statistical programs There are several options for calculating statistics using your computer, tablet or smartphone. Programs such as SPSS, JMP or Statistica have user-friendly graphical interfaces (GUI), but are not free (SPSS access directions are available below). Excel is a little less user-friendly, and the tests are directed towards business, but it can be used with some trial-and-error, and is available for free to students. Lastly, R is a statistical computer language available for free for both Mac and PC. To use R you will need to download and install the program, and there are several websites and tutorials available online. A comprehensive set of R manuals (http://cran.r-project.org/manuals.html) A useful set of instructions and tips for running stastical tests in R (http://www.statmethods.net/stats/descriptives.html) Download R (http://cran.us.r-project.org/sources.html) SPSS MCLA students have access SPSS statistical software through the Virtual Machine. Instructions for installing Virtual Machine Access can be found here (http://techhelp.mcla.edu/index.php/Install_VMView_Client). TO use the virtual machine please make sure to connect to mclanet or mclanet-g wifi servers AND NOT Hotspot (unsecured). Your data saves to your H drive on the school server, and can be accessed from any college computer by loging in with your A#, or if you connect to the H drive on your personal laptop (instructions can be found here (http://techhelp.mcla.edu/index.php/Welcome_to_TechHelp.mcla.edu) GraphPad Descriptive Statistics (http://www.graphpad.com/quickcalcs/CImean1/?Format=C) Chi Square (http://graphpad.com/quickcalcs/chisquared1.cfm) T-Test (http://www.graphpad.com/quickcalcs/ttest1/?Format=C) ANOVA Others Chi Square (http://www.quantpsy.org/chisq/chisq.htm) Student T-Test (http://www.physics.csbsju.edu/stats/t-test_bulk_form.html) One-Way ANOVA (http://vassarstats.net/anova1u.html) One-Way ANOVA Using Means and SD (http://www.danielsoper.com/statcalc3/calc.aspx?id=43) Two-Way ANOVA (http://faculty.vassar.edu/lowry/anova2x2.html)