* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Lecture 5 - Quantification and Statistics
Survey
Document related concepts
Transcript
Measurement, Quantification and Analysis Some Basic Principles Three Major Issues • 1) Biological and especially ecological data show high variability in quantitative traits • 2) We almost never measure everything in field research; rather we sample from larger populations or data sets • Sampling leads to uncertainty about conclusions, so we always must estimate our uncertainty Variability Continuous data All natural processes are variable, Whether continuous or discreet Discreet data Plus, better sampling effort better describes distributions In many processes, we observe characteristic distributions Binomial – Few interacting factors Normal – Many interacting factors 2 factors: One way to get AA or aa, 2 ways to get Aa 4 factors: One way to have AAAA or aaaa, 4 ways to get AAAa or aaaA, and 6 ways to get AAaa Sampling and Estimation A characteristic of field biology is the attempt to estimate parameters from highly variable populations of uncertain “true” value. To calculate the average in a sample: Mean = Sum of all observations/number of observation To estimate the variability of the observations: -1 Variance = Sum of (individual observation – Mean of observations)2 _____________________________________________ Number of Individual Observations - 1 Or to express this in the same units as the Mean: Standard deviation = Square Root of the Variance Probability All natural processes are variable, Whether continuous or discreet What happens when we estimate means? Select 5 observations at random. Then 10. Then 25. 1) Better sampled populations yield better distributions 2) Larger sample sizes yield better estimates 3) Means will also be variable, and will have a characteristic distribution To estimate the variability of the means: Divide the standard deviation (the square root of the variance) by the square root of the sample size (why? Variability of the means is dependent upon sample size.) Recall, To estimate the variability of the observations: Variance = Sum of (individual observation – Mean of observations)2 _____________________________________________ Number of Individual Observations – 1 To estimate the variability of the means: Divide the square root of the variance, the standard deviation, by the square root of the sample size. The bigger the sample size, the less variable the means This is the Standard Error, which is used to calculate a Confidence Interval Uncertainty Confidence intervals represent a level of confidence about the true value of the mean. In other words, if you sample repeated with a given sample size, a 95 % CI means that in 95 % of the samples you collect, you will have the value of the true mean. •No matter how well we sample, we will “missestimate” the population parameter a certain percentage. •What level of error are we willing to accept? •With a 95 % limit, 5 % of the time. •In theory, the tails are limitless, so we must set a criterion. •Decision rule – 5 % error. •Minimize this with replication Importance of Replication? One sample: Wrong 5% or 1/20 of the times you sample Two replicated samples: Wrong 1/20 x 1/20 or 1/400 Three replicated samples: Wrong 1/20 x 1/20 x 1/20 or 1/8,000 One things we do frequently in science is compare things. For example, if one population bigger than another, which population are we sampling from? What kinds of errors can we make? What confidence do we want? What error will we accept? A B Fundamental Principles • Have clearly defined hypotheses • Measure carefully • Sample intensively – large sample sizes reduce Beta-Error • Replicate – Replication reduces Alpha-Error Samples of Data Sets from Previous Projects that required Quantification and Statistical Analysis Forearm (mm) Sum of Squares 4053.985 Between Groups Within Groups Total Between Groups 2 Mean Square 2026.993 152.971 43 3.557 4206.957 45 254.274 2 127.137 98.161 43 2.283 352.435 45 Within Groups Total F 569.784 Sig. .000 55.693 .000 Principal Component Plot REGR factor score 2 for analy sis 1 Foot (mm) df 2 .00 0 00 1 .00 0 00 0 .00 0 00 2 3 3 3 2 2 2 -1 .0 0 00 0 -2 .0 0 00 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 -2 .0 0 00 0 1 1 1 1 1 11 1 2 2 1 -1 .0 0 00 0 0 .00 0 00 1 1 .00 0 00 REGR factor score 1 for analysis 1 1