Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Psychometrics wikipedia , lookup
Degrees of freedom (statistics) wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Foundations of statistics wikipedia , lookup
Taylor's law wikipedia , lookup
History of statistics wikipedia , lookup
Statistical inference wikipedia , lookup
Resampling (statistics) wikipedia , lookup
Imagine an engineer who needs to build a bridge. He/she would collect data, construct a model to test in a lab then then build the actual bridge. In scientific research we use statistics to develop models to test the outcome or effect of some treatment. We strive to optimize the fit of our model with the population of interest. The data set below can be fit with either a linear or non-linear model. We need some method to evaluate how good each model fits the data. Variance and the standard deviation (SD) can be used to quantify the error in the model. The dotted red lines show the deviance. The deviance is the vertical distance from an actual data point up to or down to the mean (blue line). The deviance can be thought of as the error in the model. Degrees of Freedom • Degrees of Freedom: the degrees of freedom are the number observations that are free to vary. • Example. Choose any 3 numbers, where the sum is 10. The first two numbers can be any number, but once they are choose the third number is no longer free, its value is determined by your choice of the first two numbers: 5 + 4 + ___ = 10 • In the example above the degrees of freedom are (n-1). Measures of Variation • Sum of squared errors (SS) is a good measure of accuracy. • The variance (s2) is the average error between the mean and the data points. The variance is in units squared. • The standard deviation (s) is the square root of the variance. The standard is in the actual units. Expressing the Mean as a Model • Everything in statistics essentially boils down to one equation: • Outcomei = (model) + errori • The data we observe can be predicted from the model we choose to fit the data plus some amount of error. • Likewise, the variance and standard deviation quantify the goodness of the fit. If you take several samples from a population these samples will differ slightly. It is important to know how well a sample represents a population. The standard error is the standard deviation of sample means. Standard Error • The standard error (SE) is the standard deviation of sample means. • The SE is simply the standard deviation divided by the square root of n. • The SE is a measure of how representative a sample is likely to be of the population. • A large SE indicates that there is a large variation between means of different samples and so the sample might not be representative of the population. • A small SE indicates that most sample means are likely to be a accurate reflection of the population. Difference between SD & SE • SD is the sum of the squared deviations from the mean. • SE is the amount of ERROR in the estimate of the population based on the sample. More than 95% of scores fall between ± 2 SD Z of 1.96 is the 95% value Computing Confidence Intervals Confidence Intervals • The basic idea of a confidence interval is to construct a range of values within which we think the population value falls. • 95% of z-scores fall between -1.96 and +1.96 • We are 95% sure that the true mean falls in this range. In the two samples below the intervals overlap. The population mean is probably between 2 and 16 million. These samples were probably drawn from the same population. Error bars represent 95% CI. In the two samples below the intervals DO NOT overlap. These samples were probably NOT drawn from the same population. Inferential Statistics • In inferential statistics, we take a sample and try to make inference about the population. • Fisher describes an experiment in which a woman said she could determine by tasting a cup of tea, whether the milk or the tea was added to the cup first. • In the simplest case if we only use 2 cups of tea, the woman has a 50% chance of getting it right. How much confidence would you have in her ability? • If we used 6 cups of tea, there are 20 combinations, if the woman just guesses she would have a 1 in 20 (5% of the time) chance of guessing correctly. If the woman now guesses all 6 correctly you would feel much more confident about her ability. Test Statistics • We can fit statistical models to data that represent the hypotheses we want to test. • We can use probability to see whether scores are likely to have happened by chance. • If we combine these ideas we can test whether statistical models significantly fit our data. • Systematic variation is the variation explained by the model. • Unsystematic variation is the variation not explained by the model. • A test statistics (t, F) is a ratio systematic to unsystematic variance. Alpha (α) is the probability of making a Type I error. A Type I is saying that there is a difference when there is none. Beta (β) is the probability of making a Type II error. A Type II error is saying that there is no difference when there is a difference. Power (1-β) is the probability of finding differences when they truly exist. Effect Size • An effect size is a standardized (objective) measure of the magnitude of observed effect. • The general form of the effect size equation is: Ex. 20 subjects were pre-tested for jumping ability. They then reported to the lab 3 times/week for 4 weeks. In each lab session they sat on the floor and imagined performing 3 sets of 10 maximum vertical jumps. After 4 weeks of imagined training all of the subjects jumped higher. How strong of a training effect would you expect in this experiment? Will they subjects improve their jumping ability by 20 cm in 4 weeks? Effect Size • The effect size quantifies the importance or meaningfulness of the results. • A significant finding is not necessarily meaningful or important. • It is now common practice to report both confidence intervals and effect sizes for experiments. Original Standard Large Medium Small Effect Size 2.0 1.9 1.8 1.7 1.6 1.5 1.4 1.3 1.2 1.1 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 Percentile Percent of Standing Nonoverlap 97.7 81.1% 97.1 79.4% 96.4 77.4% 95.5 75.4% 94.5 73.1% 93.3 70.7% 91.9 68.1% 90 65.3% 88 62.2% 86 58.9% 84 55.4% 82 51.6% 79 47.4% 76 43.0% 73 38.2% 69 33.0% 66 27.4% 62 21.3% 58 14.7% 54 7.7% 50 0% Cohen's d for Effect Sizes adapted from Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Earlbaum Associates Effect Size using r • r = .10 (small effect) explains 1% of the total variance • r = .30 (medium effect) explains 9% of the total variance • r = .50 (large effect) explains 25% of the total variance Statistical Power • The power of a test is the probability of finding differences between the means when they truly exist. • To increase power: – Increase n – Increase alpha (use 0.1 rather than 0.05) – Decrease the variance • We will use G*Power to do power analyses. It is free, just google gpower3: • http://www.psycho.uni-duesseldorf.de/abteilungen/aap/gpower3/