Download VARIANCE

VARIANCE Psychologists try to explain and/or predict behavior. They do this by showing that the behavior of interest is related to other factors. For example, suppose that you want to know if aggression in children is related to the number of hours that children watch TV. In this case, the behavior of interest (the dependent variable) is aggression, and you want to know if aggression can be systematically related to hours of TV watching. To do this study, you would first have to select a group of children and measure each child's aggression and the number of hours each child watches TV. If you took these measurements you would find a range of values for each factor. Some children watch a lot of TV, others very little. Some children are more aggressive than others. We refer to the different values associated with a measure (e.g. aggression) as a distribution of scores. For example, hours of TV watching would have a distribution of score values. When you try to establish a relationship between variables, what you are trying to do is to show that variability in one set of scores (e.g. aggression) can be systematically related to variability in another set of scores (e.g. hours of TV ). For example, you might find that aggression tends to increase as hours of TV watching increases. How do we show that two variables such as aggression and hour of TV watching are related to one another? There are a number of ways to do this, but each depends on relating one distribution of scores to another distribution of scores. In order to show that distributions are related, we have to describe or measure the amount of variability in each distribution. There are various ways to describe or measure variability. For example, you can create a visual representation of a distribution of scores: A frequency distribution is a visual representation of variability in a set of scores. You can also produce visual representations showing how different score distributions are related to one another: Scatterplots are commonly used to illustrate relationships between distributions of scores. Visual representations are very important statistical tools for evaluating data we have collected in a research project. However, they have a shortcoming. They can not be evaluated mathematically. We can simply "eyeball" them looking for relationships. Of course, this creates problems because people will have no objective way to interpret the figures. What we need is a way to mathematically measure the amount of variability in a set of scores and to mathematically show that variability in one set of scores is related to variability in another set of scores. It turns out that statisticians have developed methods to define mathematically the amount of variability in a set of scores and the extent to which variability in one set of scores can be related to variability in another set of scores. The mathematical term for indexing the amount of variability in a set of scores is VARIANCE. Variance is a number that is based on the extent to which individual scores in a distribution of scores deviate from the mean of that distribution of scores. For example, suppose that after measuring aggression in our sample of children, we find aggression scores varying from 1 to 30 (I made up these values ). We can use these raw score values to compute a mean aggression score (an average). If we subtract the mean from each individual's aggression score we can create a new set of scores called deviation scores. Now each individual has a raw score and a deviation score that represents the extent to which the individual's raw score differs from the mean score. Variance is calculated from these deviation scores. If the deviation scores are large then variance will be large and visa versa. For descriptive purposes, variances are often converted to values called standard deviations. A standard deviation is like an average or mean. It can be thought of as the average size of the deviation scores. For example, if the standard deviation for our aggression scores is 1.4, we can assume that on average most individual raw scores will be about 1.4 score units away from the mean. As an example (I am making this up), suppose that the mean aggression score is 15. Then the majority of aggression raw scores would likely fall between 13.6 and 16.4. Virtually all statistical procedures used to establish relationships between variables depend on the notion of variance. Remember that variance is one way to measure variability in a set or distribution of scores. It is the mathematical holy grail. Only the mean is a more important statistical concept, and it is so only by virtue of its role in defining variance. So how do we use variance to establish relationships between variables. For example, how can variance be used to find out if aggression is related to TV watching? Here is one way to think about this problem. Suppose that there is no relationship between aggression and TV watching. If this were the case, then a child's aggression score would have no connection to the child's TV watching score. It follows that a child's deviation aggression score would then have no relation to a child's deviation TV watching score. Because aggression variance and TV watching variance are both based on their respective deviation scores, aggression variance would be unrelated to TV watching variance. In this case, we would say that none of the variance associated with aggression can be accounted for by variance in TV watching. Now suppose that aggression and TV watching are perfectly related. That is, suppose that if we knew how much TV a child watched we could predict with perfect accuracy their aggression score. In this case, a child's aggression score could be predicted from their TV watching score. If this were the case, then all of the aggression variance would be accounted for by TV watching variance. In effect, TV watching scores could be substituted for aggression scores. Of course, in the empirical world there are probably no perfect relationships between variables. You might be able to predict aggression based on knowledge of TV watching, but you couldn't predict with perfect accuracy. Differences in aggression are likely related to a wide variety of factors. In turns out that there are mathematical procedures that allow you to compute how much of the total variance in your dependent variable (aggression) is related to your predictor variable(TV watching), and how much of the aggression variance is unrelated to your predictor variable. The latter source of variance is often referred to as "residual variance" or sometimes "error variance". If two variables are closely linked with one another, there will be very little residual variance, and we would probably conclude that the variables are related to one another in some way. FORMULA FOR VARIANCE: ∑ (X − X ) 2 s = 2 x N −1 where X is a score, X is the mean of the score distribution, and N is the number of scores. The standard deviation is the square root of the variance. ∑ (X - X ) 2 sx = N -1

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download VARIANCE