Download Additional Problems, Often with Answers Reasoned Out

Additional Problems, Often with Answers Reasoned Out and/or Calculations Shown A Supplement and Learning Aid for: Statistics and Data Interpretation for the Helping Professions James A. Rosenthal University of Oklahoma, Norman Copyrighted. May be reproduced for use by students. standard copying costs may be charged. No charges beyond Given that the problems and questions presented here typically provide the reasoning that underlies answers and, also, calculation steps, students may want to complete them prior to completing the problems and questions at each Chapter’s end. (The chapter-end problems do not provide reasoning and calculations.) Supplemental problems and questions are presented here for all Chapters. As some problems reference information presented in the text, one should have the text with them as they proceed. To save disk space, graphical images are used sparingly. Where you encounter the square root symbol “”, that symbol should be interpreted as “spanning” the number that follows it. For instance “100" conveys the square root of 100. Note: This document uses Courier, font size 11. If it does not format in this font for you, you may want to set it to this. Chapter 1 1. A researcher conducts in-depth interviews, carefully recording the extended, open-ended, and lengthy responses of delinquent youth regarding their experiences in a wilderness adventure program. (T or F). This study uses qualitative methods rather than quantitative methods. True. If the researcher had formulated questions that required participants to choose the best response from a list of responses – yes/no, agree/disagree, etc. – the study would have been characterized as quantitative. But as the researcher goes beyond 1 this and encourages extended replies in the respondents own words, the study is qualitative. 2. Think about the fact that the number “2" always has the same value (that being “2"). (T or F) The number “2" is a variable rather than a constant. False. As “2" always has the same value it (and any other number) is a constant. 3. In a given class, 29 students have brown eyes and 1 has blue eyes. (T or F) In this class, eye color is a variable rather than a constant. True. As eye color takes on different values (brown and blue), it is a variable. One could say that it is almost a constant as almost everyone has the same eye color. 4. A market survey asks respondents to choose the magazine they prefer from the following list: Newsweek, Time, U.S. News and World Report. a. b. (T or F) Newsweek, Time, and U.S. News and World Report are variables rather than values. (T or F) In this example, type of magazine is a value rather than a variable. Answers follow: a. b. 5. False (These magazines are values.) False (Type of magazine is a variable.) Indicate the level of measurement used for each of the following. a. b. c. d. The number of books in children’s houses is counted. Attendees rate a workshop as: Excellent, Good, Fair, or Poor. At a track meet, the distance that competitors throw the javelin is recorded. Seniors at a high school vote on whether they would prefer a senior trip to: Washington, DC, New York City, or Boston. Answers follow: a. b. c. d. interval/ratio ordinal interval/ratio nominal 2 6. (T or F) As it is usually measured (female or male), the variable sex is dichotomous variable. True. 7. A dichotomous variable is one with (exactly) two values. Respondents answer 10 questions about their self-esteem, responding “strongly agree”, “agree”, “not sure”, “disagree”, or “strongly disagree” to each question. Each response choice generates points and points are added for the 10 questions to generate a self-esteem scale score. Respond to the following questions: a. b. c. d. (T or F) This scale is an example of what the text means when it uses the term “multi-item attitudinal scale” (see page 8 near bottom.) (T or F) Strictly and formally, the level of measurement of the scale is interval/ratio rather than ordinal. (Hint: there may not be an easy answer; see page 8) (T or F) The level of measurement comes extremely close to being interval/ratio. (T or F) The text recommends against using statistical procedures designed for interval/ratio level variables with multi-item scales such as that presented in this example. Answers follow: a. b. c. d. 8. True. Hard to say if true or false. Scholars disagree. My view is that the measurement here is not strictly interval/ratio in the same sense as are physical quantities such as height, weight, etc. (see page 8) True. And this is the key point. Even if it may be difficult to get scholarly concurrence that, in the strictest sense, multi-item scales reflect interval/ratio level measurement, pragmatically speaking, scholars treat such scales as being at this level as they carry out their everyday work. False. Do use these procedures with multi-item scales such as this. Indicate whether you feel each pair of variables is likely to be related or not to be related. (Technically, as no data is presented, one can’t know assuredly whether or not the variables are related; but, hopefully, the examples are intuitive.) a. b. c. Level of education and income. Whether there are children in a family (yes or no) and whether that family owns a minivan (yes or no). How much students like brussel sprouts and the grade they earn on a statistics test. 3 d. e. Shoe size and height. Whether or not there are clouds in the sky and whether or not persons bring rain gear with them (umbrella, raincoat, etc.) Answers follow: a. b. c. d. e. 9. Related. On balance, the higher one’s education level the higher their income. Related. (I presume that) families with kids are more likely to own minivans than are those without. Can’t imagine that there is any relationship here. Some students who like brussel sprouts will likely do well on the test and others won’t. The same is likely true of those who don’t like brussel sprouts. There is no reason to think that those who like brussel sprouts will tend to do either better or worse on the test than will those who don’t like them. Related. Those who are tall tend to have bigger feet than those who are short. Related. People are more likely to bring rain gear when there are clouds than when there are not. Suppose that you read that those who eat brussel sprouts have lower cancer rates than those who do not. Why should you be somewhat hesitant to conclude that eating brussel sprouts causes reduced risk for cancer? (By the way, brussel sprouts are a small green vegetable in the cabbage family.) Hint: consider confounding variables such as those presented in the text at the bottom of page 9. Perhaps those who eat brussel sprouts differ from those who do not in other food preferences as well. Perhaps one of these differences rather than eating brussel sprouts explains the reduced cancer risk. 10. For each pair of variables below, indicate which is the independent variable and which is the dependent variable. Be alert for one pair of variables where it is not possible to easily categorize one as independent and the other as dependent. To help in identifying the variables, they are in italics. a. b. c. Hours spent studying for a test and grade earned on that test. Adolescents at risk for being placed out of their homes are randomly assigned to Treatment A or to Treatment B. A research study then tracks whether or not the adolescent actually requires placement (yes vs. no). How high students high jump and how far students long jump. 4 d. Whether social work graduate student have prior work experience in social work and their degree of satisfaction with their graduate program. Answers follow: a. b. c. d. 11. Hours spent studying is the independent variable and test grade is the dependent variable. Presumably studying affects test grades. Type of treatment (A or B) is the independent variable. Placement (yes, placed versus no, not placed) is the dependent variable. Presumably, the effectiveness of the program that serves adolescent (A versus B) affects their risk of subsequent placement. Here it is not clear which variable would be the independent and which would be the independent. In a situation such as this, one would not bother to attempt to classify the variables as independent or dependent. In essence, neither would be independent and neither would be dependent. These terms don’t make any sense in this example as neither variable would be readily viewed as causing the other. Here prior experience is the independent which (presumably) affects satisfaction, the dependent. (Clearly, it is nonsensical to think of satisfaction with school as affecting prior work experience; something that occurs later in time cannot affect something that occurs before.) At a given school of social work, members of the student social work association fill out a questionnaire indicating their willingness to participate in a service project. (Some students at the school are members of the association and others are not. Each student decides on their own whether or not to join the association.) a. b. c. d. e. f. Are the students in the association a sample of students in the school? Are the students in the association a random sample of students in the school? Do you know assuredly that the willingness of students in the student association to participate in the service project differs systematically from – that is, is biased relative to-- that of all students at the school? Can you rule out bias, that is, can you be sure that the opinion of those in the association does not differ systematically from that of all students at the school? Do you suspect systematic bias? Stated differently, are you concerned about possible bias? Presuming that you suspect bias, what is the nature of the bias that you suspect? Stated differently, in what 5 g. h. direction (more willing versus less) do you suspect that the willingness of student association members to participate in the project differs from that of the full population of students in the school? What type of a sample of students at the school should be taken to eliminate concerns about systematic bias? If you took a random sample of students at the School, would you be assured that the willingness of these students to participate in the project would be precisely the same as that of the full population of students at the School? Here follow answers: a. b. c. d. e. f. g. h. 12. Yes, they are a sample. No, they are not a random sample. Do not know this for sure. (Note: this question takes you a bit beyond information presented in Chapter. Whenever a sample is nonrandom, you are concerned about bias, but you cannot be 100% sure that bias is present.) This is the key idea. While you don’t know for sure that bias is present, you are concerned that such is the case. Yes, whenever sample is not random, one should be concerned about (should suspect) bias. Given that the members of the association did indeed join at least one association (the student association), one suspects that they would be more likely to participate in the service project than would students who did not join the association. So, one suspects, that, relative to the full school population, the study sample is biased in the direction of greater willingness to participate. Though not a magic pill (and presuming that everyone selected for the sample did fill out their questionnaire) taking a random sample eliminates the systematic bias that could emerge from a nonrandom sampling method. No. Selecting a random sample eliminates systematic bias. However, due to the luck of the draw, the opinions in the study sample would likely differ at least to some degree from those in the full population. (T or F) One may appropriately use inferential statistical procedures to draw conclusions about a population even when the study sample has not been randomly selected from that population. False. The study sample must be a random sample from the population. 13. For each of the following, indicate whether the researcher is engaging in descriptive or inferential statistical procedures. If the researcher is engaging in inferential statistical procedures, indicate whether these are being used appropriately. 6 (Hint: one may only use inferential procedures to draw conclusions about the population from which the study sample was randomly selected.) a. b. c. A researcher reports that 27% of women in her study sample had experienced physical abuse by their partners. A researcher finds that 17% of female social work students at a particular university have experienced physical abuse by their partners. S/he concludes that a similar percentage of all female students at the university have experienced such abuse. A researcher takes a random sample of female students at a given university. Nineteen percent in this sample have experienced physical abuse by their partners. The researcher states that 19% is his/her best estimate of the percentage of all female university students who have experienced such abuse. Answers follow: a. b. c. 14. This is simply descriptive statistics, describing the study sample. This is an (incredibly sloppy) attempt at inferential statistics. It can be viewed as an attempt at inferential statistics because the researcher is indeed drawing a statistically-based conclusion about a group/population that is broader/larger than her study sample. This is not an appropriate/correct use of inferential statistics because the study sample was not a random sample from the population of students at the university. The conclusion drawn is not a valid one. This is an appropriate use of inferential statistics. (Chapters 12 and 13 will demonstrate this kind of reasoning in much greater depth.) Indicate: 1) whether each study uses random assignment, 2) whether (possible) bias from confounding variables raises questions with the conclusion that is drawn, and 3) if confounding variables are suspected, what these are and the nature of the bias that they may cause. (Two points: 1) if no mention is made of using methods of chance to assign to persons groups, assume that assignment is nonrandom and 2) one needs to be concerned about confounding variables only in studies that do not use random assignment.) a. By a random process, a computer assigns patients to Medical Treatment A or Treatment B. Those in A have much better outcomes. The researcher concludes that A is a more effective treatment than B. 7 b. c. Via a survey, a researcher finds that youth who watch more violent television are more aggressive in school than are those who watch less. She concludes that violent television causes aggressive behavior in school. A researcher finds that children served in family foster homes are much more likely to reunite with their families of origin than are children served in group homes. The researcher concludes that the beneficial effect(s) of family foster homes – individualized attention, interaction in a family setting, bonding, etc. – are the cause of the greater likelihood of reuniting. Answers follow: a. b. c. 15. Assignment to groups is random and thus one need not be concerned about bias from confounding variables. Though the researcher’s conclusion oversimplifies, it is essentially correct. (When you study statistical significance tests, you will learn more about how chance (the luck of the draw) affects study results, but we will hold that idea for later.) There has been no random assignment to “watch” or “not watch” violent television. Perhaps youth who watch violent television are more likely to live in violent homes than are youth who do not watch such television. Perhaps higher levels of violence in the home, not watching violent television, is the cause of more aggressive behavior at school. Children were not randomly assigned to family foster versus group homes. Perhaps children who were in foster family care tended to have less serious problems than did those who were in group homes. Perhaps these less serious problems, not the benefits of foster care over group care, explain why children served in foster homes are more likely to reunite with their families. What one factor do researchers tend to look at more than any other as they attempt to decide whether study results obtained in one setting will generalize to another setting? Degree of similarity between the settings. The greater the similarity, the greater the expected generalizability. Or, stated differently, the greater the similarity of the settings the greater the expected similarity of study results. 8 Chapter 2 16. There are 20 women and 5 men in a classroom. women? What proportion are Divide the number of objects with characteristics by the TOTAL number of objects, and don’t forget the decimal point: 20/25 = .80 17. For the prior problem, find the percentage of women. Multiply the proportion by 100 .80  100 = 80% 18. In table 2.3 on page 24, what is the cumulative frequency of perpetrators aged 26 or below? Sum the frequencies for all ages 26 and below: 2 + 1 + 2 + 2 + 1 + 1 = 9 (Alternatively, you could get the answer by looking in the Cumulative Frequency column which lists cumulative frequencies.) 19. In Table 2.3, what is the cumulative percentage of perpetrators aged 26 and below? Divide the cumulative frequency by the sample size and multiply by 100: 9/20 = .45  100 = 45% 20. In Table 2.3, what is the percentile rank of perpetrators 26 and younger? Percentile rank and cumulative percentage are basically the same thing. Hence, the percentile rank was calculated in the prior problem and is 45 or 45%. 21. The stem of a given value (score) is 8 and the leaf is 3. is that value? What Presuming that the value in question has two digits, the stem is the first digit and the leaf is the second. The value is 83. 9 22. In Table 2.6 on page 27, how many persons scored 28 on the Cohesion scale? A score of 28 would have a stem of 2 and a leaf of 8. In the row in which the stem equals 2, the only digits are a 2 (conveying a score of 22) and a 9 (29). Hence, no persons scored 28. 23. In Table 2.6 on page 27, how many scored 39? Find the stem equals 3 row and count the number of 9's. There are 5 9's. 24. Hence, 5 persons scored 39? In Table 2.6 on page 27, what is the frequency of scores of 39? Frequency is the number of scores of a given value. determined in the prior problem. This was The frequency of 39's is five. Chapter 3 1. Find the mode of the observations: 2,7,2,2,4,5,6,4 The mode is the most frequently occurring value. The value 2 occurs three times, more than any other. mode is 2. 2. Hence, the Find the median of the observations in the prior problem. The median is the middle value in the ordering, or where the number of values is even it is the mean of the two “middlemost” values. For this problem, values must be ordered and then the mean of the two middlemost values must be calculated. 2,2,2,4,4,5,6,7 The two middlemost values are both 4's. values is also 4: (4 + 4)/2 = 4 3. The mean of these two What is the mean of the observations in problem 1 above. 10 To calculate a mean, sum and then divide by the number of observations. 2+7+2+2+4+5+6+4 = 32 There are eight observations (N = 8): 32/8 = 4 4. Find the median of the observations 2,119,2,2,4,5,6,4. compare to observations in problem 1.) (Hint Changing the highest value to a more extreme value will not affect the value of the middle value (or of the middlemost values) and hence will not affect the median. The median, thus, is 4. 5. Find the mean of the values in the prior question. Changing any value will affect the mean. (2+119+2+2+4+5+6+4)/8 = 18 6. Which measure of central tendency, the median or the mean, is preferred for the data presented in the prior two problems? The mean is a misleading measure when there are outliers or extreme values (e.g., 119). The median is preferred. 7. Consider the following frequency table: Value Frequency Percent poor 5 7 fair 10 14 good 40 57 excellent 15 21 What is the mode? median? mean? The mode is the value with the greatest frequency. The median in a table is the first value with a cumulative percentage greater than or equal to 50. The mean requires interval/ratio level data. The mode is “good” which occurs more than any other value. 11 Cumulative percentages can be derived by summing percentages: the first cumulative percentage greater than 50 is for “good”. The cumulative percentage for good is: 7+14+57=78. The median is “good”. The mean cannot be computed as the data is only at the ordinal level of measurement. 8. At a given mental health clinic, 22 persons are diagnosed as having a depression-related condition, 14 as having a personality disorder-related condition, and 5 as having a schizophrenic condition. What is the mode? median? mean? To determine whether a given measure of central tendency can be calculated, one should focus on a variable’s level of measurement. The mental health condition variable is at the nominal level. Only the mode can be calculated at this level. (A nominal-level variable has no mean and no median.) The mode is depression-related condition, which occurs most often. There is no mean. There is no median. Chapter 4 1. Sample 1 consists of 20 women and 80 men. Sample 2 consists of 200 women and 800 men. In which sample is the variability of gender greater? Percentages rather than frequencies are used to assess variability. In both samples, percentages are identical – 20% women and 80% men. Hence, variability is equal in the two samples. 2. What is the range of the following data: 88,21,33,16,52,99,71 To calculate the range, subtract the lowest score from the highest. 99 - 16 = 83 The following data will be used for the next several problems: 12, 18, 15, 16, 14. 3. Calculate the mean deviation of the just-presented data: 12 To do so, carry out the steps as presented on page 61: 1) The mean is (12+18+15+16+14)/5 = 15 2) Deviation scores are: 12-15=-3; 18-15=3; 15-15=0; 16-15=1; 14-15=-1 3) Absolute values of deviation scores are: 3, 3, 0, 1, 1 4) Sum of absolute values is: 3+3+0+1+1=8 5) Dividing by N yields the mean deviation: 8/5 = 1.6 4. Via formula 4.5 on page 63, calculate the standard deviation (s)of the just-presented data. Calculate the mean. Then use of a grid as on page 62 helps. As calculated in prior problem, the mean is 5. X ‾ X X - X ‾ (X - X ‾)2 12 15 -3 9 18 15 3 9 15 15 0 0 16 15 1 1 14 15 -1 1 sum = 20 The next step is to divide by (N - 1): 20/(5 - 1) = 20/4 = 5 Finally take the square root: the square root of 5 is 2.24 which is the standard deviation. 5. Via formula 4.7 on page 65, calculate the variance of the justpresented data. The next to the last step of the standard deviation formula (just prior to taking the square root) is the variance. The variance is 5.00. 6. Here are the scores in Group 1: 77, 56, 93, 84, 37 13 Here are the scores in Group 2: 75, 54, 91, 82, 35 Is the standard deviation of scores: a) Greater in Group 1 than in Group 2? b) Equal in the two Groups? c) Greater in Group 2 than in Group 1? Hint: compare the two sets of scores closely. Adding or subtracting a constant does not affect the standard deviation. All scores in Group 2 are exactly two points lower than their corresponding score in Group 1. One could think of the scores in Group 2 as having been formed by subtracting (the constant) “2" from each score in Group 1. Subtracting the constant makes the mean of Group 2 lower, but does not affect the spread of scores around the mean. In other words, the standard deviations in the two Groups are equal. Choice “b” is correct. 6. Suppose that 97 of 100 (97%) of students pass a given test. Why would it be difficult to assess the possible association of gender to passing (versus failing) the test? Where variability is very limited it is difficult to study relationships. Almost everyone has passed the test, and, thus, it will be difficult to study the possible relationship of gender to passing. (If virtually no one fails then the variable pass versus fail is, in effect, almost not a variable at all – it is very nearly a constant). Chapter 5 1. Name several “real world” variables that precisely follow a normal distribution. The normal distribution is a mathematical, abstract distribution. No real variables precisely follow a normal distribution. 2. What do you think is the shape of the distribution of calories consumed by you each day last year? Here is a response that probably applies to most: 14 Well, it is perhaps pretty close to normal, that is, on most days I eat an intermediate, “middle” amount of calories, and on fewer days I eat less or more. Thinking further, on some days I really pig out, that is I eat huge numbers of calories. These few huge number of calorie days are, essentially, outliers that would stretch out the positive tail a long way. As the positive tail is extended more than the negative, I’d say the shape of the distribution is positively skewed, at least to a modest degree. 3. What do you think is the shape of the distribution you would get from throwing a dice, say, 1000 times and each time recording the value thrown (a 1,2,3,4,5 or 6). Where each value has (about) the same frequency, a distribution is said to be flat or rectangular. Though luck would cause some variation in the numbers thrown, 1000 throws is a lot of throws, so luck would tend to “even out” and you should throw about the same number (though with some variation) of each. Hence, the shape of the distribution would be fairly close to rectangular. (It certainly would not be close to normal.) 4. What do you think is the shape of the distribution of the number of movies seen in the past month by students at your college/university? Most folks watch just a few movies. Some watch none. On the other hand, a few folks are movie junkies and watch tons and tons of movies. Those few who watch tons of movies are positive outliers. They stretch out the positive (right) tail of the distribution. Thus, most likely, the distribution is positively skewed. 5. What percentage of cases in a normal distribution are below the mean? In a normal distribution, the mean, median, and mode all have the same value. By definition 50% of cases are below the median. 50% of cases in a normal distribution are below the mean. 6. In a normal distribution, what percentage of cases have z scores of -1.00 or below? 15 In a normal distribution: 1) 50% of cases are below the mean and 2) 34% are between the mean and 1 standard deviation below the mean (see figure 5.13 on page 86) 50% - 34% = 16% 7. Ann scored 74 on a spelling test on which the mean was 84 and the standard deviation was 10. The shape of the distribution of scores on the test was negatively skewed. What is her z score? A z score indicates how many standard deviations a score is above or below the mean. One may calculate a z score for any interval/ratio level variable, even those with skewed distributions. (What one can’t do when distributions are not normal is figure out percentile ranks, percentages, etc.) Ann’s score is 10 points below the mean. The standard deviation is 10 points. Therefore Ann scored 1.00 standard deviation below the mean. This is the same thing as saying that her z score is 1.00. 8. On the same spelling test, Shawnda scored 89. What is her z score? One can simply use the z score formula on page 88. z = (89 - 84)/10 = 0.50 9. A distribution is negatively skewed. or below the mean. Is a z score of 0.32 above Even for a nonnormally distributed deviation, z scores above 0.00 are above the mean. It is above the mean. 10. What is the percentile rank of a z score of 0.32 in a negatively skewed distribution? The normal distribution table (page 448) can only be used when variables are normally distributed. As the distribution described has a negative skew, the asked for percentile rank cannot be determined. 16 11. Given a normal distribution, what is the percentile rank of a z score of 0.32? Given that the shape is normal, the percentile rank can be determined. Via the normal distribution table on page 448, 12.6% of cases are between the mean and a z score of 0.32. In a normal distribution, 50% of cases are below the mean: 50 + 12.6 = 62.6 is the percentile rank 12. Lucia scored 83 on her self-esteem inventory. self-esteem relative to her peers? Does she have good Where variables are abstract, one needs z score (or means and standard deviations so that z scores can be determined) to interpret a score. Without the just-mentioned data, one cannot respond to the question; one does not know whether she has good self-esteem. 13. The mean of the just presented self-esteem inventory is 100 with a standard deviation of 10 points. Does Lucia have good or poor self-esteem relative to her peers? Now one can respond. Lucia’s self-esteem is well below the mean: (z = (83 - 100)/10 = -1.7) She has reasonably poor self-esteem. 14. Given the information presented in the prior problem, what is the percentile rank of Lucia’s self-esteem score? To compute percentile ranks, one needs to know that the distribution is normally distributed. As the shape of the distribution is not stated (and thus one does not know that it is normal or nearly so), one cannot determine Lucia’s percentile rank. 15. Presuming a normal shape, what is Lucia’s percentile rank? Now this can be computed via the normal distribution table on page 448. Via the third column, 4.5% of cases are more extreme than -1.7. All of these more extreme cases have lower values (are more extreme in a negative direction). Hence, the percentile rank is 4.5%. 17 Chapter 6 1. Indicate whether you believe each pair of variables is related: a) How loud one partner snores and how much sleep the other gets. Yes, the louder the snoring, the less the sleep. b) Birth order and grades in school. Probably so, though the relationship is likely weak. Some studies (and common wisdom) suggest that first-borns strive harder to achieve and, thus, on average, earn slightly higher grades than their siblings. c) How late students stay out at a party and level of concentration in class. Yes, the later one stays out (on average) the less they concentrate in class. d) Ethnicity of actors in TV shows and whether or not they play a lead role. My experience is that minority actors may be more likely than white actors to be assigned to a supporting role rather than to a lead one. If so, the variables are related. e) Left-handedness Ridiculous question. Only one variable, so we can’t assess whether there is a relationship. f) How much one likes to dance and how much they like green beans. Not sure but presumably whatever relationship there is is quite weak. So, let’s just say no, these variables are not related. 18 Observe the following contingency table: Table 1 Gender and Choice of Social Work Concentration Female Male Direct Practice (DP) 100 84% 40 67% Admin. and Community Practice (ACP) 20 16% 20 33% 120 67% 60 33% 2. Respond to the following: a. b. c. d. 140 78% 40 22% 140 [answers provided to the right] What is the frequency of men in ACP? [20] What are the column margin frequencies? [120,60] Does the table present row or column percentages? [column] Which concentration do men choose most often? [DP] Note: the fact that men choose DP more often than they choose ACP is irrelevant to the assessment of relationship. e. f. Are the column percentages for women and men identical? [no] Are women or men more likely to choose DP? [women] This is the key to the assessment of relationship, that the two groups differ in the pattern of their choices, that women are more likely than men to choose DP. g. h. Which variable is the dependent variable? [concentration] What percentage of women choose DP? Men? [84% versus 67%] The fact that percentages differ is the key. do, gender and concentration are related. Because they Comment: Frequencies can differ just simply due to numbers. For instance, presume a situation in which: 1) there are twice as many female students as male students 2) the same percentages of women and men choose DP. In this situation, more women than men choose DP but this greater number is simply due to the fact that women outnumber men. In this situation, the variables are unrelated. i. Are gender and concentration choice related? 19 [yes] i. In your own words, describe that relationship. Women are more likely than men to choose DP. But that is just one of many ways to describe the relationship (see page 104) For instance, one could also say: Women are less likely than men to choose ACP. j. Via the column percentages, what is the D%? [84%67%=17%] k. Via Table 6.4 on page 109, interpret size of association. The relationship is medium in size. 3. m. Does the table follow the text’s tip (from the bottom of page 104) to make the independent variable the column variable? [yes] n. Are the presented percentages in accord with the text’s recommendation (in middle of page 102) to use column percentages when the independent variable is the column variable? [yes] Eighty-one percent of non-minority college students versus 71% of minority students indicate that they have a personal computer. Are minority versus majority status and computer ownership related? If so, how strong is that association? To determine whether variables are associated, see whether percentages differ. To gage size of association, compute the D% and use table 6.4 on page 109 as a guide. Percentages differ, so the variables are related. The D% is 10%, which most would view as indicate of a reasonably weak relationship. 4. Ninety-five percent of men versus 90% of women know the name of at least one player on their college’s football team. What’s wrong with simply dividing 95% by 90% to determine that the ratio of percentages is 1.056.? Calculate the R% in a better way. When computing the R%, one should use percentages closer to 0% rather than those closer to 1. The better calculation follows: If 90% of women know a name, then 10% do not. For men, 5% do not know: 10%/5% = 2.00 = R%. Hence, women are twice as likely as men not to know a name. 5. 24% of social work students versus 8% of psychology students participate in a volunteer work day. What is the ratio of 20 percentages with social work students considered to be the first group? With psychology students considered to be the first group? Which ratio is easier to communicate about and understand? To calculate, divide the percentage in the first group by that in the second. 24%/8% = 3.00 with social work students as the first group 8%/24% = 0.33 with psychology students as the first group Probably the first ratio is easier to communicate. One can say: “Social work students are three times more likely to volunteer than are psychology students.” On the other hand, one could also say: “Psychology students are one-third as likely to volunteer as social work students.”, which is perhaps only a bit more difficult to understand. Chapter 7 For convenience Table 1 from the problem set for the prior chapter is repeated here: Table 1 Gender and Choice of Social Work Concentration Female Male Direct Practice (DP) 100 84% 40 67% Admin. and Community Practice (ACP) 20 16% 20 33% 120 67% 60 33% 1. 140 78% 40 22% 140 What are the odds that a woman will choose DP? To determine odds divide the number that experiences event by number that does not: 100/20 = 5.00 21 2. 3. 4. What are the odds that a woman will choose ACP? [20/100=0.20] What are the odds that a man will choose DP? [40/20=2.00] What is the odds ratio with women considered as the first group and DP as the event? To determine an odds ratio, divide odds for the first group by those for the second. Odds for both groups have already been calculated: 5.00/2.00 = 2.50 5. Now compute the odds of choosing DP with men considered to be the first group. One can use the method described in the prior question to answer this question Alternatively, when group designations change, the resulting odds ratio is a reciprocal. (To calculate a number’s reciprocal, divide 1.00 by that number.) Via the first method: 2.00/5.00 = 0.40 Calculating the reciprocal odds ratio: 1/(2.50) = 0.40 6. Which odds ratio conveys a stronger relationship, 2.50 or 0.40? Hopefully, it is intuitive to you that the strength of the relationship does not change simply because we change which group is designated as the first group. Further, odds ratios that are reciprocals convey the same strength of relationship. The two odds ratios convey the same strength of relationship. 7. Calculate the odds ratio for choosing ACP with women considered to be the first group. Simply divide one odds ratio by the other, or, alternatively work directly from the numbers in the contingency table. Via the method of dividing one odds ratio by the other: The odds for women were computed earlier to be 0.20 For men these are : 20/40=0.50. Thus, the odds ratio is: 0.20/0.50 = 0.40 Via the method of using the numbers in the table: (20/100)(20/40) = (20/100)(40/20) = 800/2000 = 0.40 22 8. Using the word likely, interpret (communicate about) the odds ratio in the table. Generally, odds ratios greater than 1.00 are easier to communicate about than are those less than 1.00. Hence, I will communicate using the odds ratio 2.50 rather than 0.40. Two good ways to communicate results are: Men are two and one-half times more likely than women to choose ACP. (Or, to be more formal: the odds that men will choose ACP are two and one-half times greater than are those for women.) Women are two and one-half times more likely than men to choose DP. (Or, the odds that women will choose DP are two and one-half times greater than are those for women.) 9. Via Table 7.4 on page 118, interpret the size of the association between gender and concentration choice. An odds ratio of 2.5 conveys a relationship that most would regard as being of medium strength. 10. Which odds ratio conveys the stronger association between variables, 0.1 or 7.0? Odds ratios that are reciprocals convey the same size of association. So, one way to answer this question is to find the reciprocal of 0.1 and see if that is larger or smaller than 7.0. To find a number’s reciprocal, divide 1.0 by that number). reciprocal of .1 = 1.0/.1 = 10.0 As the reciprocal of 0.1 is 10 and 10 is larger than 7, one may conclude that an odds ratio of 0.1 conveys a larger association than does one of 7.0. 11. (T or F) As the odds ratio gets closer and closer to 1.0, this conveys that the relationship is weakening rather than strengthening. An odds ratio of 1.0 conveys the absence of association. as the odds ratio approaches 1.0, this conveys weakening relationship. The correct answer is True. 23 Hence, 12. Presume that an odds ratio is less than 1.0. (T or F) As this odds ratio gets closer and closer to 0.00, this conveys strengthening of rather than weakening of relationship. Presuming that an odds ratio is less than 1.0, the closer that it gets to 0.00, the stronger the association. For instance, an odds ratio of .1 conveys a stronger association than does one of .2. The correct answer is true. 13. Where one or more variables in a nondirectional relationship has more than two categories, this text recommends using what measure of size of association? In general, it recommends using Cramer’s V. 14. Among those who study less than one hour, 60% pass a test. Among those who study one to two hours, 80% do so. Among those who study more than two hours 70% pass the test. Is there a relationship between studying and passing? Is this relationship directional? Whenever there is a difference in percentages, there is a relationship. For the relationship, to be directional, the percentages must have a directional (always increasing or always decreasing) pattern. There is a relationship, but the relationship is not directional. 15. Among those who study less than one hour, 60% those who study one to two hours, 80% do so. study more than two hours 90% pass the test. relationship between studying and passing? Is directional? pass a test. Among Among those who Is there a this relationship There is a relationship and the relationship is directional. (The first paragraph in Section 8.13 in Chapter 8 on page 150 is a note on dichotomous variables that is relevant to this problem.) Chapter 8 1. Find the lowest (vertical) marker in the scatterplot on page 131. How far did this case long jump and how high did it high jump. 24 Trace left from the marker to determine the score on the vertical axis variable and down to determine that on the horizontal axis variable. Tracing left, the case high jumped about 3.2 feet. s/he long jumped about 10.0 feet. 2. Tracing down, Here are z scores for five persons for two statistics tests. Person z score on test 1 z score on test 2 Sally .272 .996 Juan -1.30 -1.54 Yi .795 -.325 Jeremy -.774 .183 Mary 1.00 .692 What is the correlation of scores on the two tests? As formula for r (8.1 on page 133) indicates, to compute r one multiplies the z score on the first variable by that on the second and then divides by N - 1. Person z score on test 1 z score on test 2 z on 1  z on 2 Sally .272 .996 0.27 Juan -1.30 -1.54 2.01 Yi .795 -.325 -0.26 Jeremy -.774 .183 -0.14 Mary 1.00 .692 0.69 sum = 2.57 r = 2.57/(5 - 1) = 2.57/4 = .643 3. The independent variable is plotted on the ___________ axis and the dependent variable is plotted on the ___________ axis. Independent = x (horizontal) axis Dependent = y (vertical) axis 25 4. The correlation between number of visits of family members and reported happiness (measured on a standardized scale) of nursing home clients is home is 0.53. Would you characterize this as a strong or a weak association? Via Table 8.3 on page 137, this relationship is a reasonably strong one. 5. Timothy’s z score in self-esteem is -0.80. The correlation of self-esteem scores and scores on a satisfaction with work scale is .44. What is Timothy’s predicted z score on the work scale. Formula 8.2 on page 138 directs one to multiply the z score on the first variable by the correlation. predicted z score on satisfaction = .44(-0.80) = -.352 6. The correlation of X and Y is 0.25. As the z score on X increases by 1.00 unit, what is the change in the predicted z score on Y? As discussed in Section 8.6.2 on pages 139-140 and also on page 148 (last two paragraphs), r conveys the predicted change in standard deviation units on one variable as the other increases by one standard deviation. An increase of 1.0 (one) z score is the same as an increase of one standard deviation. As the z score on X increases by 1.00, the predicted z score on Y increases by 0.25. 7. The correlation of score on a family functioning measure and a child’s grades in school is .23. Following calculation of this correlation, the researcher realizes that she mistakenly scored each case 10 points too high on the family functioning measure. What will be the correlation of family functioning scores and grades after she subtracts 10 points from each family functioning score to correct for the error? As discussed in Section 8.7 on page 140, adding or subtracting a constant from each score (that is from all scores) does not affect the value of r. The correlation will be .23. 8. The correlation between two variables is -.70. What is the proportion of explained variance? What is the coefficient of determination, r2? What is the percentage of explained variance? As discussed on page 141, to determine the proportion of explained variance – this is what r2 measures – one squares r: 26 r2 = -.7  -.7 = .49; 9. 49% of the variance is explained A given regression equation is: Predicted Y = -4 + (-5)X What is ... a. b. c. d. the the The The constant [-4] regression coefficient [-5] slope of the regression line [-5] change in predicted Y as X increases by 1.00. [-5] Comment: the regression coefficient, the slope, and the change in predicted Y as X increases by 1.00 always have the same value – they are, in essence, the same thing. The coefficient B conveys all of these. e. Predicted Y when X = 3 Carry out the math: -4 + (-5)(3)= -4 +(-15) = -4-15 = -19 f. Predicted Y when X = 0 Here the predicted Y is simply the constant, -4: -4 + (-5)(0) = -4 + 0 = -4 g. Predicted Y when X = -3 -4 + (-5)(-3) = -4 + 15 = 11 10. A given scatterplot displays two variables using z scores. The correlation between these variables is 0.33. What is the slope of the regression line? Referring to page 148, the slope of the regression line for a standardized regression equation (that is, for a regression equation that uses z scores equals r (the correlation coefficient). The slope of the regression line is 0.33. 11. A given regression equation examines the relationship between the number of school absences (dependent variable) and family income (independent variable) measured in dollars. Would the standardized regression equation or the unstandardized regression equation be preferred for communicating about this relationship? Or might both be helpful? To some degree this is a matter of judgment. 27 As both variables are tangible and in everyday, easy to understand units of measure, the unstandardized regression equation would have easily interpreted meaning. It would convey the predicted change in absences as income increased by $1.00. On the other hand, the regression coefficient in the standardized equation is, simply, r, which conveys strength of association. Hence, both equations would yield useful information. (Where both variables are not tangible/familiar, the unstandardized equation is, generally speaking, less useful.) 12. Consider an upcoming statistics test that covers the past four weeks of work. Suppose that I developed a scatterplot showing the relationship between time spent studying (independent variable, X axis) and grade earned on the test (dependent variable, Y axis). Suppose further that I could measure time studying all the way out to 500 hours (let’s presume, say, that this was an experiment, where for the purposes of science, some persons agreed to study for 500 hours). What do you think would be the shape of the line/curve (or scatterplot dots) that would depict the relationship between time spent studying and grade earned? (Hint: would it be straight?) Here is my thinking: Presumably for the first 50 hours of study, the more one studied, the better, on average, they would do on the test. After this amount of time, they would reach a point of diminishing returns. Pretty soon, extra study would only make them sleepy, frazzled, and nonfunctional. Hence, after 100 hours or so, I would say that the line would slope downward, the more the hours studied, the worse the grade. Those who studied 500 hours would all be asleep and miss the exam and score 0. If my thinking is correct, the relationship between study and grade will be a curvilinear one. (In a four-week period there are only 672 hours to begin with so 500 hours of study leaves little time for anything else, including sleep.) Chapter 9 1. The mean number of days of school missed in the school year is 7.0 at School 1 and 14.0 at School 2. The standard deviation is 5 days at both schools. What is the standardized difference between means? Here the standard deviation of the two groups is assumed to be the same. To find the SDM, divide the difference in means by the standard deviation. 28 SDM = (7.0 - 14.0)/5 = -7/5 = -1.4 2. Using Table 9.1 on page 158 as a guide, how would you characterize in words the size of the difference in standard deviation units calculated in the prior problem. This difference would likely be characterized as very large. 3. Students who attend one or more help sessions earn an average score of 92 on a statistics test. Students who do not attend any sessions earn an average score of 87. What is the SDM? Trick question. deviation 4. Cannot calculate the SDM without the standard Presume the standard deviation within groups for the prior problem is 8 points. What is the SDM? Divide the difference in means by the standard deviation. SDM = (92 - 87)/8 = 5/8 = .625 5. Where the SDM is 1.00, and assuming equal standard deviations and normal distributions, about what percentage of cases in the group with the higher mean are located above the mean of the group with the lower mean? See if, using your knowledge of the normal curve, you can reason out in your head. (You may want to look at Figure 9.2 at top of page 161.) The mean of the lower group is located 1 standard deviation below that of the higher group. Thirty-four percent of cases in the higher group are located between its mean and the mean of the lower group (68% within 1 standard deviation divided by 2 = 34%). Further (by definition) 50% of cases in higher group are located above its mean. 34% + 50% = 84% of cases in group with the higher mean are located above the mean of the group with the lower mean. 6. You observe the following information regarding the number of police contacts for youth participating in two different intervention programs. Group Mean SD n Group 1 Group 2 10.0 7.0 6.0 3.5 25 25 29 Is the SDM an appropriate measure for this data? Why or why not? For the SDM to be an appropriate measure, the standard deviations of the groups should be approximately equal. The text suggests that the larger should be no more than one-third larger than the smaller. (6.0 - 3.5)/3.5 = 2.5  3.5 = 0.71  100 = 71% larger The SDM is not an appropriate measure as the larger standard deviation is 71% larger than the smaller. 7. Is the SDM an appropriate measure for the following data? what is the estimated SDM? If so, comment on the size of association. Group Mean SD n Group 1 Group 2 10.0 7.0 8.0 6.0 25 25 If so For the measure to be appropriate, the larger standard deviation should not exceed the smaller by more than about 33%. Assuming that the measure is appropriate, one must calculate an average standard deviation via Formula 9.2 on page 163. (8 - 6)/6 = 2/6 = 1/3 = 33% -> just barely within 33% guideline Hence, we may calculate the SDM. The first step is to calculate the average standard deviation within the groups: (8 + 6)/2 = 14  2 = 7 Now use the average standard deviation to estimate the SDM SDM = (10.0 - 7.0)/7 = 3/7 = 0.42 Via Table 9.1 on page 158, the difference is approximately medium in size. 8. Estimate the SDM for the following data and comment on the size of the observed difference/relationship: Group Mean SD n Group 1 Group 2 20.0 30.0 12.0 15.0 25 25 30 First, check to see if the 33% guideline is met. Next, calculate an average SD. Next, carry out formula 9.2 on page 163. (15 - 12)/12 = 3/12 = .25  100 = 25%: The guideline is met. The mean standard deviation within the groups is: (15 + 12)/2 = 27/2 = 13.5 SDM = (20.0 - 30.0)/13.5 = 10.0/13.5 = 0.74 Most would regard the size of the difference as large. 9. An instrument measuring reading skills is administered to first graders who have been exposed to two different reading curricula. Calculate the SDM and comment on the size of association. Group Mean SD n Curriculum 1 Curriculum 2 10.0 17.0 8.0 10.0 25 40 After checking to see if the guideline is met, this problem differs from prior ones as sample sizes are unequal. Hence, formula 9.3 on page 163 will be used to estimate the standard deviation within groups. (8-10)/10 = 2/10 = .20 = 20% -> the guideline is met Via formula 9.3 on page 163 the estimated swg is: [(258) + (4010)]/(25+40) = (200+400)/65 = 600/65 = 9.23 The estimated SDM is: (10 - 17)/(9.23) = -7/(9.23) = -0.76 In the context of social science, this difference is reasonably large. 10. Where eta equals 0.16, what does eta squared equal? In this situation, what is the percentage of explained/shared variation? Would you characterize the size of the observed association, as trivial, quite weak, or strong? To calculate eta squared, one squares eta. One, then multiplies by 100 to determine the percentage of explained variation. Eta squared can be a misleading measure, as it “sounds” smaller than the actual size of association would be judged to be. 31 First calculate eta squared: .16  .16 = .026 Hence, 2.6% of the variance is explained. As a rough gage for interpreting size of association, consult Table 8.3 on page 137. Via this table, the association is weak, but not trivial. (Note that in using Table 8.3, one uses the value of eta (.16) not that of eta squared (.026).) 11. The Kendall’s tau-b that measures the association between support for public welfare programs (1 = very low ... 5 = very high) and liberal versus conservative political views (1 = very conservative ... 5 = very liberal) is .45. Is there a relationship between support for welfare programs and (liberal) political views? If so, does the relationship have direction? If so, is the direction positive or negative? Comment on the strength of the association. Only a tau-b of 0.00 would convey the (complete) absence of directional relationship. Tau-b measures directional association. Values of tau-b greater than 0.00 convey positive relationship. Table 8.3 on page 137, for the correlation coefficient, can provide a very rough measure of the size of association indicated by tau-b. Yes, there is a relationship. Yes, it does have direction. The relationship is positive (as support increases, so also does liberalism). Via table 8.3, the relationship is reasonably strong. 32 Chapters 10 and 11 (combined) Here are some data on type of foster placement for a hypothetical state. Placement with kin versus other placement (i.e., non-kin) for minority and nonminority children is presented. Number of children placed Number placed in kinship home % placed in kinship home Minority children 1000 700 70% Nonminority children 1000 440 44% Total 2000 1140 57% 1. Use the just presented table on foster placement to respond to the following questions/statements: 1) Is there an association between minority (versus nonminority) status and type of placement (kinship home versus other)? 2) If there is an association, describe it in your own words. 3) If there is an association, how would you characterize its size? 4) Can you identify any potential confounding variables, ones that might be affecting, for instance, the size of the association? (Clarification on reading this table: If, for instance, 70% were placed in kinship home, you may assume that 30% were placed in other homes). I will number the thoughts that I have according to the four subquestions. 1. If percentages differ, the variables are related. 2. Section 6.4.3 in Chapter 6 on page 104 shows some different ways to describe association. I will pick one that follows the guidelines for describing association. 3. Table 6.4 on page 109 provides a starting point for describing the size of association represented by a given difference in percentages. 4. Just try to put on my thinking cap on this one. Here are the “answers”: 1. Yes, the variables are related. 33 2. To describe the relationship, here are a couple of possibilities: Minority children are more likely than nonminority children to be placed in kinship homes. Nonminority children are more likely than minority children to be placed in nonkinship (other) homes. 2. 3. The difference in percentages is 36% (70 - 44 = 36). This is a large difference; the association is a strong one. 4. Perhaps minority children, as a group, are placed at a younger age than are nonminority children. If younger children tend to be placed with kin, then age at placement may be a (partial) explanation for the relationship. What about geographic area, particularly city environment versus other environment (suburbs, rural, etc.)? These are also possible confounding variables. You might think of others. The following table presents the just-discussed data, controlling for environment, city versus other: Children who live in city Children in other environments Number of children placed Number placed in kinship home % placed in kinship home Number of children placed Number placed in kinship home % placed in kinship home Minority children 800 600 75% 200 100 50% Nonminority children 400 240 60% 600 200 33% Total 1200 840 70% 800 300 38% Respond to the following questions: 1) In each subgroup (city environment and other environment) is there an association between minority status and kinship placement? 34 2) Is the strength of association in each subgroup reasonably similar or quite different? 3) (Given that the strength of association in each subgroup is similar): Is the strength of association when controlling for environment – that is, the strength of association within the subgroups -- stronger than, about the same as, or weaker than that prior to control? 4) Do you think that the relationship between minority status and kinship placement is entirely due to type of environment? Is it partially due to this? My thoughts on each of the four just-posed questions: 1) Just need to look to see whether percentages differ. 2) Just compare the difference in percentages in the two subgroups. 3) Compare the (average) D% in the subgroups to that prior to control. 4) This probes ideas presented in Chapter 11. Basic logic that would guide response would be: If there is still a relationship within each subgroup following control, the initial relationship cannot be due entirely to the controlled for variable. Further, if the relationship weakens when the variable is controlled for, then the controlled for variable is indeed exerting some effect on the size of the initial relationship. “Answers” to the questions: 1) Yes, in both subgroups, minority children are more likely to be placed in kinship homes then are nonminority children. 2) For city: D% = 75% - 60% = 15% For other: D% = 50% - 33% = 17% Clearly the size of association within the subgroups (15% and 17%) is similar. 3) The average difference in the subgroups is: (15%+17%)/2 = 16% The D% prior to control for environment was: 70% - 44% = 36% 35 Hence, following control, the size of association is considerably smaller. Stated differently, the size of association within the subgroups is smaller than the association observed prior to control. 4) 3. Type of environment is partially responsible as the association weakens when it is controlled for. On the other hand, as the association persists (does not disappear) within the subgroups following control, it is not fully responsible. Regarding the just-presented table in Question 2, would you be willing to conclude that the relationship between minority status and kinship placement is causal? Why or why not? Section 10.6, beginning on page 179, presents discussion on the role of random assignment and difficulties in drawing causal conclusions in its absence. Most importantly, other variables not controlled for in the table may be generating the relationship between minority status and placement. For instance, age of child could be a factor if: 1) minority children tend to be placed when younger and 2) younger children are more likely to be placed in kinship care. Even if the relationship persisted when one controlled for several variables, s/he should be cautious in drawing a causal conclusion. (There could still be other variables that are not controlled for that have neither been thought of and/or measured.) On the other hand, minority communities have considerable experience with and expertise in raising children in relative (kin) homes. So there are theoretical reasons to suggest that minority kin step forward more often into the parenting role. But the bottom line is that one should be cautious regarding the attribution of causality. 4. Suggest a causal model for the relationship. Some of the following discussion perhaps takes you a bit beyond the ideas presented in the Chapter 10 but, hopefully, is instructive: The fact that a relationship continues to exist even when type of environment is controlled for opens up the possibility that something about minority status (greater experience with raising relatives outside of nuclear family?) does indeed contribute to (cause) increased use of kin. We cannot draw a definitive conclusion on this based on the current research design (basically a survey rather than an experiment) because a myriad of possible variables are not controlled for. 36 In causal models, the absence of direct effect is signaled by the disappearance of an association following control (see Section 11.5.2, pages 198-200). As the relationship between minority status and kinship placement does not disappear with control for environment, we will presume that minority status does indeed have a direct effect on the likelihood of kinship placement. (In other words, we will presume a direct effect even though we recognize that we have not controlled for all possible variables and that it is conceivable that control for the “right” variable could make the effect disappear.) As has already been demonstrated, the presented table allows us to examine the association of minority status to kinship placement, controlling for city versus other environment. The table also allows us to examine the association of type of environment (city versus other) to kinship placement, controlling for minority status. For instance, among minority children, 75% of those in the city versus 50% of those in other environments are placed in kinship homes, a difference in percentages of 25% (D% = 75% - 50% = 25%). For non-minority children, this difference in percentages is: D = 60%  33% = 27%. Hence, even when controlling for minority status, type of environment and kinship placement are associated. As such, it makes sense to think of city environment as having a direct effect on kinship placement. Finally: Minority status and city environment are associated. This association can be discerned by observing that 67% of children placed in city environments (800/1200 = 67%) versus only 25% (200/800 = 25%) of those placed in other environments are minority children: (D% = 67%  25% = 42%). One could view minority status as causing increased likelihood of being in a city environment. And, perhaps, one could also view city environment as causing increased likelihood of minority status. I think, however, that it makes most sense to think of minority status and city environment simply as being associated. Where one is not sure of or chooses not to speculate on the cause of an association, that association is depicted via a curved arrow (see for instance, Figure 10.7 on page 187). In sum: 1) minority status may have a direct effect on kinship placement, 2) city environment may have a direct effect on kinship placement, and 3) minority status and city environment are associated. So the best model is probably similar to that in Table 10.8 on page 188. 37 Minority status  |   Kinship placement  City environment (In case the above model does not print/appear correctly, a curved arrow connects minority status and city environment and straight arrows lead from each of these to kinship placement.) 4. Which pattern from page 196 -- association persists at about the same strength, association weakens, association disappears, association varies by group -- is best illustrated by the table presented in Question 2 above? Examine the results. “Association weakens” is the unequivocal choice. The association clearly weakens, ruling out “association persists”. Yet, it clearly does not disappear, ruling out “association disappears”. The association is of similar size in the two subgroups (D% = 15% versus D% = 17%), so this rules out “association varies by group.” 5. Does the Table in Question 2 above demonstrate an interaction effect? When there is an interaction, the pattern of association differs according to the value of a third variable. The fourth pattern “association varies by group” is an example of an interaction effect. Size of association is essentially the same (15% versus 17%) in the two subgroups. In other words, size of association varies hardly at all according to type of environment, city versus other. Thus, there is no interaction effect. Chapter 12 1. A social work professor administers an instrument measuring career motivations to the 30 students in her class. She reports on their responses in the school’s newsletter. Is this sample of students a random sample? Is she carrying out inferential or descriptive statistics? Are the motivations of students in this sample likely to be representative of the motivations of students at the college/university? 38 Random samples are selected via formal methods of chance. If one simply reports data for the sample (with no intent to draw conclusions beyond the sample), this is descriptive statistics. Though one does not know assuredly that a nonrandom sample is biased, the more important point is that one does not know that it is not. Is a nonrandom sample. As there is no mention made that the professor is attempting to use the data to draw any conclusions beyond the sample, s/he is using descriptive statistics. One suspects bias in the sample with respect to college population for at least some career motivations. For instance, social work students may be less motivated by money than the “typical” student. 2. A health researcher in a community is studying asthma in children. Via random methods, he selects a sample of families in the community. Where a family is selected, he includes all children in that family in the study. With respect to children, is the independence of selection criteria met? For the independence of selection assumption to be met, the selection of each case must have no bearing (effect) on whether another case is selected. The independence of selection assumption is not met (the children and television example on page 216 is similar to this one). 3. A research study examines a given group work therapeutic intervention. Group members who receive the intervention interact extremely closely with each other. Is the independence of observations assumption met? What do you suggest doing? As Section 12.5 on pages 219-220 discusses, it can be very difficult to know whether, strictly speaking, this assumption is met in situations where study participants interact considerably, and thus the progress of one participant(s) could affect that of another(s). Further, even where the assumption is violated, the effect of this violation can be difficult to assess. And, further, there may be no easy remedy. To respond: We don’t know assuredly that the assumption has been met. Pragmatically, one would likely assume that the assumption was met and go ahead and carry out the appropriate statistical procedures. 4. A large state agency wants to determine the functioning level of clients receiving services at its mental health clinics and, thus, administers a scale to clients at one particular clinic. 39 The mean score at this clinic is 25 with a standard deviation of 10. Can one appropriately conclude that the mean score of clients in the state is 25 and that the standard deviation is 10? To make valid statistical inferences to a population, one’s sample must have been randomly selected from that population. No, as the sample is not a random one, the conclusion is not appropriate. 5. Presume for the prior problem that a random sample of 25 clients from across the state was selected. (Such clients would be selected from among those at all clinics across the state rather than from just one.) Can one appropriately conclude that the mean score of clients in the state is 25 and that the standard deviation is 10? Is 25 the best estimate that one can make of the mean score? Is 10 the best estimate of the standard deviation? Is the estimate of the mean unbiased? Is the estimate of the standard deviation unbiased? (For this question, the reasoning behind and the actual answer are combined into one response which follows: As we are estimating the population parameters, we can’t appropriately conclude the population mean is exactly 25 or that the standard deviation is exactly 10. But given that the sample is a random one, these are our best estimates. The estimate of the mean is unbiased. The estimate of the standard deviation is biased but the degree of bias is negligible. (That bias is so small that we may ignore it.) 6. The mean household income in a large city is $45,000 with a standard deviation of $20,000 dollars. The distribution of income is positively skewed. Suppose that one took an infinite number of random samples of size 100 and for each one recorded its mean. a. b. c. d. e. What is the special name of the distribution that one would build by the above process? What would be the mean of the distribution? What would be the standard deviation of the distribution? What would be the shape of the distribution? What percentage of cases would be located within one standard deviation of the distribution’s mean? The central limit theorem as presented in Section 12.8 (pages 223-5) guides reasoning. The mean will be the population mean. The standard deviation will be that in the population divided by the square root of sample size. Given that N > 100, the 40 distribution’s shape will be nearly normal. The percentage of cases within one standard deviation is common knowledge by now. a) the sampling distribution of the mean b) $45,000 c) $20,000  100 = $20,000  10 = $2,000 Note that in this document the square root symbol (√) is always interpreted as spanning the number(s) that follows it. For instance, √100 conveys the square root of 100. d) close to normal e) about 68% 7. All facts the same as in prior problem except that sample size is 400 rather than 100. Respond to the same questions. a) b) c) d) e) 8. the sampling distribution of the mean $45,000 $20,000  400 = $20,000  20 = $1,000 close to normal about 68% Continuing to think about the prior problem. If the mean of a given sample is $44,000, what is the sampling error? As presented on page 223, to determine sampling error, subtract the population parameter from the sample statistic. Sampling error = $44,000 - $45,000 = -$1,000 9. (T or F) In actual research studies, the precise amount of sampling error can be calculated. False. In actual research, one knows only the value of the sample statistic. As one has taken only a (random) sample, the exact value of the population parameter is unknown. One does know, however, that as samples get larger and larger, the likely amount of sampling error tends to decrease. Even though one knows this, s/he does not know the precise sampling error in any particular study. Chapter 13 41 1. There are 100 students in a social work class. The mean age of these 100 students is 25.0 years with a standard deviation of 6.0 years. What is the 95% confidence interval for the mean age of students in the class? Trick question: Think about whether a confidence interval is needed here, or, on the other hand, whether you already know more than the confidence interval would tell you. Where one has access to the full population, there is no need for a confidence interval. We already know the mean age exactly. Hence, there is no need to compute a confidence interval. Indeed, computing one would be illogical, nonsensical, and wrong. 2. Consider again the 100 students in the social work class in the prior problem whose mean age is 25.0 years with a standard deviation of 6.0 years. Presume that they attend a large state university with, say, 60,000 students. What is the confidence interval for the mean age of students at the University? (Think carefully before responding.) For a confidence interval to be accurate, the sample used to compute it must be a random sample from the population to which the confidence interval applies. The social work students in the class are not a random sample of students at their University. Hence, it is inappropriate to calculate a confidence interval. 3. In a given state, close to 100,000 persons receive a given welfare benefit. The mean age of 100 such persons in one city in the state is 22.0 with a standard deviation of 8.0 years. What is the 95% confidence interval for the state? (Think.) Confidence intervals are only valid when the sample is randomly selected from the population for which the interval is formed. As the persons all come from one city, they are not a random sample of persons from across the state. Hence, a confidence interval may not be formed. 4. Exact same situation as presented in prior problem, except presume that the sample is a random one drawn from the 100,000 persons in the state who receive the benefit. What is the 95% confidence interval for the state? As sample is a random one, one may calculate the confidence interval. To do so use steps shown on page 237. 42 S X = 8/100 = 8/10 = 0.80 95%CI = 22.0  (1.96  0.80) = 22.0  1.57 = 20.43 to 23.57 5. Calculate a 99% confidence interval for the same data, but this time presume that the distribution is positively skewed. So long as sample size is 100 or greater, the shape of the sampling distribution will be nearly normal. This is the case even when the distribution from which samples are selected is distinctly nonnormal. Hence, the fact that the distribution is positively skewed does not prevent us from calculating the confidence interval. We simply need to use the formula for the 99% interval as presented on pages 238-9. (The estimate of the standard error of the mean was calculated in the prior problem.) 99%CI = 22.0  (2.58  0.80) = 22.0  2.06 = 19.94 to 24.06 6. 144 participants in a welfare-to-work program are randomly selected from a much larger number who participated in the program. The mean income in the 12-month period following the termination of the program for these 144 participants is $19,500 with a standard deviation of $8,100. What are: a. b. c. d. e. f. g. The shape of the sampling distribution of the mean? To determine the CI, would the researcher actually construct a sampling distribution of the mean by drawing many random samples? The estimate of the standard error of the mean? The 95%CI? The 99%CI? Which confidence interval is wider? Which confidence is better, the 95% or the 99%? (Think carefully.) Logic for the answers follows: a. b. c. d. e. f. Wherever N > 100, we know the shape of the sampling distribution of the mean is nearly normal (we know this regardless of the shape of the distribution from which samples are selected). This distribution is theoretical and is estimated via drawing only one random sample (the study sample). Simply divide the standard deviation of the sample by the square root of the sample size Carry out the formula on pages 236-7 Carry out the formula on pages 238-9 We don’t even need to do the calculations to answer this question – other things being equal (i.e. standard 43 g. deviation and sample size), the 99%CI is always wider than the 95% CI Neither is “better” – the 95%CI is narrower (more precise) but we have less confidence (95%) that it includes the population mean. The 99% is wider (less precise) but we have greater confidence (99%) that it includes the mean Here are the answers: a. b. c. d. e. f. g. 7. Nearly normal No. $8,100 / (144) = $8,100/12 = $675 $19,000  1.96($675) = $19,000  $1,323 = $17,677 to $20,323 $19,000  2.58($675) = $19,000  $1,742 = $17,258 to $20,742 99% is wider neither is better (though the 95%CI is certainly much more common) A random sample of 324 residents in a large city responds to a poll on social services. Fifty-eight percent support plans to build a new community facility to serve those with serious mental illness and 42% oppose these plans. a. b. c. d. e. f. g. Is the sample size sufficient for calculating a confidence interval for the proportion who support the facility? What is the standard error of the proportion? What is the 95%CI of the proportion? Is the proportion in the city population (the population from which the random sample was selected) located within the range specified by the confidence interval? How much confidence can we have that the true proportion (unknown) in the city population is within the range specified by the 95%CI? What is the 99% confidence interval of the proportion? How much confidence can we have that the proportion in the population is within the range specified by the 99%CI? Here are the answers: a. Yes, via Table 13.1 on page 241, where the sample proportion is .58 the minimum needed sample size is 13. (324 greatly exceeds 13). b. S p  (.58)(1  .58) / 324  (.58)(.42) / 324  .24 / 324  .0074  .027 44 c. d. e. f. g. 8. 95%CI = .58  1.96(.027) = .58  .05 = .53 to .63 No way to know for sure, as we have only selected a sample for study 95% (as 95% of 95% CI’s include the population mean, we have 95% confidence that any given one does so) 99%CI = .58  2.58(.027) = .58  .07 = .51 to .65 99% About 100,000 clients participate in a public welfare program. In a random sample of 289 clients, 72% hold jobs one year after their discharge from the program. a. b. c. Is sample size sufficient for computing a confidence interval? What is the estimate of the standard error of the proportion? What is the 95%CI? Answers follow: a. Where p = .72 via Table 13.1 on page 241, the minimum sample size is 60, so the size guideline is met. b. S c. 9. p  (.72)(1  .72) / 289  (.72)(.28) / 289  .20 / 289  .0070  .026 95%CI = .72  1.96(.026) = .72  .05 = .67 to .77 An agency serves 400 clients in a given year. It selects a random sample of 200 clients and sends them a survey. 100 of these clients respond. One question on the survey asked: “Would you recommend this agency to a friend?” Eighty-three percent replied yes. a. b. c. d. Is sample size sufficient to calculate at 95%CI? What percentage of the population was included in the study sample? How does this affect whether one may carry out the confidence interval formulae presented in the Chapter 13? What is the response rate? Do you think the lower than desired response rate introduces possible bias? Should you compute a confidence interval? Here are the answers: a. Where p = .83 the minimum recommended sample size is 180. As the number who responded (N = 100) is less than this, the formulae presented in the chapter are not sufficiently 45 b. c. accurate. (Footnote 3 on page 248 refers one to an appropriate formula for this situation.) 50% of population was sampled (200/400 = 50%). This exceeds the 10% guideline as presented at beginning of Section 13.4 on page 244. This being the case, the confidence interval formulas presented in the Chapter do not yield accurate results. See note 5 on page 248 for how to deal with this situation. 100/200 = .50  100 = 50% response rate. Problem with a low response rate is that one can never be sure of whether response bias has been introduced and if so, of its direction. A logical conjecture in this situation is that those who respond have, on balance, more positive opinions than those who do not. Hence, my best guess is that in the full agency population of 400 clients, less than 83% would make a positive recommendation to a friend. Chapter 14 1. What is the probability of selecting at random from a normal distribution a case with a z score of 0.66 or higher? For this problem, simply look in the third column of A.1 on page 448 which indicates the proportion of cases greater than the given (positive) z score. This proportion is the probability. The probability is .255. 2. What is the probability of selecting at random from a normal distribution a case with a z score of -0.66 or higher? To find the proportion of cases greater than a negative z score, one needs first to find the proportion of cases between that score and the mean and then to add on .50 to reflect those cases above the mean. This proportion is the probability. The second column of Table A.1 on page 448 indicates that the proportion of cases between the mean and a z score of -0.66 is .245. Next: .245 + .500 = .745. Thus, the proportion of cases with z scores greater than 0.66 is .745. Hence, p = .745. 3. The appropriate statistical significance test is used to examine which of two interventions is more effective at reducing depression. (The test compares results for clients randomly assigned to one intervention or the other. Presume also that study participants were randomly selected from a very large 46 population.) The chosen statistical significance level is .05. The probability resulting from the test is .29. As .29 > .05, the null is accepted. May one: a. b. c. d. Decide/conclude that the null is true. Be sure that the null is true. Be confident that the null is true. If your response to b. and c. above is “No”, what may you conclude? The answers follow: a. b. c. d. 4. Yes one decides/concludes that the null is true. Yet, in making this decision, they recognize that they do not have access to all of the cases in the population and, thus, hold open the possibility that their decision could be wrong (in other words, they recognize the possibility that the null could be false). No. Given that only a sample is studied, one can never be 100% sure that a null is true. (And it is also the case that one can never be 100% sure that it is false.) No. Even this language is too strong. One cannot be confident that the null is true. Basically, one may conclude that the null is not inconsistent with study results. In other words, given a true null, the study results are not unusual or unexpected. The fact that the null is consistent with (not inconsistent with) study results does not allow us to conclude that the null is the only hypothesis that is consistent. As such, we cannot be confident that it is true. One more comment: In this example, we may also conclude that sampling error alone is a plausible explanation for the study results. (See first paragraph of Section 14.8 on page 257.) A researcher finds extremely strong support in the literature that Therapy A enhances self-esteem in young adults. She finds also many studies indicating that Therapy B does not do so. She decides to do a study comparing these two therapies. As such, she randomly assigns participants to Therapy A or Therapy B and conducts the appropriate statistical test on self-esteem scores. a. b. Would you be inclined to use a directional or a nondirectional hypothesis pair? Formulate a directional pair. (Your pair should pertain to mean self-esteem scores.) Here are the answers: 47 a. b. 5. Though the choice of which type of pair to use is up to the researcher, the compelling findings from prior studies would recommend a directional pair. The “trick” here would be to be sure that the direction of expected study results is stated in the research hypothesis (rather than in the null). The pair might read something like: Null: The mean self-esteem score of those who receive Therapy A is less than or equal to that of those who receive Therapy B. Research: The mean self-esteem score of those who receive Therapy A is greater than that of those who receive Therapy B. For each of the following, indicate whether one would accept or reject the null given that the .05 statistical significance level has been selected. (p stands for the probability of the study sample result.) a. b. c. d. e. p p p p p = = = = = .008 .052 .0500000000000... (exactly .05) .23 .02 Answers: a. reject; e. reject 6. c. reject; d. accept; Using the information presented in the prior problem, indicate whether to accept or reject the null using the .01 statistical significance level. Answers: a. reject; e. accept 7. b. accept; b. accept; c. accept; d. accept; (T or F) Whenever one accepts (fails to reject) the null, the study result is not statistically significant. True. These two decisions always go together. 8. (T or F) Whenever one rejects the null, the study result is statistically significant. True. 9. These two decisions always go together. (T or F) In deciding that the null is true, one is essentially deciding that sampling error alone is a plausible explanation for 48 the difference between the study sample result and the condition stated in the null. True. One is concluding that the study sample result could simply reflect sampling error. (If the sample result is due to chance alone, there is no real difference in the population and, thus, the null is true.) Chapter 15 1. (T or F) Where a test is robust to an assumption, violation of that assumption greatly reduces the accuracy of that test. False. The key idea behind robustness is that even when the assumption is not met, the probability resulting from the test is still highly accurate. Where a test is not robust to an assumption, violation of that assumption can greatly reduce accuracy. In such a situation, the test should not be used. 2. (T or F) Where a test is not robust to an assumption, violation of that assumption can greatly reduce the accuracy of that test. True. 3. (T or F) Where all of the assumptions of a test are met, that test yields accurate results, that is, accurate probabilities True. So, the gist is, one may carry out a given test when: 1) its assumptions are met or 2) its assumptions are not met but the test is robust to violated assumptions. But one should not carry out a test where both of the following are true: 1) an assumption is violated and 2) the test is not robust to that assumption. 4. (T or F) Hypothesis statements pertain to the sample rather than to the population from which the sample was randomly selected. False. 5. Hypothesis statements pertain to the population. Presume that the mean of a given population is 10.0 points with a standard deviation of 1.0 point and that that the population is negatively skewed.. Suppose that one picks an infinite number of random samples of a size 100. What is ... a. b. c. The mean of the sampling distribution The standard deviation of the sampling distribution The shape of the sampling distribution 49 d. e. f. g. The percentage of samples with means greater than 10.1 The percentage of samples with means that are either > 10.1 or < 9.9 The probability of selecting at random a sample with a mean > 10.1 The probability of selecting at random a sample with a mean > 10.1 or < 9.9 Answers follow: a. b. c. d. e. f. g. 6. 10.0 .1 (1.0  100 = 1.0  10 = 0.1) nearly normal (the key point here is that the sampling distribution has a normal shape even though the population from which samples were selected has a non-normal shape) 16% (this is the percentage of cases in a normal distribution that are more than one standard deviation above the mean; a score of 10.1 is 1 standard deviation above the mean) a score of 9.9 is one standard deviation below the mean; so this question is asking for the percentage of cases that are either more than one standard deviation above the mean or more than one standard deviation below (the combined percentage); this percentage is 32% if 16% of samples have means greater than 10.1 then the probability of randomly selecting such a sample is .16 .32 (just double the probability from the prior question) Presume that the probability resulting from a given large sample test equals .08. Presume also that the test is two-tailed, that the .05 significance level is used, and that test assumptions are met. a. b. c. d. e. f. g. What percentage of means in the sampling distribution differ from the mean stated in the null by more than does the study sample mean? What percentage of sample means in the sampling distribution are more extreme than the study sample mean? What is the probability of selecting at random from the sampling distribution, a sample with a mean that differs more from the mean stated in the null by more than does the study sample mean? What is the probability that the study sample result is due to chance alone? Given a true null, what is the probability of obtaining the study sample result or an even more extreme result? Is the study sample result likely to be due simply to chance? Is chance alone a plausible explanation for the study sample result? 50 h. i. Should the null be rejected? Is the result statistically significant? Answers follow: a. b. c. d. e. f. g. h. i. 6. 8% (see top of p. 275 for discussion) 8% (see top of p. 275 for discussion) .08 (see top of p. 275 for discussion) .08 .08 The results of significance testing indicate the probability that the study sample result is due to chance alone. As this probability is only .08 (8 times in 100), we would not conclude that it is likely that the result is due to chance alone. On the other hand, via the criterion of the .05 significance level the result (p = .08) is not sufficiently unlikely (sufficiently rare) for us to conclude with 95% confidence (see page 257) that chance alone is not the explanation. In a nutshell, though chance alone is a reasonably unlikely explanation (8 times in 100), it is not sufficiently unlikely (needs to be 5 times in 100), for us to rule out chance with sufficient (95%) confidence to reject the null. Yes, even if chance alone is a somewhat unlikely explanation, it is a plausible explanation No, the probability given by the test (.08) exceeds that associated with the significance level (.05) and so the null is accepted (we fail to reject the null) No. Whenever one fails to reject the null, the result is not statistically significant. (T or F) When the probability that chance alone could lead to a result that is as extreme as or more extreme than the study sample result is less than alpha, one rejects the null. True. (As you should know, alpha is the probability associated with the significance level.) When the probability that the result is due to chance alone is less than alpha, one rejects the null. 7. In a large state, the mean family income of persons who have graduated from its public welfare programs is $14,000. A random sample of 225 persons takes part in an intensive program. The mean family income in this sample is $14,900 with a standard deviation of $4,500. The null hypothesis is that the mean of the population from which the study sample was drawn equals $14,000. The research hypothesis is that it does not equal $14,000. Based on the just-presented information, respond to the questions 23a to 23l on page 293 of the text. 51 a. b. c. d. e. f. g. h. i. j. k. l. 8. Nondirectional (as equal/not equal logic rather than greater than/less than logic is used.) The mean is simply the value stated in the null: $14,000. To estimate the standard deviation, divide the sample standard deviation by the square root of the sample size:$4,500  225 = $4,500  15 = $300 (see p. 280 for an example of calculation). The shape is extremely close to normal. z = ($14,900 - $14,000)/$300 = $900/$300 = 3.00 Let’s reword this question to ... a mean as high as or even higher than”. The study sample mean has a z score of 3.00. Via the normal distribution table on page 448-9, the proportion of cases with z scores > 3.00 is .0013. So, the probability is .0013. To solve this, simply multiply the proportion from the prior problem by 2: 2  .0013 = .0026 As the hypothesis pair is nondirectional, the statistical test is two-tailed. This is the proportion of samples with means that are as extreme as or more extreme than the study sample mean. This proportion is .0026 (as the test is two-tailed, cases in both directions were considered) .0026 is < .05 Where alpha equals .05, the rejection area in each tail includes 2.5% of cases Yes. [In a two-tailed test where alpha equals .05, the lower rejection area consists of cases where z  1.96 and the upper area consists of cases where z  1.96. In our example, z = 3.00.] We reject the null and accept the research hypothesis. .0026 [The probability that result is due to chance alone is always the same as – indeed is the same thing as - the probability of obtaining the result given a true null.] (T or F) When a result is not statistically significant, the researcher (correctly) reasons that it may be due to chance alone. True. This is the nuts and bolts meaning of a result that is not statistically significant. (In this case, one accepts the null.) 9. (T or F) When a result is statistically significant, the researcher (correctly) reasons that it is unlikely to be due to chance alone. True. This is the nuts and bolts meaning of a result that is statistically significant. (In this case, one rejects the null.) 52 10. At which level of significance, .05 or .01, does one have greater confidence that the null is indeed false? (In responding presume that the null is rejected.) One has 99% confidence that the null is false when the .01 level is used, but only 95% confidence when the .05 level is used. So, one has greater confidence at the .01 level. 11. Where the research hypothesis states “greater than,” the rejection region is in the ________ tail. Where it states “less than,” it is in the _______ tail. Greater than -> upper; less than -> lower 12. Indicate the appropriate decision regarding the null (accept or reject) for each large sample test of X � result : a. b. c. d. e. f. g. h. Two-tailed test, alpha = .05, Two-tailed test, alpha = .01, One-tailed test, alpha = .05, greater than, z = 2.22 One-tailed test, alpha = .01, greater than, z = 2.22 One-tailed test, alpha = .05, greater than, z = 2.22 One-tailed test, alpha = .05, less than, z = -2.22 Two-tailed test, alpha = .05, One-tailed test, alpha = .05, hypothesis states less than z = 2.22 z = 2.22 research hypothesis states research hypothesis states research hypothesis states research hypothesis states z = -1.88 z = -1.88, research Answers follow: a. b. c. d. e. f. g. h. reject as absolute value of obtained z  1.96 accept as absolute value of obtained z is < 2.58 reject as obtained z  1.645 accept as obtained z < 2.33 accept as obtained z < 1.645 (result in this example is in opposite direction to that expected; opposite to that stated in the research hypothesis) reject as obtained z  1.645 (as research hypothesis states less than, rejection region is in the negative tail) accept as absolute value of obtained z is < 1.96 reject as absolute value of obtained z is  1.645 (Here is an example of where a one-tailed test rejects the null but a two-tailed one (see prior example) does not.) 53 Chapter 16 13. (T or F) As sample size increases, other things being equal, statistical power increases. True. As sample size gets larger, the width of the sampling distribution “shrinks” and, thus, smaller differences from the null result in rejection. In essence, it gets easier to reject the null and power increases. 14. (T or F) The greater the likely amount of sampling error, the greater the statistical power. False. Absolutely false. When a result differs from the condition stated in the null, one never knows assuredly whether this difference is 1) simply due to sampling error or 2) also reflects a real difference (i.e., that the null is false). The less the likely amount of sampling error, the greater the ability of the statistical test to rule it out as the sole explanation for the sample result. (Viewed differently, the less the likely amount of sampling error, the greater the test’s ability to detect real differences.) Where sampling error can be ruled out with sufficient confidence (at the .05 level, with 95% confidence), one may reject the null. 15. Where the risk of a type II error is .25, what is the statistical power. The risk of type II error is symbolized by  (beta). As presented on page 297: Power = 1 -  Power = 1.00 - .25 = .75 16. Ho:  = 50, s = 20,  = .05, two-tailed test. For each of the following: 1) estimate the standard error of the mean and 2) indicate (approximately) the range of sample results that result in rejection of the null at each of the following sample sizes. a. n = 100 b. n = 400 c. n = 1600 d. n = 6400 In case you need help in “decoding” symbols,  (pronounced “mew”) conveys the hypothesized sample mean; s is the standard deviation in the study sample; and  is alpha, the probability associated with the significance level. For simplicity, we will assume that a difference of two standard deviations (rather than 1.96) provides sufficient accuracy. Basically, any result that is more 54 than two standard deviations (two times the estimated standard error of the mean) less than or greater than  will result in rejection. So to answer the problem(s), one needs to: 1) estimate the standard error (s divided by the square root of the sample size), 2) multiply the estimated standard error by 2, 3) add and subtract from the value stated in the null ( = 50), and 4) write out the approximate range of values that result in rejection. Here follow the answers: a. b. c. d. 17. S S S S X X X X = 20/100 = 20/10 = 2; 2  2 = 4;  46 or  54 = 20/400 = 20/20 = 1; 1  2 = 2;  48 or  52 = 20/1600 = 20/40 = 0.5; .5  2 = 1;  49 or  51 = 20/6400 = 20/80 = 0.25; .252 = .5;  49.5 or  50.5 Consider results for the prior problem as you respond to the following questions. a. b. As sample size increases, does the size of the difference from the condition stated in the null that is necessary for rejection increase or decrease? (Presuming that the null is false), as sample increases does power increase or decrease? Some general comments follow. As sample size increases, the standard error of the mean gets increasing smaller, which tells us that the sampling distribution is getting progressively narrower (see page 300 for visual presentation of this concept). As sample size gets progressively larger, the likely (expected) amount of sampling error is getting smaller. (This is illustrated by the progressively “shrinking” sampling distributions on page 300.). The less the expected sampling error, the easier the job of the statistical test, which is to see the degree to which it can rule out sampling error as a plausible explanation for the difference between the condition stated in the null and the sample result. Answers follow: a. b. 18. The necessary size of the difference decreases. Power increases. Suppose that you want to know which of two interventions A or B is more effective in helping mothers of high risk new-born babies keep their post-natal medical appointments. Suppose further that 55 you have 10 such mothers on your caseload. Presume that you randomly assign 5 to receive each intervention. Comment on: a. The degree to which confounding variables can be ruled out as explanation for observed differences that you may see. b. The likely statistical power. c. Presume that the null is accepted. This being the case, should one conclude that the two interventions are equally effective? My comments follow: a. The fact of random assignment rules out confounding variables. There is no reason to be concerned that systematic differences between the two groups would affect your results. So, if you would see a statistically significant result you could have reasonably good confidence that the explanation for that result is the treatment intervention (greater effectiveness of one intervention than the other) rather than some confounding variable (one group had greater initial motivation to seek medical care than the other, etc.) b. The problem here is poor statistical power. (One can’t be sure the power is poor as, if, for instance, one intervention was anticipated to work much, much, much better than the other, the small sample size here might be sufficient for adequate power (see Section 16.5.4 on page 305). But presuming that one was not expecting a huge difference in effectiveness, power would be poor.) Given the small sample size, chance fluctuations (same thing as sampling error) are likely to be quite large in this situation. Even if the study result reveals a large difference from the condition stated in the null, this difference will, most likely, not be sufficient to rule out sampling error as a plausible explanation. Another way to think about this is to reason that just by the luck of the draw (luck of the random assignment process), the characteristics of the two groups may differ considerably. For instance, with only five in each group, it is not unlikely that those mothers (randomly) assigned to one group may be a much more motivated group of mothers than those assigned to the other. c. Here is where “fail to reject” so much more clearly expresses the conclusion that you should draw than does “accept”. Your conclusion should be that there is insufficient evidence to conclude that one intervention works better than the other. You appropriately conclude that the null (no difference) provides a plausible explanation for the study result. In “accepting” the null, 56 you are concluding that sampling error alone is a sufficient explanation for the difference between the study sample result and the condition stated in the null (no difference). The null is the simplest explanation, and, so, if it is sufficiently consistent with study results, we accept it. However, the fact that the null is consistent, does not affirm that it is true. Another hypothesis may also be consistent. In the current example, power was low and we never had a realistic opportunity to reject the null. Viewed differently, we never had a realistic opportunity to consider whether the research hypothesis might be true. (See Section 14.10, pages 260-1.) 19. (T or F) Presuming that the study sample result is in the expected direction, the size of the difference that is necessary to reject the null is smaller for a one-tailed test than for a two-tailed one. True. With a one-tailed test, assuming the result is in the expected direction, a smaller difference will result in rejection. (And, thus, power is greater.) 20. A school of social work wants to see whether there is a statistically significant difference between the mean salaries of its female(n = 80) and male (n = 8) graduates. Comment on the likely statistical power. Via Table 16.1 on page 308, one is tempted to respond “fair/moderate” and this is not a bad answer. However, there is very little variability in gender (more than 90% of graduates are female). This low variability reduces power (see last sentence in Section 16.7 on page 308), so “poor” or “low” are better responses. 21. You read the results of a very large experiment involving 10,000 women and 10,000 men that tells you that “the risk for men of having “Health Problem A” exceeds that for women by a statistically significant amount.” Assuming that this is the only information reported, why should one not conclude that there is (necessarily) a large difference in risk between men and women? (If it is a statistically significant difference, it follows that it is a large difference, doesn’t it?) Where sample size is very large, even trivial-sized differences may be statistically significant. This is so because with large sample sizes, expected sampling error is very small, so even tiny differences between the sample result and the condition stated in the null (in this example, equal risk for men and women) may be sufficient to reject the null. This being the case, you may not, in this situation, conclude that the difference in risk is large. 57 The difference may be large, it may be medium, it may be small, or it may be trivial. From the provided information, you don’t know which of these is the case. (Some related examples are presented on page 303.) 22. When one uses the .05 significance level, does p = .04 convey a statistically significant result? Should one reject the null? Yes, the result is statistically significant. Yes, one should reject the null. Statistical significance and rejecting the null go hand in hand; they are one and the same. When the result is statistically significant, you always reject the null. 23. Pragmatically speaking, may statistical significance testing be carried out in the absence of random sampling? Most researchers say yes. This is the case because whether chance is a possible explanation for the study sample result is often an important question even when the sample is nonrandom. In essence, one presumes random sampling (as this is a basic assumption of, in essence, all statistical tests), and then carries out the significance test. 24. (T or F) (Presuming that the null is false), where power equals .60, the probability that a correct decision to reject the null will be made is .60. True. Power is the probability that a false null will be rejected. 25. (T or F) Where power equals .30, the significance test has an excellent opportunity to reject the null. False. In this situation (given that the null is false), the test has only a 30% chance of resulting in the correct decision to reject. 26. (T or F) The statistical power in a given situation is .50. Via the standards of social science research, this is an acceptable level of power. False. By tradition, social science’s standard for acceptable power is .80. 27. (T or F) When a result is statistically significant, one accepts (fails to reject) the null. False. When a result is statistically significant, one rejects the null. These two decisions/events go hand-in-hand and are, in essence, the same thing. 58 28. (T or F) When a result is statistically significant, the difference between the condition stated in the null and the study sample result is likely to be due to chance alone. False. Absolutely false. When result is statistically significant, this difference is unlikely to be due to chance alone. It likely reflects a real difference, something more than chance. Stated differently, where a result is statistically significant, sampling error alone is an unlikely explanation for the sample result. As such, one concludes that (in the population from which the study sample was randomly selected) the null is false, that is, one rejects the null. Chapter 17 1. (T or F) When the shape of the distribution of the population is strongly skewed, one should have a sample size of at least 100 in order to conduct a highly accurate one-sample t test. False. In such a situation a sample size of 30 suffices (see Table 17.1 on page 324). 2. (T or F) As sample size increases, the shape of the t distribution and that of the normal distribution become increasingly different. False. As sample size increases, the shape of the t distribution increasingly resembles that of the normal distribution. 3. The shape of the distribution of scores in a study sample has a very strong (extreme) positive skew (N = 11). In this situation, should one calculate a confidence interval? Why or why not? No. One uses the shape of the study sample distribution to estimate that of the population from which the sample was (randomly) selected. Where the shape of the population is skewed to an extreme degree, Table 17.1 guidelines recommend a minimum sample size of 60. (Given extreme skew), where the sample size is less than 60, the shape of the sampling distribution does not closely approximate a t distribution. This being the case, confidence intervals that make use of the t distribution are not accurate. 4. The mean score of a random sample of 25 clients on a depression inventory is 50.0 points with a standard deviation of 10.0 59 points. The shape of the study sample’s distribution is a modest negative skew. a. b. c. d. e. f. g. h. Is sample size sufficient to calculate a confidence interval? What is the estimated standard error of the mean (sX�) How many degrees of freedom are there? What is the value of t in Table A.2 on page 450? What is the 95%CI? Interpret in your own words the “meaning” of the justcomputed 95% confidence interval. What is the 99% confidence interval? Interpret the just-computed 99% confidence interval. Answers follow: a. b. c. d. e. f. g. h. 5. Yes, via Table 17.1 on page 324 a sample of 8 would have been sufficient. = 10.0/25 = 10/5 = 2.00 X For confidence intervals for sample means using the t distribution: df = N - 1, so df = 25 - 1 = 24 Tracing down the 95%CI column to df = 24 yields a t of 2.064 95%CI = 50  (2.064)(2.00) = 50  4.128 = 45.872 <> 54.128 We can be 95% confident that the mean depression score in the population from which the study sample was selected is between 45.872 and 54.128. 99%CI = 50  (2.797)(2.00) = 50  5.594 = 44.406 <> 55.594 We can be 99% confident that the mean depression score in the population from which the study sample was selected is between 44.406 and 55.594. S The mean behavior problems score on a given scale of 64 children who were adopted at or near in infancy is 80.0 points with a standard deviation of 24.0 points. The mean score of a large representative sample of “typical” children is 75.0 points. Your task is to determine whether the difference between the mean score of the sample of adopted children differs to a statistically significant degree from that of the representative sample of “typical” children. Your hypothesis pair should be directional and the research hypothesis should state a higher mean for the adoptees. Use the .05 significance level. Presume that the study sample has a moderate negative skew. Use the justprovided information to respond to questions a. through m. for question 21 on page 335. a. Null: the mean of the population from which the sample of adopted children was randomly selected is less than or equal to 75.0. 60 Research: the mean of the population from which the sample of adopted children was randomly selected is greater than 75.0. Or to state the same in shorter format: Null: b. c. d. e. f. g. h. i. j. k. the mean for the adopted children is less than or equal to 75.0 Research: the mean for the adopted children is greater than 75.0 The one-sample t test should be used. You would not use the large sample test from Chapter 15 as the sample size is less than 100. Yes, as 64 is greater than the recommended minimum of 15 presented in Table 17.1 on page 324. You were instructed to use a directional pair. Whenever a directional hypothesis is stated, the appropriate test is a one-tailed one. df = N - 1 = 64 - 1 = 63 Via Table A.2 on page 450: Locating the column for a onetailed test where  = .05, the critical value of t for 60 degrees of freedom is 1.671 (the table does not list degrees of freedom for 63 degrees; as such, one references the next lower degrees of freedom, which is 60). Note that in this example the research hypothesis stated “greater than”; had it stated “less than” the critical t would have been -1.671 (see decision rules on pages 324-5 in Section 17.6.5). = 24.0/64 = 24.0/8 = 3.00 X Carrying out formula 17.2 from page 319 and as illustrated on page 330: t = (80.0 - 75.0)/3.0 = 5.0/3.0 = 1.60 In making decision, we reference the decision rule for a directional hypothesis where the research hypothesis states “greater than” as presented on page 328. In our example, the obtained t of 1.60 is less than the critical t of 1.67. Hence, via the decision rule, the null is accepted (we fail to reject the null). Accepting the null, does not allow us to conclude that it is likely that the null is true. When we reject the null, we can be confident that the null is false (at the .05 level, we are 95% confident that the null is false). But when we accept it, the best that we can do is conclude that it is plausible that the null is true. In the present example, the t test just missed rejecting the null, so we can be almost 95% confident that the null is false. But via the rules of significance testing being almost 95% confident is not sufficient for rejection;, so, thus, we accept the null. (See Section 14.10 in Chapter 14 on pages 260-1 for more on this.) S 61 Chapter 18 1. Suppose that you have a first population with the following characteristics:  = 10.0,  = 10.0 and a second where  = 10.0,  = 7.0. What is the mean, standard deviation, and shape of the sampling distribution of the difference between means if N = 100 for the first population and N = 49 for the second? Presume that both population distributions have moderate positive skews. Comments: Section 18.2.2 on pages 337-9 presents the logic for responding here. The question is asking for the characteristics that would result from: 1) selecting a sample of size 100 from the first population, one of size 49 from the second, 2) calculating the mean of each sample, 3) subtracting the mean of the second from that of the first, and 4) plotting this result. One would repeat steps 1 through 4 for an infinite number of pairs of samples. Answers: The mean of the sampling distribution is obtained by subtracting 2 from 1: 10 - 10 = 0. Formula 18.1 on page 338 is used to determine the standard deviation of the sampling distribution (note that variances rather than standard deviations are used in the formula 100 49   1  1  2  1.41 100 49 The shape of the sampling distribution is very close to normal. 2. Twenty-five married couples participate in a study to determine whether wives or husbands report greater marital satisfaction. Consider wives to be Sample 1 and husbands to be Sample 2. Respond to the following questions: a. b. c. Are the samples independent or dependent? Are the scores in Sample 1 independent from those in Sample 2 Which test, the dependent samples t test or the independent samples t test should be used in this situation? Answers follow: a. Wives and husbands “go together” as pairs, so the samples are dependent 62 b. c. 3. No, the scores in the two samples would be presumed to be related. Stated differently, one would expect that scores of wives and their husbands are associated. More specifically, one would expect a positive correlation between these scores. (For instance, where wives have high satisfaction one would expect (at least to some degree) the same for husbands. Where wives are unsatisfied one expects (at least to some degree) the same for their husbands.) Given that the samples are dependent, the dependent samples t test should be used. Sally’s score is 30.0 in Sample 1 and 40.0 in Sample 2. her difference score? What is To determine a case’s difference score, subtract the mean of the second sample from that of the first. Sally’s difference score is: 4. 30.0 - 40.0 = -10.0 A researcher prepares to conduct a dependent samples t test. The mean of the first sample is 25.0. That of the second sample is 21.0. What is the mean of the difference scores. As discussed in Section 18.6.3 on page 352, the mean of the difference scores is equal to the difference between the sample means. The mean of the difference scores is: 25.0 - 21.0 = 4.0 5. (T or F) The equal and unequal variances formulae for the independent samples t test always yield identical values of t. False. They typically yield different values. They always yield the same value only when sample sizes are equal. 6. (T or F) When sample sizes are equal, degrees of freedom for equal and unequal variances formulae are always equal. False. Though the value of t will be identical if sample sizes are equal, the degrees of freedom will typically differ. Hence, even when sample sizes are equal, it is important to know which formula should be used, so that the appropriate degrees of freedom can be referenced. 7. At a given point in time, a child guidance clinic serves 72 children. Thirty-six children participate in an intensive play therapy intervention. The children who participate were selected by their therapists. The other 36 children do not participate (presumably because their therapists did not think that play therapy would be beneficial for them). Following the conclusion 63 of the play therapy workshop, mothers of children in both groups fill out a scale that is designed to measure their children’s overall mood. The higher the score, the better the mood. For children who received play therapy, the mean score on the scale is 68.0 points with a standard deviation of 12.0 points. For children who do not receive the intervention, the mean score is 50.0 points with a standard deviation of 12.0 points. Scores in both samples have a moderate positive skew. Presume a nondirectional hypothesis pair. Refer to Question 19 on pages 355-6 and respond to a. through w. [Note: Question g, should be changed to read “the assignment process was not random.”] a. b. c. d. e. f. g. The two samples are independent, so the independent samples t test should be used. (The samples are independent because the two groups of children were not linked or paired in any way.) Null: Means are equal in the populations from which the samples were randomly selected. Yes. The guideline given for the independent samples t test is that both samples should meet the guidelines presented in Table 17.1 on page 324. Where the degree of skew is moderate, Table 17.1 recommends a minimum sample size of 15. As both samples meet this guideline, sample size is adequate. In a sense, the 50 clients are neither a random or a nonrandom sample. They are not a sample at all, but instead represent the full population of children at the clinic. They are certainly not a random sample. But as discussed in Section 16.8 on pages 308-10, we will assume the sample to be random and, by so doing, meet the assumption of the statistical test that this is the case. Clearly, the population is abstract/ hypothetical/ imaginary. (We have presumed the population so that a significance test may be conducted. True. We altered Question G so that it states “the assignment process was not random”. Now to respond: Given that the assignment process was nonrandom, confounding variables make it difficult to reach a causal conclusion. If we find that the difference between means (Mean = 68.0 in therapy group vs. 50.0 for other group) is statistically significant, we can be confident (though not certain) that it is not due to chance (sampling error) alone. However, we will not know regarding whether the difference is due to: 1) the positive impact of the play therapy or 2) differences between the play therapy kids and the other kids that existed prior to the intervention. (These would be confounding variables. For instance, maybe the kids in the play therapy group had closer relationships with their parents at the time that the study began than did kids in 64 h. i. j. the other group and this greater closeness, not the effects of play therapy, explains their higher score on the mood scale.) So, if the difference is statistically significant, we can rule out chance alone as a plausible cause but cannot conclude whether the real factor causing the difference is play therapy (the intervention) or a preexisting difference (a confounding variable(s)). Had the study used random assignment and obtained the same result, we could have: 1) ruled out chance alone as a plausible cause and 2) concluded that the real factor causing the difference was indeed the intervention rather than a confounding variable. (The fact of random assignment would have eliminated systematic bias due to confounding variables.) SDM = (68.0 - 50.0)/12.0 = 1.5 Given that all of three conditions described on page 342 – small sample sizes, unequal sample sizes, greatly differing variances – are not met the present situation is not an exception to the general rule that one uses the equal variance formula when the equality of variance test does not reject the null. So, use the equal variances formula. The formula to estimate this is: 144 144   288 / 36  8  2.83 36 36 Note that even though the equal variance formula is appropriate, we are taking advantage of fact that the unequal variance formula is computationally simpler, and using it (this because the two formulas yield the same value of t when sample sizes are equal). Note also that variances rather than standard deviations are used; to determine the variance simply square the standard deviation: 122 = 144) k. l. m. n. o. p. t = (68.0 - 50.0)/2.83 = 6.36 We have established that we are using the equal variance formula, so degrees of freedom for this formula apply df = 36 + 36 - 2 = 70 (this formula in Section 18.3.4 on page 343 Consulting Table A.2 on page 450: df = 70 is not listed, so we reference the next lower listed df which is 60. Tracing down the two-tailed column where  (alpha) = .05 yields a t of 2.000. As this is a two-tailed test, there are two critical values, 2.000 and -2.000 Values of t  -2.000 or  2.000 result in rejection The absolute value of the obtained t, 6.36, is greater than 2.000, so the null is rejected and the research hypothesis is accepted. 65 q. r. s. t. u. v. w. 8. Rejection of the null signals that chance alone is an unlikely explanation. True. Where one rejects using the .05 level, s/he has (at least) 95% confidence that the null is false. (See the last paragraph of Section 14.8 on page 259.) True (Even though in reality there is no such population, if we assume that such a population exists, the statement is true.) As there is no such population, this conclusion is useful for carrying out the hypothesis testing model but doesn’t really have real-world applicability. True – rejecting the null rules out sampling error as a likely explanation Yes, more so than for question t. above. This conclusion pertains to the study sample which is comprised of real people; so, it does have real-world applicability Neither True or False. The key point is that due to the absence of random assignment we don’t know whether the treatment intervention or a confounding variable(s) is the explanation for the difference between the means of the two samples. We just don’t know. The mean high school GPA of 16 first-born siblings (Sample 1) is 3.25. The mean GPA of the (16) second-born siblings (Sample 2) in these same families is 3.00. The standard deviation of difference scores is 0.64. The difference scores have a moderate positive skew. Presume a directional hypothesis stating that first-borns have higher GPA’s. Set alpha equal to .05. State the null and research hypotheses and after having done so, respond to a. to m. for Question 36 on page 357. Null: The mean difference score is less than or equal to zero. Research: The mean difference score is greater than zero. Alternatively, one could state: Null: Research: The mean GPA of first-borns is less than or equal to that of second-borns. The mean GPA of first-borns is greater than that of second-borns. (See Section 18.6.1 on page 349 and Section 18.6.2 on page 351 for discussion.) a. The cases of the two samples form linked pairs, each pair consisting of a first-born and a second-born from the same family. As the cases form linked pairs, the samples are dependent (paired), and, hence, the dependent samples t test is the appropriate test. (Section 18.5 on pages 3489 discusses dependent samples.) 66 b. c. d. e. f. g. h. i. j. k. l. m. 9. For the dependent samples t test, the shape of the distribution of difference scores is examined in order to see whether the sample size guideline is met. Sample size refers to the number of pairs, which in this case is 16. Via the guidelines in Table 17.1 on page 324, the guideline is met, the number of pairs being 16 and the recommended minimum for a moderate skew being 15. 3.25 - 3.00 = 0.25 The mean of the difference scores is always equal to the difference between the sample means, so this is also 0.25 16 0.64  16 = 0.64  4 = 0.16 t = 0.25  0.16 = 1.56 df = N - 1 = 16 - 1 = 15 As the hypothesis pair is directional, the appropriate t test is one-tailed with a rejection region in the upper tail. (The rejection is in the upper tail because the research hypothesis states “greater than”) As the research hypothesis states “greater than,” we will reject the null if the obtained t is greater than or equal to the t in Table A.2 on page 450. The obtained t of 1.56 is less than the t in the table (for a one-tailed test with 15 degrees of freedom and alpha equal to .05, this t, the critical t, is 1.753). (The applicable decision rule is the rule listed at the bottom of page 328.) Fail to reject the null. Reject the research. We do not want to conclude that it is likely that the difference in means is due to chance alone, but instead that it is plausible (not unlikely) that this is the case. We cannot be 95% confident that the difference is not simply due to chance alone. Suppose that an independent samples t test with 18 degrees of freedom yields an obtained t of 2.32. Presume a two-tailed test at the .05 level. What is the critical value(s) of t? What decision should one make regarding the null? In Table A.2 on page 451, for 18 degrees of freedom, a two-tailed test, and alpha equal to .05, the listed t is 2.101. As presented in Section 17.6.5 on page 328, the decision rule for a two-tailed test calls for rejection when the absolute value of the obtained t equals or exceeds the listed t. Therefore, obtained t’s  the negative of the listed t or  the listed t result in rejection. Hence, in this example, the critical values of t are 2.101 and 2.101. The absolute value of the obtained t (|2.32| = 2.32), exceeds the listed t (2.101). Therefore, the null is rejected. 67 Chapter 19 1. A mother surmises that her young son wanders into her bedroom to sleep on 50% of nights. She wants to decrease how often this occurs. Working with a counselor, she develops a “demonstration” plan designed to decrease this behavior. In the next month, the son wanders into the mother’s room on 7 of 30 (23.3%) days. Your task is to determine whether the percentage of times that the son wandered in during the month following intervention represents a statistically significant decrease from 50%. Your hypothesis should be directional with the research hypothesis stating that the percentage following intervention is less than 50%. Set alpha equal to .05. To respond, refer to questions a. to i. in question 3 on page 372, but use information appropriate to the current problem. For instance, in this problem, p = .233 and  = .50. (p is the proportion in the sample and  is that in the population (the hypothesized proportion in the null)). For question b. state a directional hypothesis pair rather than a nondirectional one. Prior to doing the problem, read the following comment: Where data are gathered sequentially over time – in this example, the 30 observations are gathered over 30 consecutive days - the independence of observations assumption sometimes does not hold. This (most often) occurs when observations close together in time resemble one another more than do those spaced further apart. For instance, presume that all of the seven days where the child wandered into the bedroom occurred very close together in time – say, for instance, within the first 10 days of the month. This presumed to be the case, whether a child walks in on a given night can be used to predict whether they will do so on the next night. For instance, if a child walks in on a given night, we could predict that they would do so on the next, this because all of the incidents of walking in are clustered together closely in time. Similarly, if s/he did not walk in on a given night, we could predict that s/he would not walk in on the next, this because incidents of “not walking in” are also clustered together (in our example, in the second half of the month.) Given the pattern described, “nearby” (close together in time) observations are not independent (unassociated) but instead are associated or dependent. Such a pattern of dependency, that is dependency according to time, is termed serial dependency. Where serial dependency is present the independence of observations assumption is not met and the statistical tests presented in this text should not be used. Where observations are serially dependent, specialized forms of time-series analysis are carried out. (See the Bloom, Fisher, and Orme reference listed in text on page 507 for time-series methods.) We are assuming that the data for this problem are not serially dependent. 68 a. b. c. d. e. f. g. 2. One-sample test of p Null: The proportion of days wandering in is greater than or equal to .50. Research: The proportion of days wandering in is less than .50. As presented in first paragraph of Section 10.6.1 on page 367, one uses  which is .50. Where  = .50, the recommended minimum sample size is 10. The study sample size is 30 which is sufficient. As demonstrated on page 362: p = [(.5)(.5)/50] = (.25/50) = (.005) = .071 z = (.233 - .50)/.071 = -.267/.071 = -3.76 Reject the null; accept the alternative/research, these decisions made because the obtained z (-3.76) is less than the critical value of z (-1.645) stated in the decision rule. The decision rule where alpha equals .05 and the research hypothesis states “less than” is stated near to the bottom of page 290). (As stated in Section 19.2.2 near the bottom of page 361, the decision rules for the onesample test of p are the same as those for the large sample test of the mean.) A state agency holds a workshop for adoptive families who adopted children with special needs. Of the 40 families who decide to attend, 10 are from urban communities, 20 are from suburban ones, and 10 are from rural ones. According to the statewide registry of all families who adopted children with special needs, 35% of these families live in urban communities, 35% live in suburban ones, and 30% live in rural ones. Conduct a statistical significance test to test whether the actual proportions attending the workshop differ significantly from the statewide proportions. Set alpha to .05. As you conduct the test, respond to the following questions. Answers are listed as you proceed: a. b. c. d. What is the hypothesis pair? Null: The proportions attending are .35 urban, .35 suburban, and .30 rural. Research: At least one proportion differs from the corresponding proportion stated in the null. Is the study sample actually a random sample? No. The workshop attendees decided on their own to attend. Given that the sample is not random, may a statistical test be conducted? Pragmatically speaking, yes (see Section 16.8 beginning on page 308). What are the observed proportions? Urban: .25(10/40); Suburban: .50(20/40); Rural: .25(10/40) 69 e. f. g. h. i. What are the expected proportions? Urban and suburban: .35; rural: .30 Are observed proportions and expected proportions used in the preferred calculation formula (19.4 on page 368)? No. What are the observed frequencies? Suburban: 20; Rural and urban:10 What are the expected frequencies? To determine these, multiply the expected proportion by the sample size. Do so for each group: Urban and suburban: 40  .35 = 14; Rural: 40  .30 = 12 2 What is the value of  ? Best to make a grid similar to table 19.1 on page 369 Category fo fe fo - fe (fo  fe)2 (fo - fe)2/fe urban 10 14 -4 16 16/14 = 1.143 suburban 20 14 6 36 36/14 = 2.571 rural 10 12 -2 4 4/12 = 0.333  = ()2 = 4.05 j. k. l. m. n. 3. How many degrees of freedom? df = 3 - 1 = 2 What is the critical value of 2? Via Table A.3 on page 451, 5.99 What is your decision regarding the null? The research? Fail to reject the null. Reject the research. These decisions made because the obtained 2 of 4.05 is less than the critical value of 2 (5.99). Is the result statistically significant? No. Is it plausible that the differences between the proportions in the study sample and those in the null are simply due to chance? Yes. For each of the following chi-square test results indicate first the critical value of 2 at the .05 level and second the decision that should be made regarding the null (at the .05 level). a. b. c. d. e. 2 2 2 2 2 = = = = = 5.20 22.65, df = 14 7.35, df = 2 7.35, df = 5 11.85, df = 5 70 Answers follow: a. b. c. d. e. Is a “trick” question. One can’t know whether this value of 2 is significant without knowing the degrees of freedom. 22.68, accept the null 5.99, reject the null 11.07, accept the null 11.07, reject the null Chapter 20 1. One hundred freshmen students are identified as having an increased risk of failing to complete their first year of college. Fifty of these students are randomly assigned to a mentoring program while the other 50 (also randomly assigned) receive no special services. Forty-three of the 50 mentored students (86%) in comparison to 32 (68%) of the nonmentored students successfully complete their freshmen year. Respond to a. to o. for Question 6 on page 387. Note that o. has been reworded below. a. b. c. Null: Participation in the mentoring program and successful completion of freshmen year are unassociated. Research: Participation in the mentoring program and successful completion of freshmen year are associated. Null: The percentage of mentored students who successfully complete their freshmen year is equal to the percentage of nonmentored students who do so. Research: The percentages of mentored and nonmentored students who successfully complete their freshmen years are not equal. We will do best to create a table of expected frequencies similar to Table 20.2 on page 380. Before doing so: Note that the numbers not completing their freshmen year (dropping out) can be determined by subtraction: Mentored students: 50  43 = 7; nonmentored: 50  32 = 18. Note further that the 75 in right margin was derived by summing the successful students in the two groups: 43 + 32 = 75 and that the 25 in the right margin was derived by summing the unsuccessful students: 7 + 18 = 25. The actual calculation of expected frequencies for the cells uses Formula 20.4 at the bottom of page 379. 71 Mentored Not mentored Successful (5075)/100 = 37.5 (5075)/100 = 37.5 75 Dropped out (5025)/100 = 12.5 (5025)/100 = 12.5 25 50 50 100 d. e. f. g. h. i. Section 20.3 on page 371 lists requirements. For 22 table, average expected frequency should be at least 6.00 and minimum expected frequency should be at least 1.00. These guidelines are easily met. (The average frequency is 25 (100 students  4 cells = 25. The minimum expected frequency is 12.5; both cells in the “dropped out” row have expected frequencies of 12.5) [Note: sum to 1.00 within categories of mentoring, that is down the columns.] Among mentored students, the proportion who are successful is .86 (43/50 = .86); the proportion who drop out is .14 [(5043)/50 = 7/50 = .14)]. Among nonmentored: the proportion successful equals 32/50 = .64; the proportion dropping out equals (5032)/50 = 18/50 = .36. Associated, as the percentages who succeed differs in the two groups. [Note: sum to 100 within categories of mentoring, that is down the columns.] Formula 20.2 on page 379 is used here. For both mentored and nonmentored, the expected proportion of successful students is: 75/100 = .75 where 75 is the row margin total and 100 is the total sample size. For both groups the expected proportion of nonsuccessful students is: 25/100 = .25. No, they differ (compare the proportions in e. and g. above). The best way to compute 2 will be to build a grid similar to Table 20.4 on page 382. (Hopefully, you see how to build observed frequencies for those who dropped out via subtraction as is demonstrated in calculating observed proportions in c. and e. above.) 72 Cell fo fe fo - fe (fo - fe)2 (fo - fe)2/fe mentor/ success 43 37.5 5.5 30.25 30.25/37.5= 0.81 mentor/ fail 7 12.5 -5.5 30.25 30.25/12.5= 2.42 nonmentor/ sucess 32 37.5 5.5 30.25 30.25/37.5= 0.81 nonmentor/ fail 18 12.5 -5.5 30.25 30.25/12.5= 2.42  = 2 = 6.42 j. k. l. m. n. o. df for the chi-square test of independence = (number of rows minus 1)  (number of columns minus 1): df = (2  1)(2 - 1) = 1  1 = 1 Via table A.3 on page 451, the critical value is 3.84 2 = 6.42 (1, N = 100) = 6.42, p < .01) As the obtained 2 (6.42) is greater than the critical value of 2 in Table A.3 (3.84) the decision is to reject the null and accept the research hypothesis. Yes. Whenever one rejects the null at the .05 level they have 95% confidence that: 1) the null is false and 2) the observed difference is not due solely to chance. (If you have 95% confidence that the null is false then, by definition, you have 95% confidence that study result is not due to chance alone.) Question o. is rephrased to: Given that the null is rejected, should you conclude that the mentor program caused greater success? Given that the null is rejected, we conclude (with 95% confidence) that the study result is not due to chance alone. As we can be (95%) confident that the result is not due to chance, we can be (95%) confident that it is due to something real. So the question becomes what is the something real that is causing the study result (that being higher success rates for those in the mentor program). (Though it perhaps oversimplifies), the fact that students were randomly assigned to mentor vs. nonmentor allows us to conclude that the something real is not a confounding variable on which the two groups differ – for instance, something like greater initial motivation to succeed in the mentoring group - but instead is the mentoring program. In short, we may conclude that the mentoring program causes greater success. We don’t know exactly what facet of the mentoring program is responsible – perhaps it is something so simple as charismatic leaders and the program would 73 minimal positive impact with other leaders - but we are confident that something(s) connected with mentoring explains the study result. (For discussion of random assignment see Section 1.13.2 on pages 13-4, Section 10.6 on pages 179-81, Section 15.13 on pages 282-3, and the last paragraph of Section 18.4.2 on pages 346-7.) 2. A given contingency table has 5 rows and 3 columns. How many degrees of freedom are there for a 2 test of independence? Carry out Formula 20.1 on page 378: (5-1)(3-1) = (4)(2) = 8 3. (T or F) Presuming the same number of degrees of freedom, the critical value of 2 where  = .01 is greater than that where  = .05. True. One may examine Table A.4 on page 453 to see that for any given degrees of freedom, the critical value is always higher where  = .01 than where  = .05. This should make intuitive sense. As presented in Sections 15.14 and 15.15 on pages 283-6, choice of the .01 level makes it more difficult to reject the null. Hence, to reject at the .01 level, a higher value of 2 is needed. 4. (T or F) The chi-square test of independence is the most common statistical test for examining differences between group means. False. It is the most common test for examining differences between proportions. 5. A 2 test of independence yields an obtained 2 of 12.25 with 10 degrees of freedom. At the .01 significance level, what is the critical value of 2? What decision should one make regarding the null? Via Table A.3 on page 451, the critical value is 23.21. As the obtained 2 is less than this, the null is accepted. 6. (T or F) Where a 2 test of independence is statistically significant, one concludes that the association between variables in the contingency table is likely due to chance alone. False. The statistically significant finding rules out chance as a likely cause of the association. So, presumably the association is real, that is, the variables indeed are associated in the population from which the study sample was randomly selected. 74 Chapter 21 1. To give you a feel for hand-calculation of ANOVA, follow the steps presented on page 397 to calculate ANOVA for the data that follow. (Note that sample sizes here are too small for one to be confident that ANOVA would yield accurate probabilities. Probabilities resulting from ANOVA where sample sizes are this small are accurate only when the shape of populations from which samples have been (randomly) selected is very close to normal. Given the small sample size here, it is difficult to use the shape of the distribution in the samples to estimate that in populations. See discussion in Section 17.6.2 on page 326. So, one would be hesitant to use ANOVA in this situation. The situation is provided to give practice at hand calculation.) Let’s presume that the scores presented here represent selfesteem scores in three different treatment groups: Group 1 7,8,9 Group 2 Group 3 3,4,5 4,6,8 Calculation steps follow: 1) The grand mean is: 7+8+9+3+4+5+4+6+8=54, 54/9 = 6 2) The group means are: Group 1: 7+8+9 = 24, 24/3 = 8 Group 2: 3+4+5 = 12, 12/3 = 4 Group 3: 4+6+8 = 18, 18/3 = 6 3) the SSw is: Group 1: (7-8)2 + (8-8)2 + (9-8)2 = 1 + 0 + 1 = 2 Group 2: (3-4)2 + (4-4)2 + (5-4)2 = 1 + 0 + 1 = 2 Group 3: (4-6)2 + (6-6)2 + (8-6)2 = 4 + 0 + 4 = 8 Summing: SSw = 2 + 2 + 8 = 12 4) MSw = SSw/(N - J) = 12/(9 - 3) = 12/6 = 2.0 5) SSb = 3(8-6)2 + 3(4-6)2 + 3(6-6)2 = 3(4) + 3(0) + 3(4) = 24 6) MSb = SSb/(J - 1) = 24/(3 - 1) = 24/2 = 12.0 7) F = MSb/MSw = 12.0/2.0 = 6.0 2. Given that F for the prior problem is 6.0, what is the critical F and what decision should be reached regarding the null? Presume that alpha equals .05. 75 To determine the critical F, one needs to know the degrees of freedom for both the MSb and the MSw. Degrees of freedom for the MSb = J - 1 = 3 - 1 = 2. Degrees of freedom for the MSw = N  J = 9  3 = 6. Via Table A.4 on page 452, the critical value for an F distribution with 2 degrees of freedom for the MSb and 6 for the MSw where alpha equals .05 is 5.14. As the obtained F exceeds the critical F, the null is rejected. 3. Indicate the degrees of freedom first for the MSb and second for the MSw for each of the following situations. a. b. c. N = 64, J = 8 N = 25, J = 4 N = 90, J = 6 For each of these, one simply carries out the degrees of freedom formulae: MSb = J - 1 and MSw = N  J. a. 7 and 56, 4. b. 3 and 21, c. 5 and 84 Using Table A.4 on page 451, indicate the critical value of F with alpha set to .05 for the three situations presented in the prior problem. As you do these problems, note that, as stated in Section 21.4.4 on page 396, where the exact number of degrees of freedom is not listed: use the closest larger number for the MSb and use the closest smaller number for the MSw. a. b. c. 5. 2.20 (as 56 degrees was not listed for the MSw, 50 was used) 3.07 2.22 (as 90 degrees was not listed for the MSw, 75 was used) (T or F) Where the obtained F is greater than or equal to the critical F, one rejects the null. True. (And where obtained F is less than critical F, one accepts the null.) 6. In a given situation, the MSw equals 5.00 and the MSb equals 22.00. What is the value of F? To compute F divide the MSb by the MSw: 22.0/5.0 = 4.4 7. Is the F ratio computed in the prior problem statistically significant at the .05 level? 76 Trick question. Can’t know this unless we know the degrees of freedom, and these were not given. 8. Presume that degrees of freedom for the MSb were 2 and those for the MSw were 12. Given the F ratio from problem 6 (F = 4.4), what decision should be made regarding the null? (Use the .05 level.) Given these degrees of freedom, the critical value of F is 3.89. As the obtained F exceeds the critical F, the null is rejected. 9. In a given situation, the SSw equals 100 and the degrees of freedom for the MSw equals 20. What is the value of the MSw? As indicated by Formula 21.3 on page 395, to compute the MSw, divide the sums of squares within by the degrees of freedom for the MSw: MSw = 100/20 = 5.0 10. In a given situation, the SSb equals 40 and the degrees of freedom for the MSb equals 10. What is the value of the MSb? As indicated by Formula 21.5 on page 396, to compute the MSb, divide the sums of squares between by the degrees of freedom for the MSb: MSb = 40/10 = 4.0 11. In a given situation the value of F is 0.70. Even without the degrees of freedom or knowledge of the significance level, how can one know that one should accept the null? Values of F less than 1.00 indicate even less sampling error than one would expect (on average) just from chance alone. Where F is less than 1.00, the null will always be accepted. 12. (T or F) When the null is true, one knows assuredly that the statistical test will result in a (correct )decision to accept the null. False. Most of the time – 95% of the time at the .05 level and 99% at the .01 – the statement is true. However, occasionally the study sample result (by the luck of the draw) will be sufficiently extreme to result in the (incorrect) decision to reject the null. This will happen in 5% of (random) samples at the .05 level and 1% of (random) samples at the .01 level. 13. Assuming that the null is true, what is the probability that the (appropriate) statistical test will result in an (incorrect) decision to reject the null, at the .01 level? At the .05 level? 77 At the .01 level, the probability of an incorrect decision to reject the null – that is, the probability of a Type I error – is .01. At the .05 level, this probability is .05. 14. (T or F) As the number of significance tests that are conducted increases, so also does the likelihood of making a Type I error on at least one test. True. So, when many many tests are reported in an article, be somewhat skeptical of study results. There may well be one or two instances where a statistically significant finding (and therefore a decision to reject the null) represents Type I error. Chapter 22 1. (T or F) Other things being equal, as sample size increases so also does the statistical power of a significance test of r. True. (This is true of all statistical tests, not just the test of r.) 2. (T or F) One may formulate either a directional or a nondirectional hypothesis test for a significance test of r. True. 3. (T or F) Where the hypothesis pair is directional the test of r is two-tailed. False. 4. It is one-tailed. Given a directional hypothesis where the research hypothesis states “less than” and sample size is 52. State the hypothesis pair. What values of r result in rejection of the null at the .05 level? At the .01 level? Null: In the population from which the study sample was randomly selected,   0.00. Research: In the population from which the study sample was randomly selected,  < 0.00. (Hopefully you recall that  (rho, pronounced “row”) is the symbol for the correlation coefficient in a population.) To see what values result in rejection, reference Table A.5 on page 454. Where the hypothesis is directional, the test is one- 78 tailed. Where the research hypothesis states less than, the rejection region is in the lower tail. The applicable decision rule is the last rule listed in Section 22.2.6 on page 410. The value of r in the table is .231. Applying the decision rule, at the .05 level, one would reject the null for all values of r  .231. At the .01 level one would reject the null for all values  .322. 5. In the correlation matrix on page 412. What is the correlation between family income and the parents’ perception of the impact of the adoption on the family? Is the correlation negative or positive? Is the correlation statistically significant (.01 level, two-tailed test)? Should the null be rejected? Is the relationship strong or weak? Interpret/describe the correlation in your own words. r = .170. Negative. Yes, it is statistically significant. Yes, the null should be rejected. Via Table 8.3 on page 137, the relationship between income and impact is reasonably weak. To interpret: As income level increases, the impact of adoption on the family, as reported by the parent, decreases (becomes more negative). (Sections 8.3.1, 8.3.2, and 8.3.3 on pages 130-3 provide guidance on interpreting/describing correlation.) 6. (T or F) The one-sample test of z is a parametric test. True. 7. (T or F) A nondirectional test of Spearman’s r examines whether the correlation between two rank orderings differs significantly from 0.00. True. 8. (T or F) Tests of the significance of taub and tauc are typically conducted with categorical variables that are at the nominal level of measurement. False. These tests test for the possible significance of directional association between categorical variables. For a variable to have a directional association, its level of measurement must be at least ordinal. So, these tests are used with ordinal-level variables, not nominal-level ones. 9. (T or F) The Mann-Whitney U test is a parametric test. False. 10. It is nonparametric. A social work professor asks 24 students which class they enjoy more, research methods or practice. Eighteen state practice, two 79 state research, and four state that they enjoy each the same. What test would be a good one to use to see whether the difference in preferences is statistically significant. The sign test (see Section 22.8 on page 416). 11. What test is regarded as the nonparametric alternative to the independent samples t test? The Mann-Whitney U test. (Presuming that the dependent variable is measured at the interval/ratio level), this test is preferable to an independent samples t test when both of the following are true: 1) sample size is very small and 2) the degree of skew is very strong or extreme. (See page 415.) In this situation the independent samples t test may yield inaccurate probabilities. (Of course, the Mann-Whitney test is often preferred over the t test when one’s level of measurement is ordinal.) 12. What test is a good nonparametric alternative to ANOVA? Kruskal-Wallis test. (Presuming that the dependent variable is at the interval/ratio), this test is preferred to ANOVA when both of the following conditions hold: 1) sample sizes are small and 2) skewness is extreme. In such a situation, ANOVA may yield inaccurate probabilities. (See page 415.) (Of course, the Kruskal-Wallis test is also often preferable to ANOVA with ordinal-level data.) 13. Where one has information on the size of differences as well as on the direction, is the sign test or the Wilcoxon signed ranks test preferred? Most would prefer the Wilcoxon test in this situation (see Section 22.8 pages 416-7) 14. (T or F) In multiple regression, the ’s (beta’s) are unstandardized coefficients and, thus, convey change in the units in which variables were originally measured rather than in standardized units. False. ’s are standardized and convey change in terms of standardized units (that is, in terms of z scores). B’s are unstandardized and convey change in terms of the original units of measure (in terms of raw scores). (See page 419 in Section 22.11) 15. To respond to the next series of questions refer to Table 22.2 on page 420. (Answers are given after each question.) 80 a. As parent’s mean educational level increases by one (1.00) unit (in its original unit of measure), what is the predicted change in parent/child relationship score (in its original unit of measure)? The B coefficient provides this information. It is .0530. This conveys that as educational level increases by 1.00 point, predicted relationship score decreases by .0530 points (by about 1/20th of a point). b. As parents’ mean educational level increases by 1.00 standard deviation, what is the predicted change in standard deviations for parent/child relationship score? The  coefficient provides this information. It is -.087. Thus, as education level increases by 1.00 standard deviation the parent-child relationship score is predicted to decrease by .087 standard deviations. c. Controlling for other predictors in the regression equation, about how strong is the relationship between parent educational level and parent-child relationship score? Is the association (controlling for other predictors) a directional one? If so, what direction? Strength of relationship controlling for the other predictors is conveyed by the  coefficient of -.087. Via the descriptors for r in Table 8.3 on page 137, the association is a weak one. The association is directional. The negative sign conveys that the direction of association is negative. As education level increases (controlling for other predictors), predicted relationship score decreases. d. How can it be that the association between education level and relationship score is significant (p = .011, see last column of Table 22.2) when this relationship is weak? Though the sample size was inadvertently omitted in Table 22.2, it is very large, about 700. When sample size is large even weak associations may achieve statistical significance. (See page 303 for discussion on this point and also Table 16.1 on page 308) e. As parental income increases by 1.00 standard deviation, what is the predicted change in standard deviations for parent-child relationship score? The  of -.122 conveys that the predicted score decreases by .122 standard deviations. 81 Chapter 23 1. The answers given here expand on questions a. to m. in Question 22 on page 443. Answers to these questions are presented on page 497 of the text. a. See page 497. b. The total sample size is 400. Referencing Table 16.1 on page 308, power is “good/fairly high”. Note that the variability of rehospitalization is somewhat restricted (low)– only 15% in one group and 25% in the other require hospitalization – but this restriction is not sufficient to substantially reduce power. If only, say, 5% in one group and 10% in the other had been rehospitalized, power would have been reduced somewhat. The fact that group sizes were equal (200 in each) enhances power to some degree. (See Section 16.5.5 on pages 305-6 for discussion of the effects of variability on power.) c. Yes, at the .05 level; the “p < .05" symbol in the question description conveys this information. See second paragraph on page 311 for discussion of p symbol. d. Less than or equal to .05. Where the result is statistically significant, it is not likely to be due to chance alone. e. Less than or equal to .05 (Given a true null, the probability of obtaining this or a more extreme result is quite low; given a true null, a result this extreme would be rare, unexpected, uncommon, and “atypical” – it would be a result in a tail of the sampling distribution.) f. Can be at least 95% confident. See last paragraph of Section 14.8 on page 259, last paragraph of Section 14.10 on page 261, and Table 16.2 on page 313, third grouping from the bottom of the table.) g. Statistical significance and rejection of the null go hand in hand; one rejects the null. h. Here one would calculate the difference in percentages and then assess size making use of Table 6.4 on page 109: D% = 25%  15% = 10%. Via Table 6.4 the relationship is small/weak. i. The fact that the difference in rehospitalization rates achieves statistical significance rules out chance as a plausible explanation. However, no mention is made of random assignment. (Presumably clients went to center A because they lived in its catchment area or to B because they lived in its area.) In the absence of random assignment, one should be suspicious that some differing characteristic(s) of clients served in the two centers 82 j. k. l. m. rather than differential effectiveness of the two treatment approaches (halfway house versus own residence) is the explanation for the study result. (This is the same as saying that one should be suspicious that a confounding variable is the cause of the result.) So, one would not want to conclude that relationship between type of treatment and rehospitalization is (necessarily) a causal one. Random assignment would have greatly strengthened the ability to draw a causal conclusion. See Sections 10.5 and 10.6 on pages 178-81 for discussion of random assignment and related issues. No, as the study sample was not randomly sampled from any larger (real world) population. See Section 16.8 on pages 308-10. See answer on page 497; see discussion in Section 23.2.2 on page 431. See page 497. See page 497 and Section 23.3 on pages 433-5. 83

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Additional Problems, Often with Answers Reasoned Out