Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 4: z-scores and Probability **This chapter corresponds to chapter 8 (“Are Your Curves Normal?”) of your book. What it is: z-scores (also called “standard scores”) are raw scores that have been adjusted for the mean and standard deviation of the distribution from which the raw scores came. z-scores are expressed in standard deviation units and represent the number of standard deviations above or below the mean that a given raw score is (e.g., a z-score of 1.0 is one standard deviation above the mean). We can also use our knowledge of the normal curve to assign a probability to the occurrence of any given z-score. Because z-scores use a standardized metric (i.e., standard deviation units), you can directly compare the magnitude and probability of zscores from different distributions of scores, even if those distributions have radically different means and standard deviations. When to use it: You should use z-scores when you have continuous data that you wish to express in a standardized metric (i.e., standard deviation units) and/or when you wish to assign a probability of occurrence to a given score (e.g., what is the probability of receiving a score one standard deviation or more above the mean?). z-scores are also especially useful for comparing the magnitude and/or probability of two raw scores drawn from distributions with different means and/or standard deviations. Questions asked by z-scores: How many standard deviation units above/below the mean is a given raw score? What is the probability of a given raw score occurring? What are the relative probabilities and/or magnitudes of two raw scores drawn from distributions with different means and/or standard deviations? Examples of research questions that would use z-scores: o o If the average SAT-Math score is 500 with a standard deviation of 100, what is the probability of receiving a score higher than 600? If one person scores 25 on the Stressful Life Events Inventory (which has a mean of 20 and standard deviation of 5) and another person scores 110 on the Major Stressors Questionnaire (which has a mean of 100 and standard deviation of 15), which person has the higher score, making that person more likely to have acute anxiety attacks? Using SPSS to Calculate z-scores: (dataset: Chapter 4 Example 1.sav) Albert attends a very competitive prep school where all 40 students at this school take both the SAT and ACT and then their scores are posted for everyone to see. Albert’s best subject is English, so he is especially interested in how well he did on the SAT-Verbal and ACT-English sections. When the scores are posted, he sees that he scored a 28 on the ACT-English and a 610 on the SAT-Verbal. Because the two tests use a different metric, Albert is curious about how a score of 28 on the ACT compares to a score of 610 on the SAT. Albert also wonders how well he did on each test in comparison to his classmates (i.e., what was the probability of any of his classmates scoring higher than him on each test?). Selection of the appropriate statistic(s) Z-scores are the appropriate statistic because SAT and ACT scores represent continuous data, Albert is interested in comparing the magnitude of his scores on two tests that use different metrics (i.e., have very different means and standard deviations), and Albert wishes to assign a probability to obtaining a higher score than his on each test Computation of the statistic(s) We will use SPSS to calculate z-scores for us. Open the dataset “Chapter 4 Example 1.sav”. Take a moment to familiarize yourself with the data. Note how data for this type of analysis should be entered. 1) Each participant (i.e., student at the prep school) has one row in the data. 2) One column is used to indicate each participant’s identification number, which is just a number that is assigned to each participant in the study (this variable is “ID” in the present example). 3) The second column indicates each participant’s score on the first variable (ACT-English Scores) for which we wish to calculate z-scores. 4) The third column indicates each participant’s score on the second variable (SAT-Verbal Score) for which we wish to calculate z-scores. The data should look something like this in SPSS (note that Albert’s scores of 28 and 610 are listed at the top; they don’t have to be listed at the top, but for the sake of this example it makes it easier to keep track of Albert’s scores): If you switch to variable view, you should see that the two variables (“acte” and “satv”) have labels indicating that they represent “ACT-English Scores” and “SAT-Verbal Scores”, respectively. If you did not like those labels, you could change the labels to whatever you want. To calculate z-scores in SPSS, click on the “Analyze” drop-down menu, highlight “Descriptive Statistics”, and then click “Descriptives…”, as pictured below. The following pop-up window will appear: Note that the variables are listed in the pop-up window by their labels, with their variable names in parentheses (e.g. “ACT-English Scores [acte]”). Highlight the variable(s) for which you wish to calculate z-scores (“ACT-English Scores [acte]” and “SAT-Verbal Scores [satv]” in this example) and then click on the arrow to make the variable(s) appear in the Variable(s): window, as pictured below. Now check the box next to “Save standardized values as variables”. This is telling SPSS to calculate z-scores for every raw score for each of the variables in the “Variable(s):” window, and to save these z-scores as new variables in the “data view” spreadsheet. Your screen should look like this: Click “OK” and navigate to “data view” to find your new columns of z-scores. The “data view” screen should now look like this: SPSS has created two new variables (“Zacte” and “Zsatv”) that are the z-scores for “acte” and “satv”, respectively. For instance, you can see that Albert’s z-score for the ACT is 1.14 and his z-score for the SAT is 1.09 (rounded). Thus, Albert scored 1.14 standard deviation units above the mean on the ACT and 1.09 standard deviations above the mean on the SAT. Therefore, Albert scored very similarly on the SAT and ACT, although his score on the ACT is perhaps slightly higher compared to the mean score. Transforming z-scores into Raw Scores, and Vice-versa Along with the z-scores listed in the “data view”, SPSS has also generated some descriptive statistics for the raw scores of “acte” and “satv” in the output window. If you navigate to the output window you’ll see the following table: Descriptive Statistics N Minimum Maximum Mean Std. Deviation ACT-English Scores 40 5.00 33.00 21.1000 6.05022 SAT-Verbal Scores 40 260.00 750.00 501.6250 99.73290 Valid N (listwise) 40 Sample sizes (n=40) for each variable. Minimum and maximum raw scores for each variable. Raw score means and SDs for each variable. The raw score means and standard deviations are of the most interest. Notice that the mean raw score for the ACT-English was 21.1 (SD = 6.05) and the mean raw score for the SATVerbal was 501.63 (SD = 99.73). z-scores represent the number of standard deviation units above/below the mean a given raw score is. You can use this information to transform z-scores back to raw scores, and vice-versa. For instance, because the raw score mean and standard deviation of the ACT were 21.1 and 6.05, respectively, it makes sense that Albert’s score of 28 (which is about 7 points higher than the mean) would be a bit more than one standard deviation above the mean, corresponding to a z-score a little higher than 1.0. This is indeed the case, as Albert’s ACT-English z-score is 1.14. Similarly, knowing that Albert’s z-score is slightly above 1.0, and knowing that the raw score mean and standard deviation are 21.1 and 6.05, one would expect that Albert’s raw score on the ACT was a little higher than 27. A more concrete and objective way to do this is using the following formula for transforming a zscore back into a raw score: x = z (s) + x x = 1.14(6.05) + 21.1 x = 6.90 + 21.1 x = 28 Similarly, the following formula is used for transforming a raw score into a z-score: z= x−x s z= 28 − 21.1 6.05 z= 6 .9 6.05 z = 1.14 Assigning a Probability to Scores Albert also wished to know the probability that his classmates would score higher than him on each test. For each z-score, there is an associated probability of achieving a higher z-score, and these probabilities are listed in Table B.1 (Areas Under the Normal Curve) on pages 329-331 of Salkind (2008). For instance, given the properties of the normal curve, what is the probability that a student would score higher than a 28 on the ACT? A score of 28 corresponds to a z-score of 1.14, so we go to Table B.1 to find the percentage of z-scores under the normal curve that fall above 1.14. We first find the z-score 1.14 in Table B.1, and note the “area between the mean and the zscore”. 37.29% of scores fall between the mean (which is always zero in a distribution of zscores) and 1.14. Because 50% of all scores always fall below the mean (zero) in a normal curve, this means that 87.29% (50% + 37.29% = 87.29%) of all scores fall below a z-score of 1.14. This means that only 12.71% (100% - 87.29% = 12.71%) of scores fall above a z-score of 1.14. Therefore, the probability of one of Albert’s classmates scoring higher than him on the ACT-English is only .1271 (or 12.71%). The same process can be used to determine the probability that one of Albert’s classmates will score higher than him on the SAT-Verbal (this probability is 13.79%). Interpretation of the Findings Although the two tests use a different metric, z-scores tell us that Albert scored approximately equally well on both the SAT (z = 1.09) and the ACT (z = 1.14). These scores are both well above average (i.e., more than a standard deviation unit above the mean scores). Thus, the probability that one of his classmates will score higher than Albert is not very high (12.71% and 13.79% for the ACT and SAT, respectively). Practice Problem #1 for SPSS (answer in Appendix) Following are the scores for 10 persons on the Stressful Life Events Inventory (SLEI) and the scores for 10 other persons on the Major Stressors Questionnaire (MSQ). For both surveys, higher scores mean that the person has experienced more stressors recently, suggesting a greater risk for stress-related symptoms such as acute anxiety attacks. Participant 1 2 3 4 5 6 7 8 9 10 SLEI 25 20 14 32 29 26 17 12 28 26 Participant 11 12 13 14 15 16 17 18 19 20 MSQ 110 90 83 115 105 100 77 83 118 120 A. Calculate z-scores for each of the SLEI and MSQ raw scores above. B. Participants 1 and 11 above received SLEI and MSQ scores of 25 and 110, respectively. Which person’s score is higher compared to the mean SLEI and MSQ scores? C. What is the probability that someone will score higher than a 25 on the SLEI? What is the probability that someone will score higher than a 110 on the MSQ? What is the probability that someone will score lower than those scores on each test? D. What is the probability that someone will score higher on the SLEI than Participant 7 (who scored a 17)? What is the probability that someone will score lower than Participant 7? E. What do you conclude about whether Participant 1 or 11 is at the greatest risk for acute anxiety attacks? Practice Problem #2 for Hand Calculation (answer in Appendix) Below are the number of books owned by five different persons. The standard deviation of number of books owned is 32.43. Participant ID Books Owned 1 33 2 95 3 12 4 53 5 72 z= x−x s Z x = z (s) + x A. Calculate z-scores for number of books owned for each of the five persons above. B. Based on the mean and standard deviation of number of books owned for the five persons above, what is the number of books a person would own if they had a z-score of 1.0? What about a z-score of 2.32? How about a z-score of -1.53? C. What is the probability that someone would own more books than participant 1? Answer the same question for participants 2, 3, 4, and 5. D. Based on properties of the normal curve, what percentage of persons own more than 12 books, but less than 95 books? Practice Problem #3 for Hand Calculation and SPSS (answer in Appendix) Calculate the z-scores for the following two sets of scores. Relative to the mean in each set, did participant 1 score higher in the first set or the second set? What is the probability that someone will score higher than participant 1 on the first set? The second set? Participant ID Set 1 Set 2 1 10 1000 2 12 1100 3 19 1257 4 22 1555 5 4 872 6 18 1288 7 27 999 8 22 1442 9 12 1200 10 18 1301