Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
SSAC2006.QE531.LV1.6 Frequency of Large Earthquakes Introducing Some Elementary Statistical Descriptors Earthquakes with a magnitude 7 to 7.9 are considered major earthquakes. Earthquakes with magnitude larger than 8 are considered great earthquakes. How frequently do these large earthquakes occur? Core Quantitative Issue Data analysis: Exploratory statistical descriptors Supporting quantitative concepts/skills Mean, median, mode Variance, standard deviation Percentiles, Quartiles Interpolation Normal distribution Prepared for SSAC by Len Vacher – University of South Florida © The Washington Center for Improving the Quality of Undergraduate Education. All rights reserved. xxxx 1 Preview The mission of the U.S. Geological Survey National Earthquake Information Center (USGS NEIC) is “to determine rapidly the location and size of all destructive earthquakes worldwide and to immediately disseminate this information to concerned national and international agencies, scientists, and the general public” (End note 1). The NEIC now locates some 50 earthquakes a day, or about 20,000 per year Slide 3 gives some background on earthquake frequency and magnitude. Slide 4 presents the 30 years of data that you will study: the number of large earthquakes per year for 1970-1999. Slides 5-9 ask you for the mean; the variance and standard deviation; the maximum, minimum and range; the modes; and the median and quartiles of the data. Slides 10 and 11 look at the quartiles more closely – in terms of percentiles. Linear interpolation becomes relevant in Slide 11. Slides 12-14 ask you to plot the percentiles. In particular, Slide 14 asks you to consider the distribution of the data. How does the relation between the key percentiles and the standard deviation compare to that of the normal distribution? Slide 15 wraps up, and Slide 16 gives you the end-of-module assignments, which involve data from 1940-1969. 2 Background on earthquake magnitude and frequency As shown in the following table, there were 10-15 major and one great earthquake per year in 2000-2005. Over that period of time, five of the major earthquakes and none of the great earthquakes were in the US. (End note 2) B Magnitude 2 3 4 8.0 to 9.9 5 7.0 to 7.9 6 6.0 to 6.9 7 5.0 to 5.9 8 4.0 to 4.9 9 3.0 to 3.9 10 2.0 to 2.9 11 1.0 to 1.9 12 0.1 to 0.9 13 No Magnitude 14 15 Total 16 17 Deaths (estimated) C 2000 D 2001 E 2002 F 2003 G 2004 H 2005 I 2006* 1 14 158 1345 8045 4784 3758 1026 5 3120 1 15 126 1243 8084 6151 4162 944 1 2938 0 13 130 1218 8584 7005 6419 1137 10 2937 1 14 140 1203 8462 7624 7727 2506 134 3608 2 14 141 1515 10888 7932 6316 1344 103 2939 1 10 144 1699 13917 9173 4638 26 0 867 0 9 104 1140 9736 7535 3014 14 2 541 22,256 23,534 27,454 31,419 31,194 30,475 22,095 231 21,357 1685 33,819 284,010 89,354 6595 U.S. Geological Survey National Earthquake Information Center, 11/6/2006 To appreciate the size of the large earthquakes, remember this: A magnitude-8 earthquake releases about 32× as much energy as a magnitude-7 earthquake, and a magnitude-7 earthquake releases about 32× as much energy as a magnitude-6 earthquake. Therefore a magnitude-8 earthquake is about 1000× as strong as a magnitude-6 earthquake. (End note 3) 3 Problem 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 B Yr 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 C Nbr 29 23 20 16 21 21 25 16 18 15 18 14 10 15 8 15 6 11 8 7 13 10 23 16 15 25 22 20 16 11 Here are the data. Start your spreadsheet by copying the data in Columns B and C, starting with Row 3, as shown. Data are from the Quantitative Environmental Learning Project (QELP), an NSF-sponsored project to promote quantitative reasoning using real data, by Greg Langkamp and Joe Hull of Seattle Central Community College. For information about this data set see http://www.seattlecentral.org/qelp/sets/039/039.html • What are the mean, median, and mode of these data? • What are the variance and standard deviation? • What are the quartiles? • What are the 10th and 90th percentiles? 4 Finding the Mean 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 B Yr 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 C Nbr 29 23 20 16 21 21 25 16 18 15 18 14 10 15 8 15 6 11 8 7 13 10 23 16 15 25 22 20 16 11 D E F Nbr yrs 30 Sum Earthquakes 487 Average Nbr per yr 16.2 By Excel Average 16.2 Even easier – Cell equation for F15: =AVERAGE(C3:C32) With pencil and paper, one can count the number of data (Cell F6), sum the total number of earthquakes (F8), and divide the second result by the first (F10). Excel does these steps easily. • Cell equation for F6: =COUNT(C3:C32) • Cell equation for F8: =SUM(C3:C32) Delete one of the values in Column C, and replace another one with a letter. Then try these variations. • =COUNTA(C3:C32) • =SUMA(C3:C32) • =AVERAGEA(C3:C32) Explain what you observe (End note 4) 5 Finding the variance and standard deviation 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 B C Yr 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 Nbr 29 23 20 16 21 21 25 16 18 15 18 14 10 15 8 15 6 11 8 7 13 10 23 16 15 25 22 20 16 11 D E F DeviationDeviation squared 12.8 163 6.8 46 3.8 14 -0.2 0 4.8 23 4.8 23 8.8 77 -0.2 0 1.8 3 -1.2 2 1.8 3 -2.2 5 -6.2 39 -1.2 2 -8.2 68 -1.2 2 -10.2 105 -5.2 27 -8.2 68 -9.2 85 -3.2 10 -6.2 39 6.8 46 -0.2 0 -1.2 2 8.8 77 5.8 33 3.8 14 -0.2 0 -5.2 27 G H average of deviations I 0.0 average of dev'ns-sqred 33.4 sqrt of avg of dev-sqred 5.78 By Excel population variance sample variance 33.4 34.5 population stdev sample stdev 5.78 5.88 Recreate this spreadsheet. With pencil and paper, one calculates the deviation of each of the values from the average (Col E) and then squares each of them (Col F). The average of all of the deviations is zero (Cell I5). The average of all of the squared deviations is the population variance (I7). The square root of the population variance is the population standard deviation (I9). Using Excel’s built-in functions: =VARP(C3:C32) =STDEVP(C3:C32) Without the “p”, VAR and STDEV return the sample variance and sample standard deviation, respectively. (End note 5) Google research: What’s the difference between sample and population standard deviation? 6 Finding the range 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 B Yr 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 C Nbr 29 23 20 16 21 21 25 16 18 15 18 14 10 15 8 15 6 11 8 7 13 10 23 16 15 25 22 20 16 11 D E 1970 1976 1995 1971 1992 1996 1974 1975 1972 1997 1978 1980 1973 1977 1993 1998 1979 1983 1985 1994 1981 1990 1987 1999 1982 1991 1984 1988 1989 1986 F Ordered 29 25 25 23 23 22 21 21 20 20 18 18 16 16 16 16 15 15 15 15 14 13 11 11 10 10 8 8 7 6 G H With pencil and paper, one sorts the data from highest to lowest and then reads off the largest and smallest values, as well as the second largest and third smallest values, if you wish. The range is the largest value minus the smallest value (29 – 6, in this case). I <<<Max <<<Second largest <<<Third By Excel Max 2nd 3rd 29 25 25 Min 2nd 3rd 6 7 8 Range 23 <<<Third <<<Second smallest <<<Min To sort in Excel: • Copy the block to be sorted (B3 to C32) to a blank part of the spreadsheet (E3 to F32). • Block out E3 to F32. • Select “Data” on tool bar • Select “Sort” • Choose Column F and “Descending” • Choose OK Excel’s built-in functions: • For Max =MAX(C3:C32) • For minimum =MIN(C3:C32) (End note 6) • For 2nd largest =LARGE(C3:C32,2) • For 3rd smallest =SMALL(C3:C32,3) 7 Finding modes 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 B Yr 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 C Nbr 29 23 20 16 21 21 25 16 18 15 18 14 10 15 8 15 6 11 8 7 13 10 23 16 15 25 22 20 16 11 D E 1970 1976 1995 1971 1992 1996 1974 1975 1972 1997 1978 1980 1973 1977 1993 1998 1979 1983 1985 1994 1981 1990 1987 1999 1982 1991 1984 1988 1989 1986 F Ordered 29 25 25 23 23 22 21 21 20 20 18 18 16 16 16 16 15 15 15 15 14 13 11 11 10 10 8 8 7 6 G Counting 1 2 H I When the data are sorted, one can count the number of times that each value occurs. Column G provides these counts. Cell G3, for example, says that there was one year with 29 earthquakes, and G6 says that there were two years with 25 earthquakes. The most frequent values are 16 (4 times) and 15 (4 times). Thus 15 and 16 are modes (a composite mode). J 2 By Excel 1 2 Mode 16 Nbr of 16s 4 Nbr of 15s Nbr of 19s 4 0 Nbr >16 Nbr>=16 Nbr <16 Nbr<=16 Nbr not 16 12 16 14 18 26 Nbr 14-18 11 2 2 4 4 1 1 2 Sumifs Sum if 16 Sum if >=16 2 2 1 1 Recreate this spreadsheet. The SUMIF function works the same way as the COUNTIF function Excel’s built-in functions – • For mode (J8): =MODE(C3:C32) 64 329 For counts – • For number of times the value is 16: =COUNTIF(C3:C32,16) • For number of times value is larger than 16: =COUNTIF(C3:C32,”>16”) • For number of times value is 16 or larger: =COUNTIF(C3:C32,”>=16”) • For number of times value is not 16: =COUNTIF(C3:C32,”<>16”) 8 Finding quartiles 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 B Yr 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 C Nbr 29 23 20 16 21 21 25 16 18 15 18 14 10 15 8 15 6 11 8 7 13 10 23 16 15 25 22 20 16 11 D E 1970 1976 1995 1971 1992 1996 1974 1975 1972 1997 1978 1980 1973 1977 1993 1998 1979 1983 1985 1994 1981 1990 1987 1999 1982 1991 1984 1988 1989 1986 F Ordered 29 25 25 23 23 22 21 21 20 20 18 18 16 16 16 16 15 15 15 15 14 13 11 11 10 10 8 8 7 6 G Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 H I J Count mid count 1st qtr count 3rd qtr count 30 15.5 7.75 23.25 By Excel median Q1 Q2 Q3 16 11.5 16 20.75 Upr Qtr Mid-Count By Excel median med(upr half) med(lwr half) 16 21 11 Lwr Qtr Another way that quartiles are determined is: • Q1 is the median of the smaller half of the list – MEDIAN(G18:G32) • Q3 is the median of the larger half of the list – MEDIAN(G3:G17) NOTE: These results (J21 and J22) don’t agree with Excel’s built-in functions either. When the data are sorted, one can locate the median and quartiles. The median is the value that occurs halfway through the list. For 30 values, the halfway position is 31/2 – in in general (COUNT+1)/2 (J6). The first quartile occurs a quarter of the way through from the bottom – or at (COUNT+1)/4 (J7), and the third quartile occurs three- quarters of the way through – or at (COUNT+1)*3/4 (J8). The horizontal boxes from Column F to H identify the quartiles from this sorting: 11 for the first quartile (Q1), 16 for the second quartile (Q2 or median), and 21 for the third quartile (Q3). Excel’s built-in functions – • For median (J12)): =MEDIAN(C3:C32) For Q1 (J13): =QUARTILE(C3:C32,1) • For Q2 (J14): =QUARTILE(C3:C32,2) • For Q3 (J15):: =QUARTILE(C3:C32,3) NOTE: The results (J13, J15) do 9 not agree with our values from counting through our sorted list. What is EXCEL doing? Finding percentiles and comparing them to quartiles B 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Yr 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 C Nbr 29 23 20 16 21 21 25 16 18 15 18 14 10 15 8 15 6 11 8 7 13 10 23 16 15 25 22 20 16 11 D E 1970 1976 1995 1971 1992 1996 1974 1975 1972 1997 1978 1980 1973 1977 1993 1998 1979 1983 1985 1994 1981 1990 1987 1999 1982 1991 1984 1988 1989 1986 F G H I Ordered Rank largest at from top bottom Percentile 29 30 1.000 25 29 0.966 25 28 0.931 23 27 0.897 23 26 0.862 22 25 0.828 21 24 0.793 21 23 0.759 75th 20 22 0.724 20 21 0.690 18 20 0.655 18 19 0.621 16 18 0.586 16 17 0.552 16 16 0.517 50th 16 15 0.483 15 14 0.448 15 13 0.414 15 12 0.379 15 11 0.345 14 10 0.310 13 9 0.276 25th 11 8 0.241 11 7 0.207 10 6 0.172 10 5 0.138 8 4 0.103 8 3 0.069 7 2 0.034 6 1 0.000 J vs K Excel 20.75 vs 16 vs 11.5 Excel uses percentiles to calculate quartiles. The median is the 50th percentile, meaning that 50% of the data are smaller than the median. Q1 is the 25th percentile, meaning that 25% of the data are smaller than Q1. Q3 is the 75th percentile, meaning that 75% of the data are smaller than Q3 One can calculate the percentile of each position in the sorted list (Column H). Cell H4 shows that the second highest position contains the 96.6th percentile. It is calculated as (29-1)/(30-1) or in Excel =(G4-1)/(COUNT($C$3:$C$32)-1) The $-symbols are included in order that the equation can be copied through Column H. The 75th percentile clearly occurs between 20 and 21, and closer to 21. The 25th percentile clearly occurs between 11 and 13, and closer to 11. Both observations are consistent with the values returned by Excel for Q3 and Q1, respectively. What do we find if we interpolate? 10 Finding quartiles by interpolating percentiles B 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Yr 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 C Nbr 29 23 20 16 21 21 25 16 18 15 18 14 10 15 8 15 6 11 8 7 13 10 23 16 15 25 22 20 16 11 D E 1970 1976 1995 1971 1992 1996 1974 1975 1972 1997 1978 1980 1973 1977 1993 1998 1979 1983 1985 1994 1981 1990 1987 1999 1982 1991 1984 1988 1989 1986 F Ordered largest at top 29 25 25 23 23 22 21 21 20 20 18 18 16 16 16 16 15 15 15 15 14 13 11 11 10 10 8 8 7 6 G H I Rank from bottom Percentile 30 1.000 29 0.966 28 0.931 27 0.897 26 0.862 25 0.828 24 0.793 23 0.759 75th 22 0.724 21 0.690 20 0.655 19 0.621 18 0.586 17 0.552 16 0.517 50th 15 0.483 14 0.448 13 0.414 12 0.379 11 0.345 10 0.310 9 0.276 25th 8 0.241 7 0.207 6 0.172 5 0.138 4 0.103 3 0.069 2 0.034 1 0.000 J Interpolate 20.75 16 11.5 K Excel 20.75 16 11.5 For Excel’s algorithm see: http://support.microsoft.com/?kbid=214072 Cell equation in J10: =F11+(.75-H11)/(H10H11)*(F10-F11) Why? Explain using the word proportion. What cell equations belong in J17 and J24? It looks like Excel determines quartiles as percentiles by interpolation. 11 Graphing the percentiles C D Yr 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 Nbr 29 23 20 16 21 21 25 16 18 15 18 14 10 15 8 15 6 11 8 7 13 10 23 16 15 25 22 20 16 11 1970 1976 1995 1971 1992 1996 1974 1975 1972 1997 1978 1980 1973 1977 1993 1998 1979 1983 1985 1994 1981 1990 1987 1999 1982 1991 1984 1988 1989 1986 E Ordered largest at top 29 25 25 23 23 22 21 21 20 20 18 18 16 16 16 16 15 15 15 15 14 13 11 11 10 10 8 8 7 6 F G Rank from bottom Percentile 30 1.000 29 0.966 28 0.931 27 0.897 26 0.862 25 0.828 24 0.793 23 0.759 22 0.724 21 0.690 20 0.655 19 0.621 18 0.586 17 0.552 16 0.517 15 0.483 14 0.448 13 0.414 12 0.379 11 0.345 10 0.310 9 0.276 8 0.241 7 0.207 6 0.172 5 0.138 4 0.103 3 0.069 2 0.034 1 0.000 H I J K L M N Earthquakes >M7, 1970-1999 1.000 0.750 percentile 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 B 0.500 0.250 0.000 5 10 15 20 25 30 number of EQs in year To visualize what we have done by interpolation, plot percentile (Col G) against earthquakes per year (Col E) and see where the 25, 50 and 75 percentiles cross the data curve. Increase the y-scale to see the crossings more clearly. 12 Graphing the percentiles (2) C Yr 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 Nbr 29 23 20 16 21 21 25 16 18 15 18 14 10 15 8 15 6 11 8 7 13 10 23 16 15 25 22 20 16 11 D E F number percentile per year 1.00 0.95 0.90 0.85 0.80 0.75 0.70 0.65 0.60 0.55 0.50 0.45 0.40 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00 29 25 23.2 22.65 21.2 20.75 20 18 16.8 16 16 15.05 15 15 13.7 11.5 10.8 10 8 7.45 6 G H I percentile (for graph) 1.00 0.95 0.90 0.85 0.80 0.75 0.70 0.65 0.60 0.55 0.50 0.45 0.40 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00 J K L M Cumulated Frequency of EQs >M7 1.00 0.90 0.80 0.70 Percentile 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 B 0.60 0.50 0.40 0.30 0.20 0.10 0.00 0 5 10 15 20 25 30 Number of earthquakes per year Or, instead of interpolating, use Excel’s built-in function to find whatever percentile you want. For example, Column E lists percentiles incrementally. Column F lists the corresponding percentiles – cell equation for F5 is =PERCENTILE($C$3:$C$32,E5). The $-symbols are included so you can copy the formula through the column. Column G repeats Column E so you can easily plot the graph as Column G vs. Column F (End note 7). How does the graph on this slide differ from the graph on Slide 12? 13 Key percentiles 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Yr 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 C Nbr 29 23 20 16 21 21 25 16 18 15 18 14 10 15 8 15 6 11 8 7 13 10 23 16 15 25 22 20 16 11 D E F G 0.999 0.977 0.841 0.500 0.159 0.023 0.001 H I J K central 68.2% central 95.4% central 99.7% (+/- STDEVP (+/- STDEVP +/- STDEVP units) units units number percentile per year Percentile B 28.884 26.332 22.389 16 10 6.667 6.029 1.98 1.70 1.07 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0 5 10 15 20 Number Earthquakes per Year 25 30 A normal distribution has the property that 68.2% of the values lie within plus or minus one standard deviation of the mean (i.e., between percentiles of 84.1% and 15.9%). Similarly, 95.4% and 99.7% of the values are within +/two and three standard deviations, respectively, in a normal distribution. How do these standards compare to the distribution of earthquakes per year? Recreate this spreadsheet. Column E lists the key percentiles. Column F uses the PERCENTILE function. The cell equation for H6 is (F6-F8)/STDEVP(C3:C32)/2. The graph plots Column E against Column F. H6, I5, and J4 would be 1.00, 2.00, and 3.00 respectively if the earthquakes per year were distributed 14 normally. This is good agreement. The distribution of earthquakess per year is nearly normal. What we have found for 1970-1999 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 B Yr 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 C Nbr 29 23 20 16 21 21 25 16 18 15 18 14 10 15 8 15 6 11 8 7 13 10 23 16 15 25 22 20 16 11 Number of Earthquakes (M>7) per Year, 1970-1999 • On average there were 16.2 large (M>7) earthquakes per year. (Read this as 162 in ten years if you don’t like the notion of 0.2 earthquakes in a year.) • The standard deviation was 15.7 earthquakes per year. • The median was 16, which was also a mode. • The distribution was unimodal: there were four years with 16 earthquakes and four with 15. • The maximum number was 29 (1970), and the minimum was 6 (1986) giving a total range of 23. • Q3 was 20.75 and Q1 was 11.5 giving an interquartile range (Q3-Q1) of 9.25 earthquakes per year. • The 90th percentile was 23.2, and the 10th percentile was 8.0, meaning the range of the central 80% of the distribution was 14.2 earthquakes per year. • The central 68.2% of the distribution occurred within +/1.1 standard deviations of the mean, and the central 95.4%, within 1.7 standard deviations of the mean. On this basis the distribution of earthquakes per year was nearly normal. 15 End of Module Assignments 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 B Year 1940 1941 1942 1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 C Nbr 23 24 27 41 31 27 35 26 28 36 29 21 17 22 17 19 15 34 10 15 22 18 15 20 15 22 19 16 30 27 Here are the data from the preceding thirty years, 1940-1969, from the QELP Data Set 039: http://www.seattlecentral.org/qelp/sets/039/039.html 1. Rewrite the information of Slide 15 replacing the statistics with numbers appropriate to the 1940-1969 data. Use bullets in the same order so that your answers are easy to grade. If you can, use a PowerPoint slide (simply copy and paste Slide 15 into a new presentation, delete the data set, and change the numbers in the bulleted items appropriately). 2. Hand in spreadsheets for Slides 6, 8, 12 and 13 for the new data set. Include the graphs for the last two. 3. Answer the questions in the green boxes of Slides 5, 11 and 13. 4. Find the mean, median, standard deviation and quartiles for the sixty years of data. Find the +/- number of standard deviations for the central 68.2%, 95.4% and 99.7% of the sixty years of data (Slide 14). 5. Based on the information in this module – a. What is the standard deviation and why is it important? b. What are quartiles and why are they important? 16 c. What are percentiles and why are they important? End notes 1. Home page: http://earthquake.usgs.gov/regional/neic/. Return to Slide 2. 2. For earthquake facts and statistics see the USGS NEIC Website: http://neic.usgs.gov/neis/eqlists/eqstats.html. Return to Slide 3. 3. How large is an 8.7-magnitude earthquake compared to a magnitude-5.8 earthquake? See the USGS Website: http://earthquake.usgs.gov/learning/topics/how_much_bigger.php. Return to Slide 3. 4. Experiment also with the function COUNTBLANK(array). Return to Slide 5. 5. As variations on the COUNTA(array) theme, there are also the following built-in functions: VARA(array), VARPA(array), STDEVA(array), STDEVPA(array). Return to Slide 6. 6. There are also MAXA(array) and MINA(array). Return to Slide 7. 7. Or if you want to reverse the axes (i.e., plot x vs. y, instead of y vs. x), here is one way of doing it. Plot y vs. x as usual. Right-click on the gray area of the graph. Select “Source data …” from the popup window. Select the “Series” tab. Click on the small icon to the right of the X-values. The address of your x-series will appear in a long, skinny window and a shimmering outline will appear around the series block in the spreadsheet. Change the address in the skinny window to that of the Y-values. One easy way of doing that is to use the mouse to outline the block of the Yvalues on the spreadsheet. Notice how the address in the skinny window changes as you move the mouse down the column. When the block is the size you want, release the mouse button and hit enter. Repeat for the Y-values, replacing their address with the address of the old X-values. Rescale and retitle the axes. Return to Slide 13. 17