Download earthquakes per year. - SERC

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
SSAC2006.QE531.LV1.6
Frequency of Large Earthquakes
Introducing Some Elementary Statistical Descriptors
Earthquakes with a magnitude 7 to
7.9 are considered major
earthquakes. Earthquakes with
magnitude larger than 8 are
considered great earthquakes. How
frequently do these large
earthquakes occur?
Core Quantitative Issue
Data analysis: Exploratory
statistical descriptors
Supporting quantitative concepts/skills
Mean, median, mode
Variance, standard deviation
Percentiles, Quartiles
Interpolation
Normal distribution
Prepared for SSAC by
Len Vacher – University of South Florida
© The Washington Center for Improving the Quality of Undergraduate Education. All rights reserved. xxxx
1
Preview
The mission of the U.S. Geological Survey National Earthquake Information Center
(USGS NEIC) is “to determine rapidly the location and size of all destructive
earthquakes worldwide and to immediately disseminate this information to concerned
national and international agencies, scientists, and the general public” (End note 1).
The NEIC now locates some 50 earthquakes a day, or about 20,000 per year
Slide 3 gives some background on earthquake frequency and magnitude.
Slide 4 presents the 30 years of data that you will study: the number of large
earthquakes per year for 1970-1999.
Slides 5-9 ask you for the mean; the variance and standard deviation; the maximum,
minimum and range; the modes; and the median and quartiles of the data.
Slides 10 and 11 look at the quartiles more closely – in terms of percentiles. Linear
interpolation becomes relevant in Slide 11.
Slides 12-14 ask you to plot the percentiles. In particular, Slide 14 asks you to consider
the distribution of the data. How does the relation between the key percentiles and the
standard deviation compare to that of the normal distribution?
Slide 15 wraps up, and Slide 16 gives you the end-of-module assignments, which
involve data from 1940-1969.
2
Background on earthquake magnitude and frequency
As shown in the following table, there were 10-15 major and one great
earthquake per year in 2000-2005. Over that period of time, five of the major
earthquakes and none of the great earthquakes were in the US. (End note 2)
B
Magnitude
2
3
4
8.0 to 9.9
5
7.0 to 7.9
6
6.0 to 6.9
7
5.0 to 5.9
8
4.0 to 4.9
9
3.0 to 3.9
10
2.0 to 2.9
11
1.0 to 1.9
12
0.1 to 0.9
13
No Magnitude
14
15
Total
16
17 Deaths (estimated)
C
2000
D
2001
E
2002
F
2003
G
2004
H
2005
I
2006*
1
14
158
1345
8045
4784
3758
1026
5
3120
1
15
126
1243
8084
6151
4162
944
1
2938
0
13
130
1218
8584
7005
6419
1137
10
2937
1
14
140
1203
8462
7624
7727
2506
134
3608
2
14
141
1515
10888
7932
6316
1344
103
2939
1
10
144
1699
13917
9173
4638
26
0
867
0
9
104
1140
9736
7535
3014
14
2
541
22,256
23,534
27,454
31,419
31,194
30,475
22,095
231
21,357
1685
33,819
284,010
89,354
6595
U.S. Geological Survey National Earthquake Information Center, 11/6/2006
To appreciate the size of the large earthquakes, remember this: A magnitude-8
earthquake releases about 32× as much energy as a magnitude-7 earthquake,
and a magnitude-7 earthquake releases about 32× as much energy as a
magnitude-6 earthquake. Therefore a magnitude-8 earthquake is about 1000×
as strong as a magnitude-6 earthquake. (End note 3)
3
Problem
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
B
Yr
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
C
Nbr
29
23
20
16
21
21
25
16
18
15
18
14
10
15
8
15
6
11
8
7
13
10
23
16
15
25
22
20
16
11
Here are the data. Start your
spreadsheet by copying the
data in Columns B and C,
starting with Row 3, as shown.
Data are from the Quantitative Environmental Learning
Project (QELP), an NSF-sponsored project to promote
quantitative reasoning using real data, by Greg
Langkamp and Joe Hull of Seattle Central Community
College. For information about this data set see
http://www.seattlecentral.org/qelp/sets/039/039.html
• What are the mean, median, and mode of these data?
• What are the variance and standard deviation?
• What are the quartiles?
• What are the 10th and 90th percentiles?
4
Finding the Mean
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
B
Yr
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
C
Nbr
29
23
20
16
21
21
25
16
18
15
18
14
10
15
8
15
6
11
8
7
13
10
23
16
15
25
22
20
16
11
D
E
F
Nbr yrs
30
Sum Earthquakes
487
Average Nbr per yr
16.2
By Excel
Average
16.2
Even easier –
Cell equation for F15:
=AVERAGE(C3:C32)
With pencil and paper, one
can count the number of data
(Cell F6), sum the total
number of earthquakes (F8),
and divide the second result
by the first (F10).
Excel does these steps easily.
• Cell equation for F6:
=COUNT(C3:C32)
• Cell equation for F8:
=SUM(C3:C32)
Delete one of the values in
Column C, and replace
another one with a letter.
Then try these variations.
• =COUNTA(C3:C32)
• =SUMA(C3:C32)
• =AVERAGEA(C3:C32)
Explain what you observe
(End note 4)
5
Finding the variance and standard deviation
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
B
C
Yr
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
Nbr
29
23
20
16
21
21
25
16
18
15
18
14
10
15
8
15
6
11
8
7
13
10
23
16
15
25
22
20
16
11
D
E
F
DeviationDeviation squared
12.8
163
6.8
46
3.8
14
-0.2
0
4.8
23
4.8
23
8.8
77
-0.2
0
1.8
3
-1.2
2
1.8
3
-2.2
5
-6.2
39
-1.2
2
-8.2
68
-1.2
2
-10.2
105
-5.2
27
-8.2
68
-9.2
85
-3.2
10
-6.2
39
6.8
46
-0.2
0
-1.2
2
8.8
77
5.8
33
3.8
14
-0.2
0
-5.2
27
G
H
average of deviations
I
0.0
average of dev'ns-sqred
33.4
sqrt of avg of dev-sqred
5.78
By Excel
population variance
sample variance
33.4
34.5
population stdev
sample stdev
5.78
5.88
Recreate this
spreadsheet.
With pencil and paper, one
calculates the deviation of
each of the values from the
average (Col E) and then
squares each of them (Col
F). The average of all of the
deviations is zero (Cell I5).
The average of all of the
squared deviations is the
population variance (I7).
The square root of the
population variance is the
population standard
deviation (I9). Using Excel’s
built-in functions:
=VARP(C3:C32)
=STDEVP(C3:C32)
Without the “p”, VAR and
STDEV return the sample
variance and sample
standard deviation,
respectively. (End note 5)
Google research: What’s the difference between sample
and population standard deviation?
6
Finding the range
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
B
Yr
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
C
Nbr
29
23
20
16
21
21
25
16
18
15
18
14
10
15
8
15
6
11
8
7
13
10
23
16
15
25
22
20
16
11
D
E
1970
1976
1995
1971
1992
1996
1974
1975
1972
1997
1978
1980
1973
1977
1993
1998
1979
1983
1985
1994
1981
1990
1987
1999
1982
1991
1984
1988
1989
1986
F
Ordered
29
25
25
23
23
22
21
21
20
20
18
18
16
16
16
16
15
15
15
15
14
13
11
11
10
10
8
8
7
6
G
H
With pencil and paper, one sorts the data
from highest to lowest and then reads off
the largest and smallest values, as well
as the second largest and third smallest
values, if you wish. The range is the
largest value minus the smallest value
(29 – 6, in this case).
I
<<<Max
<<<Second largest
<<<Third
By Excel
Max
2nd
3rd
29
25
25
Min
2nd
3rd
6
7
8
Range
23
<<<Third
<<<Second smallest
<<<Min
To sort in Excel:
• Copy the block to be sorted (B3 to C32)
to a blank part of the spreadsheet (E3 to
F32).
• Block out E3 to F32.
• Select “Data” on tool bar
• Select “Sort”
• Choose Column F and “Descending”
• Choose OK
Excel’s built-in functions:
• For Max
=MAX(C3:C32)
• For minimum
=MIN(C3:C32) (End note 6)
• For 2nd largest
=LARGE(C3:C32,2)
• For 3rd smallest
=SMALL(C3:C32,3)
7
Finding modes
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
B
Yr
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
C
Nbr
29
23
20
16
21
21
25
16
18
15
18
14
10
15
8
15
6
11
8
7
13
10
23
16
15
25
22
20
16
11
D
E
1970
1976
1995
1971
1992
1996
1974
1975
1972
1997
1978
1980
1973
1977
1993
1998
1979
1983
1985
1994
1981
1990
1987
1999
1982
1991
1984
1988
1989
1986
F
Ordered
29
25
25
23
23
22
21
21
20
20
18
18
16
16
16
16
15
15
15
15
14
13
11
11
10
10
8
8
7
6
G
Counting
1
2
H
I
When the data are sorted, one can
count the number of times that
each value occurs. Column G
provides these counts. Cell G3,
for example, says that there was
one year with 29 earthquakes, and
G6 says that there were two years
with 25 earthquakes. The most
frequent values are 16 (4 times)
and 15 (4 times). Thus 15 and 16
are modes (a composite mode).
J
2
By Excel
1
2
Mode
16
Nbr of 16s
4
Nbr of 15s
Nbr of 19s
4
0
Nbr >16
Nbr>=16
Nbr <16
Nbr<=16
Nbr not 16
12
16
14
18
26
Nbr 14-18
11
2
2
4
4
1
1
2
Sumifs
Sum if 16
Sum if >=16
2
2
1
1
Recreate this
spreadsheet.
The SUMIF function works the same
way as the COUNTIF function
Excel’s built-in functions –
• For mode (J8):
=MODE(C3:C32)
64
329
For counts –
• For number of times the value is
16:
=COUNTIF(C3:C32,16)
• For number of times value is
larger than 16:
=COUNTIF(C3:C32,”>16”)
• For number of times value is 16
or larger:
=COUNTIF(C3:C32,”>=16”)
• For number of times value is not
16:
=COUNTIF(C3:C32,”<>16”)
8
Finding quartiles
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
B
Yr
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
C
Nbr
29
23
20
16
21
21
25
16
18
15
18
14
10
15
8
15
6
11
8
7
13
10
23
16
15
25
22
20
16
11
D
E
1970
1976
1995
1971
1992
1996
1974
1975
1972
1997
1978
1980
1973
1977
1993
1998
1979
1983
1985
1994
1981
1990
1987
1999
1982
1991
1984
1988
1989
1986
F
Ordered
29
25
25
23
23
22
21
21
20
20
18
18
16
16
16
16
15
15
15
15
14
13
11
11
10
10
8
8
7
6
G
Rank
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
H
I
J
Count
mid count
1st qtr count
3rd qtr count
30
15.5
7.75
23.25
By Excel
median
Q1
Q2
Q3
16
11.5
16
20.75
Upr Qtr
Mid-Count
By Excel
median
med(upr half)
med(lwr half)
16
21
11
Lwr Qtr
Another way that quartiles are determined is:
• Q1 is the median of the smaller half of the list – MEDIAN(G18:G32)
• Q3 is the median of the larger half of the list – MEDIAN(G3:G17)
NOTE: These results (J21 and J22) don’t agree with Excel’s built-in functions either.
When the data are sorted, one can
locate the median and quartiles.
The median is the value that
occurs halfway through the list.
For 30 values, the halfway
position is 31/2 – in in general
(COUNT+1)/2 (J6). The first
quartile occurs a quarter of the
way through from the bottom – or
at (COUNT+1)/4 (J7), and the third
quartile occurs three- quarters of
the way through – or at
(COUNT+1)*3/4 (J8).
The horizontal boxes from
Column F to H identify the
quartiles from this sorting: 11 for
the first quartile (Q1), 16 for the
second quartile (Q2 or median),
and 21 for the third quartile (Q3).
Excel’s built-in functions –
• For median (J12)):
=MEDIAN(C3:C32)
For Q1 (J13):
=QUARTILE(C3:C32,1)
• For Q2 (J14):
=QUARTILE(C3:C32,2)
• For Q3 (J15)::
=QUARTILE(C3:C32,3)
NOTE: The results (J13, J15) do
9
not agree with our values from
counting through our sorted list.
What is EXCEL doing?
Finding percentiles and comparing them to quartiles
B
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Yr
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
C
Nbr
29
23
20
16
21
21
25
16
18
15
18
14
10
15
8
15
6
11
8
7
13
10
23
16
15
25
22
20
16
11
D
E
1970
1976
1995
1971
1992
1996
1974
1975
1972
1997
1978
1980
1973
1977
1993
1998
1979
1983
1985
1994
1981
1990
1987
1999
1982
1991
1984
1988
1989
1986
F
G
H
I
Ordered
Rank
largest at
from
top
bottom Percentile
29
30
1.000
25
29
0.966
25
28
0.931
23
27
0.897
23
26
0.862
22
25
0.828
21
24
0.793
21
23
0.759 75th
20
22
0.724
20
21
0.690
18
20
0.655
18
19
0.621
16
18
0.586
16
17
0.552
16
16
0.517 50th
16
15
0.483
15
14
0.448
15
13
0.414
15
12
0.379
15
11
0.345
14
10
0.310
13
9
0.276 25th
11
8
0.241
11
7
0.207
10
6
0.172
10
5
0.138
8
4
0.103
8
3
0.069
7
2
0.034
6
1
0.000
J
vs
K
Excel
20.75
vs
16
vs
11.5
Excel uses percentiles to
calculate quartiles. The
median is the 50th
percentile, meaning that
50% of the data are smaller
than the median. Q1 is the
25th percentile, meaning
that 25% of the data are
smaller than Q1. Q3 is the
75th percentile, meaning
that 75% of the data are
smaller than Q3
One can calculate the
percentile of each position
in the sorted list (Column
H). Cell H4 shows that the
second highest position
contains the 96.6th
percentile. It is calculated
as (29-1)/(30-1) or in Excel
=(G4-1)/(COUNT($C$3:$C$32)-1)
The $-symbols are included
in order that the equation
can be copied through
Column H.
The 75th percentile clearly occurs between 20 and 21, and closer to 21.
The 25th percentile clearly occurs between 11 and 13, and closer to 11.
Both observations are consistent with the values returned by Excel for Q3 and Q1, respectively.
What do we find if we interpolate?
10
Finding quartiles by interpolating percentiles
B
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Yr
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
C
Nbr
29
23
20
16
21
21
25
16
18
15
18
14
10
15
8
15
6
11
8
7
13
10
23
16
15
25
22
20
16
11
D
E
1970
1976
1995
1971
1992
1996
1974
1975
1972
1997
1978
1980
1973
1977
1993
1998
1979
1983
1985
1994
1981
1990
1987
1999
1982
1991
1984
1988
1989
1986
F
Ordered
largest at
top
29
25
25
23
23
22
21
21
20
20
18
18
16
16
16
16
15
15
15
15
14
13
11
11
10
10
8
8
7
6
G
H
I
Rank
from
bottom Percentile
30
1.000
29
0.966
28
0.931
27
0.897
26
0.862
25
0.828
24
0.793
23
0.759 75th
22
0.724
21
0.690
20
0.655
19
0.621
18
0.586
17
0.552
16
0.517 50th
15
0.483
14
0.448
13
0.414
12
0.379
11
0.345
10
0.310
9
0.276 25th
8
0.241
7
0.207
6
0.172
5
0.138
4
0.103
3
0.069
2
0.034
1
0.000
J
Interpolate
20.75
16
11.5
K
Excel
20.75
16
11.5
For Excel’s algorithm see: http://support.microsoft.com/?kbid=214072
Cell equation in J10:
=F11+(.75-H11)/(H10H11)*(F10-F11)
Why? Explain
using the word
proportion.
What cell
equations
belong in J17
and J24?
It looks like Excel
determines
quartiles as
percentiles by
interpolation.
11
Graphing the percentiles
C
D
Yr
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
Nbr
29
23
20
16
21
21
25
16
18
15
18
14
10
15
8
15
6
11
8
7
13
10
23
16
15
25
22
20
16
11
1970
1976
1995
1971
1992
1996
1974
1975
1972
1997
1978
1980
1973
1977
1993
1998
1979
1983
1985
1994
1981
1990
1987
1999
1982
1991
1984
1988
1989
1986
E
Ordered
largest at
top
29
25
25
23
23
22
21
21
20
20
18
18
16
16
16
16
15
15
15
15
14
13
11
11
10
10
8
8
7
6
F
G
Rank
from
bottom Percentile
30
1.000
29
0.966
28
0.931
27
0.897
26
0.862
25
0.828
24
0.793
23
0.759
22
0.724
21
0.690
20
0.655
19
0.621
18
0.586
17
0.552
16
0.517
15
0.483
14
0.448
13
0.414
12
0.379
11
0.345
10
0.310
9
0.276
8
0.241
7
0.207
6
0.172
5
0.138
4
0.103
3
0.069
2
0.034
1
0.000
H
I
J
K
L
M
N
Earthquakes >M7, 1970-1999
1.000
0.750
percentile
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
B
0.500
0.250
0.000
5
10
15
20
25
30
number of EQs in year
To visualize what we have done by interpolation, plot percentile (Col G) against
earthquakes per year (Col E) and see where the 25, 50 and 75 percentiles cross
the data curve. Increase the y-scale to see the crossings more clearly.
12
Graphing the percentiles (2)
C
Yr
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
Nbr
29
23
20
16
21
21
25
16
18
15
18
14
10
15
8
15
6
11
8
7
13
10
23
16
15
25
22
20
16
11
D
E
F
number
percentile per year
1.00
0.95
0.90
0.85
0.80
0.75
0.70
0.65
0.60
0.55
0.50
0.45
0.40
0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00
29
25
23.2
22.65
21.2
20.75
20
18
16.8
16
16
15.05
15
15
13.7
11.5
10.8
10
8
7.45
6
G
H
I
percentile
(for graph)
1.00
0.95
0.90
0.85
0.80
0.75
0.70
0.65
0.60
0.55
0.50
0.45
0.40
0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00
J
K
L
M
Cumulated Frequency of EQs >M7
1.00
0.90
0.80
0.70
Percentile
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
B
0.60
0.50
0.40
0.30
0.20
0.10
0.00
0
5
10
15
20
25
30
Number of earthquakes per year
Or, instead of interpolating, use Excel’s built-in function to find whatever
percentile you want. For example, Column E lists percentiles incrementally.
Column F lists the corresponding percentiles – cell equation for F5 is
=PERCENTILE($C$3:$C$32,E5). The $-symbols are included so you can copy
the formula through the column. Column G repeats Column E so you can
easily plot the graph as Column G vs. Column F (End note 7).
How does the graph on this slide differ from the graph on Slide 12?
13
Key percentiles
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Yr
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
C
Nbr
29
23
20
16
21
21
25
16
18
15
18
14
10
15
8
15
6
11
8
7
13
10
23
16
15
25
22
20
16
11
D
E
F
G
0.999
0.977
0.841
0.500
0.159
0.023
0.001
H
I
J
K
central 68.2% central 95.4% central 99.7%
(+/- STDEVP (+/- STDEVP +/- STDEVP
units)
units
units
number
percentile per year
Percentile
B
28.884
26.332
22.389
16
10
6.667
6.029
1.98
1.70
1.07
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0
5
10
15
20
Number Earthquakes per Year
25
30
A normal distribution has
the property that 68.2% of
the values lie within plus or
minus one standard
deviation of the mean (i.e.,
between percentiles of
84.1% and 15.9%).
Similarly, 95.4% and 99.7%
of the values are within +/two and three standard
deviations, respectively, in
a normal distribution. How
do these standards
compare to the distribution
of earthquakes per year?
Recreate this spreadsheet.
Column E lists the key
percentiles. Column F uses
the PERCENTILE function.
The cell equation for H6 is
(F6-F8)/STDEVP(C3:C32)/2.
The graph plots Column E
against Column F.
H6, I5, and J4 would be 1.00, 2.00, and 3.00 respectively if the earthquakes per year were distributed
14
normally. This is good agreement. The distribution of earthquakess per year is nearly normal.
What we have found for 1970-1999
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
B
Yr
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
C
Nbr
29
23
20
16
21
21
25
16
18
15
18
14
10
15
8
15
6
11
8
7
13
10
23
16
15
25
22
20
16
11
Number of Earthquakes
(M>7) per Year, 1970-1999
• On average there were 16.2 large (M>7) earthquakes per
year. (Read this as 162 in ten years if you don’t like the
notion of 0.2 earthquakes in a year.)
• The standard deviation was 15.7 earthquakes per year.
• The median was 16, which was also a mode.
• The distribution was unimodal: there were four years with
16 earthquakes and four with 15.
• The maximum number was 29 (1970), and the minimum
was 6 (1986) giving a total range of 23.
• Q3 was 20.75 and Q1 was 11.5 giving an interquartile
range (Q3-Q1) of 9.25 earthquakes per year.
• The 90th percentile was 23.2, and the 10th percentile was
8.0, meaning the range of the central 80% of the
distribution was 14.2 earthquakes per year.
• The central 68.2% of the distribution occurred within +/1.1 standard deviations of the mean, and the central 95.4%,
within 1.7 standard deviations of the mean. On this basis
the distribution of earthquakes per year was nearly normal.
15
End of Module Assignments
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
B
Year
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
C
Nbr
23
24
27
41
31
27
35
26
28
36
29
21
17
22
17
19
15
34
10
15
22
18
15
20
15
22
19
16
30
27
Here are the data from the preceding thirty years, 1940-1969, from
the QELP Data Set 039:
http://www.seattlecentral.org/qelp/sets/039/039.html
1. Rewrite the information of Slide 15 replacing the statistics with
numbers appropriate to the 1940-1969 data. Use bullets in the
same order so that your answers are easy to grade. If you can,
use a PowerPoint slide (simply copy and paste Slide 15 into a
new presentation, delete the data set, and change the numbers in
the bulleted items appropriately).
2. Hand in spreadsheets for Slides 6, 8, 12 and 13 for the new data
set. Include the graphs for the last two.
3. Answer the questions in the green boxes of Slides 5, 11 and 13.
4. Find the mean, median, standard deviation and quartiles for the
sixty years of data. Find the +/- number of standard deviations
for the central 68.2%, 95.4% and 99.7% of the sixty years of data
(Slide 14).
5. Based on the information in this module –
a. What is the standard deviation and why is it important?
b. What are quartiles and why are they important?
16
c. What are percentiles and why are they important?
End notes
1.
Home page: http://earthquake.usgs.gov/regional/neic/. Return to Slide 2.
2.
For earthquake facts and statistics see the USGS NEIC Website:
http://neic.usgs.gov/neis/eqlists/eqstats.html. Return to Slide 3.
3.
How large is an 8.7-magnitude earthquake compared to a magnitude-5.8 earthquake? See the
USGS Website: http://earthquake.usgs.gov/learning/topics/how_much_bigger.php. Return to
Slide 3.
4.
Experiment also with the function COUNTBLANK(array). Return to Slide 5.
5.
As variations on the COUNTA(array) theme, there are also the following built-in functions:
VARA(array), VARPA(array), STDEVA(array), STDEVPA(array). Return to Slide 6.
6.
There are also MAXA(array) and MINA(array). Return to Slide 7.
7.
Or if you want to reverse the axes (i.e., plot x vs. y, instead of y vs. x), here is one way of doing it.
Plot y vs. x as usual. Right-click on the gray area of the graph. Select “Source data …” from the
popup window. Select the “Series” tab. Click on the small icon to the right of the X-values. The
address of your x-series will appear in a long, skinny window and a shimmering outline will
appear around the series block in the spreadsheet. Change the address in the skinny window to
that of the Y-values. One easy way of doing that is to use the mouse to outline the block of the Yvalues on the spreadsheet. Notice how the address in the skinny window changes as you move
the mouse down the column. When the block is the size you want, release the mouse button and
hit enter. Repeat for the Y-values, replacing their address with the address of the old X-values.
Rescale and retitle the axes. Return to Slide 13.
17