Download Common Crime Analysis Formulas

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Time series wikipedia , lookup

Transcript
Part 3 ♦ Tools of Crime Analysis
Mean, Median, and Mode. The mean is the
“average” of a group of numbers. Analysts
are often required to provide average
response times for emergency calls, average
number of crimes per month, average length
of time on a call, and so on. The mean is
easily calculated in any spreadsheet
application using canned functions. However,
to manually calculate the mean, divide the
sum of all numbers by the number of data
points. The following chart illustrates the use
of mean against a group of emergency
response times. Notice how the mean is
skewed because one of the emergency
response times appears problematic—a
response time of more than 100 minutes. For
this example, perhaps the median response
time might describe our data better.
Crime Analysis Formulas
The spreadsheet provides one of the most
robust and efficient means to perform many
of the common crime analysis related
calculations. This section introduces many of
those topics and provides examples on how
to calculate them.
Common Crime Analysis Formulas
Percent Change. Percent change is perhaps
one of the most widely used statistics in
crime analysis. It could not be easier to
calculate using a spreadsheet application.
Percent change will provide us a result
indicating the percentage of change from one
value to another. This is useful in crime
analysis to provide indications of whether
crime is going up (positive percentage) or on
the decline (negative percentage.) The
formula is =(B1-A1)/A1 where B1 contains
the most recent frequency and A1 contains
the “historical” data. So, using this formula to
calculate the percent change from year 2000
to 2001 where 2001 experienced 100 crimes
and 2000 experienced 50 crimes would result
in a value in cell A3 of -50% (assuming you
formatted cell A3 as a percent.) An example
of this calculation is provided in the figure
above.
So, to calculate percent change, simply
subtract the old value from the new value,
and then divide by the old value. Use the
percent button to format the result as a
percentage.
Figure 18-3: Calculating the mean
The median is the “middle” number of a
group of sorted cases. For instance, 5 is the
median number for the following group of
cases:
1, 1, 1, 4, 5, 7, 7, 10, 202
The number 5 more accurately describes the
middle for our group of cases and is not as
subject to the pitfalls of outliers such as value
Figure 18-2: Calculating percentage change
368
Chapter 18♦ Spreadsheets
“202.” When a group of cases contains an
even number of data points, the median is
the average of the two data points centered
on the middle.
If no value is repeated, there is no mode. If
more than one value occurs with the same
greatest frequency, each value is a mode.
Data points containing two modes are called
bimodal. Data sets containing more than two
modes are called multimodal. (See Chapters
10 and 13 for more on central tendency.)
Text Manipulation Formulas
Two of the most often used crime analysis
text manipulation functions in Excel are
concatenation and parsing. Concatenation is
the process of merging multiple values in
multiple cells into one cell. Parsing is the
process of separating out a singular value in a
cell to new values in multiple cells. An
example of concatenation would be joining
several cells containing components of an
address (street number, direction, street
name, street suffix) into one cell containing
the entire address. An example of parsing
data might be separating a string of words
delimited by commas into several individual
cells containing the data.
Figure 18-4: Calculating the median
The mode is the value that occurs most often
in a series of numbers. The mode for our
previous group of data points would be “1”
because it occurs more often than any other
number—3 times in all.
Parsing. Parsing is the process of taking text
stored in one variable separated by some
delimiter and separating the text into
individual fields based on the delimiter.
Parsing is most often used in crime analysis
to separate a single address field into separate
fields. The following illustration shows a
single variable address and variables
containing its parsed text.
There are various ways to perform parsing in
spreadsheet applications. One of the quickest
ways to easily perform parsing is using
Excel’s “Text to Columns” function. Here is
a brief methodology of how to use the “Text
to Columns” function to perform parsing.
Before using this feature, make sure the
column immediately to the right of the
column you wish to parse is empty.
Figure 18-5: Calculating the mode
369
Part 3 ♦ Tools of Crime Analysis
Figure 18-6: Parsing a single field into multiple fields
Figure 18-8: An address field after parsing out the apartment
1. Select the data you want to separate (or
select the column).
2. Go to menu Data |Text to Columns
3. Check the “Delimited” option button.
Click “Next.”
4. Select “Space”
5. Click “Finish”
The concatenation process is similar to other
functions performed in Excel or other
spreadsheet applications in that we build a
function to perform this process.
Notice how the concatenation formula
displayed in the formula window contains
double quotes (" "). It is necessary to include
a space between our variable values otherwise
the resulting value would look like this
“123EMainSt”. Unfortunately, often our data
may contain spaces on the end of a variable
value and sometimes it may not. The
resulting concatenation might then have
multiple spaces between values. In this case,
another function may be necessary to “clean”
our final address value. With the use of the
“TRIM” function, we can remove extra
spaces between words, leaving one space as a
separator.
Of course, data can be parsed using any of
the other delimiters. You may wish to delimit
your address field by the “#” sign. Often, the
# sign is used to indicate the beginning of an
apartment or suite number. By using this
delimiter, you will separate your address from
the apartment number resulting in a separate
address field and apartment/suite field.
Figures 18-7 and 18-8 illustrate this example.
Concatenation. Conversely, concatenation
takes several values stored in separate
variables and combines their values to create
a single value. Again, most often in crime
analysis the need to concatenate values is
necessitated by the address variable.
Figure 18-9: An address field concatenated from four separate
fields
Table 18-1 provides additional Microsoft
Excel text manipulation formulas commonly
used by crime analysts.
Figure 18-7: An address field before parsing out the apartment
370