Download Drawn to scale by Steve Meng

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Open Database Connectivity wikipedia , lookup

Database wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
Drawn to Scale
May 1, 2005 12:00 PM
By Steve Meng
As the population grew and became more mobile, volunteer departments could not attract enough members
to provide adequate service. To expand, paid crews were put in the volunteer stations. Unfortunately, these
stations were built mostly on donated land within the boundaries drawn up by the volunteer departments
before the population growth and, as such, were rarely in the most useful location. As steady growth
continued, what was already a significant problem became worse each year.
Sound familiar? That was the situation facing Hillsborough County (Fla.) Fire Rescue. The chief officers knew
that response time was inadequate and that more stations and firefighters were needed.
Being part of a large county government, requests for expansion were sent to professional administrators.
The fire chief knew that there was a serious problem, but could not simply say that expansion was needed.
The chief set out to demonstrate, in concrete terms, how inadequate response times were and how badly
there was a need to invest in stations and personnel to improve them.
INITIAL CHALLENGES
The first part of defining a performance problem is to have a standard for performance. While there are
staffing guidelines based on population and other criteria, these guidelines are abstract. Telling an elected
official that NFPA guidelines specify that the department should have x number of staff on duty is not
convincing. Requests for expansion based on population growth and increased call load weren't achieving
results either.
Several years prior to the effort to expand, the county commission set performance goals for the
department. These standards included a five-minute response time to 80% of urban incidents, a 10-minute
response time to 90% of rural incidents, and an ALS ambulance on scene within eight minutes on 90% of all
ALS-level medical incidents. Measuring performance against these standards would be the key to obtaining
the commitment to expand services.
Department leaders knew some performance data through research on individual incidents. While the
department could rely on such information for internal use, it could not be presented to elected officials.
What it eventually took to convince the elected officials was the chief being able to say that the department
met its response time goals only 28% of the time in urban areas and 68% of the time in rural areas. To
make these determinations, the department needed to analyze a large amount of data.
The department received a daily printout of CAD records, which then were copied and distributed to various
offices. The department also had an electronic incident-reporting system, but it only included fire reports.
Further, these were all keyed by reporting officers, leading to accuracy problems. The department also had
paper reports for all patient encounters.
Reports from the CAD system would have been the best source of data for several reasons. First, it is a
comprehensive record of the activity of the department. Second, accuracy is excellent because most of the
data of interest was entered by the system in the form of time stamps and the address was from Caller-ID
information. Finally, it is in a commercial database system, which can easily be queried for a multitude of
statistical data.
However, political issues prohibited access to that data electronically. The department's dispatch services
were provided by an outside agency, which was not willing to allow access to the CAD system and would
only provide limited reports to the department. Further, accurate assessment of performance required being
able to manipulate data and screen-out errors, which was beyond the ability of the dispatch department.
To solve this problem in the short term, the department built a dispatch-log database. The dispatch agency
agreed to begin exporting the daily run report in an Excel format, which the department retrieved and
imported into the database system. For a long-term solution, the department bought a records-management
system with a CAD link. Numerous issues, however, prevented this system from coming online in time, so
the dispatch-log database was used extensively in analyzing response time, interfacility transport
performance and a host of other issues.
EXPANSION EVIDENCE
Since a statistical analysis was the basis for the case to expand services, it's important to understand some
statistical concepts. For example, we may be interested in an average response time or average compliance
with a goal, such as response within five minutes. In a statistical analysis, the typical method of
accomplishing this is by selecting a sample of incidents, analyzing those, and using that to estimate the true
average for all incidents.
Using a large sample, such as analyzing all the incidents in a two-year period, would require extensive work.
Using a smaller sample results in a less accurate estimate of the true average, but it is far easier to
accomplish. There are statistical formulas that determine what sample size should be used and what the
margin of error would be for using a given sample size.
Within the sample of data, it's important to understand that averages can be skewed by extraordinarily large
or small values. For this study, such a situation could arise where a unit may neglect to report “on scene”
when it arrives, or the dispatcher may not enter that time. In such cases, it's likely that the “on-scene” time
in the data sample will be inaccurate by a large amount. It's important to “scrub” the data to remove errors
such as these. This study determined a reasonable travel time to each geographical zone being studied, and
discarded times that varied from that amount by more than 50%.
A further issue is that not all responses are answered with units using lights and sirens. In this jurisdiction,
controlled burns and some other incidents were responded to in non-emergency mode. For ambulance
response time, some incidents were initially dispatched to the private BLS providers, and the department's
rescue units responded after the failure of the private providers or the on-scene company upgraded the
incident. Each of these situations would skew the estimates of performance and had to be removed from the
samples.
The methods employed for this study would not be sufficiently accurate to qualify for the nightly news,
where sample size and margin of error are always reported, nor for a college statistics course. Making a
statistically sound study for these purposes requires significantly more staff hours than were available. For
our purposes, one year's worth of data was selected, then obvious errors were scrubbed. For each of the
response time studies, a sample set was built that included at least 1,000 records. That was a sufficiently
large sample for this study.
DATABASE DESIGN
A CAD system will have a table used for all incidents that will contain fields like incident address, calling
party, alarm time, call type and operator narrative. There usually will be several fields that have an incident
number: the CAD system's identifying number and usually a department's incident number. So all you have
to do is find this table, and you'll have the answer, right? Not quite.
Continuing with the CAD system example, on many incidents there will be more than one unit responding.
The CAD system has a table for each incident, and it will have a table for units that respond. If this were
being done in Excel, one might continue repeating all the basic call information for each responding unit;
however, this would constitute poor database design.
The database will have a table for incidents and a table for units. The units table can have one or more
records for each incident, depending on how many units respond. The units table would typically have the
alert time for each unit, their arrival times, and their available times. The units table will have the CAD
system's identifying number in it so it can be linked to the appropriate incident table record. CAD or recordsmanagement systems have many tables, but the incident and unit tables have the answers to questions
about performance. Since the data needed for performance analysis resides in at least two different tables,
we must frequently use tools to assemble the data. (See “Common Software Eases Data Crunch,” page 61.)
This brings up another type of database object, a view. Views, which are similar to Microsoft Access queries,
display table data in particular formats. Views may assemble information from an incident table and a unit
table to present a set of records with information from both tables. A view may be used to extract particular
information, such as the incident type codes from an NFIRS 901 table.
If an appropriate view isn't available, users interested in analyzing data must use joins to extract the
necessary data. In the process of extracting data, users would select the two (or more) tables containing
the data necessary, then specify how the tables are related. The units table, for example, has a record
identifier that identifies which record in the incidents table that record belongs to. If five units respond to
incident number 12345, there will be one record of the incident in the incidents table, and five unit records
with that incident number in the units table. Using the Excel database query tools, it is an easy matter to
specify the tables and their matching records.
GETTING THE DATA
It will be important to filter the records to achieve a manageable number of records and remove nonemergency responses. A few steps would be necessary to achieve the desired results. For this department's
study, only the census tracts in the urban area or rural area, depending on the desired study, would be
used. Then the records would be examined, and the records where the response time seems unduly long
(indicating an error as discussed earlier) would be rejected. Once clean data from the desired service area is
all that remains, a few more formulae will yield the desired numerical results. (See “RMS, Excel Aid in
Response Time Study,” below.)
For each row of data, a logical test is needed to calculate whether the response time is appropriate. In our
case, in the 100 incidents studied, 29 incidents meet the response time goal.
Using similar methods, several different performance dimensions were produced. Rural response time and
Advanced Life Support unit response time also were analyzed for the elected officials. Advanced Life Support
unit arrival for interfacility transport was analyzed for compliance with a regulatory organization's
requirements.
Learning to analyze data in this manner takes some education and some effort. The first step is to have an
advanced knowledge of Microsoft Excel, and at least an intermediate knowledge of Microsoft Access. A good
course in Microsoft Access will cover table design, which will help users understand the structure of the CAD
or RMS database. Advanced courses in Microsoft Excel cover database work in Excel, which is an essential
skill to produce these statistics. The final part of the study is how to use Access and Excel to get data from
the CAD or records-management system. Vendors typically offer classes on this topic. While they are
expensive, the results are significant. Finally, networking with other users of the CAD or recordsmanagement systems is a good way to expand skills.
While this study was instrumental in the initial commitment to expand services, there will be an ongoing
need to justify the expansion and answer questions about where to place new stations. The department was
embarking on a study, using samples selected by the same method, to determine performance and need by
each square mile of service area. An outside consultant was engaged. This consulting firm was engaged
because it was already involved in studies of where to place future schools.
To complete the school site recommendations, the firm had analyzed future growth by using building permit
requests, zoning requests and land-use planning data to project future population levels. The consultants
used the fire department data samples to determine response times to most of the square-mile zones in the
service area. (Some areas did not have adequate sample sizes to estimate a response time.) The historical
number of responses to each zone was determined. Historical trends in the number of responses, and future
growth projections were used to estimate call loads for most areas up to 20 years in the future.
FINAL RESULTS
The result of this study was a sophisticated model of the service area, from which the department could
determine the best placement of future stations to gain the biggest possible response time improvements.
This will be invaluable in achieving the best fire and medical protection improvements for each dollar spent.
“Bang for the buck” is what elected officials understand. Reducing response times by the largest amount for
the largest number of citizens is the goal of the fire department. Using sophisticated statistical analysis and
modeling techniques outlined herein, Hillsborough County Fire Rescue was able to align these goals and
land a commitment to expand the department by 25% over five years.
Steve Meng currently serves the Arvada (Colo.) Fire Protection District. Prior to that, he was a
battalion chief with Hillsborough County Fire Rescue in Tampa, Fla. In his last four years with
that department, he served as a management systems analyst, where he conducted these
statistical analyses and was the project manager for Aether FireRMS implementation. He has
since formed Front Range Data Management Services to provide fire service consulting and
training in technology issues.
COMMON SOFTWARE EASES DATA CRUNCH
Many CAD systems and most records-management systems are based on Microsoft SQL Server, a database
system designed for large amounts of storage. Systems based on SQL Server provide easy access to the
data by using readily available reporting packages or Microsoft Office applications. Larger CAD systems may
be based on Oracle or other databases, but even these systems typically make the data readily available for
custom reports.
There are several tools available to extract and analyze data. The most common tool sold to create custom
reports from these systems is Seagate Crystal Reports, which can report on large amounts of data and is
well-suited for these databases. Getting Crystal to connect to the database takes a few steps, but then
report design is very easy. Crystal Reports, however, was not a good choice for our study. Incident records
needed to be analyzed and filtered in ways that Crystal Reports can't accomplish.
Microsoft Excel is one tool that was used to extract and analyze data. Excel has several advantages over
Crystal Reports. First, Excel is available on most office computers. Second, Excel has the capability to easily
extract records from SQL Server or other databases for analysis. Third, once records are imported into
Excel, it has significant database capabilities of its own, such as sorting and filtering. Finally, Excel's charting
capabilities are essential to be able to present the results.
The second tool used to analyze performance data is Microsoft Access, a desktop database application that
does on a smaller scale what SQL Server and its competitors do on a larger scale. While Excel is an
important tool and can perform many database functions, there were some analyses that required
capabilities beyond what it could perform. For the analysis of response times, it was necessary to distinguish
between urban and rural service areas. This required analysis using a separate list of what are urban and
rural areas, which was more easily accomplished using Access. Once data was analyzed and manipulated in
Access, however, it was always imported into Excel to create the charts used in presentations.
RMS, EXCEL AID IN RESPONSE TIME STUDY
For this discussion, examples will be shown to retrieve data from Aether FireRMS using Microsoft Excel. The
example presented here would be to determine average response time for the first unit on scene. In this
case, all the data necessary is located in a table called Incident. To retrieve this data, users would open
Excel, create a new, blank worksheet, then choose Data, Import External Data, New Database Query. A
dialogue box appears asking to choose a data source:
For Aether FireRMS, there is a data source in the list called rms5sql_dsn; other vendors may have different
methods for accessing the data. After choosing the data source and choosing OK, a login dialog box
appears. You will need a valid login name for the SQL Server database. FireRMS provides an account called
FireRms5_reporter, which has a blank password.
After entering the password, choose OK. A dialogue box then appears to select the tables and columns that
are desired.
For this query, select the incident table, then the IncidentType, AlarmDate, ArrivalDate, and CensusTract
columns. The two date fields, AlarmDate and ArrivalDate, store both date and time. The IncidentType will
be used to select desired records in a subsequent step. The CensusTract is useful to determine whether the
incident was in a rural or urban location. The AlarmDate and ArrivalDate are used to determine response
time. Once the selections have been made, choose Next.
A dialogue box now appears to allow filtering records. It will be important to filter the records to achieve a
manageable number of records and remove non-emergency responses. An example might be to select
incidents from June 2003, limiting the incident type to NFIRS codes in the 300-399 range (medical calls to
ensure that only emergency responses are analyzed). First, select IncidentDate and fill the fields as follows:
Next, select IncidentType and complete the filter as shown:
Three more dialogue boxes appear. The first one offers to sort the results. The second offers to return the
data to Microsoft Excel, edit the results in Microsoft Query or save the data as an OLAP cube. The third asks
what cell to return the data to. Choose Next, Finish, then OK to accept the default answers for these
dialogues. Microsoft Excel now will have a worksheet with the specified fields from the incident database:
A simple formula entered in the Column E cells will determine the response time. In cell E2, enter the
formula =C2-B2. Then format the cell, selecting the Number tab from the Format Cells dialog box, then
select Custom in the category list, then select [h]:hh:mm in the type box. After copying cell E2 down the
column, the results would appear as follows:
A few steps would be necessary to achieve the desired results. For this department's study, only the census
tracts in the urban area or rural area, depending on the desired study, would be used. Then the records
would be examined, and the records where the response time seems unduly long (indicating an error) would
be rejected. Once clean data from the desired service area is all that remains, a few more formulas will yield
the desired numerical results.
In the cells below the study, enter the labels “Total Responses for Study,” “Response time less than 5
minutes,” “Response time over 5 minutes,” and “Test response time.” In column D, enter the formula to
count the number of responses: =CountA(A2:Axxx), where xxx represents the last row of data. The CountA
function counts the number of non-blank cells in the specified range. In column D of the row containing the
label “Test response time,” enter the value 0:05, then format the cell as described earlier (or use the Format
Painter).
For each row of data, a logical test is needed to calculate the response time. In cell F2, enter the formula
=if(E2<$D$xxx, “Yes”, “no”) where xxx is the row of the target time 0:05 entered earlier. The dollar signs
($) are necessary to make sure the formula always refers to the test value. Then copy that formula down to
the remaining rows of data in the worksheet. This will cause Excel to evaluate each row, and enter a “Yes”
in column F where the response time is within the limit.
Finally, returning to the bottom of the worksheet, two more formulae will finish the job. In Column D of the
row containing the Response Time Less than 5 Minutes label, enter the formula =CountIf(E2:Exxx, “Yes”)
where xxx represents the last row of data. The CountIf function returns the number of cells that meet the
criterion, in this case that have Yes in them. In the next cell down, enter a formula that subtracts the
number of “Yes” responses from the total responses. The bottom of the worksheet should appear as
follows:
The results are that the department, in the 100 incidents studied, meets the response time goal in 29% of
the incidents. This should be a number that an elected official would react to. Pictures, however, are always
more valuable than words.
A simple pie chart of the results of this study is produced by selecting (in this case) cells A104 through
A105, then holding the CTRL key and selecting D105 through D105, then clicking on the Chart icon in the
Excel toolbar. In the first dialog box, select the Pie Chart type, then choose next. Simply choose Next, Next,
and Finish to accept the default responses to the remaining dialog boxes, and Excel produces a pie chart
perfect for a city councilor: