Download Grid projects at EPCC - National e

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
FirstDIG
First Data Investigation on the Grid
Paul Graham, Terry Sloan, Adam Carter
EPCC
Ian Gregory, Darren Unwin
First South Yorkshire
tel:+44 (0)131 650 5155
email:[email protected]
Description
First plc - UK’s largest public transport operator
Data sources
Huge range – mileage, revenue, fuel, maintenance,
routes …
Collected – manually, ticket machines, GPS …
Disparate DBMS

Acquisitions, historical, OS, physical location, representation …
Issues NOT unique to the bus industry
Fine for day to day operations, but …
Business questions – data from >1 source
Complaints vs Lateness, Revenue vs Lost Miles …
Aggregation – by service, by day, weekdays only …
Introduces challenges for data analysis
Description
First South Yorkshire situation
No common interface
No common reporting process
Statistics produced manually when
required
Labour intensive
Not performed often or well
Process to produce what is needed
Expensive
Impractical
Description and Aims
Open Grid Services Architecture: Data Access
and Integration
Assists with the access and integration of data from
separate data sources via Grid Services
Our remit:To evaluate the suitability of the use of
OGSA-DAI in a commercial environment. If OGSA-DAI:



Is appropriate, secure, straightforward to deploy and use …
Does what we need!
Provide feedback to OGSA-DAI team
Aims
1.
2.
Demonstrate deployment of OGSA-DAI within the First South
Yorkshire bus operational environment and learn from it
Short data analysis using OGSA-DAI service enabled data sources
to answer business questions posed by First South Yorkshire
Status: Workpackages
WP 1: Data Source requirements capture (FINISHED)
D1.1 Data Source Requirements Capture & D1.2 Organisation Data
Schema (COMMERCIAL-IN-CONFIDENCE)
WP 2: Development of data interfaces (FINISHED)
OGSA-DAI Deployment
WP 3: Deployment & refinement of OGSA-DAI (FINISHED)
First Data Service Browser User Guide
First Data Service Browser Software
WP 4: Data mining requirements capture (FINISHED)
D4.1 Data Mining Requirements Capture (COMMERCIAL-INCONFIDENCE)
WP 5: Initial data mining analysis (FINISHED)
D5.1 Initial Data Mining Report (COMMERCIAL-IN-CONFIDENCE)
WP 6: Data mining detailed analysis (FINISHED)
D6.1 Final Data Mining Report (COMMERCIAL-IN-CONFIDENCE)
Technical Achievements 1
Data Mining
Combined two databases to answer First’s
business questions
The Customer Contact System


Microsoft Access
Information on customer complaints e.g. time, service, nature
The Mileage database


dBASE IV
Information on bus mileage e.g. lost miles
Also investigated Revenue and Schedule
Adherence suitability for data mining
Produced detailed data mining report
Technical Achievements 2
OGSA-DAI deployment at First South Yorkshire
Created Grid Data Services for DBMS previously
unsupported by OGSA-DAI
MS Access – CCS, dBASE IV – Mileage
Investigated GDS for SQL Server and CVS-based
DBMS
Rigorously exercised use of OGSA-DAI in a
commercial setting:
Identified numerous areas for improvement in OGSADAI
Identified new requirements for use of OGSA-DAI in
business
Confirmed the relevance and potential of OGSA-DAI for
business
Technical Achievements 3
Data Service Browser
Identified need to aid ‘ease of use’ for OGSA-DAI
Middleware
Developed a generic Grid Data Service Browser
Simple GUI – avoids XML etc
Allows SQL queries and updates to databases
Enables JOIN queries across databases
Will be included in future OGSA-DAI releases
… demo later
Achievements – First’s
perspective
Project has proven that:
There is a cost-effective solution that First
South Yorkshire can utilise
First can get to its data and analyse it in a
useful manner
With considerably reduced labour time First
can produce more accurate and more wideranging information for the business
management
Achievements
“the results of this exercise will
revolutionise the way we do things in the
bus industry”
Darren Unwin
Divisional IT Manager
Dissemination
Presentations
Ernst & Young, WestInfo Services, Strategy & Performance Associates,
SingTel Optus, Executive Briefing Centre, Curtin Business School,
Curtin University of Technology, Perth Australia, February 24th, 26th,
2004.
Curtin Business School Information Systems Seminar, Curtin
University of Technology, Perth, Australia, February 20th 2004
UK e-Science booth, Supercomputing 2003, Phoenix, USA, November
2003
Flyers
UK e-Science All Hands Conference, Nottingham, UK 2-4 September
2003
Posters
UK e-Science All Hands Conference, Nottingham, UK 2-4 September
2003
Articles
T.M.Sloan, A.Carter, P.J.Graham, D.Unwin, I.Gregory, "First Data
Investigation on the Grid: FirstDIG", Proceedings of the 2nd UK eScience All Hands Meeting, 2-4 September, 2003, Nottingham, UK
Exploitation
First Data Service Browser is being used
and extended in the INWA project with
Curtin Business School, Perth, Australia
First are keen to extend their deployment
to other databases
Future Plans
Project is finished, no effort remaining.
Incorporation of First Data Service
Browser into future releases of OGSA-DAI
First South Yorkshire want to build
management reporting applications based
on OGSA-DAI
Demo
Data Service Browser
Accessing three different DBMS
Mileage, CCS, MySQL
A JOIN – similar to the queries required for the
data mining
Easy within one DB, requires intermediary steps for
distributed DB
Without OGSA-DAI would have been impractical
Looking at Lost Miles and Customer Complaints
Run the Demo
Lost miles and Number of Complaints
350
300
250
200
Lost miles
150
Complaints
100
50
Date
29
/0
4/
20
02
22
/0
4/
20
02
15
/0
4/
20
02
08
/0
4/
20
02
01
/0
4/
20
02
0
In Conclusion
Successfully demonstrated the use of Grid
middleware in a ‘real-world’ environment
OGSA-DAI team:
Gained (in)valuable feedback
Incorporated Data Service Browser
First
Discovered valuable information from their
data which would have otherwise been
practically unobtainable
Keen to extend to other DBMS