Download 5 - Transmob

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
TransMob: gebruiksmogelijkheden
data uit Mobiliteitskaart
prof. Chris Tampère, prof. Pieter Vansteenwegen
dr.ir. Willem Himpe, ir. Ivan Mendoza
KU Leuven
L-Mob Leuven Mobility Research Center
Bijdrage KU Leuven
• Verkenning opportuniteiten data-analyse markt mobiliteitskaart
standpunt vervoersaanbieders
o standpunt gebruiker: multimodaal mobiliteitsgedrag
• Unieke kenmerken TransMob dataset en analyses
o multimodale tracking gebruikers
o combinatie gegevens
o
• gebruikers
• aanbieders diensten
• aanvullende open databronnen
• KU Leuven verkende analysemogelijkheden en tools die deze unieke
kenmerken exploiteren
2
data-analyse multimodaal mobiliteitsgedrag
• gedragsmodellen voor operationele en strategische planningsdoeleinden in
multimodale netwerken
• wiskundige modellering van strategische samenstelling van een
mobiliteitsportefeuille
o benodigd
• tracking data representatieve set verplaatsingen
• (persoonskenmerken)
o
uitgevoerd voordat TransMob data beschikbaar kwam op eerder beschikbare
datasets
• BMW project Gent
• MOP data Duitsland
• analyse TransMob data en gekoppelde datasets
3
markt en operationele analyse
• marktanalyse: per dienst info over de klantengroep
wat zijn hun verplaatsingspatronen, herkomst/bestemmingsgebieden?
o welke diensten gebruiken ze daarbij?
o welke is de positie van beschouwde dienst in multimodaal pakket van
vervoersdiensten?
o welke routes van/naar toegangspunten gebruikt?
o welk bedieningsgebied per servicepunt?
• strategische planning
o belastingspatronen: hoe vult data uit mobiliteitskaart bestaande monitoring aan?
o bedrijfsmatige indicatoren
o operationele interactie met andere diensten
o
4
Ontwikkelde methoden en analysetools
 Workflow Architecture
 Data Mining Process





Data Integration
Data Cleaning and Selection
Data Transformation
Data Mining outputs
Patterns Evaluation: examples
5
Workflow Architecture
PUBLIC OPEN DATA
PUBLICATION
(SHARING)
TRIPS
AGGREGATE DATA
WAYPOINTS
CSV
TRAJECTORIES
PREPARED REPORTS
OPERATOR DATA
SPATIAL
REPRESENTATION
DATAMINING
LIBRARIES
WEB SERVICES
WEKA
Contents
[1] Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. (2009). The WEKA
data mining software: an update. ACM SIGKDD explorations newsletter, 11(1), 10-18.
[1]
Data Mining Process
[2]
HETEROGENOUS SOURCES
INTEGRATION
ENRICH PROCESS
AGGREGATE DATA
1
CLEAN PRE-PROCESSED DATA
SELECTION
2
FORMATTED DATA
PATTERNS
MINING
TRANSFORMATION
3
Contents
[2] Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). From data mining to knowledge
discovery in databases. AI magazine, 17(3), 37.
4
EVALUATION
5
KNOWLEDGE
1. Data Integration
Mining process
INTEGRATION
1
Mobile Data
Events Data
(User Interface)
a
Tracking Data
(Background Process)
Aggregate
Data
Open Data
b
Contents
(Public datasets)
8
1. Data Integration (sources)
Mining process
INTEGRATION
1
Mobile Data
Open Data
Events Data
a
(train, tram, shared-bike, parking garage, etc.)
(Identified trips + some waypoints, map-matched
points + interpolations, mode detection, etc.)
PRE-PROCESSED DATA
(already partially cleaned and transformed to specific formats)
transports.trips.CSV
•
transports.waypoints.CSV
•
transports.trajectories.CSV
Contents
Public datasets
 VELO stations
 Bus stations + train stations,
Tracking Data
•
b
9
 OSM points-of-interest..
2. Data Cleaning & Selection
Mining process
2
SELECTION
SPATIAL
FILTERING
MODAL
Open
data
GPS
EXAMPLES to remove:
• During test period
• Incomplete nonrecoverable info.
• Measurements below
minimum accuracy
• Illogical event chains
Contents
events
TEMPORAL
WEEKENDS
AFTERNOON
CLEAN DATA
10
WORKWEEKS
MORNING
3. Data Transformation
Mining process
3
TRANSFORMATION
(ONE EXAMPLE) Events to validate trips:
Logged Events:
Recognized Trips
•
Train event
•
Car trip
•
Bus/tram/metro event
•
Bike trip
•
Shared-bike event
•
Train / tram
•
Parking garage (payments)
•
Walking
•
Parking street (payments)
Embedded event information:
• Event action: Start / Stop
• Timestamp: Date / time (UTM time)
• Transport mode / event type
• Event location (inferred)
• User profile, etc.
Contents
Enhanced trip
information
POSSIBLE USES:
• Detect undetected trips
• Restore incomplete trip info
• Correct detected mode
• Etc..
11
4. Data Mining Outputs
Mining process
MINING
4
PATTERNS:
• Empirical points of interest (POIs)
• POIs relevance and interconnections.
• Network nodes and links analysis
• Regular trips and tours
• Users categorization, etc.
DIMENSIONS:
• Spatial (origins, waypoints / destinations)
• Temporal (arrivals, durations, departures)
• Time of the day, day of the week, seasonal
• Transport modes
• User profiles, etc.
Contents
12
5. Patterns Evaluation: examples
Mining process
EVALUATION
Example A
5
Example B
Offline analysis
Empirical points of interest (POIs)
Detection of Empirical
Points of Interest
Example C
Points of Interest
Interconnections
Example D
Network nodes and links analysis
Access Route Maps
Contents
Route Alternatives
13
5. Patterns Evaluation (POIs)
Evaluation examples
EVALUATION
5
Empirical points of interest (POIs)
[3]
ENRICHED INPUTS
Update & clean records
Consistency checks
INPUTS
Arrival and departure times
Origins and destinations
Travel mode, etc..
HISTORICAL OUTPUTS
Re-evaluate POI relevance
Update known POI’s
PROCESS TASK
To Identify visits to frequent destinations
done within a timeframe by using a
specific travel mode.
OUTPUTS
 Frequent destinations (centroids) within a timeframe
 POIs relevance (unique visits vs. unique users).
 POIs connections
Contents
14
[3] Zheng, Y., Zhang, L., Xie, X., & Ma, W.-Y. (2009). Mining interesting locations and travel sequences from GPS
trajectories. In Proceedings of the 18th international conference on World wide web (pp. 791–800). ACM.
5. Patterns Evaluation (POIs)
Evaluation examples
EVALUATION
5
A
Empirical points of interest (POIs): Detection
SHOP
Density-based
Clustering
POI Matching
SCHOOL
[4]
CINEMA
CLEANING
PUBLIC POIS
WEKA
Cluster size determines the
POIs relevance
Problem 1: Choose wisely the minimum
number of visits required to create a cluster.
Problem 2: Choose wisely the minimum
number of unique users required to create a
cluster.
Contents
15
[4] Ester, M., Kriegel, H. P., Sander, J., & Xu, X. (1996, August). A density-based algorithm for discovering
clusters in large spatial databases with noise. In Kdd (Vol. 96, No. 34, pp. 226-231).
5. Patterns Evaluation (POIs)
Evaluation examples
EVALUATION
5
Empirical points of interest (POIs): Detection - Attraction Levels
A
B. POI relevance (unique users)
A. POI relevance (unique visits)
POIs relevance can be
overestimated if same
user makes many visits to
the same location.
Cluster size represents
POI relevance in function
of an attribute value (visits
vs. unique users).
Contents
16
5. Patterns Evaluation (POIs)
Evaluation examples
EVALUATION
5
Empirical points of interest (POIs): Connectivity Maps
B
Connectivity Maps:
•
Relevant POIs Interconnections
•
Represent trips attraction among related
POIs within a timeframe.
•
Destination can represent a business, a
service point, a neighborhood, etc.
Color and thickness provide statistics of
trips occurrence between POIs.
Contents
17
5. Patterns Evaluation (N/L Analysis)
Evaluation examples
EVALUATION
5
Network nodes and links analysis
ENRICHED INPUTS
Updated tracks
Consistent network
INPUTS
 Points of interest to study
 Modal network
 Map-matched tracks
HISTORICAL OUTPUTS
Considered route alternatives
Travel time profiles
PROCESS TASK
To Identify those nodes or links used to
access or leave a studied point of
interest, together with stats of use per
node or link.
OUTPUTS
 Nodes or links use proportions to a given destination
 Nodes or links use proportions from a given origin
 Nodes or links use proportions between points of interest
Contents
18
5. Patterns Evaluation (N/L Analysis)
Evaluation examples
EVALUATION
5
C
Link analysis: Access Route Maps
Access route maps:
•
Which links are used to reach a
destination of interest.
•
Destination can represent a business, a
service point, a neighborhood, etc.
•
Links can be categorized by use levels.
Color and thickness provide use statistics of
links to a destination of interest.
Contents
19
5. Patterns Evaluation (Links Analysis)
Evaluation examples
EVALUATION
5
Link analysis: Route alternatives
D
Trips from POI (12) to POI (50).
Route alternatives:
•
Displays those links used to reach a
destination of interest from another specific
origin.
•
Exhibit relevant network links choices
between points of interest.
Links use statistics and selected choices
can be retrieved for a specific timeframe
and other criteria.
Contents
20
Take away
• set flexibele en configureerbare tools en methodes ter beschikking voor
o
o
analyses gebruikersperspectief ( beleidsperspectief)
analyses operator perspectief
• gebaseerd op filtering, combinatie, mining van meerdere databronnen
o
o
combinatie reduceert onzekerheid typisch voor big data
combinatie laat toe elke processing stap iteratief te verfijnen
21
Thanks for your attention
contact: [email protected]
KU Leuven Mobility Research Center