Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
TransMob: gebruiksmogelijkheden data uit Mobiliteitskaart prof. Chris Tampère, prof. Pieter Vansteenwegen dr.ir. Willem Himpe, ir. Ivan Mendoza KU Leuven L-Mob Leuven Mobility Research Center Bijdrage KU Leuven • Verkenning opportuniteiten data-analyse markt mobiliteitskaart standpunt vervoersaanbieders o standpunt gebruiker: multimodaal mobiliteitsgedrag • Unieke kenmerken TransMob dataset en analyses o multimodale tracking gebruikers o combinatie gegevens o • gebruikers • aanbieders diensten • aanvullende open databronnen • KU Leuven verkende analysemogelijkheden en tools die deze unieke kenmerken exploiteren 2 data-analyse multimodaal mobiliteitsgedrag • gedragsmodellen voor operationele en strategische planningsdoeleinden in multimodale netwerken • wiskundige modellering van strategische samenstelling van een mobiliteitsportefeuille o benodigd • tracking data representatieve set verplaatsingen • (persoonskenmerken) o uitgevoerd voordat TransMob data beschikbaar kwam op eerder beschikbare datasets • BMW project Gent • MOP data Duitsland • analyse TransMob data en gekoppelde datasets 3 markt en operationele analyse • marktanalyse: per dienst info over de klantengroep wat zijn hun verplaatsingspatronen, herkomst/bestemmingsgebieden? o welke diensten gebruiken ze daarbij? o welke is de positie van beschouwde dienst in multimodaal pakket van vervoersdiensten? o welke routes van/naar toegangspunten gebruikt? o welk bedieningsgebied per servicepunt? • strategische planning o belastingspatronen: hoe vult data uit mobiliteitskaart bestaande monitoring aan? o bedrijfsmatige indicatoren o operationele interactie met andere diensten o 4 Ontwikkelde methoden en analysetools Workflow Architecture Data Mining Process Data Integration Data Cleaning and Selection Data Transformation Data Mining outputs Patterns Evaluation: examples 5 Workflow Architecture PUBLIC OPEN DATA PUBLICATION (SHARING) TRIPS AGGREGATE DATA WAYPOINTS CSV TRAJECTORIES PREPARED REPORTS OPERATOR DATA SPATIAL REPRESENTATION DATAMINING LIBRARIES WEB SERVICES WEKA Contents [1] Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. (2009). The WEKA data mining software: an update. ACM SIGKDD explorations newsletter, 11(1), 10-18. [1] Data Mining Process [2] HETEROGENOUS SOURCES INTEGRATION ENRICH PROCESS AGGREGATE DATA 1 CLEAN PRE-PROCESSED DATA SELECTION 2 FORMATTED DATA PATTERNS MINING TRANSFORMATION 3 Contents [2] Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). From data mining to knowledge discovery in databases. AI magazine, 17(3), 37. 4 EVALUATION 5 KNOWLEDGE 1. Data Integration Mining process INTEGRATION 1 Mobile Data Events Data (User Interface) a Tracking Data (Background Process) Aggregate Data Open Data b Contents (Public datasets) 8 1. Data Integration (sources) Mining process INTEGRATION 1 Mobile Data Open Data Events Data a (train, tram, shared-bike, parking garage, etc.) (Identified trips + some waypoints, map-matched points + interpolations, mode detection, etc.) PRE-PROCESSED DATA (already partially cleaned and transformed to specific formats) transports.trips.CSV • transports.waypoints.CSV • transports.trajectories.CSV Contents Public datasets VELO stations Bus stations + train stations, Tracking Data • b 9 OSM points-of-interest.. 2. Data Cleaning & Selection Mining process 2 SELECTION SPATIAL FILTERING MODAL Open data GPS EXAMPLES to remove: • During test period • Incomplete nonrecoverable info. • Measurements below minimum accuracy • Illogical event chains Contents events TEMPORAL WEEKENDS AFTERNOON CLEAN DATA 10 WORKWEEKS MORNING 3. Data Transformation Mining process 3 TRANSFORMATION (ONE EXAMPLE) Events to validate trips: Logged Events: Recognized Trips • Train event • Car trip • Bus/tram/metro event • Bike trip • Shared-bike event • Train / tram • Parking garage (payments) • Walking • Parking street (payments) Embedded event information: • Event action: Start / Stop • Timestamp: Date / time (UTM time) • Transport mode / event type • Event location (inferred) • User profile, etc. Contents Enhanced trip information POSSIBLE USES: • Detect undetected trips • Restore incomplete trip info • Correct detected mode • Etc.. 11 4. Data Mining Outputs Mining process MINING 4 PATTERNS: • Empirical points of interest (POIs) • POIs relevance and interconnections. • Network nodes and links analysis • Regular trips and tours • Users categorization, etc. DIMENSIONS: • Spatial (origins, waypoints / destinations) • Temporal (arrivals, durations, departures) • Time of the day, day of the week, seasonal • Transport modes • User profiles, etc. Contents 12 5. Patterns Evaluation: examples Mining process EVALUATION Example A 5 Example B Offline analysis Empirical points of interest (POIs) Detection of Empirical Points of Interest Example C Points of Interest Interconnections Example D Network nodes and links analysis Access Route Maps Contents Route Alternatives 13 5. Patterns Evaluation (POIs) Evaluation examples EVALUATION 5 Empirical points of interest (POIs) [3] ENRICHED INPUTS Update & clean records Consistency checks INPUTS Arrival and departure times Origins and destinations Travel mode, etc.. HISTORICAL OUTPUTS Re-evaluate POI relevance Update known POI’s PROCESS TASK To Identify visits to frequent destinations done within a timeframe by using a specific travel mode. OUTPUTS Frequent destinations (centroids) within a timeframe POIs relevance (unique visits vs. unique users). POIs connections Contents 14 [3] Zheng, Y., Zhang, L., Xie, X., & Ma, W.-Y. (2009). Mining interesting locations and travel sequences from GPS trajectories. In Proceedings of the 18th international conference on World wide web (pp. 791–800). ACM. 5. Patterns Evaluation (POIs) Evaluation examples EVALUATION 5 A Empirical points of interest (POIs): Detection SHOP Density-based Clustering POI Matching SCHOOL [4] CINEMA CLEANING PUBLIC POIS WEKA Cluster size determines the POIs relevance Problem 1: Choose wisely the minimum number of visits required to create a cluster. Problem 2: Choose wisely the minimum number of unique users required to create a cluster. Contents 15 [4] Ester, M., Kriegel, H. P., Sander, J., & Xu, X. (1996, August). A density-based algorithm for discovering clusters in large spatial databases with noise. In Kdd (Vol. 96, No. 34, pp. 226-231). 5. Patterns Evaluation (POIs) Evaluation examples EVALUATION 5 Empirical points of interest (POIs): Detection - Attraction Levels A B. POI relevance (unique users) A. POI relevance (unique visits) POIs relevance can be overestimated if same user makes many visits to the same location. Cluster size represents POI relevance in function of an attribute value (visits vs. unique users). Contents 16 5. Patterns Evaluation (POIs) Evaluation examples EVALUATION 5 Empirical points of interest (POIs): Connectivity Maps B Connectivity Maps: • Relevant POIs Interconnections • Represent trips attraction among related POIs within a timeframe. • Destination can represent a business, a service point, a neighborhood, etc. Color and thickness provide statistics of trips occurrence between POIs. Contents 17 5. Patterns Evaluation (N/L Analysis) Evaluation examples EVALUATION 5 Network nodes and links analysis ENRICHED INPUTS Updated tracks Consistent network INPUTS Points of interest to study Modal network Map-matched tracks HISTORICAL OUTPUTS Considered route alternatives Travel time profiles PROCESS TASK To Identify those nodes or links used to access or leave a studied point of interest, together with stats of use per node or link. OUTPUTS Nodes or links use proportions to a given destination Nodes or links use proportions from a given origin Nodes or links use proportions between points of interest Contents 18 5. Patterns Evaluation (N/L Analysis) Evaluation examples EVALUATION 5 C Link analysis: Access Route Maps Access route maps: • Which links are used to reach a destination of interest. • Destination can represent a business, a service point, a neighborhood, etc. • Links can be categorized by use levels. Color and thickness provide use statistics of links to a destination of interest. Contents 19 5. Patterns Evaluation (Links Analysis) Evaluation examples EVALUATION 5 Link analysis: Route alternatives D Trips from POI (12) to POI (50). Route alternatives: • Displays those links used to reach a destination of interest from another specific origin. • Exhibit relevant network links choices between points of interest. Links use statistics and selected choices can be retrieved for a specific timeframe and other criteria. Contents 20 Take away • set flexibele en configureerbare tools en methodes ter beschikking voor o o analyses gebruikersperspectief ( beleidsperspectief) analyses operator perspectief • gebaseerd op filtering, combinatie, mining van meerdere databronnen o o combinatie reduceert onzekerheid typisch voor big data combinatie laat toe elke processing stap iteratief te verfijnen 21 Thanks for your attention contact: [email protected] KU Leuven Mobility Research Center