Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Extracting Dense Regions From Hurricane Trajectory Data Praveen Kumar Tripathi Madhuri Debnath Ramez Elmasri Department of Computer Science and Engineering University of Texas at Arlington First International ACM Workshop on Managing and Mining Enriched Geo-spatial Data GeoRich 2014 Extracting Dense Regions From Hurricane Trajectory Data Contents: Spatio temporal data: Examples Applications Data mining tasks Extracting Dense Regions from Hurricane Trajectory data: Proposed concept DBSCAN: core concepts Experiments and Analysis: spatial, and spatial with non spatial. Temporal extension for the analysis: Temporal DBSCAN Proposed relative time attribute Experimental analysis of spatio temporal clustering 2 5/22/2017 Extracting Dense Regions From Hurricane Trajectory Data Spatio temporal data: Hurricane JIG 1950 Has spatial and temporal attributes as key attributes. Example : 3 'ADV' 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 'Year' 1950 1950 1950 1950 1950 1950 1950 1950 1950 1950 1950 1950 1950 1950 1950 1950 1950 1950 1950 1950 1950 1950 1950 1950 1950 1950 'Name' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'HurricaneJIG' 'LAT' 24.3 24.2 24.2 24.2 24.2 24.4 24.7 25 25.4 26.1 26.9 27.6 28.4 29.5 30.8 32 33.2 34.2 35.1 35.9 36.8 38.8 40.8 41.9 43 44.1 'LON' -47.2 -48.2 -49.2 -50.2 -51.2 -52.4 -53.7 -54.6 -55.5 -56.7 -57.8 -58.6 -59.3 -60 -60.5 -60.2 -59.2 -58.2 -57.2 -56.2 -55 -51.5 -47.1 -44.5 -42 -39.9 'TIME' '10/11/12Z' '10/11/18Z' '10/12/00Z' '10/12/06Z' '10/12/12Z' '10/12/18Z' '10/13/00Z' '10/13/06Z' '10/13/12Z' '10/13/18Z' '10/14/00Z' '10/14/06Z' '10/14/12Z' '10/14/18Z' '10/15/00Z' '10/15/06Z' '10/15/12Z' '10/15/18Z' '10/16/00Z' '10/16/06Z' '10/16/12Z' '10/16/18Z' '10/17/00Z' '10/17/06Z' '10/17/12Z' '10/17/18Z' 'WIND' 40 40 45 45 50 55 55 60 65 70 75 75 80 80 85 90 100 105 100 95 90 90 85 65 60 60 'PR' '-' '-' '-' '-' '-' '-' '-' '-' '-' '-' '-' '-' '-' '-' '-' '-' '-' '-' '-' '-' '-' '-' '-' '-' '-' '-' 'STAT' 'TROPICAL' 'TROPICAL' 'TROPICAL' 'TROPICAL' 'TROPICAL' 'TROPICAL' 'TROPICAL' 'TROPICAL' 'HURRICANE-1' 'HURRICANE-1' 'HURRICANE-1' 'HURRICANE-1' 'HURRICANE-1' 'HURRICANE-1' 'HURRICANE-2' 'HURRICANE-2' 'HURRICANE-3' 'HURRICANE-3' 'HURRICANE-3' 'HURRICANE-2' 'HURRICANE-2' 'HURRICANE-2' 'HURRICANE-2' 'HURRICANE-1' 'EXTRATROPICAL' 'EXTRATROPICAL' 5/22/2017 Extracting Dense Regions From Hurricane Trajectory Data Hurricane data 4 5/22/2017 Extracting Dense Regions From Hurricane Trajectory Data Dataset* Attributes : ADV : Tracking ID LAT : Position in latitude LON : Position in longitude Time : Track time point WIND : Winds in knots ( 1 knot = 1.15 mph) PR : Pressure in millibards STAT : State category ( Tropical Storm, Hurricane..) Temporal Coverage : 1950 – 1999 (50 years) (15319 data points, 496 trajectories) Spatial Coverage : North Atlantic Interval : 6 hourly (0000,0600,1200,2400) Data format : txt file 5 *http://weather.unisys.com/hurricane/atlantic 5/22/2017 Extracting Dense Regions From Hurricane Trajectory Data Spatio temporal data Example Global positioning data (GPS). Cell phone tracking, vehicle traffic data, emergency vehicle data, transportation data Hurricane and storm tracking data. Animal movement data. Environmental data: Land use data. Demographic : population data. Soil moisture data. Temperature data. 6 5/22/2017 Extracting Dense Regions From Hurricane Trajectory Data Spatio Temporal data: contd.. Applications: Managing vehicle traffic patterns. Monitoring and predicting weather conditions. Examining wild animal behavior. Analyzing spread of diseases. Monitoring land use, climate change. Data Mining Tasks: 7 Clustering. Flocking behavior mining. Convoy detection. Classification. 5/22/2017 Extracting Dense Regions From Hurricane Trajectory Data Extracting Dense Regions from Hurricane Trajectory data The data set has been considered as point data set. Density Based Spatial Clustering of Application with Noise (DBSCAN) [1] has been used. DBSCAN has been used to identify the dense regions of hurricane activity viz., hot spots. Clustering has been done using only the spatial attributes (latitude and longitude), as well as spatial along with non spatial attribute (latitude, longitude and wind speed) for comparative analysis. Clustering results have been evaluated using clustering evaluation measures. 8 [1] Martin Ester , Hans-peter Kriegel , Jörg S , Xiaowei Xu, A density-based algorithm for discovering clusters in large spatial databases with noise , Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), pp. 226 5/22/2017 Extracting Dense Regions From Hurricane Trajectory Data Contd.. The post processing of the obtained clustering results have been done to obtain: Regions from where the storms are most likely to originate. Regions where the storms are more likely to land. Key regions through which the storms are more likely to traverse. 9 5/22/2017 Extracting Dense Regions From Hurricane Trajectory Data DBSCAN: Core Concepts Cluster Definition: A cluster 𝐶 is a subset of objects satisfying the following: • Connected: ∀𝑝, 𝑞 ∈ 𝐶, 𝑝 𝑎𝑛𝑑 𝑞 are density connected • Maximal: ∀𝑝, 𝑞, 𝑖𝑓 𝑝 ∈ 𝐶, 𝑎𝑛𝑑 𝑞 is density reachable from 𝑝, 𝑡ℎ𝑒𝑛 𝑞 ∈ 𝐶. 10 5/22/2017 Extracting Dense Regions From Hurricane Trajectory Data Experiments and Analysis : Spatial analysis 11 5/22/2017 Extracting Dense Regions From Hurricane Trajectory Data Experiments and Analysis(1): Spatial and non spatial analysis 12 5/22/2017 Extracting Dense Regions From Hurricane Trajectory Data [2] Quality measure = Two distances used for two quality measures : • Spatial distance (latitude and longitude) • Non-spatial distance (wind speed). 13 5/22/2017 Extracting Dense Regions From Hurricane Trajectory Data 14 5/22/2017 Extracting Dense Regions From Hurricane Trajectory Data Temporal extension to DBSCAN The spatio temporal data has time as another key attribute. DBSCAN needs to be modified to incorporate the time domain. In the temporal analysis possibilities: Absolute temporal analysis (vehicle traffic analysis) Periodic temporal analysis (repeating everyday) Relative temporal analysis (hurricane data analysis) Present analysis uses the relative temporal analysis: Hurricanes do not occur at the same time. (absolute temporal analysis not possible). 15 5/22/2017 Extracting Dense Regions From Hurricane Trajectory Data Relative temporal attribute Let there be two trajectories of length 𝑚 𝑎𝑛𝑑 𝑙 where, 𝑙 𝑎𝑛𝑑 𝑚 may be different: Absolute relative temporal attribute. 𝑇𝑟1 = [< 𝑥11 , 𝑦11 , 1 >, < 𝑥12 , 𝑦12 , 2 > ⋯ 𝑥1𝑙 , 𝑦1𝑙 , 𝑙] 𝑇𝑟2 = [< 𝑥21 , 𝑦21 , 1 >, < 𝑥22 , 𝑦22 , 2 > ⋯ 𝑥2𝑚 , 𝑦2𝑚 , 𝑚] Length normalized relative temporal attribute. 1 𝑇𝑟1 = [< 𝑥11 , 𝑦11 , 0 >, < 𝑥12 , 𝑦12 , > ⋯ 𝑥1𝑙 , 𝑦1𝑙 , 1 ] 𝑙−1 1 𝑇𝑟2 = [< 𝑥21 , 𝑦21 , 0 >, < 𝑥22 , 𝑦22 , > 𝑚−1 ⋯ 16 𝑥2𝑚 , 𝑦2𝑚 , 1] 5/22/2017 Extracting Dense Regions From Hurricane Trajectory Data Extending DBSCAN In [2], DBSCAN extended to incorporate additional neighborhood criteria: Let 𝐷 = 𝑑1 , 𝑑2 , … 𝑑𝑛 be the data set where 𝑑𝑖 = 𝑥𝑖 , 𝑦𝑖 , 𝑎𝑖 , 𝑏𝑖 . (𝑥1 −𝑥2 )2 + (𝑦1 −𝑦2 )2 - spatial attributes (𝑥𝑖 , 𝑦𝑖 ) = (𝑎1 −𝑎2 )2 + (𝑏1 −𝑏2 )2 - non spatial attributes 𝑑𝑖𝑠𝑡𝑠 = 𝑑𝑖𝑠𝑡𝑛𝑠 (𝑎𝑖 , 𝑏𝑖 ) Let there are two thresholds 𝜖1 and 𝜖2 given by the user for the spatial and non-spatial neighborhood…………………………(1) Let there be now two neighborhood (spatial and non-spatial): 𝑁𝜖 1 𝑑𝑘 = 𝑑𝑗 ∈ 𝐷 𝑑𝑖𝑠𝑡𝑠 𝑑𝑘 , 𝑑𝑗 ≤ 𝜖1 } 𝑁𝜖 2 𝑑𝑘 = 𝑑𝑗 ∈ 𝐷 𝑑𝑖𝑠𝑡𝑛𝑠 𝑑𝑘 , 𝑑𝑗 ≤ 𝜖2 } then the final common neighborhood will be: 𝑁 = 𝑁𝜖 1 𝑑𝑘 ∩ 𝑁𝜖 2 𝑑𝑘 …………………………………(2) 17 [2] Birant, D., and Kut, A. ST-DBSCAN: An algorithm for clustering spatial-temporal data. Data Knowledge Engineering , 2007 pp 208-221 5/22/2017 Extracting Dense Regions From Hurricane Trajectory Data Temporal extension We use the formulation in (2), where the thresholds 𝜖1 is used for spatial neighborhood and 𝜖2 for the temporal neighborhood. By incorporating the spatial as well as temporal neighborhood in the basic DBSCAN algorithm we now have the spatio-temporal version of the DBSCAN algorithm. 18 5/22/2017 Extracting Dense Regions From Hurricane Trajectory Data Spatio temporal clustering results 19 5/22/2017 Extracting Dense Regions From Hurricane Trajectory Data References: 1. 2. 3. 4. 5. 6. 7. 8. Martin Ester , Hans-peter Kriegel , Jörg S , Xiaowei Xu, A density-based algorithm for discovering clusters in large spatial databases with noise , Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), pp. 226–231 Birant, D., and Kut, A. ST-DBSCAN: An algorithm for clustering spatial-temporal data. Data Knowledge Engineering , 2007 pp 208-221 Gil Lee and Han, J. Trajectory Clustering: A partition-and-group framework. In SIGMOD 2007, pp 593-604 Gudmundsson, J, Thom A, Vahrenhold J, Of motifs and goals : mining trajectory data. In Proceedings of the 20th International Conference on Advances in Geographic Information Systems, SIGSPATIAL’12 129-138. Gil Lee, J Han , X Li and H Gonzalez, Traclass : Trajectory Classification using hierarchical regionbased and trajectory-based clustering VLDB 2008. Vieira M., Bakalov P., and Tsotras V., On-Line Discovery of Flock Patterns in Spatio-Temporal data, Proceedings of 17th ACM SIGSPATIAL International conference, GIS’09, pp. 286-295 Jeung H.,Yiu M., Zhou X. Jensen C., and Shen H., Discovery of convoy in Trajectory databases. PVLDB, 2008. Kalnis P., Mamoulis, and Bakiras S., On Discovering Moving Clusters in Spatio-Temporal data, In SSTD 2005, pp. 364-381. Extracting Dense Regions From Hurricane Trajectory Data 21 5/22/2017