Mining Traffic Stream and Vehicle/pedestrian Networks Philip S. Yu Professor & Wexler Chair in Information Technology Computer Science Department University of Illinois at Chicago Problem Statement and Motivation • With the advancement on sensor, GPS and wireless technologies, transportation system transforms from data poor to data rich. • Challenges: • Real-time requirement • Complexity of the data • Spatio-temporal correlation • Noisy or uncertain data • Privacy preservation Prediction of congested areas GPS applications - database compaction through object simplification - faster pattern matching 3 Collision Detection collision detection can be more efficient using segmentation - approximate object movement 4 Technical Approach • Develop real-time stream processing capability to address monitoring type applications • Develop new scalable mining techniques to discover traffic and traversal patterns • Explore graph OLAP technique to zoom in/out a huge graph for analysis on different granularities • Explore learning from heterogeneous sources to address lacking of training examples Key Achievements and Future Goals • Real-time data stream mining algorithms with concept drifts, and uncertainty • Indexing and similarity search methods for trajectories • Online Analytical Processing paradigms for Information Network • Privacy preservation techniques • Learning from heterogeneous examples • Explore green technology Publications • C. Aggarwal, P.S. Yu, "A Framework for Clustering Uncertain Data Streams", IEEE Intl. Conf. on Data Engineering, 2008. • A. Anagnostopoulos, M. Vlachos, E. Keogh, P.S. Yu, "Global Distance-based Segmentation of Trajectories", ACM KDD 2006. • C. Aggarwal, P.S. Yu, "Privacy-Preserving Data Mining: Models and Algorithms", Springer, 2008. • B. Fung, K. Wang, P.S.Yu, "Anonymizing Classification Data for Privacy Preservation", IEEE Trans. Knowledge and Data Eng., Vol. 19, No. 5, May 2007. • X. Shi, Q. Liu, W. Fan, Q. Yang, P.S. Yu, "Predictive Modeling with Heterogeneous Sources", SIAM Data Mining Conference, 2010. • C. Chen, X. Yan, F. Zhu, J. Han, P.S. Yu, "Graph OLAP: A Multidimensional Framework for Graph Data Analysis", Knowledge and Information Systems, Vol. 21. No. 1, 2009. Publications • B. Gedik, L. Liu, P. S. Yu, "ASAP: An Adaptive Sampling Approach to Data Collection in Sensor Networks", IEEE Trans. Parallel Distributed Systems, 2007. • B. Gedik, K.L. Wu, P.S. Yu, L. Liu, "MobiQual: QoS-aware Load Shedding in Mobile CQ Systems", IEEE Intl. Conf. on Data Engingeering, 2008. • K.L. Wu, S.K. Chen, P.S. Yu, "Incremental Processing of Continual Range Queries over Moving Objects", IEEE Trans. Knowledge and Data Eng., Vol. 18, No. 11, 2006. • W. Li, W.K. Ng, X.H. Dang, K. Zhang, P.S. Yu, "Density-Based Clustering of Data Streams at Multiple Resolutions", ACM Trans. Knowledge Discovery from Data, Vol. 3, No. 3, 2009. • X. Gu, S. Papadimitriou, P.S. Yu, S.P. Chang "Toward Learningbased Failure Management for Distributed Stream Processing Systems", IEEE Intl. Conf. on Distributed Computing Systems, 2008.