Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Dimitrios Gunopulos Department of Computer Science and Engineering University of California, Riverside Bourns College of Engineering Riverside, CA 92507 Tel: (951) 827-2479 Email: [email protected] Url: www.cs.ucr.edu/~dg 8526 Alexandria St. Riverside, CA 92508, USA Education 1. Ph.D. in Computer Science, November 1995, Princeton University, Princeton, NJ. Thesis title: “Computing the Discrepancy.” Advisor: David P. Dobkin. 2. M.A., October 1992, Princeton University, Princeton, Ν.J. 3. Diploma, Computer Engineering and Informatics Department, University of Patras, Patras, Greece, 1990. Work Experience [8/2002 – current] Associate Professor, Computer Science and Engineering Department, University of California, Riveride. [11/1998 – 7/2002] Assistant Professor, Computer Science and Engineering Department, University of California, Riverside. [10/1996 – 11/1998] Research Associate, IBM Almaden Research Center, Quest Data Mining Group. [9/1996 - 10/96] Visiting Researcher, Department of Computer Science, University of Helsinki. [9/1995 - 8/1996] Postdoctoral Fellow, Max-Planck-Institut for Informatik. Saarbruecken, Germany. [9/1990 – 8/1995] Research Assistant, and for 3 semesters Teaching Assistant. Department of Computer Science, Princeton University. Publications - Journals 1. “Indexing Multi-Dimensional Time-Series”, M. Vlachos, M. Hadjieleftheriou, E. Keogh, D. Gunopulos, accepted in VLDB Journal. 2. “Elastic Translation Invariant Matching of Trajectories”, M. Vlachos, G. Kollios, D. Gunopulos, accepted in Machine Learning Journal. 3. “Indexing Mobile Objects Using Dual Transformations.” George Kollios, Dimitris Papadopoulos, Dimitrios Gunopulos, Vassilis J. Tsotras:, accepted in VLDB Journal. 4. ‘”Selectivity Estimators for Multi-Dimensional Range Queries over Real Attributes”, D. Gunopulos, G. Kollios, V. Tsotras, C. Domeniconi, accepted in VLDB Journal. 5. “Indexing Spatio-temporal Archives.” M. Hadjieleftheriou, G. Kollios, D. Gunopulos, V. J. Tsotras, accepted in VLDB Journal. 6. “Large Margin Nearest Neighbor Classifiers”, C. Domeniconi, D. Gunopulos, J. Peng, Accepted with Minor Revisions, IEEE Transactions on Neural Networks. 7. "Exploiting Locality for Scalable Information Retrieval in Peer-to-Peer Systems", D. Zeinalipour-Yazti, V. Kalogeraki and D. Gunopulos, Information Systems Journal, Elsevier Publications, In Press, to appear in 2004. 8. "An Efficient Density-Based Approach for Data Mining Tasks", C. Domeniconi, D. Gunopulos, Knowledge and Information Systems, 6(6), pp. 750-770, November 2004. 9. "Information Retrieval Techniques for Peer-to-Peer Networks", D. Zeinalipour-Yazti, V. Kalogeraki and D. Gunopulos. IEEE CiSE Magazine, Special Issue on Web Engineering, IEEE Publications, pp.12-20., July/August 2004. 10. “Discovering all most specific sentences”. Dimitrios Gunopulos, Roni Khardon, Heikki Mannila, Sanjeev Saluja, Hannu Toivonen, Ram Sewak Sharma. TODS 28(2): 140-174 (2003). 11. “Distributed Deviation Detection in Sensor Networks.” Themistoklis Palpanas, Dimitris Papadopoulos, Vana Kalogeraki, Dimitrios Gunopulos. Sigmod Record 32(4), Special Issue on Sensor Technology, December 2003. 12. “Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large Datasets”. G. Kollios, D. Gunopulos, N. Koudas, S. Berchtold, IEEE Trans. Knowl. Data Eng. 15(5): 1170-1187 (2003). 13. “Temporal and spatio-temporal aggregations over data streams using multiple time granularities.” Donghui Zhang, Dimitrios Gunopulos, Vassilis J. Tsotras, Bernhard Seeger. Information Systems 28(1-2): 61-84 (2003). 14. “Feature Selection for the Naive Bayesian Classifier Using Decision Trees”. Chotirat (Ann) Ratanamahatana, Dimitrios Gunopulos. Applied Artificial Intelligence 17 (5-6): 475-487 (2003). 15. “Indexing Mobile Objects Using Duality Transforms.” Dimitris Papadopoulos, George Kollios, Dimitrios Gunopulos, Vassilis J. Tsotras. IEEE Data Engineering Bulletin 25(2): 18-24 (2002). 16. “Locally Adaptive Metric Nearest Neighbor Classification”, C. Domeniconi, J. Peng, D. Gunopulos. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 24(9), pp. 1281-1285, September 2002. 17. “Time-Series Similarity Problems and Well-Separated Geometric Sets”. B. G. Das, D. Gunopulos, H. Mannila, Nordic Journal of Computing, 4/2001. Bollobas, 18. “Indexing Animated Objects Using Spatiotemporal Access Methods”. G. Kollios, V. Tsotras, D. Gunopulos, A. Delis, M. Hadjeleftheriou. IEEE Trans. Knowledge and Data Engeneering (TKDE), 23(5):758-777, 2001. 19. “Constraint-Based Rule Mining on Large, Dense Data-Sets”. R. Bayardo, R. Agrawal, D. Gunopulos. Data Mining and Knowledge Discovery Journal, vol. 4(2/3): 217-240, 2000. 20. “Computing the maximum Bichromatic Discrepancy, with applications in Computer Graphics and Machine Learning”. D. Dobkin, D. Gunopulos, W. Maass. Journal of Computer and Systems Sciences, 52(3):453-470 (1996). Preliminary version: Electronic Colloquium on Computational Complexity (ECCC)(025): (1994). Publications - Referred conferences, symposia, workshops 1. “Rotation invariant distance measures for trajectories.” Michail Vlachos, Dimitrios Gunopulos, Gautam Das: Proc. ACM SIGKDD 2004: 707-712. 2. “Subspace Clustering of High Dimensional Data.” Carlotta Domeniconi, Dimitris Papadopoulos, Dimitrios Gunopulos, Sheng Ma: SIAM Int. Conference in Data Mining (SDM) 2004. 3. “Identifying Similarities, Periodicities and Bursts for Online Search Queries.” Michail Vlachos, Christopher Meek, Zografoula Vagena, Dimitrios Gunopulos: ACM SIGMOD Conference 2004: 131-142. 4. Eamonn J. Keogh, Themis Palpanas, Victor B. Zordan, Dimitrios Gunopulos, Marc Cardle: Indexing Large Human-Motion Databases. Proc. VLDB 2004: 780-791. 5. “Online Amnesic Approximation of Streaming Time Series. “ Themistoklis Palpanas, Michail Vlachos, Eamonn J. Keogh, Dimitrios Gunopulos, Wagner Truppel: Proc. IEEE ICDE 2004: 338-349. 6. “Iterative Incremental Clustering of Time Series.” Jessica Lin, Michail Vlachos, Eamonn J. Keogh, Dimitrios Gunopulos: EDBT 2004: 106-122. 7. “Sketching Techniques for Spatio-temporal Density Queries.” Marios Hadjieleftheriou, George Kollios, Dimitrios Gunopulos, Vassilis J. Tsotras: HDMS (Hellenic Data Management Symposium) 2004. 8. "On Constructing Internet-Scale P2P Information Retrieval Systems". D. ZeinalipourYazti, V. Kalogeraki and D. Gunopulos, Intl. Workshop On Databases, Information Systems and P2P Computing, VLDB Workshops 2004, pp.122-136. 9. “Fast Motion Capture Matching with Replicated Motion Editing”. ,M. Cardle, M. Vlachos, S. Brooks, E. Keogh, D. Gunopulos. In Proc. of SIGGRAPH 2003, Technical Sketches & Applications. San Diego, CA. 10. Clustering Gene Expression Data in SQL Using Locally Adaptive Metrics. Dimitris Papadopoulos, Carlotta Domeniconi, Dimitrios Gunopulos, Sheng Ma. 8th ACM Sigmod Workshop on Research Issues in Data Mining and Knowledge Discovery, 2003. 11. “Efficient Approximation Of Optimization Queries Under Parametric Aggregation Constraints.” Sudipto Guha, Dimitrios Gunopulos, Nick Koudas, Divesh Srivastava, Michail Vlachos. Proc. VLDB 2003: 778-789. 12. “On-Line Discovery of Dense Areas in Spatio-temporal Databases.” Marios Hadjieleftheriou, George Kollios, Dimitrios Gunopulos, Vassilis J. Tsotras. SSTD 2003: 306-324. 13. “Peer-to-Peer Architectures for Scalable, Efficient and Reliable Media Services.” Vana Kalogeraki, Alex Delis, Dimitrios Gunopulos. Proc. IPDPS 2003, Nice, France. 14. “Indexing Multi-Dimensional Time-Series with Support for Multiple Distance Measures.”Michail Vlachos, Marios Hadjieleftheriou, Dimitrios Gunopulos, Eamonn Keogh, ACM SIGKDD 2003. 15. “Correlating Synchronous and Asynchronous Data Streams”. Sudipto Guha, Dimitrios Gunopulos, Nick Koudas. ACM SIGKDD 2003. 16. “Robust Similarity Measures for Multidimensional Trajectories”. M. Vlachos, D. Gunopulos, G. Kollios. 5th Int. Workshop Mobility in Databases and Distributed Systems (MDDS 2002), in conjunction with the 13th Int. Conf. on Database and Expert Systems Applications (DEXA 2002), Aix-en-Provence, France. 17. “Indexing Mobile Objects on the Plane”. Dimitris Papadopoulos, George Kollios, Dimitrios Gunopulos, V. Tsotras. 5th Int. Workshop Mobility in Databases and Distributed Systems (MDDS 2002), in conjunction with the 13th Int. Conf. on Database and Expert Systems Applications (DEXA 2002), Aix-en-Provence, France (an extended version appears at HDMS 2002). 18. “Evaluating the Use of Statistical Phrases and Latent Semantic Indexing for Text Classification.” Huiwen Wu, Dimitrios Gunopulos. Poster paper, IEEE International Conference on Data Mining, 2002. 19. “A Local Search Mechanism for Peer-to-Peer Networks”, V. Kalogeraki, D. Gunopulos, D. Zeinalipour-Yatzi. 2002 ACM Conference on Information and Knowledge Management (CIKM). 20. “Indexing Mobile Objects on the Plane”. D. Papadopoulos, G. Kollios, D. Gunopulos, V. Tsotras. 1st Hellenic Data Management Symposium (HDMS 2002). 21. “Non-Linear Dimensionality Reduction Techniques for Classification and Visualization'', M. Vlachos, C. Domeniconi, D. Gunopulos, G. Kollios, N. Koudas. 2002 ACM SIGKDD Conference. 22. “Efficient Aggregation over Objects with Extend”. D. Zhang, V. Tsotras, D. Gunopulos. 21th SIGMOD-SIGACT-SIGART Symp. on Principles of Database Systems (PODS 2002), Madison, WI. 23. “Handling Multimedia Streams in a Peer-to-Peer Network”. V. Kalogeraki, A. Delis, D. Gunopulos, short paper, 2nd Int. Workshop on Global and Peer-to-Peer Computing on Large Scale Distributed Systems, at IEEE CCGrid 2002. 24. “Efficient Local Flexible Nearest Neighbor Classification”, C. Domeniconi, D. Gunopulos, 2nd SIAM Int. Conf. on Data Mining (SDM) 2002. 25. “Temporal Aggregation over Data Streams using Multiple Granularities”. D. Zhang, D. Gunopulos, V. Tsotras, B. Seeger. 8th Conf. On Extending Database Technologies (EDBT-2002). 26. “Efficient Indexing of Spatiotemporal Objects”, M. Hadjieleftheriou, G. Kollios, V. Tsotras, D. Gunopulos. 8th Conf. On Extending Database Technologies (EDBT-2002). 27. “Adaptive Nearest Neighbor Classification using Support Vector Machines”. C. th Domeniconi, D. Gunopulos.14 Neural Ιnformation Processing Systems Conference, NIPS 2001, Vancouver, CA. 28. “Discovering similar trajectories”. M. Vlachos, G. Kollios, D. Gunopulos.18th IEEE International Conf. on Data Engineering (ICDE 2002), San Jose, CA. 29. “Incremental Support Vector Machine Construction”. C. Domeniconi, D. Gunopulos. Poster paper, 1st IEEE International Conference on Data Mining (ICDM), 2001, San Jose, CA. 30. “An Efficient Approach for Approximating Multi-dimensional Range Queries and Nearest Neighbor Classification in Large Datasets”. C. Domeniconi, D. Gunopulos. 18th International Conference on Machine Learning (ICML), June 2001, Williams College, MA. 31. “Efficient Mining of SpatioTemporal Patterns”. E. Tsoukatos, D. Gunopulos. 7th International Symposium on Spatial and Temporal Databases (SSTD), 2001, Redondo Beach, CA. 32. “Efficient Computation of Temporal Aggregates with Range Predicates”. D. Zhang, A. Markowetz, V. Tsotras, D. Gunopulos, B. Seeger. 20th SIGMOD-SIGACT-SIGART Symp. on Principles of Database Systems (PODS 2001), Santa Barbara, CA. 33. “Efficient and Tunable Similar Set Retrieval”. Aristides Gionis, D. Gunopulos, Nick Koudas. 2001 ACM SIGMOD Conf. on Management of Data, Santa Barbara, CA. 34. “An Efficient Approximation Scheme for Data Mining Tasks”. G. Kollios, D. Gunopulos, N. Koudas, S. Berchtold. 17th IEEE International Conf. on Data Engineering (ICDE) 2001: 453-462, Heidelberg, Germany. 35. “An Adaptive Metric Machine for Pattern Classification.” C. Domeniconi, J. Peng, D. Gunopulos.13th Neural Information Processing Systems Conference, NIPS 2000: 458464, Denver, CO (proceedings published by MIT Press). 36. “Managing Multimedia Streams in Distributed Environments using CORBA”. V. Kalogeraki, D. Gunopulos. 6th International Workshop on Multimedia Information Sustems (MIS 2000), Chicago, IL. 37. “Identifying Prospective Customers”. P. Chou, E. Grossman, D. Gunopulos, P. Kamesam. 6th ACM SIGKDD Int. Conference on Knowledge Discovery and Data Mining: 447-456, 2000, Boston, MA. 38. “Adaptive Metric Nearest Neighbor Classification”. C. Domeniconi, J. Peng, D. Gunopulos. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2000, Hilton Head Island, SC. 39. “Approximating Multi-Dimensional Aggregate Range Queries Over Real Attributes”. D. Gunopulos, G. Kollios, V. Tsotras, C. Domeniconi. 2000 ACM SIGMOD Conf. on Management of Data: 463-474, Dallas, TX. 40. “Indexing Animated Objects”. George Kollios, D. Gunopulos, Vassilis Tsotras. 5th International Workshop on Multimedia Information Sustems (MIS) '99, Palm Springs Desert, CA, 1999. 41. “Nearest Neighbor Queries in a Mobile Environment.” G. Kollios, D. Gunopulos, V. Tsotras, iWorkshop for Spatio-Temporal Database Management (STDBM-99): 119-134, co-located with VLDB-99, Edinburg, Scotland. 42. “All-Pairs Nearest Neighbors in a Mobile Environment”. D. Gunopulos, G. Kollios, V. Tsotras. 7th Hellenic Conference on Informatics: II.23-II.30, Ioannina, Greece, 1999. 43. “Indexing Mobile Points”. George Kollios, D. Gunopulos, Vassilis Tsotras.18th ACM Symp. on Principles of Database Systems (PODS-1999): 261-272, Philadelphia, PA. 44. “Constraint-Based Rule Mining on Large, Dense Data-Sets”. R. Bayardo, R. Agrawal, D. Gunopulos. 15th IEEE Int. Conf. on Data Engineering (ICDE 99): 188-197, Sydney, Australia. 45. “Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications”. R. Agrawal, J. Gehrke, D. Gunopulos, P. Raghavan. Proc. of 1998 ACM SIGMOD Int. Conf. on Management of Data: 94-105, Seattle, WA. 46. “Mining Process Models from Workflow Logs”. R. Agrawal, D. Gunopulos, F. Leymann. 6th Intl. Conf. on Extending Database Technology (EDBT-98): 469-483, Valencia, Spain, 47. “Data mining, Hypergraph Transversals, and Machine Learning”. D. Gunopulos, R. Khardon, H. Mannila, H. Toivonen.16th ACM Symp. on Principles of Database Systems (PODS-1997): 209-216, Tuscon, AZ. 48. “Time-Series Similarity Problems and Well-Separated Geometric Sets”. B. Bollobas, G. Das, D. Gunopulos, H. Mannila.13th ACM Symp. on Computational Geometry, 1997: 454-456, Nice, France. 49. “Episode Matching”. G. Das, R. Fleisher, L. Gasieniec, D. Gunopulos, J. Karkkainen. 8th Annual Symp. Combinatorial Pattern Matching 1997: 12-27, Aarhus, Denmark (proceedings published in LNCS 1264, Springer, 1997). 50. “Finding Similar Time Series”. G. Das, D. Gunopulos, H. Mannila, in Principles of Data Mining and Knowledge Discovery in Databases (PKDD) 97: 88-100, (proceedings published in Lecture Notes in Artificial Intelligence (LNCS 1263), Springer, 1997), Trondheim, Norway. 51. “Discovering all Most Specific Sentences by Randomized Algorithms”. D. Gunopulos, H. Mannila, S. Saluja. 6th International Conference in Database Theory 1997: 215-229, Delphi, Greece. (Proceedings published in Lecture Notes in Computer Science (LNCS 1186), Springer, 1997). 52. “Some geometry problems in Machine Learning”. D. Dobkin, D. Gunopulos. First ACM Workshop on Applied Computational Geometry, 1996, Philadelphia, PA, (proceedings published in Lecture Notes in Computer Science (LNCS) 1148, Springer, 1996). 53. “Computing optimal shallow decision trees”. D. Dobkin, D. Gunopulos, S. Kasif. 4th International Symposium on Artificial Intelligence and Mathematics, 1996, Miami, FL 54. “Concept Learning with geometric hypotheses”. D. Dobkin, D. Gunopulos. 8th Conference on Computational Learning Theory (COLT-95): 329-344, Santa Cruz, CA. 55. “Computing the Rectangle Discrepancy”. D. Dobkin, D. Gunopulos. 3rd Annual Video Review of Computational Geometry (Proc. 10th ACM Symp. on Computational Geometry: 385), Vancouver, Canada, 1994. Publications - Book Chapters 1. “Indexing Multi-Dimensional Trajectories for Similarity Queries.” M. Vlachos, M. Hadjieleftheriou, E. Keogh, D. Gunopulos. In Spatial Databases: Technologies, Techniques and Trends. A book edited by Yannis Manolopoulos, Apostolos Papadopoulos (Aristotle University Thessaloniki ) & Michael Vassilakopoulos (Technological Educational Institute of Thessaloniki), Greece. Final manuscript (25 pages) accepted, to appear. 2. “Locally Adaptive Techniques for Pattern Classification”, Carlotta Domeniconi, Dimitrios Gunopulos, in Encyclopedia of Data Warehousing and Mining, Idea Group, Inc., Editor: John Wang, Professor, Montclair State University. Final manuscript (10 pages) accepted, to appear. 3. “Time Series Similarity and Indexing”. Invited chapter in the Handbook on Data Mining. D. Gunopulos, G. Das. Editor Nong Ye, publisher Lawrence Erlbaum Associates, 2003, pp 279-304. 4. “Indexing Similar Time Series under Conditions of Noise”. Invited chapter in “Data Mining in Time Series Data Bases”, D. Gunopulos, G. Das, M. Vlachos. Editors: Abraham Kamel, Horst Bunke, Mark Last. 5. “All pairs nearest neighbors in a mobile environment.” Gunopulos, D., Kollios, G., Tsotras, V. Advances in Informatics, D.I. Fotiadis, S.D. Nikolopoulos, editors, World Scientific, April 2000, ISBN 981-02-4192-5. Publications - Books “Data Mining: Quality Assessment and Uncertainty Handling”. M. Vazirgiannis, M. Halkidi, D. Gunopulos. Springer-Verlang London Limited, 2003. Publications - Other Publications 1. “Time series similarity Measures”, Dimitrios Gunopulos. Invited contribution to the Encyclopedia of Biostatistics, 2nd Edition, John Wiley Ltd., B. Everitt, Editor, to appear (7 pages). 2. “Summer School Report: Dimacs Summer School Tutorial on New Frontiers in Data Mining, August 2001”. D. Gunopulos, N. Koudas. SIGMOD Record, 10/15/2001. 3. “Workshop Report: 2000 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery”. D. Gunopulos, R. Rastogi. SIGKDD Explorations 2(1): 8384 (2000). 4. ”Using kernels to approximate multi-dimensional aggregate range queries over real attributes”. D. Gunopulos, G. Kollios, V. Tsotras, C. Domeniconi. Workshop on New Perspectives in Kernel-Based Learning Methods, NIPS 2000 5. “FIGI: The Architecture of an Internet-Based Financial Information Gathering Infrastructure”. Marios Dikaiakos, D. Gunopulos. Future work paper in First Int. Workshop on Advance Issues of E-Commerce and Web-based Information Systems (WECWIS) 99: 91-94, Santa Clara, CA (proceedings published by IEEE Computer Society). Tutorials 1. DIMACS Summer School on New Frontiers in Data Mining, August 13-17, Piscataway NJ. Organizers: D. Gunopulos, UCR, N. Koudas, ATT Research. url: http://dimacs.rutgers.edu/Workshops/MiningTutorial This was a 5 day long school, funded uder the auspices of the Center for Discrete Mathematics and Theoretical Computer Sciene (DIMACS) 2001-2004 Special Focus on Data Analysis and Mining. Also funded by NSF. There were 16 invited speakers from academia (MIT, Cornell University, UC Riverside, Columbia University, NYU, University of Toronto) and industry (IBM Almaden Research Center, IBM Watson Research Center, Lucent Bell Labs, AT&T Research, HP Labs, Verity, Telcordia, Exelixis). 2. “Mining Spatiotemporal Data”. D. Gunopulos. 7th International Symposium on Spatial and Temporal Databases (SSTD), 2001, Redondo Beach, CA. 3. “Time Series Similarity Measures and Time Series Indexing”. D. Gunopulos, Gautam Das. 2001 ACM SIGMOD Conf. on Management of Data, May 2001, Santa Barbara (Proceedings of the 2001 ACM SIGMOD Conf. On Management of Data, p.624. ) 4. “Time series similarity measures”. D. Gunopulos, Gautam Das. 6th ACM SIGKDD Conf. on Knowledge Discovery and Data Mining, 2000, Boston, MA. Slides in KDD-00 Tutorial Notes – Tutorial Notes for ACM SIGKDD 2000, The Sixth International Conference on Knowledge Discovery and Data Mining - August 20-23, 2000, Boston, MA USA, ACM, p. 243-307. Patents 1. “System and Method for Organizing Repositories of Semi—Structured Documents Such as Email”. R. Agrawal, R. Bayardo, D. Gunopulos, H. Ho, S. Sarawagi, J. Shafer, R. Srikant, U.S. Patent US6592627 (7/15/2003). 2. “System and method for constraint-based rule mining in large, dense data-sets”. R. Bayardo, R. Agrawal, D. Gunopulos. U.S. Patent US6278997 (8/21/2001). 3. “Prospective Customer Selection Using Customer and Market Reference Data”. P. Chou, E. Grossman, D. Gunopulos, P. Kamesam. U.S. Patent US06061658 (5/9/2000). 4. “Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications”. R. Agrawal, J. Gehrke, D. Gunopulos, P. Raghavan. U.S. Patent US6003029 (12/14/1999). 5. “Mining Process Models from Workflow Logs”. R. Agrawal, D. Gunopulos, F. Leymann, D. Roller. U.S. Patent US6038538 (3/14/2000). Invited Talks, Panels 1. “Data Mining in Geophysical Data.” Invited talk at the Workshop on Spatio-temporal Data Models of Biogeophysical Fields, Ecological Forecasting, April 8-10, 2002, San Diego Supercomputer Center, La Jolla, CA. Organizers: Dr. G. Henerby (UNL), Dr. J. Chomicki (SUNY Buffalo), Dr. T. Fountain (SDSC), Dr. K.J. Ranson (NASA). Sponsored by NSF. 2. “Approximating Range Queries”. Hewlett-Packard Labs, Palo Alto, CA, 5/11/2001. 3. “Approximating multi-dimensional aggregate range queries over real attributes”. Lucent Bell-Labs, Morristown, NJ, 9/2/2000. 4. Panelist: “From Minitel to the World-Wide Web and Beyond: The Ongoing Role of Multimedia Information Systems in Digital Government”. MIS 2000 Sixth International Workshop on Multimedia Information Systems, October 26-28, 2000, Chicago, IL. Moderator: Bill McIver (Brown Univ.). 5. Panelist: 2000 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD 2000) panel. Moderator: Umesh Dayal (HP Labs). 6. “Indexing Moving Points”. University of California, Santa Barbara, Santa Barbara, CA, 12/6/1999. 7. “Clustering for Data Mining Applications”. National Technical University of Greece, Athens, 8/30/1999. 8. “Clustering for Data Mining Applications”. Univ. of Pittsburgh, Pittsburgh, PA, 5/26/1999. 9. “Computing Association Rules”. NASA Ames Research Center, Moffett Field, CA, 8/14/1997. 10. “Computing the rectangle discrepancy”. AMS Meeting 892, Special Session in Computational Geometry, NY, NY, 1994. Research Funding 1. “Efficient Indexing for Spatiotemporal Applications” Agency: National Science Foundation Principal Investigators: Dr. V. Tsotras, Dr. D. Gunopulos Amount: $ 436,804 Duration: 10/1999 - 9/2002. 2. “Access Methods for Spatio-Temporal Data” Agency: U.S. Department of Defense Principal investigator: Dr. V. Tsotras Co-Investigator: Dr. D. Gunopulos Amount: $ 100,000 Duration: 7/1999 – 6/2001. 3. “Developing the Next Generation of Virtual Library Internet Finding Tools for the Library Community” Agency: Institute of Museum and Library Services Principal investigators: Drs. James Thompson, Dimitrios Gunopulos Amount: $ 498,750 Duration: 10/99 – 9/01. 4. “Data mining techniques for geospatial applications” Agency: NSF CAREER Award Principal Investigator: Dr. D. Gunopulos Amount: $ 320,000 Duration: 9/00 – 8/04. 5. “Knowledge Discovery in Spatio-Temporal Data” Agency: U.S. Department of Defense Principal investigator: Dr. D. Gunopulos Co-Investigator: Dr. V. Tsotras Amount: $ 100,000 Duration: 7/2000 – 7/2002. 6. “Knowledge Management over Time-Varying Geospatial Datasets” Agency: NSF Principal Investigator: Dr. Peggy Aggouris (U. of Maine) UCR is a subcontractor: Drs. D. Gunopulos, V. Tsotras Amount (UCR): $ 100,000 Duration: 9/2000 - 8/2002 7. “Fertility, smoking and Early Mammalian Development” Funding agency: Tobacco Related Disease Research Program PIs: Dr. Prudence Talbot (Cell Bio. and Neuroscience, UCR), Dr. D. Gunopulos Amount: $454,491 Duration: 7/01-6/04 8. AT&T Research Award, $25,000, with M. Faloutsos, C. Ravishankar. 9. UC Regents Fellowship award, $5000, 1999-2000. 10. “DBGlobe: A data centric approach to global computing” Funding Agency: European Commission Research Directorates General IST, Future and Emerging Technologies, Global Computing Initiative Participants: Univ. of Ioannina (Leading Institution), Computer Technology Institute, Athens University of Economics and Business, INRIA, Aalborg University, Unversity of Cyprus, Univ. of California, Riverside. Amount: 932,000 EURO Duration: 1/2002 – 12/2003 11. “ITR: Understanding Change in Spatiotemporal Data.” Funding Agency: NSF PI: Dimitrios Gunopulos. Co-PI: Vassilis Tsotras Amount: $140,000 Duration: 9/2002 – 8/2004 12. “An Adaptive and Scalable Architecture for Dynamic Sensor Networks” Funding Agency: NSF PI: V. Kalogeraki, UCR. Co-Pis: D. Gunopulos, S. Krishnamurthy, V. Tsotras Amount: $600,000 Duration: 9/2003 - 8/2006. Professional Activities 1. Associate Editor of IEEE Transactions on Knowledge and Data Engineering (IEEE TKDE). 2. Member in the Editorial Advisory Board of the Elsevier Information Systems Journal. 3. Program Co-Chair and General Co-Chair: 15th International Conference on Scientific and Statistical Database Management (SSDBM 2003), Cambridge, MA, July 9-11, 2003. 4. Program Vice-Chair, IEEE International Conference on Data Engineering (ICDE) 2004. 5. Program Co-Chair: 2000 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD), held in cooperation with SIGMOD'2000, Dallas, Texas, May 2000. 6. Proceedings Chair: Eighteenth ACM Symposium on Principles of Database Systems (PODS-99). 7. Program Committee member: KDD 1998, DMKD 1999, ICDE 2000, SIGMOD 2001, ICML 2001, SSTD 2001, ICDM 2001, SIGKDD 2001, ICDE 2002, PODS 2002, SIGKDD 2003, CIKM 2002, SDM 2003, SIGMOD 2004, SSDBM 2004, SAC 2003, PAKDD 2004, PODS 2005. 8. Referee for proposals: NSF, NASA. 9. Participated in: Computer Science and Telecommunications Board, National Academy of Science, Workshop on the Intersection of Geospatial Information and Information Technology, Washington, D.C., October 1-2, 2001. Students Ph.D.: Carlotta Domeniconi Michail Vlachos Ph.D. Summer 2002. Currently Assistant Professor at George Mason University. Ph.D. Summer 2004. Currently Research Staff Member, at IBM T.J. Watson Research Center. Ph.D (current): Dimitrios Papadopoulos (expected Graduation Winter 2004) Demetris Zeinalipour-Yatzi (expected Graduation Summer 2005) Sharmila Subramaniam Song Li Shalendra Chhabra Postdoctoral Fellows: Themis Palpanas ` Maria Halkidi Ph.D. 2003, University of Toronto. Postdoctoral Fellow at UCR: 2/2003 to 8/2004. Currently Research Associate at IBM T.J. Watson Research Center. Ph.D. 2004, Athens University of Economics and Business Marie Currie Postdoctoral Fellow, in UCR from 5/2004 to 5/2005. MS: Ilias Tsoukatos (2001) Sarika Sahni (2000) Michail Vlachos (2001) Atinder Singh (2002) Sharmila Subramanian (2003) Demetris Zeinalipour-Yatzi (2003) Bo Wang (2001) Jessica Lin (2002) Huiwen Wu (2002) Ram Sharma (2002) Dimitris Papadopoulos (2003) Committee Member: George Kollios (Ph.D. 2000, Polytechnic University) Donghui Zhang (Ph.D. 2002, UCR) Vaggelis Hristidis (Ph.D. 2004, UC San Diego) Marios Hadjieleftheriou (Ph.D. 2004, UCR) Ph.D. Thesis Reader: Pirjo Moen (PhD 2000, University of Helsinki) Research Statement Knowledge discovery in databases is an exciting new field of computer science research, encompassing several diverse techniques for analyzing large datasets. The goal of data mining and knowledge discovery in databases is to obtain new, interesting and actionable pieces of information. Vast amounts of data are accumulated in diverse application domains, including bioinformatics, epidemiology, business, physical sciences, web applications, and networking. Data mining is fundamentally an interdisciplinary field, borrowing and combining techniques from theory, statistics, databases and machine learning, and ultimately producing new approaches. My research work is motivated by hard real life problems in analyzing data in those areas. I believe that it is important to work to problems that are not only interesting but also important in practice, and for this reason I have worked extensively on research problems with research labs: AT&T Research (Summer 2000, Spring 2001, Summer 2001), HP Labs (Summer 2001), Microsoft Research, and IBM Almaden (I was in the Quest Group in IBM Almaden from 10/96 to 12/98). My research work has concentrated on the following topics: 1. Design of efficient algorithms for knowledge discovery in high dimensional data The most important problem in the field of data mining and knowledge discovery in databases is designing efficient algorithms for data analysis tasks. This problem fundamentally differentiates the field from the related fields of machine learning and statistics, where the main focus is the accurate modeling of the data. In many data mining applications, the datasets are very large, and only efficient algorithms can make the data analysis task feasible. Since most datasets are stored in databases, or they are to large to reside in main memory, an important aspect of algorithm design is the minimization of disk accesses. This work was supported by an NSF Career award, and an NSF ITR grant. I have worked on the following problems in this area: 1. 2. 3. 4. 5. 2. Multi-dimensional Selectivity Estimation. Density Estimation for Geospatial Datasets. Improving the performance of clustering, and classification. Locally Adaptive Nearest-Neighbor Classification. Subspace Clustering for Data Mining Applications. Design new knowledge discovery tasks for streaming and sensor data Another important area of research is designing algorithms and techniques for new data analysis tasks. The accumulation and aggregation of data by many organizations and companies has led to the formulation of new interesting approaches of analyzing such data. In addition, new sources of data, each with their own properties, become prevalent. Examples of such new sources of data include spatiotemporal datasets that describe the evolution of physical phenomena or the movement of objects, and stream data that are the output of (mobile or stationary) sensors. It is important to develop specific techniques that are suitable for datasets with specific properties because the data mining tasks we wish to perform can be quite complex. This work was supported by an NSF Career award, an NSF ITR award, an NSF award and a DoD grant. Following I list work that I have done in the design of new data analysis techniques. 1. 2. 3. 4. 5. 6. 3. Aggregation over Stream Data. Training a SVM Classifier on Stream Data. Finding Frequent Spatio-temporal Patterns in Large Geospatial Datasets. Discovering Similar Trajectories. Discovering Workflow Graphs. Computing the Similarity of Time Series. Provide database support for spatio-temporal and high dimensional data A higher level of integration between data analysis tools and relational databases is very desirable for both database users and developers of database applications. There is a lot of research on extending database functionality, and much of this research is driven by user requirements. It is difficult to integrate batch data mining algorithms (such as clustering or classification) in the database engine because performance suffers, on the other hand, integrating techniques that allow efficient exploration of the data by the user can significantly improve the performance of exploratory data analysis tasks. Below I describe work that I have done on expanding the functionality of database systems to allow the efficient storage and querying of complex objects, such as sets, or objects with extend. This work was supported by two NSF grants and two NIMA grants. Specific problems I worked on include: 1. 2. 3. 4. 5. Similar Set Retrieval. Indexing Spatio-temporal Objects. Indexing Mobile Objects. Indexing Trajectories. Efficient Aggregation over Objects with Extend. 4. Theoretical foundations for addressing data mining and learning problems The Complexity of Finding Maximal Frequent Sets: The probelm is related to the problem of computing association rules, a fundamental problem in data mining which has motivated a large body of research (including hundreds of papers and many different approaches). We present the first algorithm for computing all maximal frequent sets that has a running time that depends only on the number of the maximal frequent sets, and not on the number of all frequent sets (there can be exponentially more frequent sets that maximal frequent sets). In addition we gave theoretical bounds on the performance of related algorithms. We also gave a new algorithm for finding the most interesting association rules has a patent on this algorithm, and it is used in an IBM product. In my thesis, under the supervision of Prof. David Dobkin, I designed and implemented geometric algorithms to compute the maximal discrepancy of point sets. 5. Analysis of Bioinformatics Data A great advance in bioinformatics is the increasing availability of gene expression data that describe the expression levels of different genes over time, as a result of different stimuli. I am currently working on the problem of analyzing data from multiple experiments, in order to discover how the expression levels of different genes combine and affect each other in biological processes. This work is supported by a TRDRP grant. 6. The Peer-to-Peer Model of Computing The Peer-to-Peer model of computing is becoming increasingly popular due to its simplicity, cost effectiveness, robustness, fault tolerance and availability. It also has scalability problems, because the networks form in an ad-hoc matter and make inefficient use of the resources. One of the problems that I am currently working on is to apply data mining techniques to develop intelligent peer-to-peer architectures that allow improved scalability. One interesting application of such techniques are analysis techniques for sensor networks. This work is supported by an NSF grant and a EU grant.