Download Dimitrios Gunopulos

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Cluster analysis wikipedia , lookup

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
Dimitrios Gunopulos
Department of Computer Science and Engineering
University of California, Riverside
Bourns College of Engineering
Riverside, CA 92507
Tel: (951) 827-2479
Email: [email protected]
Url: www.cs.ucr.edu/~dg
8526 Alexandria St.
Riverside, CA 92508, USA
Education
1. Ph.D. in Computer Science, November 1995, Princeton University, Princeton, NJ.
Thesis title: “Computing the Discrepancy.”
Advisor: David P. Dobkin.
2. M.A., October 1992, Princeton University, Princeton, Ν.J.
3. Diploma, Computer Engineering and Informatics Department, University of Patras,
Patras, Greece, 1990.
Work Experience
[8/2002 – current]
Associate Professor, Computer Science and Engineering Department,
University of California, Riveride.
[11/1998 – 7/2002]
Assistant Professor, Computer Science and Engineering Department,
University of California, Riverside.
[10/1996 – 11/1998]
Research Associate, IBM Almaden Research Center, Quest Data Mining
Group.
[9/1996 - 10/96]
Visiting Researcher, Department of Computer Science, University of
Helsinki.
[9/1995 - 8/1996]
Postdoctoral Fellow, Max-Planck-Institut for Informatik. Saarbruecken,
Germany.
[9/1990 – 8/1995]
Research Assistant, and for 3 semesters Teaching Assistant. Department
of Computer Science, Princeton University.
Publications - Journals
1. “Indexing Multi-Dimensional Time-Series”, M. Vlachos, M. Hadjieleftheriou, E. Keogh,
D. Gunopulos, accepted in VLDB Journal.
2. “Elastic Translation Invariant Matching of Trajectories”, M. Vlachos, G. Kollios, D.
Gunopulos, accepted in Machine Learning Journal.
3. “Indexing Mobile Objects Using Dual Transformations.” George Kollios, Dimitris
Papadopoulos, Dimitrios Gunopulos, Vassilis J. Tsotras:, accepted in VLDB Journal.
4. ‘”Selectivity Estimators for Multi-Dimensional Range Queries over Real Attributes”, D.
Gunopulos, G. Kollios, V. Tsotras, C. Domeniconi, accepted in VLDB Journal.
5. “Indexing Spatio-temporal Archives.” M. Hadjieleftheriou, G. Kollios, D. Gunopulos, V.
J. Tsotras, accepted in VLDB Journal.
6. “Large Margin Nearest Neighbor Classifiers”, C. Domeniconi, D. Gunopulos, J. Peng,
Accepted with Minor Revisions, IEEE Transactions on Neural Networks.
7. "Exploiting Locality for Scalable Information Retrieval in Peer-to-Peer Systems", D.
Zeinalipour-Yazti, V. Kalogeraki and D. Gunopulos, Information Systems Journal,
Elsevier Publications, In Press, to appear in 2004.
8. "An Efficient Density-Based Approach for Data Mining Tasks", C. Domeniconi, D.
Gunopulos, Knowledge and Information Systems, 6(6), pp. 750-770, November 2004.
9. "Information Retrieval Techniques for Peer-to-Peer Networks", D. Zeinalipour-Yazti, V.
Kalogeraki and D. Gunopulos. IEEE CiSE Magazine, Special Issue on Web Engineering,
IEEE Publications, pp.12-20., July/August 2004.
10. “Discovering all most specific sentences”. Dimitrios Gunopulos, Roni Khardon, Heikki
Mannila, Sanjeev Saluja, Hannu Toivonen, Ram Sewak Sharma. TODS 28(2): 140-174
(2003).
11. “Distributed Deviation Detection in Sensor Networks.” Themistoklis Palpanas, Dimitris
Papadopoulos, Vana Kalogeraki, Dimitrios Gunopulos. Sigmod Record 32(4), Special
Issue on Sensor Technology, December 2003.
12. “Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large
Datasets”. G. Kollios, D. Gunopulos, N. Koudas, S. Berchtold, IEEE Trans. Knowl. Data
Eng. 15(5): 1170-1187 (2003).
13. “Temporal and spatio-temporal aggregations over data streams using multiple time
granularities.” Donghui Zhang, Dimitrios Gunopulos, Vassilis J. Tsotras, Bernhard
Seeger. Information Systems 28(1-2): 61-84 (2003).
14. “Feature Selection for the Naive Bayesian Classifier Using Decision Trees”. Chotirat
(Ann) Ratanamahatana, Dimitrios Gunopulos. Applied Artificial Intelligence 17 (5-6):
475-487 (2003).
15. “Indexing Mobile Objects Using Duality Transforms.” Dimitris Papadopoulos, George
Kollios, Dimitrios Gunopulos, Vassilis J. Tsotras. IEEE Data Engineering Bulletin 25(2):
18-24 (2002).
16. “Locally Adaptive Metric Nearest Neighbor Classification”, C. Domeniconi, J. Peng, D.
Gunopulos. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI),
24(9), pp. 1281-1285, September 2002.
17. “Time-Series Similarity Problems and Well-Separated Geometric Sets”. B.
G. Das, D. Gunopulos, H. Mannila, Nordic Journal of Computing, 4/2001.
Bollobas,
18. “Indexing Animated Objects Using Spatiotemporal Access Methods”. G. Kollios, V.
Tsotras, D. Gunopulos, A. Delis, M. Hadjeleftheriou. IEEE Trans. Knowledge and Data
Engeneering (TKDE), 23(5):758-777, 2001.
19. “Constraint-Based Rule Mining on Large, Dense Data-Sets”. R. Bayardo, R. Agrawal, D.
Gunopulos. Data Mining and Knowledge Discovery Journal, vol. 4(2/3): 217-240, 2000.
20. “Computing the maximum Bichromatic Discrepancy, with applications in Computer
Graphics and Machine Learning”. D. Dobkin, D. Gunopulos, W. Maass. Journal of
Computer and Systems Sciences, 52(3):453-470 (1996). Preliminary version: Electronic
Colloquium on Computational Complexity (ECCC)(025): (1994).
Publications - Referred conferences, symposia, workshops
1. “Rotation invariant distance measures for trajectories.” Michail Vlachos, Dimitrios
Gunopulos, Gautam Das: Proc. ACM SIGKDD 2004: 707-712.
2. “Subspace Clustering of High Dimensional Data.” Carlotta Domeniconi, Dimitris
Papadopoulos, Dimitrios Gunopulos, Sheng Ma: SIAM Int. Conference in Data Mining
(SDM) 2004.
3. “Identifying Similarities, Periodicities and Bursts for Online Search Queries.” Michail
Vlachos, Christopher Meek, Zografoula Vagena, Dimitrios Gunopulos: ACM SIGMOD
Conference 2004: 131-142.
4. Eamonn J. Keogh, Themis Palpanas, Victor B. Zordan, Dimitrios Gunopulos, Marc
Cardle: Indexing Large Human-Motion Databases. Proc. VLDB 2004: 780-791.
5. “Online Amnesic Approximation of Streaming Time Series. “ Themistoklis Palpanas,
Michail Vlachos, Eamonn J. Keogh, Dimitrios Gunopulos, Wagner Truppel: Proc. IEEE
ICDE 2004: 338-349.
6.
“Iterative Incremental Clustering of Time Series.” Jessica Lin, Michail Vlachos, Eamonn
J. Keogh, Dimitrios Gunopulos: EDBT 2004: 106-122.
7. “Sketching Techniques for Spatio-temporal Density Queries.” Marios Hadjieleftheriou,
George Kollios, Dimitrios Gunopulos, Vassilis J. Tsotras: HDMS (Hellenic Data
Management Symposium) 2004.
8. "On Constructing Internet-Scale P2P Information Retrieval Systems". D. ZeinalipourYazti, V. Kalogeraki and D. Gunopulos, Intl. Workshop On Databases, Information
Systems and P2P Computing, VLDB Workshops 2004, pp.122-136.
9. “Fast Motion Capture Matching with Replicated Motion Editing”. ,M. Cardle, M.
Vlachos, S. Brooks, E. Keogh, D. Gunopulos. In Proc. of SIGGRAPH 2003, Technical
Sketches & Applications. San Diego, CA.
10. Clustering Gene Expression Data in SQL Using Locally Adaptive Metrics. Dimitris
Papadopoulos, Carlotta Domeniconi, Dimitrios Gunopulos, Sheng Ma. 8th ACM Sigmod
Workshop on Research Issues in Data Mining and Knowledge Discovery, 2003.
11. “Efficient Approximation Of Optimization Queries Under Parametric Aggregation
Constraints.” Sudipto Guha, Dimitrios Gunopulos, Nick Koudas, Divesh Srivastava,
Michail Vlachos. Proc. VLDB 2003: 778-789.
12. “On-Line Discovery of Dense Areas in Spatio-temporal Databases.” Marios
Hadjieleftheriou, George Kollios, Dimitrios Gunopulos, Vassilis J. Tsotras. SSTD 2003:
306-324.
13. “Peer-to-Peer Architectures for Scalable, Efficient and Reliable Media Services.” Vana
Kalogeraki, Alex Delis, Dimitrios Gunopulos. Proc. IPDPS 2003, Nice, France.
14. “Indexing Multi-Dimensional Time-Series with Support for Multiple Distance
Measures.”Michail Vlachos, Marios Hadjieleftheriou, Dimitrios Gunopulos, Eamonn
Keogh, ACM SIGKDD 2003.
15. “Correlating Synchronous and Asynchronous Data Streams”. Sudipto Guha, Dimitrios
Gunopulos, Nick Koudas. ACM SIGKDD 2003.
16. “Robust Similarity Measures for Multidimensional Trajectories”. M. Vlachos, D.
Gunopulos, G. Kollios. 5th Int. Workshop Mobility in Databases and Distributed Systems
(MDDS 2002), in conjunction with the 13th Int. Conf. on Database and Expert Systems
Applications (DEXA 2002), Aix-en-Provence, France.
17. “Indexing Mobile Objects on the Plane”. Dimitris Papadopoulos, George Kollios,
Dimitrios Gunopulos, V. Tsotras. 5th Int. Workshop Mobility in Databases and
Distributed Systems (MDDS 2002), in conjunction with the 13th Int. Conf. on Database
and Expert Systems Applications (DEXA 2002), Aix-en-Provence, France (an extended
version appears at HDMS 2002).
18. “Evaluating the Use of Statistical Phrases and Latent Semantic Indexing for Text
Classification.” Huiwen Wu, Dimitrios Gunopulos. Poster paper, IEEE International
Conference on Data Mining, 2002.
19. “A Local Search Mechanism for Peer-to-Peer Networks”, V. Kalogeraki, D. Gunopulos,
D. Zeinalipour-Yatzi. 2002 ACM Conference on Information and Knowledge
Management (CIKM).
20. “Indexing Mobile Objects on the Plane”. D. Papadopoulos, G. Kollios, D. Gunopulos, V.
Tsotras. 1st Hellenic Data Management Symposium (HDMS 2002).
21. “Non-Linear Dimensionality Reduction Techniques for Classification and Visualization'',
M. Vlachos, C. Domeniconi, D. Gunopulos, G. Kollios, N. Koudas. 2002 ACM SIGKDD
Conference.
22. “Efficient Aggregation over Objects with Extend”. D. Zhang, V. Tsotras, D. Gunopulos.
21th SIGMOD-SIGACT-SIGART Symp. on Principles of Database Systems (PODS
2002), Madison, WI.
23. “Handling Multimedia Streams in a Peer-to-Peer Network”. V. Kalogeraki, A. Delis, D.
Gunopulos, short paper, 2nd Int. Workshop on Global and Peer-to-Peer Computing on
Large Scale Distributed Systems, at IEEE CCGrid 2002.
24. “Efficient Local Flexible Nearest Neighbor Classification”, C. Domeniconi, D.
Gunopulos, 2nd SIAM Int. Conf. on Data Mining (SDM) 2002.
25. “Temporal Aggregation over Data Streams using Multiple Granularities”. D. Zhang, D.
Gunopulos, V. Tsotras, B. Seeger. 8th Conf. On Extending Database Technologies
(EDBT-2002).
26. “Efficient Indexing of Spatiotemporal Objects”, M. Hadjieleftheriou, G. Kollios, V.
Tsotras, D. Gunopulos. 8th Conf. On Extending Database Technologies (EDBT-2002).
27. “Adaptive Nearest Neighbor Classification using Support Vector Machines”.
C.
th
Domeniconi, D. Gunopulos.14 Neural Ιnformation Processing Systems Conference,
NIPS 2001, Vancouver, CA.
28. “Discovering similar trajectories”. M. Vlachos, G. Kollios, D. Gunopulos.18th IEEE
International Conf. on Data Engineering (ICDE 2002), San Jose, CA.
29. “Incremental Support Vector Machine Construction”. C. Domeniconi, D. Gunopulos.
Poster paper, 1st IEEE International Conference on Data Mining (ICDM), 2001, San
Jose, CA.
30. “An Efficient Approach for Approximating Multi-dimensional Range Queries and
Nearest Neighbor Classification in Large Datasets”. C. Domeniconi, D. Gunopulos. 18th
International Conference on Machine Learning (ICML), June 2001, Williams College,
MA.
31. “Efficient Mining of SpatioTemporal Patterns”. E. Tsoukatos, D. Gunopulos. 7th
International Symposium on Spatial and Temporal Databases (SSTD), 2001, Redondo
Beach, CA.
32. “Efficient Computation of Temporal Aggregates with Range Predicates”. D. Zhang, A.
Markowetz, V. Tsotras, D. Gunopulos, B. Seeger. 20th SIGMOD-SIGACT-SIGART
Symp. on Principles of Database Systems (PODS 2001), Santa Barbara, CA.
33. “Efficient and Tunable Similar Set Retrieval”. Aristides Gionis, D. Gunopulos, Nick
Koudas. 2001 ACM SIGMOD Conf. on Management of Data, Santa Barbara, CA.
34. “An Efficient Approximation Scheme for Data Mining Tasks”. G. Kollios, D. Gunopulos,
N. Koudas, S. Berchtold. 17th IEEE International Conf. on Data Engineering (ICDE)
2001: 453-462, Heidelberg, Germany.
35. “An Adaptive Metric Machine for Pattern Classification.” C. Domeniconi, J. Peng, D.
Gunopulos.13th Neural Information Processing Systems Conference, NIPS 2000: 458464, Denver, CO (proceedings published by MIT Press).
36. “Managing Multimedia Streams in Distributed Environments using CORBA”. V.
Kalogeraki, D. Gunopulos. 6th International Workshop on Multimedia Information
Sustems (MIS 2000), Chicago, IL.
37. “Identifying Prospective Customers”. P. Chou, E. Grossman, D. Gunopulos, P.
Kamesam. 6th ACM SIGKDD Int. Conference on Knowledge Discovery and Data
Mining: 447-456, 2000, Boston, MA.
38. “Adaptive Metric Nearest Neighbor Classification”. C. Domeniconi, J. Peng, D.
Gunopulos. IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
2000, Hilton Head Island, SC.
39. “Approximating Multi-Dimensional Aggregate Range Queries Over Real Attributes”. D.
Gunopulos, G. Kollios, V. Tsotras, C. Domeniconi. 2000 ACM SIGMOD Conf. on
Management of Data: 463-474, Dallas, TX.
40. “Indexing Animated Objects”. George Kollios, D. Gunopulos, Vassilis Tsotras. 5th
International Workshop on Multimedia Information Sustems (MIS) '99, Palm Springs
Desert, CA, 1999.
41. “Nearest Neighbor Queries in a Mobile Environment.” G. Kollios, D. Gunopulos, V.
Tsotras, iWorkshop for Spatio-Temporal Database Management (STDBM-99): 119-134,
co-located with VLDB-99, Edinburg, Scotland.
42. “All-Pairs Nearest Neighbors in a Mobile Environment”. D. Gunopulos, G. Kollios, V.
Tsotras. 7th Hellenic Conference on Informatics: II.23-II.30, Ioannina, Greece, 1999.
43. “Indexing Mobile Points”. George Kollios, D. Gunopulos, Vassilis Tsotras.18th ACM
Symp. on Principles of Database Systems (PODS-1999): 261-272, Philadelphia, PA.
44. “Constraint-Based Rule Mining on Large, Dense Data-Sets”. R. Bayardo, R. Agrawal, D.
Gunopulos. 15th IEEE Int. Conf. on Data Engineering (ICDE 99): 188-197, Sydney,
Australia.
45. “Automatic Subspace Clustering of High Dimensional Data for Data Mining
Applications”. R. Agrawal, J. Gehrke, D. Gunopulos, P. Raghavan. Proc. of 1998 ACM
SIGMOD Int. Conf. on Management of Data: 94-105, Seattle, WA.
46. “Mining Process Models from Workflow Logs”. R. Agrawal, D. Gunopulos, F. Leymann.
6th Intl. Conf. on Extending Database Technology (EDBT-98): 469-483, Valencia, Spain,
47. “Data mining, Hypergraph Transversals, and Machine Learning”. D. Gunopulos, R.
Khardon, H. Mannila, H. Toivonen.16th ACM Symp. on Principles of Database Systems
(PODS-1997): 209-216, Tuscon, AZ.
48. “Time-Series Similarity Problems and Well-Separated Geometric Sets”. B. Bollobas, G.
Das, D. Gunopulos, H. Mannila.13th ACM Symp. on Computational Geometry, 1997:
454-456, Nice, France.
49. “Episode Matching”. G. Das, R. Fleisher, L. Gasieniec, D. Gunopulos, J. Karkkainen. 8th
Annual Symp. Combinatorial Pattern Matching 1997: 12-27, Aarhus, Denmark
(proceedings published in LNCS 1264, Springer, 1997).
50. “Finding Similar Time Series”. G. Das, D. Gunopulos, H. Mannila, in Principles of Data
Mining and Knowledge Discovery in Databases (PKDD) 97: 88-100, (proceedings
published in Lecture Notes in Artificial Intelligence (LNCS 1263), Springer, 1997),
Trondheim, Norway.
51. “Discovering all Most Specific Sentences by Randomized Algorithms”. D. Gunopulos,
H. Mannila, S. Saluja. 6th International Conference in Database Theory 1997: 215-229,
Delphi, Greece. (Proceedings published in Lecture Notes in Computer Science (LNCS
1186), Springer, 1997).
52. “Some geometry problems in Machine Learning”. D. Dobkin, D. Gunopulos. First ACM
Workshop on Applied Computational Geometry, 1996, Philadelphia, PA, (proceedings
published in Lecture Notes in Computer Science (LNCS) 1148, Springer, 1996).
53. “Computing optimal shallow decision trees”. D. Dobkin, D. Gunopulos, S. Kasif. 4th
International Symposium on Artificial Intelligence and Mathematics, 1996, Miami, FL
54. “Concept Learning with geometric hypotheses”. D. Dobkin, D. Gunopulos. 8th
Conference on Computational Learning Theory (COLT-95): 329-344, Santa Cruz, CA.
55. “Computing the Rectangle Discrepancy”. D. Dobkin, D. Gunopulos. 3rd Annual Video
Review of Computational Geometry (Proc. 10th ACM Symp. on Computational
Geometry: 385), Vancouver, Canada, 1994.
Publications - Book Chapters
1. “Indexing Multi-Dimensional Trajectories for Similarity Queries.” M. Vlachos, M.
Hadjieleftheriou, E. Keogh, D. Gunopulos. In Spatial Databases: Technologies,
Techniques and Trends. A book edited by Yannis Manolopoulos, Apostolos
Papadopoulos (Aristotle University Thessaloniki ) & Michael Vassilakopoulos
(Technological Educational Institute of Thessaloniki), Greece. Final manuscript (25
pages) accepted, to appear.
2. “Locally Adaptive Techniques for Pattern Classification”, Carlotta Domeniconi,
Dimitrios Gunopulos, in Encyclopedia of Data Warehousing and Mining, Idea Group,
Inc., Editor: John Wang, Professor, Montclair State University. Final manuscript (10
pages) accepted, to appear.
3. “Time Series Similarity and Indexing”. Invited chapter in the Handbook on Data Mining.
D. Gunopulos, G. Das. Editor Nong Ye, publisher Lawrence Erlbaum Associates, 2003,
pp 279-304.
4. “Indexing Similar Time Series under Conditions of Noise”. Invited chapter in “Data
Mining in Time Series Data Bases”, D. Gunopulos, G. Das, M. Vlachos. Editors:
Abraham Kamel, Horst Bunke, Mark Last.
5. “All pairs nearest neighbors in a mobile environment.” Gunopulos, D., Kollios, G.,
Tsotras, V. Advances in Informatics, D.I. Fotiadis, S.D. Nikolopoulos, editors, World
Scientific, April 2000, ISBN 981-02-4192-5.
Publications - Books
“Data Mining: Quality Assessment and Uncertainty Handling”. M. Vazirgiannis, M. Halkidi,
D. Gunopulos. Springer-Verlang London Limited, 2003.
Publications - Other Publications
1. “Time series similarity Measures”, Dimitrios Gunopulos. Invited contribution to the
Encyclopedia of Biostatistics, 2nd Edition, John Wiley Ltd., B. Everitt, Editor, to appear
(7 pages).
2. “Summer School Report: Dimacs Summer School Tutorial on New Frontiers in Data
Mining, August 2001”. D. Gunopulos, N. Koudas. SIGMOD Record, 10/15/2001.
3. “Workshop Report: 2000 ACM SIGMOD Workshop on Research Issues in Data Mining
and Knowledge Discovery”. D. Gunopulos, R. Rastogi. SIGKDD Explorations 2(1): 8384 (2000).
4. ”Using kernels to approximate multi-dimensional aggregate range queries over real
attributes”. D. Gunopulos, G. Kollios, V. Tsotras, C. Domeniconi. Workshop on New
Perspectives in Kernel-Based Learning Methods, NIPS 2000
5. “FIGI: The Architecture of an Internet-Based Financial Information Gathering
Infrastructure”. Marios Dikaiakos, D. Gunopulos. Future work paper in First Int.
Workshop on Advance Issues of E-Commerce and Web-based Information Systems
(WECWIS) 99: 91-94, Santa Clara, CA (proceedings published by IEEE Computer
Society).
Tutorials
1. DIMACS Summer School on New Frontiers in Data Mining, August 13-17, Piscataway
NJ.
Organizers: D. Gunopulos, UCR, N. Koudas, ATT Research.
url: http://dimacs.rutgers.edu/Workshops/MiningTutorial
This was a 5 day long school, funded uder the auspices of the Center for Discrete
Mathematics and Theoretical Computer Sciene (DIMACS) 2001-2004 Special Focus on
Data Analysis and Mining. Also funded by NSF. There were 16 invited speakers from
academia (MIT, Cornell University, UC Riverside, Columbia University, NYU,
University of Toronto) and industry (IBM Almaden Research Center, IBM Watson
Research Center, Lucent Bell Labs, AT&T Research, HP Labs, Verity, Telcordia,
Exelixis).
2. “Mining Spatiotemporal Data”. D. Gunopulos. 7th International Symposium on Spatial
and Temporal Databases (SSTD), 2001, Redondo Beach, CA.
3. “Time Series Similarity Measures and Time Series Indexing”. D. Gunopulos, Gautam
Das. 2001 ACM SIGMOD Conf. on Management of Data, May 2001, Santa Barbara
(Proceedings of the 2001 ACM SIGMOD Conf. On Management of Data, p.624. )
4. “Time series similarity measures”. D. Gunopulos, Gautam Das. 6th ACM SIGKDD Conf.
on Knowledge Discovery and Data Mining, 2000, Boston, MA. Slides in KDD-00
Tutorial Notes – Tutorial Notes for ACM SIGKDD 2000, The Sixth International
Conference on Knowledge Discovery and Data Mining - August 20-23, 2000, Boston,
MA USA, ACM, p. 243-307.
Patents
1. “System and Method for Organizing Repositories of Semi—Structured Documents Such
as Email”. R. Agrawal, R. Bayardo, D. Gunopulos, H. Ho, S. Sarawagi, J. Shafer, R.
Srikant, U.S. Patent US6592627 (7/15/2003).
2. “System and method for constraint-based rule mining in large, dense data-sets”. R.
Bayardo, R. Agrawal, D. Gunopulos. U.S. Patent US6278997 (8/21/2001).
3. “Prospective Customer Selection Using Customer and Market Reference Data”. P. Chou,
E. Grossman, D. Gunopulos, P. Kamesam. U.S. Patent US06061658 (5/9/2000).
4. “Automatic Subspace Clustering of High Dimensional Data for Data Mining
Applications”. R. Agrawal, J. Gehrke, D. Gunopulos, P. Raghavan. U.S. Patent
US6003029 (12/14/1999).
5. “Mining Process Models from Workflow Logs”. R. Agrawal, D. Gunopulos, F. Leymann,
D. Roller. U.S. Patent US6038538 (3/14/2000).
Invited Talks, Panels
1. “Data Mining in Geophysical Data.” Invited talk at the Workshop on Spatio-temporal
Data Models of Biogeophysical Fields, Ecological Forecasting, April 8-10, 2002, San
Diego Supercomputer Center, La Jolla, CA. Organizers: Dr. G. Henerby (UNL), Dr. J.
Chomicki (SUNY Buffalo), Dr. T. Fountain (SDSC), Dr. K.J. Ranson (NASA).
Sponsored by NSF.
2. “Approximating Range Queries”. Hewlett-Packard Labs, Palo Alto, CA, 5/11/2001.
3. “Approximating multi-dimensional aggregate range queries over real attributes”. Lucent
Bell-Labs, Morristown, NJ, 9/2/2000.
4. Panelist: “From Minitel to the World-Wide Web and Beyond: The Ongoing Role of
Multimedia Information Systems in Digital Government”. MIS 2000 Sixth International
Workshop on Multimedia Information Systems, October 26-28, 2000, Chicago, IL.
Moderator: Bill McIver (Brown Univ.).
5. Panelist: 2000 ACM SIGMOD Workshop on Research Issues in Data Mining and
Knowledge Discovery (DMKD 2000) panel. Moderator: Umesh Dayal (HP Labs).
6. “Indexing Moving Points”. University of California, Santa Barbara, Santa Barbara, CA,
12/6/1999.
7. “Clustering for Data Mining Applications”. National Technical University of Greece,
Athens, 8/30/1999.
8. “Clustering for Data Mining Applications”. Univ. of Pittsburgh, Pittsburgh, PA,
5/26/1999.
9. “Computing Association Rules”. NASA Ames Research Center, Moffett Field, CA,
8/14/1997.
10. “Computing the rectangle discrepancy”. AMS Meeting 892, Special Session in
Computational Geometry, NY, NY, 1994.
Research Funding
1. “Efficient Indexing for Spatiotemporal Applications”
Agency: National Science Foundation
Principal Investigators: Dr. V. Tsotras, Dr. D. Gunopulos
Amount: $ 436,804
Duration: 10/1999 - 9/2002.
2. “Access Methods for Spatio-Temporal Data”
Agency: U.S. Department of Defense
Principal investigator: Dr. V. Tsotras
Co-Investigator: Dr. D. Gunopulos
Amount: $ 100,000
Duration: 7/1999 – 6/2001.
3. “Developing the Next Generation of Virtual Library Internet Finding Tools for the
Library Community”
Agency: Institute of Museum and Library Services
Principal investigators: Drs. James Thompson, Dimitrios Gunopulos
Amount: $ 498,750
Duration: 10/99 – 9/01.
4. “Data mining techniques for geospatial applications”
Agency: NSF CAREER Award
Principal Investigator: Dr. D. Gunopulos
Amount: $ 320,000
Duration: 9/00 – 8/04.
5. “Knowledge Discovery in Spatio-Temporal Data”
Agency: U.S. Department of Defense
Principal investigator: Dr. D. Gunopulos
Co-Investigator: Dr. V. Tsotras
Amount: $ 100,000
Duration: 7/2000 – 7/2002.
6. “Knowledge Management over Time-Varying Geospatial Datasets”
Agency: NSF
Principal Investigator: Dr. Peggy Aggouris (U. of Maine)
UCR is a subcontractor: Drs. D. Gunopulos, V. Tsotras
Amount (UCR): $ 100,000
Duration: 9/2000 - 8/2002
7. “Fertility, smoking and Early Mammalian Development”
Funding agency: Tobacco Related Disease Research Program
PIs: Dr. Prudence Talbot (Cell Bio. and Neuroscience, UCR), Dr. D. Gunopulos
Amount: $454,491
Duration: 7/01-6/04
8. AT&T Research Award, $25,000, with M. Faloutsos, C. Ravishankar.
9. UC Regents Fellowship award, $5000, 1999-2000.
10. “DBGlobe: A data centric approach to global computing”
Funding Agency: European Commission Research Directorates General
IST, Future and Emerging Technologies, Global Computing Initiative
Participants:
Univ. of Ioannina (Leading Institution), Computer Technology Institute,
Athens University of Economics and Business, INRIA, Aalborg University,
Unversity of Cyprus, Univ. of California, Riverside.
Amount: 932,000 EURO
Duration: 1/2002 – 12/2003
11. “ITR: Understanding Change in Spatiotemporal Data.”
Funding Agency: NSF
PI: Dimitrios Gunopulos. Co-PI: Vassilis Tsotras
Amount: $140,000
Duration: 9/2002 – 8/2004
12. “An Adaptive and Scalable Architecture for Dynamic Sensor Networks”
Funding Agency: NSF
PI: V. Kalogeraki, UCR. Co-Pis: D. Gunopulos, S. Krishnamurthy, V. Tsotras
Amount: $600,000
Duration: 9/2003 - 8/2006.
Professional Activities
1. Associate Editor of IEEE Transactions on Knowledge and Data Engineering (IEEE
TKDE).
2. Member in the Editorial Advisory Board of the Elsevier Information Systems Journal.
3. Program Co-Chair and General Co-Chair: 15th International Conference on Scientific and
Statistical Database Management (SSDBM 2003), Cambridge, MA, July 9-11, 2003.
4. Program Vice-Chair, IEEE International Conference on Data Engineering (ICDE) 2004.
5. Program Co-Chair: 2000 ACM SIGMOD Workshop on Research Issues in Data Mining
and Knowledge Discovery (DMKD), held in cooperation with SIGMOD'2000, Dallas,
Texas, May 2000.
6. Proceedings Chair: Eighteenth ACM Symposium on Principles of Database Systems
(PODS-99).
7. Program Committee member: KDD 1998, DMKD 1999, ICDE 2000, SIGMOD 2001,
ICML 2001, SSTD 2001, ICDM 2001, SIGKDD 2001, ICDE 2002, PODS 2002,
SIGKDD 2003, CIKM 2002, SDM 2003, SIGMOD 2004, SSDBM 2004, SAC 2003,
PAKDD 2004, PODS 2005.
8. Referee for proposals: NSF, NASA.
9. Participated in: Computer Science and Telecommunications Board, National Academy of
Science, Workshop on the Intersection of Geospatial Information and Information
Technology, Washington, D.C., October 1-2, 2001.
Students
Ph.D.:
Carlotta Domeniconi
Michail Vlachos
Ph.D. Summer 2002.
Currently Assistant Professor at George Mason
University.
Ph.D. Summer 2004.
Currently Research Staff Member, at IBM T.J.
Watson Research Center.
Ph.D (current):
Dimitrios Papadopoulos (expected Graduation Winter 2004)
Demetris Zeinalipour-Yatzi (expected Graduation Summer 2005)
Sharmila Subramaniam
Song Li
Shalendra Chhabra
Postdoctoral Fellows:
Themis Palpanas
`
Maria Halkidi
Ph.D. 2003, University of Toronto.
Postdoctoral Fellow at UCR: 2/2003 to 8/2004.
Currently Research Associate at IBM T.J.
Watson Research Center.
Ph.D. 2004, Athens University of Economics
and Business
Marie Currie Postdoctoral Fellow, in UCR from
5/2004 to 5/2005.
MS:
Ilias Tsoukatos
(2001)
Sarika Sahni
(2000)
Michail Vlachos
(2001)
Atinder Singh
(2002)
Sharmila Subramanian (2003)
Demetris Zeinalipour-Yatzi (2003)
Bo Wang
(2001)
Jessica Lin
(2002)
Huiwen Wu
(2002)
Ram Sharma
(2002)
Dimitris Papadopoulos (2003)
Committee Member:
George Kollios
(Ph.D. 2000, Polytechnic University)
Donghui Zhang
(Ph.D. 2002, UCR)
Vaggelis Hristidis
(Ph.D. 2004, UC San Diego)
Marios Hadjieleftheriou (Ph.D. 2004, UCR)
Ph.D. Thesis Reader:
Pirjo Moen
(PhD 2000, University of Helsinki)
Research Statement
Knowledge discovery in databases is an exciting new field of computer science research,
encompassing several diverse techniques for analyzing large datasets. The goal of data mining
and knowledge discovery in databases is to obtain new, interesting and actionable pieces of
information. Vast amounts of data are accumulated in diverse application domains, including
bioinformatics, epidemiology, business, physical sciences, web applications, and networking.
Data mining is fundamentally an interdisciplinary field, borrowing and combining techniques
from theory, statistics, databases and machine learning, and ultimately producing new
approaches.
My research work is motivated by hard real life problems in analyzing data in those
areas. I believe that it is important to work to problems that are not only interesting but also
important in practice, and for this reason I have worked extensively on research problems with
research labs: AT&T Research (Summer 2000, Spring 2001, Summer 2001), HP Labs (Summer
2001), Microsoft Research, and IBM Almaden (I was in the Quest Group in IBM Almaden from
10/96 to 12/98). My research work has concentrated on the following topics:
1. Design of efficient algorithms for knowledge discovery in high dimensional data
The most important problem in the field of data mining and knowledge discovery in
databases is designing efficient algorithms for data analysis tasks. This problem fundamentally
differentiates the field from the related fields of machine learning and statistics, where the main
focus is the accurate modeling of the data. In many data mining applications, the datasets are very
large, and only efficient algorithms can make the data analysis task feasible.
Since most datasets are stored in databases, or they are to large to reside in main memory,
an important aspect of algorithm design is the minimization of disk accesses. This work was
supported by an NSF Career award, and an NSF ITR grant.
I have worked on the following problems in this area:
1.
2.
3.
4.
5.
2.
Multi-dimensional Selectivity Estimation.
Density Estimation for Geospatial Datasets.
Improving the performance of clustering, and classification.
Locally Adaptive Nearest-Neighbor Classification.
Subspace Clustering for Data Mining Applications.
Design new knowledge discovery tasks for streaming and sensor data
Another important area of research is designing algorithms and techniques for new data
analysis tasks. The accumulation and aggregation of data by many organizations and companies
has led to the formulation of new interesting approaches of analyzing such data. In addition, new
sources of data, each with their own properties, become prevalent.
Examples of such new sources of data include spatiotemporal datasets that describe the
evolution of physical phenomena or the movement of objects, and stream data that are the output
of (mobile or stationary) sensors. It is important to develop specific techniques that are suitable
for datasets with specific properties because the data mining tasks we wish to perform can be
quite complex. This work was supported by an NSF Career award, an NSF ITR award, an NSF
award and a DoD grant.
Following I list work that I have done in the design of new data analysis techniques.
1.
2.
3.
4.
5.
6.
3.
Aggregation over Stream Data.
Training a SVM Classifier on Stream Data.
Finding Frequent Spatio-temporal Patterns in Large Geospatial Datasets.
Discovering Similar Trajectories.
Discovering Workflow Graphs.
Computing the Similarity of Time Series.
Provide database support for spatio-temporal and high dimensional data
A higher level of integration between data analysis tools and relational databases is very
desirable for both database users and developers of database applications. There is a lot of
research on extending database functionality, and much of this research is driven by user
requirements. It is difficult to integrate batch data mining algorithms (such as clustering or
classification) in the database engine because performance suffers, on the other hand, integrating
techniques that allow efficient exploration of the data by the user can significantly improve the
performance of exploratory data analysis tasks. Below I describe work that I have done on
expanding the functionality of database systems to allow the efficient storage and querying of
complex objects, such as sets, or objects with extend. This work was supported by two NSF
grants and two NIMA grants.
Specific problems I worked on include:
1.
2.
3.
4.
5.
Similar Set Retrieval.
Indexing Spatio-temporal Objects.
Indexing Mobile Objects.
Indexing Trajectories.
Efficient Aggregation over Objects with Extend.
4. Theoretical foundations for addressing data mining and learning problems
The Complexity of Finding Maximal Frequent Sets: The probelm is related to the
problem of computing association rules, a fundamental problem in data mining which has
motivated a large body of research (including hundreds of papers and many different approaches).
We present the first algorithm for computing all maximal frequent sets that has a running time
that depends only on the number of the maximal frequent sets, and not on the number of all
frequent sets (there can be exponentially more frequent sets that maximal frequent sets). In
addition we gave theoretical bounds on the performance of related algorithms. We also gave a
new algorithm for finding the most interesting association rules has a patent on this algorithm,
and it is used in an IBM product.
In my thesis, under the supervision of Prof. David Dobkin, I designed and implemented
geometric algorithms to compute the maximal discrepancy of point sets.
5. Analysis of Bioinformatics Data
A great advance in bioinformatics is the increasing availability of gene expression data
that describe the expression levels of different genes over time, as a result of different stimuli. I
am currently working on the problem of analyzing data from multiple experiments, in order to
discover how the expression levels of different genes combine and affect each other in biological
processes. This work is supported by a TRDRP grant.
6. The Peer-to-Peer Model of Computing
The Peer-to-Peer model of computing is becoming increasingly popular due to its
simplicity, cost effectiveness, robustness, fault tolerance and availability. It also has scalability
problems, because the networks form in an ad-hoc matter and make inefficient use of the
resources. One of the problems that I am currently working on is to apply data mining techniques
to develop intelligent peer-to-peer architectures that allow improved scalability. One interesting
application of such techniques are analysis techniques for sensor networks. This work is
supported by an NSF grant and a EU grant.