Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Biographical Sketch: Michael J. Cafarella A. Professional Preparation Brown University Computer Science University of Edinburgh Artificial Intelligence University of Washington Computer Science University of Washington Computert Science A.B., 1996 M.Sc., 1997 M.Sc., 2005 Ph.D., 2009 B. Appointments Starting December, 2009 Assistant Professor, Department of Computer Science and Engineering, University of Michigan, Ann Arbor, MI C. Publications Relevant Publications: Data Integration for the Relational Web. Michael J. Cafarella, Alon Halevy, Nodira Khoussainova. PVLDB 1(2), 2009. 2) Uncovering the Relational Web. Michael J. Cafarella, Alon Halevy, Yang Zhang, Daisy Zhe Wang, Eugene Wu. Proceedings of the Eleventh International Workshop on the Web and Databases (WebDB), June 2008. Vancouver, Canada. 3) WebTables: Exploring the Power of Tables on the Web Michael J. Cafarella, Alon Halevy, Yang Zhang, Daisy Zhe Wang, Eugene Wu. Proceedings of VLDB 2008, August 2008. Auckland, New Zealand. 4 Extracting and Querying a Comprehensive Web Database. Michael Cafarella. Proceedings of the Conference on Innovative Data Systems Research (CIDR) 2009. Asilomar, CA. 5 Navigating Extracted Data with Schema Discovery. Michael J. Cafarella, Dan Suciu, Oren Etzioni. Proceedings of the Tenth International Workshop on the Web and Databases (WebDB), June 2007. Beijing, China. Other Publications: Ontology-driven, Unsupervised Instance Population. Luke K. McDowell and Michael Cafarella. Journal of Web Semantics 6(3): 218-236, 2008. 2) Structured Querying of Web Text: A Technical Challenge. Michael J. Cafarella, Christopher Re, Dan Suciu, Oren Etzioni, Michele Banko. Proceedings of the Conference on Innovative Data Systems Research (CIDR) 2007. Asilomar, CA. 3Open Information Extraction from the Web. Michele Banko, Michael J. Cafarella, Stephen Soderland, Matthew Broadhead, Oren Etzioni. Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), January 2007. Hyderabad, India. 4) A Search Engine for Natural Language Applications. Michael J. Cafarella, Oren Etzioni. Proceedings of the 14th International World Wide Web Conference (WWW 2005). 5) Web-Scale Information Extraction in KnowItAll: Preliminary Results. Oren Etzioni, Michael Cafarella, Doug Downey, Stanley Kok, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S. Weld, Alexander Yates. Proceedings of the 13th International World Wide Web Conference (WWW 2004). D. Synergistic Activities Michael Cafarella’s main research focus is in the area of Web information extraction and integration. His recent WebTables project obtained more than 125M distinct databases by crawling the Web for HTML tables used in a database-like setting; the result was a database corpus more than five orders of magnitude larger than any previous effort. His published Octopus system allows users to easily combine and integrate data from these extracted Web sources. In both cases, his work has solved novel data management problems prompted by the growth of the Web. Dr. Cafarella is also the co-founder of the Nutch and Hadoop open-source projects. Nutch is an open-source search engine. Hadoop is a suite of cluster-based data management tools, including the most widely-used implementation of the MapReduce distributed programming framework. Hadoop is used in both academia and industry, including MIT, Stanford, CMU, Yahoo!, Facebook, and NYTimes.com. E. Collaborators and Other Affiliations Non-advisor collaborators: Michele Banko (University of Washington), Nodira Khoussainova (University of Washington), Jayant Madhavan (Google, Inc.), Luke McDowell (US Naval Academy), Christopher Re (University of Wisconsin, Madison), Daisy Zhe Wang (UC Berkeley) Eugene Wu (MIT), Yang Zhang (MIT) Graduate Advisors: Oren Etzioni, Computer Science and Engineering, University of Washington Dan Suciu, Computer Science and Engineering, University of Washington Alon Halevy, Google Research, Google, Inc.