* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download SE 611723: Advanced DBMS - Al
Survey
Document related concepts
Microsoft SQL Server wikipedia , lookup
Oracle Database wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Ingres (database) wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Functional Database Model wikipedia , lookup
Concurrency control wikipedia , lookup
Relational model wikipedia , lookup
ContactPoint wikipedia , lookup
Transcript
SE 611723: Advanced DBMS Shadi Aljawarneh Course Description Database management systems are standard tools that enable the storage and retrieval of data within modern information systems. Units introducing database concepts are now an accepted part of most computer science courses. These introductory units tend to concentrate on the use of relational database systems. This advanced module, in contrast, deals with implementation aspects of relational systems and tests the candidates’ knowledge of the current enhancements to relational database systems, object oriented database, RDF database and XML database systems. Course Aims The overall aims and objectives of this course will help you to: 1. Develop your knowledge and understanding of the underlying principles of Relational 2. Database Management System 3. Build up your capacity to learn DBMS advanced features 4. Develop your competence in enhancing database models using distributed databases 5. Build up your capacity to implement and maintain an efficient database system using emerging trends. Course Objectives 1. Upon completion of the course, you should be able to: 2. Describe the basic concepts of Relational Database Design 3. Explain Database implementation and tools 4. Describe SQL and Database System catalog. 5. Describe the process of DB Query processing and evaluation. 6. Discuss the concepts of transaction management. 7. Explain the Database Security and Authorization. 8. Describe the design of Distributed Databases and big data. 9. Know how to design with DB, XML and RDF. 10. Describe the basic concept of Data warehousing and Data mining 11. Discuss the emerging Database Models Technologies and Applications Reading material will consist primarily of research papers. All students will have to present a research paper of their choice, either from the list below or other papers subject to instructor’s approval. There will also be two exams (midsem/endsem), assignments, and a course project. Anyone who does an exceptional course project that has the potential to be a publishable paper is eligible for a straight Excellent grade. Otherwise the grading breakup would be midsem 30, endsem 40, project 20 and assignments plus seminar presentation 10 (the breakup of these will depend on whether we have individual or joint seminars, which depends on the final enrollment). Assignments To be decided. Project The project is mandatorily an implementation oriented project or a literature survey is acceptable as a project. (You may still need to do some literature survey to figure out your project though.) Projects should be done in groups of 2. A basic project will take any of the papers we study in the course, or other related papers, and implement the algorithms in the paper, and do a very basic performance study. However, I would expect most projects to improve upon existing techniques. A more advanced project would take a problem specification for which no solution is publicly available, figure out how to solve it, and implement the solution. Textbook (for background material only) Database System Concepts, 6th Ed. Avi Silberschatz, Hank Korth, and S. Sudarshan. McGraw Hill, 2010. (book home page) < Database Design and Implementation 1. (Oct 13) Part 1: Relational Databases. Related papers, not required reading: Prasan's full thesis The Volcano Optimizer Generator: Extensibility and Efficient Search. Goetz Graefe, William J. McKenna ICDE 1993: 209-218 The Cascades Framework for Query Optimization. Goetz Graefe IEEE Data Eng. Bull. 18(3): 19-29 (1995) 2. (Oct 13) Part 2: Database Design 3. (Oct 20) Data Storage and Querying Related papers, not required reading: 4. (Oct 27) Optimizing Nested Queries with Parameter Sort Orders Ravindra Guravannavar, Ramanujam H.S., S. Sudarshan Talk (ppt) Reducing Order Enforcement Cost in Complex Query Plans Ravindra Guravannavar and S. Sudarshan, ICDE 2007 Tech Report (@arxiv.org) Talk (ppt) Database Security &Authorization From TextBook “ADVANCED DATABASE MANAGEMENT SYSTEM” By NATIONAL OPEN UNIVERSITY OF NIGERIA Execution strategies for SQL subqueries Mostafa Elhemali, Cesar A. Galindo-Legaria, Torsten Grabs, Milind Joshi SIGMOD Conference 2007: 993-1004 (Talk from SIGMOD 07 (ppt)) Related papers, not required reading: Query Processing for SQL Updates Cesar A. Galindo-Legaria, Stefano Stefani, Florian Waas SIGMOD Conference 2004: 844-849 Talk (ppt) Adaptive Query processing 5. (Nov 3) Massively Parallel Data Management Systems (a.k.a. Big Data Systems) Background reading: The parallel database chapter and the distributed database chapter from DB Concepts. Slides: Chapter 18: Parallel Databases, and Chapter 19: Eddies: Continuously Adaptive Query Processing, Avnur and Hellerstein, SIGMOD 2000. (Talk taken from http://web.cs.wpi.edu/~cs561/s05/talks/eddy-sigmod00cs561.ppt) (Adaptive Query Processing using Eddies (ppt) by Amol Deshpande) (Jan 25, 2011) Talk (ppt) Distributed Databases (plus 3PC, not available on book site) 6. Nov 10, 17) Unit 1: Object Oriented Database Unit 2: Database and XML Unit 3: Introduction To Data Warehousing Unit 4: Introduction to Data Mining 7. (Nov 24) Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber, OSDI 06) Video of talk by Jeff Dean: Local mp4 copy OR on video.google.com Talk (ppt) Related papers, not required reading: 8. (Nov 24) Megastore: Providing Scalable, Highly Available Storage for Interactive Services, Jason Baker, Chris Bond, James C. Corbett, JJ Furman, Andrey Khorlin, James Larson, Jean-Michel Leon, Yawei Li, Alexander Lloyd, and Vadim Yushprakh CIDR 2011 You can also read about the Google AppEngines DataStore API, an API in Python. PNUTS: Yahoo!'s Hosted Data Serving Platform, Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, VLDB Talk by Brian Cooper (ppt) Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver and Ramana Yerneni. VLDB (industry track) 2008. Related papers, not required reading: 9. (Nov 24) database implementation on S3 (Brantner et al SIGMOD 2008) Asynchronous view maintenance for VLSD databases Parag Agrawal, Adam Silberstein, Brian F. Cooper, Utkarsh Srivastava, Raghu Ramakrishnan, SIGMOD 2009 Talk (odp) and(pdf) Old talk from 2010 (ppt) Related papers, not required reading: 10. (Nov 24) The Megastore paper (see above), to understand how it does asynchronous maintenance of indices. SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets Ronnie Chaiken, Bob Jenkins, Per-…أke Larson, Bill Ramsey, Darren Shakib, Simon Weaver, and Jingren Zhou, VLDB 2008 Related papers, not required reading: Hive - a petabyte scale data warehouse using Hadoop, A. Thusoo, J. S. Sarma, N. Jain, Shao Zheng, P. Chakka, Zhang Ning, S. Antony, Liu Hao, and R. Murthy, ICDE 2010 Pig Latin: A Not-So-Foreign Language for Data Processing Talk (pptx) Chris Olston, Brian Reed, Utarsh Srivastava, Ravi Kumar and Andrew Tomkins SIGMOD 2008, Talk (ppt) Week of 18-23: Midsemester Exam 11. (Dec 1) IR and DB 12 and 13. (Dec 8) Guest lecture on HSearch by Abinasha Karana, Founder and CTO Bizosys, Bangalore Talk (pdf) Keyword Searching and Browsing in Databases using BANKS Gaurav Bhalotia, Charuta Nakhe, Arvind Hulgeri, Soumen Chakrabarti and S. Sudarshan, ICDE 2002 Talk (ppt) Related papers, not required reading: Bidirectional Expansion For Keyword Search on Graph Databases, Varun Kacholia, Shashank Pandit, Soumen Chakrabarti, S Sudarshan, Rushi Desai and Hrishikesh Karambelkar, VLDB 2005 (Talk: ppt) Big Data (again) 14. (Dec 15) Spanner: Google's Globally-Distributed Database James C. Corbett et al., OSDI 2012 Talk (pptx) by Sagar Chordia Column-stores vs. row-stores: how different are they really? Daniel J. Abadi, Samuel Madden, Nabil Hachem: SIGMOD Conference 2008: 967-980 Talk from 2010 and Talk from 2011 (ppt) by Paresh Modak and Souman Talk (pdf) (source files) by Subhro Bhattacharyya and Souvik Pal Column Stores 15. (Dec 22) Mandal . See also VLDB 09 tutorial on column stores by Hariozopoulos, Abadi and Boncz Streaming Data 16. (Dec 29) Monitoring Streams - A New Class of Data Management Applications, Donald Carney, Ugur ‡أetintemel, Mitch Cherniack, Christian Convey, Sangdon Lee, Greg Seidman, Michael Stonebraker, Nesime Tatbul, Stanley B. Zdonik VLDB 2002: 215-226 Talk (pptx) by Joydip Datta and Debarghya Majumdar You must also read this talk: (PODS 2002 talk by Motwani) Related papers, not required reading 17. (Dec 29) Big Data yet again (Self Talk (pptx) by Ajay Gupta,Vinit Deodhar Aurora: A New Model and Architecture for Data Stream Management. Abadi, D. J., Carney, D., Cetintemel, U., Cherniack, M., Convey, C., Lee, S., Stonebraker, M., Tatbul, N., and Zdonik, S. The VLDB Journal 12 (2003), 120-139. Abadi, D., Ahmad, Y., Balazinska, M., Cetintemel, U., Cherniack, M., Hwang, J.-H., Lindner, W., Maskey, A. S., Rasin, A., Ryvkina, E., Tatbul, N., Xing, Y., and Zdonik, S. The Design of the Borealis Stream Processing Engine. In Proceedings of the 2nd Conference on Innovative Databasee Research (CIDR) (Jan. 2005), pp. 277-289. Models and issues in data stream systems, Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, Jennifer Widom PODS 2002 (PODS 2002 talk by Motwani) Physically Independent Stream Merging Badrish Chandramouli, David Maier and Jonathan Goldstein ICDE 2012 Talk (pdf) by Amol Bhangdiya and Pushkar Khadilkar Study) 18. Declarative Data Processing (outside of databases) 18. (Jan 5) 19. (Jan 5) Calvin: Fast Distributed Transactions for Partitioned Database systems Systems Alexander Thomson, Thaddeus Diamond, Shu-Chun Weng, Kun Ren, Philip Shao, and Daniel J. Abadi. SIGMOD 2012 Talk (pptx) by K. V. Mahesh and Abhishek Gupta Declarative Networking Boon Thau Loo, Tyson Condie, Minos Garofalakis, David E. Gay, Joseph M. Hellerstein, Petros Maniatis, Raghu Ramakrishnan, Timothy Roscoe, and Ion Stoica CACM 52(11), Nov 2009 Talk (pptx) by Harsh Vardhan and Sandeep Joshi Scalability for Virtual Worlds Talk (pptx) by Pratik Nitin Gupta, Alan J. Demers, Johannes Gehrke, Philipp Unterbrunner, Walker M. Patre and Biplab Kar White ICDE 2009 Talk (ppt) by Siddharth Chinoy and Zibran Shaikh Related papers, not required reading: Distributed Databases SEMMO: A Scalable Engine for Massively Multiplayer Online Games (Demonstration Paper) Nitin Gupta, Alan Demers, and Johannes Gehrke, SIGMOD 2008 Database Research Opportunities in Computer Games Walker White, Christoph Koch, Nitin Gupta, Johannes Gehrke, and Alan Demers, In SIGMOD Record, September 2007. RDF Database 20. (Jan 12) 21. (Jan 12) RDF-3X: a RISC-style Engine for RDF Thomas Neumann, Gerhard Weikum, VLDB 2008 Discussion on future of data management Talk (pptx) by Pankaj Vanwari