* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Slides - Cs.UCLA.Edu
Survey
Document related concepts
Concurrency control wikipedia , lookup
Operational transformation wikipedia , lookup
Data vault modeling wikipedia , lookup
Microsoft Access wikipedia , lookup
Tandem Computers wikipedia , lookup
Business intelligence wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Versant Object Database wikipedia , lookup
Relational algebra wikipedia , lookup
Clusterpoint wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Transcript
CS240A: Databases and Knowledge Bases Introduction Carlo Zaniolo Department of Computer Science University of California, Los Angeles Database Systems During late 60s Relational DBMS were proposed [by E.F. Codd] in the 70s 10+years of R&D led to Relational DBMSs and SQL IMS and other hierarchical DBMSs Codasyl-compliant DBMSs using the network model Extraordinary success from a research and a commercial view point (IBM, Oracle, …) Relational DBMS were covered in CS143 But starting in the mid 80s, DBMSs have faced major technical and commercial challenges, forcing a major evolution in these systems---this is the topic of CS240A! DBMS Vendors IBM. SystemR, DB2 Oracle MS SQL Server Smaller Players: Sybase, Informix, Teradata/NCR Changes and Challenges and Expert Systems and rule-based computing and knowledge management: New Applications and data types (e.g., spatio- temporal and multimedia information) Object Oriented databases Datablades and extenders The WEB and XML Deductive Databases and recursive queries Active databases and rules, Publishing databases using XML XQuery: the new query language for XML data. Decision Support, Knowledge Discovery, Big Data, Machine Learning, …, Data Science OLAP applications Data Mining Evolution of SQL Standards SQL89 and SQL2 (a.k.a. SQL92): Strictly relational. SQL3: working documents discussing new specs for OR systems, but also for recursion, active rules, OLAPs and OLAP functions. SQL:1999, and with minor changes SQL:2003. But evolution continues: User-defined indexes, user-defined aggregates, XML, etc. In this course we investigate how SQL and relational systems are being extended to face the new applications. We will often study languages other than SQL as a framework for research. The main Problem of SQL: Inadequate Expressive Power For instance, SQL cannot support complex queries and recursion needed in several applications, such as BillofMaterials applications. Thus database applications are now developed in procedural languages with embedded SQL statements An impedance mismatch between SQL the host language (different data types programming paradigm) slows down application development and their execution. Two approaches to solve the problem: Making query language more powerful: deductive databases Extending programming languages with DB capabilities—this is approach taken by OO DBMSs and OR DBMSs Expressive Power: Relational Completeness All relational languages suffer from the same expressivepower problems: 1. Relational Algebra, 2. Domain Relational Calculus, 3. Tuple Relational Calculus, and 4. Nonrecursive safe Datalog rules. These languages are equivalent in terms of the expressive power, and programs (I.e. queries) written in one language are easily mapped into programs written in another. The notion of Relational Completeness (RC) defines the class of queries expressible using relational algebra or, equivalently, using safe relational calculus queries. RC was proposed in the 70s as a minimum required for all database query languages (not met by most of query languages at that time) But nowadays RC is not enough! Datalog SQL’s Close Relations 1. QBE (Query by Example): twodimensional rendering of domain calculus 2. QUEL and SQL: inline, keywordbased versions of tuple relational calculus---with extensions such as updates and aggregates. 3. Datalog: ruleoriented, logicbased refinement of domain calculus. Datalog is the best candidate for more powerful query languages because Its formal framework based on first order logic, It supports the rulebased programming paradigm, that is the key of expert systems and knowledgebased systems Similar to Prolog which is more procedural. Big Data have brought a renewed interest in Datalog. The Bigger Picture Assemblers, Operating Systems (Early 60s …) Languages and Compilers (Late 60s …) Information Management Systems and Data Base Management Systems (DBMS) (70s … GUIs (80s …) Networks (60s) and the WEB (90s) and beyond Year 2000 and beyond big data analytics 2010 and so… Datalog’s renaissance. Workplan and Grade Basis ---Grade Basis for CS240A Midterm : 40% Homework and Assignements: 10% Final Projects and Reports 50% (XML 15%, DM 35%) Take home final Consists of two projects: The first project will be about supporting temporal queries in XML and JSON. The second project will ask you to write decision support queries in SQL and DeALS.