* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download marked - Kansas State University
Open Database Connectivity wikipedia , lookup
Serializability wikipedia , lookup
Functional Database Model wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Relational algebra wikipedia , lookup
Concurrency control wikipedia , lookup
Clusterpoint wikipedia , lookup
Database model wikipedia , lookup
Lecture 01 of 42 Database Architecture: Client-Server Models, Relational DB Definitions Wednesday, 27 August 2008 William H. Hsu Department of Computing and Information Sciences, KSU KSOL course page: http://snipurl.com/va60 Course web site: http://www.kddresearch.org/Courses/Fall-2008/CIS560 Instructor home page: http://www.cis.ksu.edu/~bhsu Reading for Next Class: Sections 2.1 – 2.3, Silberschatz et al., 5th edition Syllabus and Introductory Handouts CIS 560: Database System Concepts Wedneday, 27 Aug 2008 Computing & Information Sciences Kansas State University Lecture Outline  Reading for Next Class: Sections 2.1 – 2.3, Silberschatz et al. 5e  Today and Friday: Basic Relational DB Principles  Relations  Database definitions  Records  Fields  Tables  Relations  Next Week: Relational Algebra and SQL Intro  Relational operators: PROJECT, SELECT, JOIN  Variations on relational joins  Implementation  Examples and Exercises  Look at Chapter 2 examples  Exercises: relational algebra formulas CIS 560: Database System Concepts Monday, 25 Aug 2008 Computing & Information Sciences Kansas State University Course Administration  Official Course Page (KSOL): http://snipurl.com/va60  Class Web Page: http://www.kddresearch.org/Courses/Fall-2008/CIS560  Instructional E-Mail Addresses  [email protected] (always use this to reach instructor)  [email protected] (this goes to everyone)  Instructor: William Hsu, Nichols 213  Office phone: +1 785 532 7905; home phone: +1 785 539 7180  IM: AIM/YIM/MSN hsuwh & rizanabsith, ICQ 28651394 & 191317559  Office hours: after class Mon/Wed/Fri; Tue; other times by appointment  Graduate Teaching Assistant: TBD  Office location: Nichols 124  Office hours: to be announced on class web board  Grading Policy  Hour exams: 15% each (in-class, closed-book); final (open-book): 30%  Machine problems, problem sets (8 of 10): 16%; term project: 17%  Class participation: 7% (3% attendance, 2% questions, 2% answers) CIS 560: Database System Concepts Computing & Information Sciences Kansas State University Review: Relational Databases, Architectures, Platforms  This Course: Relational DBs, Queries, Indexing, Concurrency  Tools: (My)SQL & ORACLE, PHP & JSP  Database Management Systems (DBMS)  Storage and retrieval of information  Accessed using queries     Data Manipulation Languages (DMLs) Data Description Languages (DDLs) Client-Server Architecture Relational Databases  Based on relations (tuples of entities)  Entities: objects organized into sets  Relationships  Links between entities  Correspond to mathematical relations CIS 560: Database System Concepts Computing & Information Sciences Kansas State University Review: Overall System Structure CIS 560: Database System Concepts Computing & Information Sciences Kansas State University Review: Relations Defined • • • • A relation on sets S1, S2, …, Sn is a subset of S1  S2  …  Sn It consists of those tuples (s1, s2, …, sn) that are related Finite example: set of pairs (a, b) in {1, 2, 3) such that a > b Infinite example: >, set of pairs (a, b) of nonnegative integers aka natural numbers such that a > b CIS 560: Database System Concepts 1 1 2 2 3 3 A B Computing & Information Sciences Kansas State University Review: Functions Defined A function is a relation that takes an element from a set and maps it to a unique element in another set f maps R to Z Domain R f Z Co-domain (range) f(4.3) 4 4.3 Pre-image of 4 Image of 4.3 Adapted from slides © 2005 A. Bloomfield, University of Virginia CS202 Discrete Mathematics http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/ CIS 560: Database System Concepts Computing & Information Sciences Kansas State University Review: Function Example A pre-image of 1 The image of A Domain Co-domain Alice A “a” 1 Bob B “bb“ 2 Chris C “cccc” 3 Dave D “dd” 4 Emma F “e” 5 A class grade function A string length function Adapted from slides © 2005 A. Bloomfield, University of Virginia CS202 Discrete Mathematics http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/ CIS 560: Database System Concepts Computing & Information Sciences Kansas State University Review: Functions and Non-Function Relations Range a 1 “a” 1 e 2 “bb“ 2 i 3 “cccc” 3 o 4 “dd” 4 u 5 “e” 5 Some function… Not a valid function! Also not a valid function! Adapted from slides © 2005 A. Bloomfield, University of Virginia CS202 Discrete Mathematics http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/ CIS 560: Database System Concepts Monday, 25 Aug 2008 Computing & Information Sciences Kansas State University Review: One-to-One (Injective) Functions • • • A function is one-to-one if each element in the co-domain (range) has a unique pre-image aka into function Every element y that is mapped into has an inverse f-1(y)… but this is not necessarily every element in the co-domain a 1 a 1 e 2 e 2 i 3 i 3 o 4 o 4 5 5 A one-to-one function A function that is not one-to-one Adapted from slides © 2005 A. Bloomfield, University of Virginia CS202 Discrete Mathematics http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/ CIS 560: Database System Concepts Monday, 25 Aug 2008 Computing & Information Sciences Kansas State University More on one-to-one  Injective is synonymous with one-to-one  “A function is injective”  A function is an injection if it is one-to-one  Note that there can be un-used elements in the co-domain a 1 e 2 i 3 o 4 5 Adapted from slides © 2005 A. Bloomfield, University of Virginia CS202 Discrete Mathematics http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/ 11 A one-to-one function Computing & Information Sciences Kansas State University Review: Onto (Surjective) Functions • • A function is onto if each element in the co-domain (range) is an image of some pre-image Every element y with a unique pre-image has an inverse f-1(y)… but this is not necessarily every element in the co-domain a 1 a 1 e 2 e 2 i 3 i 3 o 4 o 4 u 5 An onto function A function that is not onto Adapted from slides © 2005 A. Bloomfield, University of Virginia CS202 Discrete Mathematics http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/ CIS 560: Database System Concepts Monday, 25 Aug 2008 Computing & Information Sciences Kansas State University More on onto  Surjective is synonymous with onto  “A function is surjective”  A function is an surjection if it is onto  Note that there can be multiply used elements in the co-domain a 1 e 2 i 3 o 4 u Adapted from slides © 2005 A. Bloomfield, University of Virginia CS202 Discrete Mathematics http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/ 13 An onto function Computing & Information Sciences Kansas State University Onto vs. one-to-one  Are the following functions onto, one-to-one, both, or neither? a 1 a 1 b 2 b 2 c 3 c 3 4 d 4 1-to-1, not onto Both 1-to-1 and onto 1 a 1 b 2 b 2 c 3 c 3 d 4 Onto, not 1-to-1 1 b 2 c 3 4 Not a valid function a d a Neither 1-to-1 nor onto 14 Adapted from slides © 2005 A. Bloomfield, University of Virginia CS202 Discrete Mathematics http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/ Computing & Information Sciences Kansas State University Review: Bijections Consider a function that is both one-to-one and onto Such a function is a one-to-one correspondence, or a bijection aka permutation aka invertible function a 1 b 2 c 3 d 4 Adapted from slides © 2005 A. Bloomfield, University of Virginia CS202 Discrete Mathematics http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/ CIS 560: Database System Concepts Monday, 25 Aug 2008 Computing & Information Sciences Kansas State University Inverse functions Let f(x) = 2*x R f R f-1 f(4.3) 8.6 4.3 f-1(8.6) Then f-1(x) = x/2 Adapted from slides © 2005 A. Bloomfield, University of Virginia CS202 Discrete Mathematics http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/ 16 Computing & Information Sciences Kansas State University More on inverse functions  Can we define the inverse of the following functions? a 1 a 1 b 2 b 2 c 3 c 3 4 d What is f-1(2)? Not onto! What is f-1(2)? Not 1-to-1!  An inverse function can ONLY be defined on a bijection Adapted from slides © 2005 A. Bloomfield, University of Virginia CS202 Discrete Mathematics http://www.cs.virginia.edu/~asb/teaching/cs202-spring07/ 17 Computing & Information Sciences Kansas State University Query Processing 1. Parsing and translation 2. Optimization 3. Evaluation CIS 560: Database System Concepts Wednesday, 24 Jan 2008 Computing & Information Sciences Kansas State University Query Processing (Cont.)  Alternative ways of evaluating a given query  Equivalent expressions  Different algorithms for each operation  Cost difference between a good and a bad way of evaluating a query can be enormous  Need to estimate the cost of operations  Depends critically on statistical information about relations which the database must maintain  Need to estimate statistics for intermediate results to compute cost of complex expressions CIS 560: Database System Concepts Wednesday, 24 Jan 2008 Computing & Information Sciences Kansas State University Transaction Management  A transaction is a collection of operations that performs a single logical function in a database application  Transaction-management component ensures that the database remains in a consistent (correct) state despite system failures (e.g., power failures and operating system crashes) and transaction failures.  Concurrency-control manager controls the interaction among the concurrent transactions, to ensure the consistency of the database. CIS 560: Database System Concepts Wednesday, 24 Jan 2008 Computing & Information Sciences Kansas State University Database Architecture The architecture of a database systems is greatly influenced by the underlying computer system on which the database is running:  Centralized  Client-server  Parallel (multi-processor)  Distributed CIS 560: Database System Concepts Wednesday, 24 Jan 2008 Computing & Information Sciences Kansas State University History of Database Systems  1950s and early 1960s:  Data processing using magnetic tapes for storage  Tapes provide only sequential access  Punched cards for input  Late 1960s and 1970s:  Hard disks allow direct access to data  Network and hierarchical data models in widespread use  Ted Codd defines the relational data model  Would win the ACM Turing Award for this work  IBM Research begins System R prototype  UC Berkeley begins Ingres prototype  High-performance (for the era) transaction processing CIS 560: Database System Concepts Wednesday, 24 Jan 2008 Computing & Information Sciences Kansas State University History (cont.)  1980s:  Research relational prototypes evolve into commercial systems  SQL becomes industrial standard  Parallel and distributed database systems  Object-oriented database systems  1990s:  Large decision support and data-mining applications  Large multi-terabyte data warehouses  Emergence of Web commerce  2000s:  XML and XQuery standards  Automated database administration CIS 560: Database System Concepts Wednesday, 24 Jan 2008 Computing & Information Sciences Kansas State University Chapter 2: Relational Model       Structure of Relational Databases Fundamental Relational-Algebra-Operations Additional Relational-Algebra-Operations Extended Relational-Algebra-Operations Null Values Modification of the Database CIS 560: Database System Concepts Wednesday, 24 Jan 2008 Computing & Information Sciences Kansas State University Example of a Relation CIS 560: Database System Concepts Wednesday, 24 Jan 2008 Computing & Information Sciences Kansas State University Basic Structure  Formally, given sets D1, D2, …. Dn a relation r is a subset of D1 x D2 x … x Dn Thus, a relation is a set of n-tuples (a1, a2, …, an) where each ai  Di  Example: If customer_name = {Jones, Smith, Curry, Lindsay} customer_street = {Main, North, Park} customer_city = {Harrison, Rye, Pittsfield} Then r = { (Jones, Main, Harrison), (Smith, North, Rye), (Curry, North, Rye), (Lindsay, Park, Pittsfield) } is a relation over customer_name x customer_street x customer_city CIS 560: Database System Concepts Computing & Information Sciences Kansas State University Attribute Types  Each attribute of a relation has a name  The set of allowed values for each attribute is called the domain of the attribute  Attribute values are (normally) required to be atomic; that is, indivisible  Note: multivalued attribute values are not atomic  Note: composite attribute values are not atomic  The special value null is a member of every domain  The null value causes complications in the definition of many operations  We shall ignore the effect of null values in our main presentation and consider their effect later CIS 560: Database System Concepts Computing & Information Sciences Kansas State University Relation Instance  The current values (relation instance) of a relation are specified by a table  An element t of r is a tuple, represented by a row in a table attributes (or columns) customer_name customer_street Jones Smith Curry Lindsay Main North North Park customer_city Harrison Rye Rye Pittsfield tuples (or rows) customer CIS 560: Database System Concepts Wednesday, 24 Jan 2008 Computing & Information Sciences Kansas State University Relations are Unordered n Order of tuples is irrelevant (tuples may be stored in an arbitrary order) n Example: account relation with unordered tuples CIS 560: Database System Concepts Computing & Information Sciences Kansas State University Database  A database consists of multiple relations  Information about an enterprise is broken up into parts, with each relation storing one part of the information account : stores information about accounts depositor : stores information about which customer owns which account customer : stores information about customers  Storing all information as a single relation such as bank(account_number, balance, customer_name, ..) results in  repetition of information (e.g., two customers own an account)  the need for null values (e.g., represent a customer without an account)  Normalization theory (Chapter 7) deals with how to design relational schemas CIS 560: Database System Concepts Computing & Information Sciences Kansas State University Summary  RDB Overview  Mathematical Relations     Based on sets (of entities) Relation R: subset of Cartesian product of sets Each member of R: one tuple R specifies “which combinations are related”  Relational Algebra  Based on mathematical relations CIS 560: Database System Concepts Computing & Information Sciences Kansas State University Terminology      Database Management Systems (DBMS) Data Manipulation Languages (DMLs) Data Description Languages (DDLs) Client-Server Architecture Relational Databases  Entity  Relationship  Relations  Subsets of Cartesian product of two or more sets  Functions  Functions  One-to-one (into function, injection)  Onto (surjection)  One-to-one & onto (bijection, permutation, invertible function) CIS 560: Database System Concepts Computing & Information Sciences Kansas State University
 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            