Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Introduction » Misunderstood topics ˃ ˃ ˃ ˃ Normalization Database design Performance SQL » Advanced topics ˃ ˃ ˃ Time in databases Translucency Performance » Realistic experience ˃ ˃ ˃ Realistic team size Accountability Emerging requirements » Current Developments ˃ ˃ ˃ Big data NOSQL Cloud Computing 2 » Early applications: ˃ Programs wrote information into files on disk ˃ Programs included lots of information about the files + Where they were stored + Type of storage + Exact format of each record ˃ Changing programs is, in general, very hard + Programming is exacting work + Testing takes lots of time + People change jobs ˃ Early programs were very hard to change + If data moved, programs had to change + If data changed, programs had to change + Events tend to force changes in data 3 » It was discovered that many programs fit a paradigm: ˃ They stored some data ˃ Then later they changed it ˃ Although hard problems of changing structure of data remained » Many useful applications could be built on this notion of a “stored data base” ˃ Data base systems were developed to help manage the data ˃ They provided uniform backup, recovery ˃ Later, they even made changing the data easier 4 » Earlier database systems: hierarchies, networks as data models ˃ ˃ ˃ ˃ Data could be moved around easily Relationships represented as physical connections Structure of relationship imbedded in applications When structure changed, programs had to change » Relational: independent table as data model ˃ ˃ ˃ ˃ Relationships “represented” by equal values of data Structure of relationships invisible to applications Relationships change as data value change Much greater ease of change 5 6 » » » » Inventor of the relational approach Received Turing Award Mathematician at IBM Research Was looking for a true formalism for data 7 » Relational Database: a set of relations 8 » Relation: a set of ordered pairs » Ordered pair: a pair of values, such that interchanging the two values changes the meaning ˃ That is, <a,b>=<b,a> iff a=b and b=a » Specifying a relation by enumeration: R={<a,b>,<c,d>,<e,f>} ˃ This is a relation consisting of three ordered pairs. 9 » Ordered pairs can model more than two values through nesting: ˃ <a, b, c> == <<a,b>, c> ˃ <a, b, c, d> == <<a,b>, c, d> ˃ And so on » This extends the ordered pair so that it can model a tuple of any length » Now a relation starts to look like our notion of a file, with each tuple corresponding to our notion of a record 10 » Relation is a set of ordered pairs (modeling a set of tuples), so: » 1. exchanging order of values within a tuple changes the meaning of the tuple » 2. exchanging the order of tuples within a relation does not change the meaning of the tuple » 3. duplicate tuples are not allowed 11 » Now we build a database as a collection of independent relations, each describing instances of a single entity type » For example: ˃ Employee (employee#, job, salary, department) ˃ Department (department#, departmentname, location) 12 » We need a way to insert data into the database, retrieve data from the database, and changes values that are stored in the database » We define a data language that can be used from any programming language to do that » The data language (SQL) has a lot of power and can save a lot of programming work if you understand it 13 » Now we’ll talk about course mechanics 14