Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Oracle Database wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Concurrency control wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
ContactPoint wikipedia , lookup
Clusterpoint wikipedia , lookup
1 UU - DIS - UDBL DATABASE SYSTEMS - 10p Course No. ?? A second course on development of database systems Kjell Orsborn Uppsala Database Laboratory Department of Information Science, Uppsala University, Uppsala, Sweden Kjell Orsborn 2017-05-22 2 UU - DIS - UDBL Personell - Spring 2001 • Kjell Orsborn, lecturer • email: [email protected] • phone: 018 - 471 1154 • room no: A324 • Lars Westergren, lab assisstant • email: [email protected] • phone: 018 - 471 ???? • room no: E??? Kjell Orsborn 2017-05-22 3 UU - DIS - UDBL Preliminary course contents • Course intro - overview and rep of db terminologi, extended ER (KO) • Rep relational algebra operators, relationskalkyl, QBE (KO) • Advanced SQL (KO) • Storage and Index Structures (KO) • Query optimization (TR) • OO/OR DBMSs (TR) • AMOS/AMOSQL (TR) • Transactions, Concurrency Control (KO) Kjell Orsborn • Recovery Techniques, Security / Authorization (KO) • Distributed and Multi-DBMSs (TR) • Active DBMSs (TR) • Multimedia DBMSs (TR) • Data warehousing / Data Mining (TR) • Parallell DBMSs (KO) • Project presentation / Summary (KO) 2017-05-22 4 UU - DIS - UDBL Preliminary course contents • Labs InterBase • Lab Project XX – RDBMS – To be decided: InterBase alt. AMOS II • Labs AMOS II – OO/OR DBMS Kjell Orsborn 2017-05-22 5 UU - DIS - UDBL Introduction to Database Terminology Elmasri/Navathe chs 1-2 Lecture 1 Kjell Orsborn Department of Information Science Uppsala University, Uppsala, Sweden Kjell Orsborn 2017-05-22 6 UU - DIS - UDBL Evolution of Database Technology 1960 Hierarchical (IMS) 1970 Network model (CODASYL) 1980 Relational model (e.g. ORACLE) 1990 1st Generation OODB (e.g. Objectivity) 1997 Object Realtional DBMS (e.g. SQL99) Trees Complex data structures Tables OO data structures Object model {} Kjell Orsborn 2017-05-22 7 UU - DIS - UDBL Database? • A database (DB) is a more or less well-organized collection of related data. • The information in a database . . . – represents information within some subarea of “the reality” (i.e. objects, characteristics and relationships between objects) – is logically connected through the intended meaning – has been organized for a specific group of users and applications Kjell Orsborn 2017-05-22 8 UU - DIS - UDBL Outline of a database system DATABASE SYSTEM Applications Users’ procedures/statements interactive queries DBMS Database language tools Data managing tools Database schema Kjell Orsborn Database 2017-05-22 9 UU - DIS - UDBL An example database (Elmasri/Navathe fig. 1.2) Kjell Orsborn 2017-05-22 10 UU - DIS - UDBL Database management system? • A database management system (DBMS) is one (or several) program that provides functionality for users to develop, use, and maintain a database. • Thus, a DBMS is a general software system for defining, populating (creating), and manipulating databases for different types of applications. Kjell Orsborn 2017-05-22 11 UU - DIS - UDBL Back to the example ... • Defining this DB involve – declaration of files, records, fields and data types for each fields. • Population of the DB means – that the files are filled with data about individual students, courses etc. • Manipulation is then carried out by users directly via a query language or indirectly via application programs: – updates – queries to the DB Kjell Orsborn 2017-05-22 12 UU - DIS - UDBL Database System? • A database system consists of . . . – the physical database (instance) – a database management system – one or several database languages (means for communicating with the database) – one or several application program(s) • A database system makes a simple and efficient manipulation of large data sets possible. • The term DB can refer to both the content and to the system. The answer to this ambiguity is governed by the context. Kjell Orsborn 2017-05-22 13 UU - DIS - UDBL Why DB? • DB in comparison to manual paper-based registers: – – – – Better compactness Faster search Simpler maintenance “Greater usability” • DB in comparison to conventional file management: – – – – – – Kjell Orsborn Data model - data abstraction Meta-data Program- and data independence Multiple views of the same data High-level language for managing the database Efficient search/access of large data sets 2017-05-22 Database 14 UU - DIS - UDBL Problems that can be avoided using a database system • • • • • • • Redundancy and inconsistency Slow process to find correct information Different formats for data storage Erroneous or unreasonable values Incomplete updates Concurrent access Unauthorized access Kjell Orsborn 2017-05-22 15 UU - DIS - UDBL Data model? • Every DB has a data model which makes it possible to “hide” the physical representation of data. • A data model is a formalism that defines a notation for describing data on an abstract level together with a set of operations to manipulate data represented using this data model. • Data models are used for data abstraction - making it possible to define and manipulate data on an abstract level. Kjell Orsborn 2017-05-22 16 UU - DIS - UDBL Data model continued . . . • E.g. assume that information about employees in an enterprise exists in a file employees which is a sequence records of the type: record name: char[30]; manager: char[30] end An abstract model of this file is a relation: employees(name, manager) , where employees is the name of the relation and name and manager is the attribute of the relation. Kjell Orsborn 2017-05-22 17 UU - DIS - UDBL Data models - examples • Examples of data models within the database field are the: – Hierarchical (IMS), – Network (IDMS), – Relational (ORACLE, SYBASE, DB2, InterBase), – Object-oriented (O2, ObjectStore) and – Object-relational ( Informix, UniSQL, Iris/AMOS) data model. • Conceptual data model – ER-model (Entity-Relationship model) (not an implementation model since there are no operations defined for the notation) Kjell Orsborn 2017-05-22 18 UU - DIS - UDBL Meta-data, i.e. “data about data” • Information about which information that exists • Information about how/where data is stored – – – – file structures records data types / formats name of files, data types • Information regarding mapping between different schemas • Meta-data is stored in the, so called, system catalog or data dictionary. Kjell Orsborn 2017-05-22 19 UU - DIS - UDBL Meta-data cont. ... • Meta-data is used by the DBMS to answer questions such as: (Users) – Which information exists in the database? – Is the information “x” in the database? – What is the cost to access a specific piece of information. (Database administrator or DBMS) – How much is different parts of the database used? – How long is the response time for different types of queries? – Has any user tried to break the security system of the database? – Is optimization required of the physical organization of the database with regards to memory utilization or response times? Kjell Orsborn 2017-05-22 20 UU - DIS - UDBL Schema and instance To be able to separate data in the database and its description the terms database instance and database schema are used. • The schema is created when a database is defined. A database schema is not changed frequently. • The data in the database constitute an instance. Every change of data creates a new instance of the database. Kjell Orsborn 2017-05-22 21 UU - DIS - UDBL Data independence • Reduces the connection between: – the actual organization of data and – how the users/application programs process data (or “sees” data.) • Why? – Data should be able to change without requiring a corresponding alteration of the application programs. – Different applications/users need different “views” of the same data. Kjell Orsborn 2017-05-22 22 UU - DIS - UDBL Data dependencies Conventional systems have, in general, a very low level of data independence: • Even a small change of the data structure, e.g. the introduction or reduction of a field in a record structure, usually require that one has to make changes in several programs or routines. Programs can be dependent of that: • data is located on a specific storage medium • data has a specific storage format (binary, compressed) • fields have been coded according to certain rules (Man = 1, Woman = 2) • the file records are sorted in a specific manner etc . . . Kjell Orsborn 2017-05-22 23 UU - DIS - UDBL Data independence - how? By introducing a multi-level architecture where each level represents one abstraction level. Three-schema architecture: – In 1978 the following “standard” architecture (ANSI/SPARC architecture ) for databases was introduced. It consists of 3 levels: 1. Internal level 2. Conceptual level 3. External level – Each level introduces one abstraction layer and has a schema that describes how representations should be mapped to the next lower abstraction level. Kjell Orsborn 2017-05-22 24 UU - DIS - UDBL Three-schema architecture End users External level Conceptual level Internal level view1 view2 Conceptual schema Internal schema Database instance Kjell Orsborn 2017-05-22 viewn 25 UU - DIS - UDBL Internal schema • Describes storage structures and access paths for the physical database. – Abstraction level: files, index files etc. • Is usually defined through the data definition language (DDL) of the DBMS. Kjell Orsborn 2017-05-22 26 UU - DIS - UDBL Conceptual schema • An abstract description of the physical database. • Constitute one, for all users, common basic model of the logical content of the database. • This abstraction level corresponds to “the real world”: object, characteristics, relationships between objects etc. • The schema is created in the DDL according to a specific data model. Kjell Orsborn 2017-05-22 27 UU - DIS - UDBL External schemas, or views • A typical DB has several users with varying needs, demands, access privileges etc. • External schemas describes different views of the conceptual database with respect to what different user groups would like to/are allowed to se. • Some DBMS’s have a specific language for view definitions (else the DDL is used). Kjell Orsborn 2017-05-22 28 UU - DIS - UDBL Views - example (Elmasri/Navathe fig 1.4) Kjell Orsborn 2017-05-22 29 UU - DIS - UDBL Views - example in SQL • Assume that we have a relation (table) consisting information about employees in an enterprise: employees(name,dept,salary,address) • and wish to give a user group rights “to see” all information in the table except the SALARY field. • This can be accomplished by the definition of a view called safe-emp: create view safe-emps by select name, dept, address from employees; Kjell Orsborn 2017-05-22 30 UU - DIS - UDBL Data independence in the three-schema architecture 1. Logical data independence – The possibility to change the conceptual schema without influencing the external schemas (views). • e.g. add another field to a conceptual schema. 2. Physical data independence – The possibility to change the internal schema without influencing the conceptual schema.. • the effects of a physical reorganization of the database, such as adding an access path, is eliminated. Kjell Orsborn 2017-05-22 31 UU - DIS - UDBL Database languages • The term database language is a generic term for a class of languages used for defining, communicating with or manipulating a database. • In conventional programming languages, declarations and program sentences is implemented in one and the same language. • A DB system uses several different languages. – – – – Kjell Orsborn Storage Definition Language (SDL) Data Definition Language (DDL) View Definition Language (VDL) Data Manipulation Language (DML) 2017-05-22 32 UU - DIS - UDBL DDL and DML • DDL is used by the database administrator and others to define internal and conceptual schema. • In this manner the database is designed. Subsequent modifications in the design is also made in DDL. • DML is used by DB users and application programs to retrieve, add, remove, or alter the information in the database. The term query language is usually used as synonym to DML. Kjell Orsborn 2017-05-22 33 UU - DIS - UDBL DDL example in SQL create table flights(number: int, Date:char(6), Seats: int, from:char(3), to: char(3)); create index for flights on number; • The first expression defines a relation, its attribute and their types. • The second expression creates an index as part of the internal schema making search faster for flights, given a flight no. (e.g. this can be accomplished by creating a hash table with number as the key). Kjell Orsborn 2017-05-22 34 UU - DIS - UDBL DML example in SQL update flights set seats = seats -4 where number = 123 and date = ‘AUG 31’ “Decrease the no. of seats in flight no.. 123 on August 31 with 4.” insert into flights values(171, ‘AUG 21’, 100, ‘ROM’, ‘JFK’) “Add a new flight with flight no.. 171 and 100 seats, from Rom to New York (JFK) on August 21” Kjell Orsborn 2017-05-22 35 UU - DIS - UDBL Host language • Application programs that work with a DB are mainly written in a normal programming language: C, COBOL, PASCAL, C++, Java, etc. • The part of the program that interact with the DB is normally written in a DML. • The DML commands are embedded in the code that is written in the host language. ##get(b) sum := sum + b; ##store(sum) Kjell Orsborn 2017-05-22 36 UU - DIS - UDBL Classification criteria for DBMSs • Type of data model – hierarchical, network, relational, object-oriented, object-relational • Centralized vs. distributed DBMSs – Homogeneous vs. heterogeneous DDBMSs – Multidatabase systems • Single-user vs. multi-user systems • General-purpose vs. special-purpose DBMSs – specific applications such as airline reservation and phone directory systems. • Cost Kjell Orsborn 2017-05-22 37 UU - DIS - UDBL Components of a DBMS • Query processor – – – – DML compiler Embedded DML precompiler DDL interpreter Query processing unit • Storage manager – – – – Authorization and integrity control Transactions management File management Buffer management • Physical storage – data files, meta-data (data dictionary), index, statistics Kjell Orsborn 2017-05-22 39 UU - DIS - UDBL Components of a DBMS (fig 1.6 Silberschatz et al) Kjell Orsborn 2017-05-22 40 UU - DIS - UDBL Introduction to Database Design Using Entity-Relationship Modeling Elmasri/Navathe chs 3-4 Lecture 1 Kjell Orsborn Department of Information Science Uppsala University, Uppsala, Sweden Kjell Orsborn 2017-05-22 41 UU - DIS - UDBL ER-modeling • Aims at defining a high-level specification of the information content in the database. • History – Chen,P.P.S., “The entity-relationship model: towards a unified view of data”, ACM TODS, 1, 1 1976, p. 9-36. • Why ER-models? – High-level description - easier to understand for non-technicians – More formal than natural language - avoid misconceptions and multiple interpretations – Implementation independent (of DBMS) - less technical details – Documentation – Model transformation to an implementation data model Kjell Orsborn 2017-05-22 42 UU - DIS - UDBL ER-modeling cont. ... • How to do? – Identify: • Entity types and attributes • Relationship types – Normally presented in a graphical ER-diagram – An ER-model can later be transformed to an implementation data model, such as the relational data model Kjell Orsborn 2017-05-22 43 UU - DIS - UDBL Example of an ER-diagram Kjell Orsborn 2017-05-22 44 UU - DIS - UDBL Terminology in ER-modeling • Entity type - Entity – Physical or abstract concepts with some sort of identity. . • Attribute – Characteristics or different aspects that describes an entity. • Relationship type - Relationship – Represents relationships between entities ER-diagram Kjell Orsborn 2017-05-22 45 UU - DIS - UDBL Entity type • An entity type represents a set of entities that have the same set of attributes. – Entity types express the intention, i.e. the meaning of the concept whereas the set of entities represents the extension of that type. – Names of entity types are given in singular form. – The description of an entity type is called its schema. PERSON name, ssn, address, phoneno – Each attribute in an entity type is associated with a domain that indicates the allowed values of that attribute. Kjell Orsborn 2017-05-22 46 UU - DIS - UDBL Attribute • An attribute describe a character or aspect of an entity type. – Every attribute has a domain (or value set). – A domain specifies the set of allowed values each individual attribute can be assigned. – There is (at least) six different types of values for attributes: • • • • • • • simple/ composite single-valued/ multivalued stored/ derived null ER-diagram Kjell Orsborn Simple sex: M or F name: (Ior, Karlsson) name: “Ior Karlsson” friends: {Nasse, Puh,...} birthdate: 980917 age :0 Compos ite 2017-05-22 Multivalued Obs! Derived 47 UU - DIS - UDBL Key • An attribute that has unique values for every instance of an entity type is called a key attribute. • Sometimes several attributes are used together to get a unique key. • An entity type can have more than one key. key Kjell Orsborn hello 2017-05-22 48 UU - DIS - UDBL Relationship type • A relationship type represents a relationship (or relation/connection), between a number of entity types. • A relationship type R is a set of relational instances or tuples. • A relationship type, R, can mathematically be defined as: R E1 E2 … En where each Ej is a entity type. • A tuple (or an instance) t R is written as (e1, e2, ..., en) or <e1, e2, ..., en> where ej Ej. Kjell Orsborn 2017-05-22 49 UU - DIS - UDBL Structural constraints for relationship types • Cardinality ratio constraint specifies the number of relational instances that an entity can take part in. For binary relationship types: – one-to-one (1:1) – one-to-many (1:N) – many-to-many (M:N) Kjell Orsborn 2017-05-22 1 1 1 N M N 50 UU - DIS - UDBL Structural constraints cont. ... Partial Total • Participation constraint – specifies whether the entity existence is dependent of another entity via a relationship type. E.g. can an employee exist without working for a department? – Partial participation: the entity can exist without this relationship – Total participation: the entity requires this relationship in order to exist. employee works_for manages Kjell Orsborn 2017-05-22 department 51 UU - DIS - UDBL Roles of relationship types • A role name specifies what role an entity type plays in a specific relationship • Role names are sometimes used in ER-diagrams to clarify the roles of the participating entity types. Employee manager 1 N Supervision Kjell Orsborn 2017-05-22 worker 52 UU - DIS - UDBL Attributes for relationship types • Also a relationship type can have attributes. E.g. in the case where the weekly number of hours an employee works on a project should be kept, that can be represented for each instance of the relation “works-on”. • If the relation is a 1:1 or 1:N relation, the attribute can be stored at one of the participating entities. • When the relation is of the type M:N one must store the attributes with the instance of the relation. Kjell Orsborn 2017-05-22 53 UU - DIS - UDBL Weak entity types • Weak entity types are those that are meaningless without an owner entity type. • Weak entities are uniquely identified in the extension with their owner’s key attributes together with its own (broken) underlined attribute. • The relationship to the owner is called the identifying relationship. Kjell Orsborn 2017-05-22 54 UU - DIS - UDBL ER-notation Kjell Orsborn (Elmasri/Navathe fig. 3.14) 2017-05-22 55 UU - DIS - UDBL Another ER diagram Kjell Orsborn 2017-05-22 56 UU - DIS - UDBL ER model transformations • Replacing multi-valued attributes by an entity type Name Dept id Location DEPARTMENT Name Location Dept id 1 DEPARTMENT Kjell Orsborn N located at 2017-05-22 LOCATION 57 UU - DIS - UDBL ER model transf. cont. ... • Replacing M-N relationships with an entity type and binary TEACHER relationships. 1 TEACHER lectures M N booked for lectures ROOM N N LECTURES N Time 1 consists of ROOM COURSE 1 COURSE Kjell Orsborn 2017-05-22 Time 58 UU - DIS - UDBL Example ER-modeling – An enterprise consists of a number of departments. Each department has a name, a number, a manager, and a number of employees. The starting date for every department manager should also be registered. A department can have several office rooms. – Every department finances a number of projects. Each project has a name, a number and an office room. – For each employee, the following information is kept: name, social security number, address, salary and sex. An employee works for only one department but can work with several projects that can be related to different departments. Information about the number of hours (per week) that an employee work with a project should be stored. Information about the employees manager should also be stored. Kjell Orsborn 2017-05-22 62 UU - DIS - UDBL Extended Entity-Relationship (EER) modeling • The intention of using an E-R diagram is to use it as a basis for user communication or for getting to a good design specification. – i.e. try to make it simple and avoid to much complexity. • EER (extended or enhanced ER) introduces several notational extensions to deal concepts such as: – Superclass /subclass (supertype/subtype, is-a relationship) • specialization/generalization • constraints – Aggregation (whole/part or part-of relationship) – Union types (category) Kjell Orsborn 2017-05-22 63 UU - DIS - UDBL EER diagram notation for specialization and subclass Kjell Orsborn 2017-05-22 64 UU - DIS - UDBL Subclasses, superclasses & inheritance • Two generic ideas for creating superclass/subclass relationships – Specialization of superclass into subclasses – Generalization of subclasses into a superclass • Constraints and characteristics of spec. & gen. – Constraints • Predicate-defined (condition-defined) sub-classes • Attribute-defined • User-defined – Disjointness • Disjoint • Overlapping – Completeness • Total • Partial Kjell Orsborn 2017-05-22 65 UU - DIS - UDBL Generalization of subclasses Kjell Orsborn 2017-05-22 66 UU - DIS - UDBL Overlapping (nondisjoint) subclasses Kjell Orsborn 2017-05-22 67 UU - DIS - UDBL Representation of aggregation in ER notation Kjell Orsborn 2017-05-22 68 UU - DIS - UDBL A UML conceptual schema Kjell Orsborn 2017-05-22 69 UU - DIS - UDBL Specialization/generalization in UML Kjell Orsborn 2017-05-22 70 UU - DIS - UDBL Union of two entity types Kjell Orsborn 2017-05-22 71 UU - DIS - UDBL Alternative diagrammatic notation for ER/EER Kjell Orsborn 2017-05-22