Download DATABASE SYSTEMS

Document related concepts

Oracle Database wikipedia , lookup

IMDb wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Concurrency control wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Database wikipedia , lookup

ContactPoint wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
1
UU - DIS - UDBL
DATABASE SYSTEMS - 10p
Course No. ??
A second course on development of
database systems
Kjell Orsborn
Uppsala Database Laboratory
Department of Information Science, Uppsala University,
Uppsala, Sweden
Kjell Orsborn
2017-05-22
2
UU - DIS - UDBL
Personell - Spring 2001
•
Kjell Orsborn, lecturer
• email: [email protected]
• phone: 018 - 471 1154
• room no: A324
•
Lars Westergren, lab assisstant
• email: [email protected]
• phone: 018 - 471 ????
• room no: E???
Kjell Orsborn
2017-05-22
3
UU - DIS - UDBL
Preliminary course contents
• Course intro - overview and rep
of db terminologi, extended ER
(KO)
• Rep relational algebra
operators, relationskalkyl, QBE
(KO)
• Advanced SQL (KO)
• Storage and Index Structures
(KO)
• Query optimization (TR)
• OO/OR DBMSs (TR)
• AMOS/AMOSQL (TR)
• Transactions, Concurrency
Control (KO)
Kjell Orsborn
• Recovery Techniques, Security
/ Authorization (KO)
• Distributed and Multi-DBMSs
(TR)
• Active DBMSs (TR)
• Multimedia DBMSs (TR)
• Data warehousing / Data
Mining (TR)
• Parallell DBMSs (KO)
• Project presentation / Summary
(KO)
2017-05-22
4
UU - DIS - UDBL
Preliminary course contents
• Labs InterBase
• Lab Project XX
– RDBMS
– To be decided:
InterBase alt. AMOS II
• Labs AMOS II
– OO/OR DBMS
Kjell Orsborn
2017-05-22
5
UU - DIS - UDBL
Introduction to Database Terminology
Elmasri/Navathe chs 1-2
Lecture 1
Kjell Orsborn
Department of Information Science
Uppsala University, Uppsala, Sweden
Kjell Orsborn
2017-05-22
6
UU - DIS - UDBL
Evolution of Database Technology
1960
Hierarchical
(IMS)
1970
Network model
(CODASYL)
1980
Relational model
(e.g. ORACLE)
1990
1st Generation OODB
(e.g. Objectivity)
1997
Object Realtional DBMS
(e.g. SQL99)
Trees
Complex data structures
Tables
OO data structures
Object model
{}
Kjell Orsborn
2017-05-22
7
UU - DIS - UDBL
Database?
• A database (DB) is a more or less well-organized
collection of related data.
• The information in a database . . .
– represents information within some subarea of “the reality”
(i.e. objects, characteristics and relationships between objects)
– is logically connected through the intended meaning
– has been organized for a specific group of users and applications
Kjell Orsborn
2017-05-22
8
UU - DIS - UDBL
Outline of a database system
DATABASE SYSTEM
Applications
Users’
procedures/statements
interactive queries
DBMS
Database language tools
Data managing tools
Database
schema
Kjell Orsborn
Database
2017-05-22
9
UU - DIS - UDBL
An example database (Elmasri/Navathe fig. 1.2)
Kjell Orsborn
2017-05-22
10
UU - DIS - UDBL
Database management system?
• A database management system (DBMS) is one (or several)
program that provides functionality for users to develop, use,
and maintain a database.
• Thus, a DBMS is a general software system for defining,
populating (creating), and manipulating databases for different
types of applications.
Kjell Orsborn
2017-05-22
11
UU - DIS - UDBL
Back to the example ...
• Defining this DB involve
– declaration of files, records, fields and data types for each fields.
• Population of the DB means
– that the files are filled with data about individual students, courses
etc.
• Manipulation is then carried out by users directly via a
query language or indirectly via application programs:
– updates
– queries to the DB
Kjell Orsborn
2017-05-22
12
UU - DIS - UDBL
Database System?
• A database system consists of . . .
– the physical database (instance)
– a database management system
– one or several database languages
(means for communicating with the database)
– one or several application program(s)
• A database system makes a simple and efficient manipulation
of large data sets possible.
• The term DB can refer to both the content and to the system.
The answer to this ambiguity is governed by the context.
Kjell Orsborn
2017-05-22
13
UU - DIS - UDBL
Why DB?
• DB in comparison to manual paper-based registers:
–
–
–
–
Better compactness
Faster search
Simpler maintenance
“Greater usability”
• DB in comparison to conventional file management:
–
–
–
–
–
–
Kjell Orsborn
Data model - data abstraction
Meta-data
Program- and data independence
Multiple views of the same data
High-level language for managing the database
Efficient search/access of large data sets
2017-05-22
Database
14
UU - DIS - UDBL
Problems that can be avoided using a
database system
•
•
•
•
•
•
•
Redundancy and inconsistency
Slow process to find correct information
Different formats for data storage
Erroneous or unreasonable values
Incomplete updates
Concurrent access
Unauthorized access
Kjell Orsborn
2017-05-22
15
UU - DIS - UDBL
Data model?
• Every DB has a data model which makes it possible to
“hide” the physical representation of data.
• A data model is a formalism that defines a notation for
describing data on an abstract level together with a set of
operations to manipulate data represented using this data
model.
• Data models are used for data abstraction - making it
possible to define and manipulate data on an abstract level.
Kjell Orsborn
2017-05-22
16
UU - DIS - UDBL
Data model continued . . .
• E.g. assume that information about employees in an
enterprise exists in a file employees which is a sequence
records of the type:
record
name: char[30];
manager: char[30]
end
An abstract model of this file is a relation:
employees(name, manager)
, where employees is the name of the relation and name and
manager is the attribute of the relation.
Kjell Orsborn
2017-05-22
17
UU - DIS - UDBL
Data models - examples
• Examples of data models within the database field are the:
– Hierarchical (IMS),
– Network (IDMS),
– Relational (ORACLE, SYBASE, DB2, InterBase),
– Object-oriented (O2, ObjectStore) and
– Object-relational ( Informix, UniSQL, Iris/AMOS) data model.
• Conceptual data model
– ER-model (Entity-Relationship model)
(not an implementation model since there are no operations defined for
the notation)
Kjell Orsborn
2017-05-22
18
UU - DIS - UDBL
Meta-data, i.e. “data about data”
• Information about which information that exists
• Information about how/where data is stored
–
–
–
–
file structures
records
data types / formats
name of files, data types
• Information regarding mapping between different schemas
• Meta-data is stored in the, so called, system catalog or data
dictionary.
Kjell Orsborn
2017-05-22
19
UU - DIS - UDBL
Meta-data cont. ...
• Meta-data is used by the DBMS to answer questions such as:
(Users)
– Which information exists in the database?
– Is the information “x” in the database?
– What is the cost to access a specific piece of information.
(Database administrator or DBMS)
– How much is different parts of the database used?
– How long is the response time for different types of queries?
– Has any user tried to break the security system of the database?
– Is optimization required of the physical organization of the database with
regards to memory utilization or response times?
Kjell Orsborn
2017-05-22
20
UU - DIS - UDBL
Schema and instance
To be able to separate data in the database and its
description the terms database instance and database
schema are used.
• The schema is created when a database is defined. A
database schema is not changed frequently.
• The data in the database constitute an instance. Every
change of data creates a new instance of the database.
Kjell Orsborn
2017-05-22
21
UU - DIS - UDBL
Data independence
• Reduces the connection between:
– the actual organization of data and
– how the users/application programs process data (or “sees” data.)
• Why?
– Data should be able to change without requiring a corresponding
alteration of the application programs.
– Different applications/users need different “views” of the same
data.
Kjell Orsborn
2017-05-22
22
UU - DIS - UDBL
Data dependencies
Conventional systems have, in general, a very low level of data
independence:
• Even a small change of the data structure, e.g. the introduction or
reduction of a field in a record structure, usually require that one has
to make changes in several programs or routines.
Programs can be dependent of that:
• data is located on a specific storage medium
• data has a specific storage format (binary, compressed)
• fields have been coded according to certain rules (Man = 1, Woman =
2)
• the file records are sorted in a specific manner
etc . . .
Kjell Orsborn
2017-05-22
23
UU - DIS - UDBL
Data independence - how?
By introducing a multi-level architecture where each level
represents one abstraction level.
Three-schema architecture:
– In 1978 the following “standard” architecture (ANSI/SPARC
architecture ) for databases was introduced.
It consists of 3 levels:
1. Internal level
2. Conceptual level
3. External level
– Each level introduces one abstraction layer and has a schema that
describes how representations should be mapped to the next lower
abstraction level.
Kjell Orsborn
2017-05-22
24
UU - DIS - UDBL
Three-schema architecture
End users
External level
Conceptual level
Internal level
view1
view2


Conceptual schema
Internal schema
Database instance
Kjell Orsborn

2017-05-22
viewn
25
UU - DIS - UDBL
Internal schema
• Describes storage structures and access paths for the
physical database.
– Abstraction level: files, index files etc.
• Is usually defined through the data definition language
(DDL) of the DBMS.
Kjell Orsborn
2017-05-22
26
UU - DIS - UDBL
Conceptual schema
• An abstract description of the physical database.
• Constitute one, for all users, common basic model of the
logical content of the database.
• This abstraction level corresponds to “the real world”:
object, characteristics, relationships between objects etc.
• The schema is created in the DDL according to a specific
data model.
Kjell Orsborn
2017-05-22
27
UU - DIS - UDBL
External schemas, or views
• A typical DB has several users with varying needs,
demands, access privileges etc.
• External schemas describes different views of the
conceptual database with respect to what different user
groups would like to/are allowed to se.
• Some DBMS’s have a specific language for view
definitions (else the DDL is used).
Kjell Orsborn
2017-05-22
28
UU - DIS - UDBL
Views - example (Elmasri/Navathe fig 1.4)
Kjell Orsborn
2017-05-22
29
UU - DIS - UDBL
Views - example in SQL
• Assume that we have a relation (table) consisting
information about employees in an enterprise:
employees(name,dept,salary,address)
• and wish to give a user group rights “to see” all
information in the table except the SALARY field.
• This can be accomplished by the definition of a view
called safe-emp:
create view safe-emps by
select name, dept, address
from employees;
Kjell Orsborn
2017-05-22
30
UU - DIS - UDBL
Data independence in the
three-schema architecture
1. Logical data independence
– The possibility to change the conceptual schema without
influencing the external schemas (views).
• e.g. add another field to a conceptual schema.
2. Physical data independence
– The possibility to change the internal schema without influencing
the conceptual schema..
• the effects of a physical reorganization of the database, such as adding
an access path, is eliminated.
Kjell Orsborn
2017-05-22
31
UU - DIS - UDBL
Database languages
• The term database language is a generic term for a class of
languages used for defining, communicating with or
manipulating a database.
• In conventional programming languages, declarations and
program sentences is implemented in one and the same
language.
• A DB system uses several different languages.
–
–
–
–
Kjell Orsborn
Storage Definition Language (SDL)
Data Definition Language (DDL)
View Definition Language (VDL)
Data Manipulation Language (DML)
2017-05-22
32
UU - DIS - UDBL
DDL and DML
• DDL is used by the database administrator and others to
define internal and conceptual schema.
• In this manner the database is designed. Subsequent
modifications in the design is also made in DDL.
• DML is used by DB users and application programs to
retrieve, add, remove, or alter the information in the
database. The term query language is usually used as
synonym to DML.
Kjell Orsborn
2017-05-22
33
UU - DIS - UDBL
DDL example in SQL
create table
flights(number: int,
Date:char(6),
Seats: int,
from:char(3),
to: char(3));
create index for flights on number;
• The first expression defines a relation, its attribute and their
types.
• The second expression creates an index as part of the internal
schema making search faster for flights, given a flight no. (e.g.
this can be accomplished by creating a hash table with number
as the key).
Kjell Orsborn
2017-05-22
34
UU - DIS - UDBL
DML example in SQL
update flights
set seats = seats -4
where number = 123 and date = ‘AUG 31’
“Decrease the no. of seats in flight no.. 123 on August 31 with 4.”
insert into flights
values(171,
‘AUG 21’, 100, ‘ROM’, ‘JFK’)
“Add a new flight with flight no.. 171 and 100 seats, from Rom to New
York (JFK) on August 21”
Kjell Orsborn
2017-05-22
35
UU - DIS - UDBL
Host language
• Application programs that work with a DB are mainly written in
a normal programming language: C, COBOL, PASCAL, C++,
Java, etc.
• The part of the program that interact with the DB is normally
written in a DML.
• The DML commands are embedded in the code that is written in
the host language.
##get(b)
sum := sum + b;
##store(sum)
Kjell Orsborn
2017-05-22
36
UU - DIS - UDBL
Classification criteria for DBMSs
• Type of data model
– hierarchical, network, relational, object-oriented, object-relational
• Centralized vs. distributed DBMSs
– Homogeneous vs. heterogeneous DDBMSs
– Multidatabase systems
• Single-user vs. multi-user systems
• General-purpose vs. special-purpose DBMSs
– specific applications such as airline reservation and phone directory
systems.
• Cost
Kjell Orsborn
2017-05-22
37
UU - DIS - UDBL
Components of a DBMS
• Query processor
–
–
–
–
DML compiler
Embedded DML precompiler
DDL interpreter
Query processing unit
• Storage manager
–
–
–
–
Authorization and integrity control
Transactions management
File management
Buffer management
• Physical storage
– data files, meta-data (data dictionary), index, statistics
Kjell Orsborn
2017-05-22
39
UU - DIS - UDBL
Components of a DBMS
(fig 1.6 Silberschatz et al)
Kjell Orsborn
2017-05-22
40
UU - DIS - UDBL
Introduction to Database Design Using
Entity-Relationship Modeling
Elmasri/Navathe chs 3-4
Lecture 1
Kjell Orsborn
Department of Information Science
Uppsala University, Uppsala, Sweden
Kjell Orsborn
2017-05-22
41
UU - DIS - UDBL
ER-modeling
• Aims at defining a high-level specification of the information
content in the database.
• History
– Chen,P.P.S., “The entity-relationship model: towards a unified view of
data”, ACM TODS, 1, 1 1976, p. 9-36.
• Why ER-models?
– High-level description - easier to understand for non-technicians
– More formal than natural language - avoid misconceptions and multiple
interpretations
– Implementation independent (of DBMS) - less technical details
– Documentation
– Model transformation to an implementation data model
Kjell Orsborn
2017-05-22
42
UU - DIS - UDBL
ER-modeling cont. ...
• How to do?
– Identify:
• Entity types and attributes
• Relationship types
– Normally presented in a graphical ER-diagram
– An ER-model can later be transformed to an implementation data model,
such as the relational data model
Kjell Orsborn
2017-05-22
43
UU - DIS - UDBL
Example of an ER-diagram
Kjell Orsborn
2017-05-22
44
UU - DIS - UDBL
Terminology in ER-modeling
• Entity type - Entity
– Physical or abstract concepts with some sort of identity.
.
• Attribute
– Characteristics or different aspects that describes an entity.
• Relationship type - Relationship
– Represents relationships between entities
ER-diagram
Kjell Orsborn
2017-05-22
45
UU - DIS - UDBL
Entity type
• An entity type represents a set of entities that have the
same set of attributes.
– Entity types express the intention, i.e. the meaning of the concept
whereas the set of entities represents the extension of that type.
– Names of entity types are given in singular form.
– The description of an entity type is called its schema.
PERSON
name, ssn, address, phoneno
– Each attribute in an entity type is associated with a domain that
indicates the allowed values of that attribute.
Kjell Orsborn
2017-05-22
46
UU - DIS - UDBL
Attribute
• An attribute describe a character or aspect of an entity type.
– Every attribute has a domain (or value set).
– A domain specifies the set of allowed values each individual attribute can
be assigned.
– There is (at least) six different types of values for attributes:
•
•
•
•
•
•
•
simple/
composite
single-valued/
multivalued
stored/
derived
null
ER-diagram
Kjell Orsborn
Simple
sex:
M or F
name: (Ior, Karlsson)
name: “Ior Karlsson”
friends: {Nasse, Puh,...}
birthdate: 980917
age
:0
Compos ite
2017-05-22
Multivalued
Obs!
Derived
47
UU - DIS - UDBL
Key
• An attribute that has unique values for every instance of an
entity type is called a key attribute.
• Sometimes several attributes are used together to get a unique
key.
• An entity type can have more than one key.
key
Kjell Orsborn
hello
2017-05-22
48
UU - DIS - UDBL
Relationship type
• A relationship type represents a relationship (or
relation/connection), between a number of entity types.
• A relationship type R is a set of relational instances or
tuples.
• A relationship type, R, can mathematically be defined as:
R  E1  E2  … En
where each Ej is a entity type.
• A tuple (or an instance) t  R is written as (e1, e2, ..., en) or
<e1, e2, ..., en> where ej  Ej.
Kjell Orsborn
2017-05-22
49
UU - DIS - UDBL
Structural constraints for
relationship types
• Cardinality ratio constraint specifies the number of
relational instances that an entity can take part in.
For binary relationship types:
– one-to-one (1:1)
– one-to-many (1:N)
– many-to-many (M:N)
Kjell Orsborn
2017-05-22
1
1
1
N
M
N
50
UU - DIS - UDBL
Structural constraints cont. ...
Partial
Total
• Participation constraint
– specifies whether the entity existence is dependent of another entity via a
relationship type.
E.g. can an employee exist without working for a department?
– Partial participation: the entity can exist without this relationship
– Total participation: the entity requires this relationship in order to exist.
employee
works_for
manages
Kjell Orsborn
2017-05-22
department
51
UU - DIS - UDBL
Roles of relationship types
• A role name specifies what role an entity type plays in a specific
relationship
• Role names are sometimes used in ER-diagrams to clarify the
roles of the participating entity types.
Employee
manager
1
N
Supervision
Kjell Orsborn
2017-05-22
worker
52
UU - DIS - UDBL
Attributes for relationship types
• Also a relationship type can have attributes. E.g. in the
case where the weekly number of hours an employee
works on a project should be kept, that can be represented
for each instance of the relation “works-on”.
• If the relation is a 1:1 or 1:N relation, the attribute can be
stored at one of the participating entities.
• When the relation is of the type M:N one must store the
attributes with the instance of the relation.
Kjell Orsborn
2017-05-22
53
UU - DIS - UDBL
Weak entity types
• Weak entity types are those that are meaningless without an
owner entity type.
• Weak entities are uniquely identified in the extension with their
owner’s key attributes together with its own (broken) underlined
attribute.
• The relationship to the owner is called the identifying
relationship.
Kjell Orsborn
2017-05-22
54
UU - DIS - UDBL
ER-notation
Kjell Orsborn
(Elmasri/Navathe fig. 3.14)
2017-05-22
55
UU - DIS - UDBL
Another ER diagram
Kjell Orsborn
2017-05-22
56
UU - DIS - UDBL
ER model transformations
• Replacing multi-valued attributes by an entity type
Name
Dept id
Location
DEPARTMENT
Name
Location
Dept id
1
DEPARTMENT
Kjell Orsborn
N
located at
2017-05-22
LOCATION
57
UU - DIS - UDBL
ER model transf. cont. ...
• Replacing M-N relationships with an entity type and binary
TEACHER
relationships.
1
TEACHER
lectures
M
N
booked
for
lectures
ROOM
N
N
LECTURES
N
Time
1
consists of
ROOM
COURSE
1
COURSE
Kjell Orsborn
2017-05-22
Time
58
UU - DIS - UDBL
Example ER-modeling
– An enterprise consists of a number of departments. Each department has
a name, a number, a manager, and a number of employees. The starting
date for every department manager should also be registered. A
department can have several office rooms.
– Every department finances a number of projects. Each project has a
name, a number and an office room.
– For each employee, the following information is kept: name, social
security number, address, salary and sex. An employee works for only
one department but can work with several projects that can be related to
different departments. Information about the number of hours (per week)
that an employee work with a project should be stored. Information about
the employees manager should also be stored.
Kjell Orsborn
2017-05-22
62
UU - DIS - UDBL
Extended Entity-Relationship
(EER) modeling
• The intention of using an E-R diagram is to use it as a basis
for user communication or for getting to a good design
specification.
– i.e. try to make it simple and avoid to much complexity.
• EER (extended or enhanced ER) introduces several
notational extensions to deal concepts such as:
– Superclass /subclass (supertype/subtype, is-a relationship)
• specialization/generalization
• constraints
– Aggregation (whole/part or part-of relationship)
– Union types (category)
Kjell Orsborn
2017-05-22
63
UU - DIS - UDBL
EER diagram notation for specialization and subclass
Kjell Orsborn
2017-05-22
64
UU - DIS - UDBL
Subclasses, superclasses & inheritance
• Two generic ideas for creating superclass/subclass relationships
– Specialization of superclass into subclasses
– Generalization of subclasses into a superclass
• Constraints and characteristics of spec. & gen.
– Constraints
• Predicate-defined (condition-defined) sub-classes
• Attribute-defined
• User-defined
– Disjointness
• Disjoint
• Overlapping
– Completeness
• Total
• Partial
Kjell Orsborn
2017-05-22
65
UU - DIS - UDBL
Generalization of subclasses
Kjell Orsborn
2017-05-22
66
UU - DIS - UDBL
Overlapping (nondisjoint) subclasses
Kjell Orsborn
2017-05-22
67
UU - DIS - UDBL
Representation of aggregation in ER
notation
Kjell Orsborn
2017-05-22
68
UU - DIS - UDBL
A UML conceptual schema
Kjell Orsborn
2017-05-22
69
UU - DIS - UDBL
Specialization/generalization in UML
Kjell Orsborn
2017-05-22
70
UU - DIS - UDBL
Union of two entity types
Kjell Orsborn
2017-05-22
71
UU - DIS - UDBL
Alternative
diagrammatic
notation for
ER/EER
Kjell Orsborn
2017-05-22