Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CLARK UNIVERSITY College of Professional and Continuing Education (COPACE) Management Information Systems Lection 03 Database management system Plan • • • • • • • Term “Database” Architecture of database Data models Normal forms Operations of relation algebra Operations of SQL DBMS Term “Database” • A huge amount of data is entered into computer systems every day. • Where does this data go and how is it used? • How can it help you on a job? Term “Database” Widely database is a collection of facts about real world’s objects if some field. Field is a part of the real world which we learn for managing: company, university, etc. Non-structured data Folder No. 16493, Smith John, 01/01/1976; folder No. 16593, LeVering Barbara, 03/15/1975; folder No. 16693, McCow Robert, 04/14/1976. Structured data Folder No. Surname First name Birth date 16493 Smith John 01/01/1976 16593 LeVering Barbara 03/15/1975 16693 McCow Robert 04/14/1976 Structuring is the introduction of agreements on the ways of presenting data. Database definition • Database (DB) is a named collection of structured data related to a particular subject area. • Database management system (DBMS) - a set of software and language tools necessary to create databases, keeping them up to date and organize the search in them the necessary information. Classification of databases Data processing Centralized stored in a computer system, which may be the mainframe (access via terminals) or file server network. Distributed consists of several parts, which are stored in different computer network connection. Classification of databases Access to data With local access With network access Classification of databases Centralized database with network access can have the following architecture: File-server Client-server Two-level model Three-level model Architecture “File-level” АрхитектArchitecture “File-level”ура файл-сервер 1. Input and Display data 2. Data access and search Criteria 1. Keeping the database file 3. Implementation of computing functions on the data АрхитектArchitecture “File-level”ура файл-сервер Advantages: 1. The absence of very high performance of the server (most importantly - the required amount of disk space) 2. The database is not running and will not being installed on a server Disadvantages: 1. High network traffic 2. Lack of special security arrangements file from the DB Architecture “Client-server” Architecture “Client-server” 1. Input and display data 2. Implementation of computing functions on data sets 1. Keeping the database file 2. Access to the data and search for certain criteria Architecture “Client-server” Advantages: 1. Lower network traffic than the file-server model 2. SQL-Server provides functions to ensure the integrity and security of data Disadvantages: 1. In certain cases, some data sets may take quite a substantial amount of place Two-level architecture 1. Input and Display data 1. Keeping the database file 2. Data access and search by criteria 3. Implementation of computing procedures on data sets Two-level architecture Advantages: 1. Significant reduction in network traffic compared to client server 2. High reliability of data storage and processing Disadvantages: 1. High demands on the computer server (disk space and speed) Three-level architecture Three-level architecture suggest the following application components: a client application ("thin client" or a terminal) connected to the application server, which in turn is connected to the database server. Data models Stored data have a logical structure described by a model of data (data model), supported by the DBMS. Data model determines the organization of data, constraints and the set of operations that are allowed on the object. Data models Hierarchical model Network model Relational model Hierarchical model The hierarchical model has been developed historically in the first turn. Based on this model it was created the first professional DBMS IMS in the late 60's - early 70‘s (IBM). Hierarchical model Links between different entities of data are described by a structured graph or a tree Hierarchical model Advantage: 1. Sufficiently high run-time operations on data Disadvantages: 1. Complexity of understanding for the average user 2. The presence of redundancy Network model Link between the data are described by an arbitrary graph Network model Advantages: 1. Minimum redundancy 2. Compared to the hierarchical model the network model provides a great deal in terms of the admissibility of the formation of new links 3. Effective implementation in terms of memory consumption. Disadvantages: 1. Complexity of understanding for the average user 2. Weakened control of the accuracy of links Relational model The relational model was proposed by Edgar Cod in 1970. Based on the concept of relation. Graphically represented as a ratio of a table. In a relational database assumes that the user perceives the database as tables. Relational model Folder No. Surname First name Birth date Group 16493 Smith John 01/01/1976 111 16593 LeVering Barbara 03/15/1975 111 16693 McCow Robert 04/14/1976 112 Relational model Advantage: 1. Simplicity and clarity for a wide user, which is the reason of its wide distribution. Disadvantage: 1. Necessary redundancy because of the relationships between tables. Relational model There are alternatives to the terms: Alternative 1 Alternative 1 Alternative 1 Table File Relation String Record Tuple Column Field Domain Primary key Folder No. Surname First name Birth date Group 16493 Smith John 01/01/1976 111 16593 LeVering Barbara 03/15/1975 112 16693 McCow Robert 04/14/1976 113 Primary key is a relation attribute (set of attributes) that uniquely identifies each of its records. Student (FolderNo., Surname, First_name, Birth_date, Group) Foreign key We can link tables by foreign keys. Foreign key is an attribute (set of attributes) of relation, which is the key of another relation. Student(FolderNo., Surname, First_name, Birth_date, Group) Group(Number, Specialisation, Head_of_group) Indexes Index is a means to accelerate the search operation records in the table, as well as other operations that use search (retrieval, modification, sorting, etc.) Surname Location Location Alekseev 1 1 Alekseev Afanasiev 4 2 Yakovlev Kuznecov 1000 3 Mikhailov Mikhailov 3 4 Afanasiev … Yakovlev … Surname … 2 1000 Kuznecov Types of indexes Indexes Primary The key field is always indexed, so it doesn’t require an additional index. Secondary •The are used for enforcing searching and executing queries •There might be several secondary indexes •They might include several fields •The same field can enter different indexes Data redundancy Folder No. Surname First name Birth date Group Specialization 15345 Ivanov Ivan 04/15/1989 392 Informatics and Management 15349 Medvedeva Anna 02/13/1989 392 Informatics and Management 15310 Petrov Mikhail 11/12/1989 392 Informatics and Management 15259 Sidorov Nicolay 01/26/1987 591 Informatics and English 15263 Sanin Alexander 10/20/1987 591 Informatics and English The problem of update Folder No. Surna me 15345 Ivanov 15345 First name Birth date Gro up Specializa tion Term Course Score Ivan 04/15/ 1989 492 IM 2 English Good Ivanov Ivan 04/15/ 1989 492 IM 2 Theory of algorithms Excellent 15345 Ivanov Ivan 04/15/ 1989 392 ? IM 2 Chemistry Good 15310 Petrov Mikhail 11/12/ 1989 392 IM 6 English Satisfactory 15310 Petrov Mikhail 11/12/ 1989 392 IM 6 Theory of algorithms Satisfactory 15310 Petrov Mikhail 11/12/ 1989 392 IM 6 Chemistry Good 15259 Sidorov Nicolay 01/26/ 1987 591 IE 10 Architecture of PC Excellent 15259 Sidorov Nicolay 01/26/ 1987 591 IE 10 Computer modeling Excellent The problem of inserting new data Folder No. Surname First name 15345 Ivanov Ivan 15345 Ivanov 15345 Birth date Group Specialization Term Course Score 04/15 392 /1989 IM 2 English Good Ivan 04/15 392 /1989 IM 2 Theory of algorithms Excellent Ivanov Ivan 04/15 392 /1989 IM 2 Chemistry Good 15310 Petrov Mikhail 11/12 392 /1989 IM 6 English Satisfactory 15310 Petrov Mikhail 11/12 392 /1989 IM 6 Theory of algorithms Satisfactory 15310 Petrov Mikhail 11/12 392 /1989 IM 6 Chemistry Good 15259 Sidorov Nicolay 01/26 591 /1987 IE 10 Architecture of PC Excellent 15259 Sidorov Nicolay 01/26 591 /1987 IE 10 Computer modeling Excellent 15402 Stepanov Andrew 03/29 191 /1991 MSIT 1 - - Нормализация The normalization of relations represent rules of such formation of relations (tables) that allow to eliminate duplication, inconsistency stored in the database. Нормализация E. Codd developed three normal forms of relations and the mechanism, which allows to convert any relation to the third normal form. Первая нормальная форма (1НФ) Table is in 1NF if each its cell has always the only atomic value, and there can never be the set of such values. Folder No. Surname 15345 Ivanov 15310 Petrov 15259 Sidorov First name Ivan Первая нормальнаяCourse форма Score Group Speciali- Term zation (1НФ) Birth date 04/15 392 /1989 Mikhail Nicolay 11/12 392 /1989 01/26 591 /1987 IM IM IE 2 6 10 English Good Theory of algorithms Excellent Chemistry Good English Satisfactory Theory of algorithms Satisfactory Chemistry Good Architecture of PC Excellent Computer modeling Excellent The table is not in 1NF First name Первая нормальная форма Score Group SpecialiTerm Course zation (1НФ) Folder No. Surname Birth date 15345 Ivanov Ivan 04/15 392 /1989 IM 2 English Good 15345 Ivanov Ivan 04/15 392 /1989 IM 2 Theory of algorithms Excellent 15345 Ivanov Ivan 04/15 392 /1989 IM 2 Chemistry Good 15310 Petrov Mikhail 11/12 392 /1989 IM 6 English Satisfactory 15310 Petrov Mikhail 11/12 392 /1989 IM 6 Theory of algorithms Satisfactory 15310 Petrov Mikhail 11/12 392 /1989 IM 6 Chemistry Good 15259 Sidorov Nicolay 01/26 591 /1987 IE 10 Architecture of PC Excellent 15259 Sidorov Nicolay 01/26 591 /1987 IE 10 Computer modeling Excellent Table is in 1NF Диаграмма функциональных зависимостей для примера БД «Студент» Table is in 2NF if it does not contain any non-key attributes which are functionally dependent on part of the key Table is in 3NF if it does not contain any non-key attributes, transitively dependent on the key part 45 Folder No. Surname 15345 Ivanov First name Ivan Birth Group проектирования Group SpecialiРезультат БД date zation «Студент» 392 IM 04/15 392 /1989 15310 Petrov Mikhail 11/12 392 /1989 15259 Sidorov Nicolay 01/26 591 /1987 Folder No. Term 591 Course Score 15345 2 English Good 15345 2 Theory of algorithms Excellent 15345 2 Chemistry Good 15310 6 English Satisfactory 15310 6 Theory of algorithms Satisfactory 15310 6 Chemistry Good 15259 10 Architecture of PC Excellent 15259 10 Computer modeling Excellent IE SQL A query language SQL (Structured Query Language) provides an access to information contained in relational databases for users, software and computing systems Relational algebra SQL is based on relational algebra operations. Relational Algebra is a set of operations on relations. Relational algebra was developed within the relational model by Codd. Using the relational algebra we can get other relations Relational algebra Students of group 392 Union No. Surname Name Birth date Group 15345 Ivanov Ivan 04/15/1989 392 15349 Medvedeva Anna 02/13/1989 392 15310 Petrov Mikhail 11/12/1989 392 Students of group 591 No. Surname Name Birth date Group 15259 Sidorov Nikolay 01/26/1987 591 15263 Sanin Alexander 10/20/1987 591 Result No. Surname Name Birth date Group 15345 Ivanov Ivan 04/15/1989 392 15349 Medvedeva Anna 02/13/1989 392 15310 Petrov Mikhail 11/12/1989 392 15259 Sidorov Nikolay 01/26/1987 591 15263 Sanin Alexander 10/20/1987 591 Intersection Relational algebra Students of groups 392 and 591 No. Surname Name Birth date Group 15345 Ivanov Ivan 04/15/1989 392 15349 Medvedeva Anna 02/13/1989 392 15310 Petrov Mikhail 11/12/1989 392 15259 Sidorov Nikolay 01/26/1987 591 15263 Sanin Alexander 10/20/1987 591 Students of group 392 No. Surname Name Birth date Group 15345 Ivanov Ivan 04/15/1989 392 15349 Medvedeva Anna 02/13/1989 392 15310 Petrov Mikhail 11/12/1989 392 Result No. Surname Name Birth date Group 15345 Ivanov Ivan 04/15/1989 392 Substraction Relational algebra Students of groups 392 and 591 No. Surname Name Birth date Group 15345 Ivanov Ivan 04/15/1989 392 15349 Medvedeva Anna 02/13/1989 392 15310 Petrov Mikhail 11/12/1989 392 15259 Sidorov Nikolay 01/26/1987 591 15263 Sanin Alexander 10/20/1987 591 Students of group 392 No. Surname Name Birth date Group 15345 Ivanov Ivan 04/15/1989 392 15349 Medvedeva Anna 02/13/1989 392 15310 Petrov Mikhail 11/12/1989 392 Result No. Surname Name Birth date Group 15349 Medvedeva Anna 02/13/1989 392 15310 Petrov Mikhail 11/12/1989 392 15259 Sidorov Nikolay 01/26/1987 591 15263 Sanin Alexander 10/20/1987 591 Relational algebra Selection Students of groups 392 and 591 No. Surname Name Birth date Group 15345 Ivanov Ivan 04/15/1989 392 15349 Medvedeva Anna 02/13/1989 392 15310 Petrov Mikhail 11/12/1989 392 15259 Sidorov Nikolay 01/26/1987 591 15263 Sanin Alexander 10/20/1987 591 Students of group 591 No. Surname Name Birth date Group 15259 Sidorov Nikolay 01/26/1987 591 15263 Sanin Alexander 10/20/1987 591 SQL SELECT [DISTINC] elements FROM table(s) [WHERE condition] [GROUP BY field(s) [HAVING condition]] [ORDER BY field(s)] SELECT – choose DISTINC – eliminate the same strings in the output file FROM – from which tables WHERE – where if condition is true GROUP BY – grouping strings by some field HAVING – grouping under some condition ОRDER BY – sorting Select surnames of all students Students No. Surname Name Birth date Group 15345 Ivanov Ivan 04/15/1989 392 15349 Medvedeva Anna 02/13/1989 392 15310 Petrov Mikhail 11/12/1989 392 15259 Sidorov Nikolay 01/26/1987 591 15263 Sanin Alexander 10/20/1987 591 Surname Ivanov Medvedeva SELECT Surname FROM Students Petrov Sidorov Sanin Select all information about students of group 591, sorting them by surname Students No. Surname Name Birth date Group 15345 Ivanov Ivan 04/15/1989 392 15349 Medvedeva Anna 02/13/1989 392 15310 Petrov Mikhail 11/12/1989 392 15259 Sidorov Nikolay 01/26/1987 591 15263 Sanin Alexander 10/20/1987 591 SELECT * FROM Students WHERE Group=591 ORDER BY Surname Result No. Surname Name Birth date Group 15263 Sanin Alexander 10/20/1987 591 15259 Sidorov Nikolay 01/26/1987 591 Select birth date of Petrov Students No. Surname Name Birth date Group 15345 Ivanov Ivan 04/15/1989 392 15349 Medvedeva Anna 02/13/1989 392 15310 Petrov Mikhail 11/12/1989 392 15259 Sidorov Nikolay 01/26/1987 591 15263 Sanin Alexander 10/20/1987 591 SELECT Birth Date FROM Students WHERE Surname=“Petrov” Birth date 11/12/1989 DBMS Database management system (DBMS) is a set of software and language tools necessary to create databases, keeping them up to date and organize the search the necessary information. The term "database server" is generally used to refer to the entire database, based on the "client-server", including the server and client side. Types of DBMS Network DBMS (CronosPlus) Hierarchical DBMS (IMS) Rational (MS Access, Paradox, Interbase, FireBird, MySQL, Oracle, Ingres) Onject-oriented and object-rational (Oracle Database, MicroSoft SQL Server)