Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
DATA TYPES, DATABASES Gergely ZAJZON Dr. Med. DATABASE • Simply: structured collection of data 2 3 4 5 6 DATABASE • Simply: structured collection of data 7 DATABASE • Simply: structured collection of data • Storage: a collection of tables connecting to each other 8 DATABASE • In computer science, a database is a structured collection of records or data that is stored in a computer system so that a computer program or person using a query language can consult it to answer queries. • The records retrieved in answer to queries are information that can be used to make decisions. 9 DATA • Data is visualised form of some kind of knowledge. Representative property of somebody or something. • From IT perspective: data is coded information, stored in a computer as a signal sequence. • Has to be stored in a structured and processable form 10 DATA - examples • PATIENT DATA: - name - birth date - address - „TAJ” – patient code Phone book: - name - address - phone number 11 INFORMATION • The Sesame Street character Cookie Monster describes information as "news or facts about something." • Measure unit of information: bit: value can be 1 or 0 (true / false). • Information is the result of processing, gathering, manipulating and organizing data in a way that adds to the knowledge of the receiver.. 12 INFORMATION Why do we create databases? • Storing and searching of data • Retrieval of new informations from stored data collections 13 DATA TYPES • Basic property of data fields • Defines: – a set of values – structure of data – allowable operations on those values • Makes it possible, that in one column of a table only same type of data can be stored. 14 BASIC DATA TYPES • Can not be divided to further separete parts 15 COUNTABLE DATA TYPE CLASSESS NUMERIC Byte 0-255 8 bit Shortint (-128)-127 8 bit Integer (-32768)-32767 16 bit LOGICAL 1 or 0 1 bit CHARACTER Any characters 1 byte (8 bit) 16 REAL DATA TYPE CLASSESS Numeric real data types Real (-2,9*10-39)-(1,7*1038) 48 bit Single (-1,5*10-45)-(3,4*1038) 32 bit Double (-5*10-324)-(1,7*10308) 64 bit Extended (-3,4*10-4932)(1,1*104932) 80 bit Comp -2-63 + 1..263-1 64 bit 17 DATA TYPE • Choosing the correct data type: – Helps making your database more structured – Eg.: a field containing birth dates should be DATE type. – Helps validating your data: – Eg.: DATE field can not contain a date like 31. february 2008. – You can not enter a text into an INTEGER field 18 19 Data Types • Binary – Database specific binary objects – Pictures, digital signatures, etc. • Boolean – True/False values • Character – Fixed width or variable size • Numeric – Integer, Real (floating decimal point), Money • Temporal – Time, Date, Timestamp 20 SPREADSHEET <-> DATABASE • Spreadsheet (EXCEL table): structured. - search - order - filter • Software: Eg.: MS-Excel 21 SPREADSHEET <-> DATABASE • „Excel database”: table with special layout: • Rows of the table are connected with eachoter, containing the properties of one selected object. Columns: containing the same kind of property for each object. The table has a header in the first row. 22 SPREADSHEET <-> DATABASE TAJ Name Address Birth date.: Phone number: 123456789 Iv Béla Bp. 1014 Palota utca 1235.12.20. 1 06-1-2020030 234567890 Könyves Kálmán IV. kerület Mély utca 3 1308.Jan.1 +36-20-1234567 345678901 Harmadi k András Székesfehérvár 8000 Fő út 4. 1290-04-01 (30) 1234567 456789012 Dobzse László Bp. 1031 Kis utca 3 1492.Március 2 2345678 012345678 Kalapos József Bp. 1220 Nagy utca 1 30/04/1835 06201234321 23 SPREADSHEET <-> DATABASE • Disadvantages of Spreadsheet - The same data set can only be modified by one user at the same time - Storage of data is not safe enough - Structure of data and connections are not defined 24 SPREADSHEET <-> DATABASE • Disadvantages of Spreadsheet - Logical limit: can not use multiple tables easiliy, as connections between tables can not be defined 25 SPREADSHEET <-> DATABASE Name Postal code City Address Doctor Off_P C Office_city Off_address Office_phone Balla Béla 2080 Érd Fő u. 3. Dr. House 1191 Budapest Kék u. 3. 405-2039 Kis Péter 1022 Budapest Kis u. 2. Dr. Green 1221 Budapest Fa u. 5. 373-8022 Nagy Irén 1134 Budapest Új u. 4. Dr. House 1191 Budapest Kék u. 3. 303-1048 Pók Géza 2560 Diósd Tár u. 9. Dr. Kovac 2205 Diósd Zár u. 10. 323-581 INDIRECT PROPERTY => REDUNDANCY, MISTAKES IN DATA INPUT 26 27 28 29 30 31 DATABASE • Data collection organized on the basis of a data model. • Contains: – data – Structure of data – Connection rule descriptions 32 DATA MODEL • A logical structure, defining the layou of the stored data. • Contains no data values. • Only properties of data, structure and possible connections 33 DATA MODEL CONCEPTUAL DATA MODEL: • Analysis of the informations and connections between them. • Independent from software tools PHYSICAL DATA MODEL: • Structured used in the database:definition of tables, data types, connection • Software dependent 34 CONCEPTUAL DATA MODEL MOZIK PK FILMEK PK MOZI_ID MOZINEV M N FILM_ID FILMCIM MOZICIM HOSSZ MOZITEL NYELV MOZIWEB RENDEZO 35 PHYSICAL DATA MODEL MOZI__FILM FILMEK MOZIK PK N PK MOZI_ID MOZINEV MOZICIM MOZITEL 1 MOZIFILM_ID MOZI_FK FILM_FK PK N 1 FILM_ID FILMCIM HOSSZ VETITES NYELV RENDEZO MOZIWEB 36 Parts of a database • Attributes (fields) – An attribute or field is a component of a record that describes something about an item. • Records – A record is the representation of an individual item. • Table – A collection of records • Database – A collection of tables and rules for accessing the tables 37 Parts of a database Record Tables Attribute/Field • Records become “rows” • Attributes/fields become “columns” • Rules determine the relationship between the tables and tie the data together to form a database 38 Referential Integrity • Rules to preserve relationships • Prevents orphan records – Cannot add records on many sides – Cannot delete from one side • Cascade update • Cascade delete 39 NORMAL FORMS • The normal forms (abbrev. NF) of relational database theory provide criteria for determining a table's degree of vulnerability to logical inconsistencies and anomalies. The higher the normal form applicable to a table, the less vulnerable it is to such inconsistencies and anomalies. Each table has a "highest normal form" (HNF): by definition, a table always meets the requirements of its HNF and of all normal forms lower than its HNF; also by definition, a table fails to meet the requirements of any normal form higher than its HNF. • The normal forms are applicable to individual tables; to say that an entire database is in normal form n is to say that all of its tables are in normal form n. 40 Normalizing • • • • • Dependencies between data are identified Redundant data is minimized Reduces chances of data errors Reduces disk space The data model is flexible and easier to maintain 41 First Normal Form • Eliminate repeating columns in each table • Create a separate table for each set of related data • Identify each set of related data with a primary key Contacts Id Name Company Address Phone ZipCode 1 Joe ABC 123 5532 12345 1 Joe ABC 123 2234 12345 1 Joe ABC 123 3211 12345 2 Jane XYZ 456 3421 14454 3 Chris PDQ 789 2341 14423 3 Chris PDQ 789 6655 14423 Benefits: Now we can have infinite phone numbers or company addresses for each contact. Drawback: Now we have to type in everything over and over again. This leads to inconsistency, redundancy and wasting space. Thus, the second normal form… 42 Second Normal Form • Create separate tables for sets of values that apply to multiple records • Relate these tables with a “foreign key”. People Id Name Company Address Zip 1 Joe ABC 123 12345 2 Jane XYZ 456 14454 3 Chris PDQ 789 14423 PhoneNumbers PhoneID Id Phone 1 1 5532 2 1 2234 3 1 3211 4 2 3421 5 3 2341 6 3 6655 43 Third Normal Form • Eliminate fields that do not depend on the primary key. PhoneNumbers PhoneID Id Phone 1 1 5532 2 1 2234 3 1 3211 4 2 3421 5 3 2341 6 3 6655 People Id Name AddressID 1 Joe 1 2 Jane 2 3 Chris 3 Address AddressID Company Address Zip 1 ABC 123 12345 2 XYZ 456 14454 3 PDQ 789 14423 Is this enough? Codd thought so… What about “many to many”? 44 Kinds of Relationships • “One to One” – One row of a table matches exactly to another • One person, one id number, one address • “One to Many” – One row of a table matches many of another • One person, many phone numbers • “Many to Many” – One row may match many of another or many rows match one row of another 45 TAJ Név Irsz Város Cím Háziorvos Rend_irsz Rend_város Rend_cím RendMh_tel 123456789 Balla Béla 2080 Érd Fő u. 3. Dr. Ház 1191 Budapest Kék u. 3. 405-2039 123456790 Kis Péter 1022 Budapest Kis u. 2. Dr. Zöld 1221 Budapest Fa u. 5. 373-8022 123456791 Nagy Irén 1134 Budapest Új u. 4. Dr. Ház 1191 Budapest Kék u. 3. 303-1048 123456792 Pók Géza 2560 Diósd Tár u. 9. Dr. Kovács 2205 Diósd Zár u. 10. 323-5819 TAJ Név Irsz Város Cím Háziorvo s 1234567 89 Balla Béla 20 80 Érd Fő u. 3. Dr. Ház 1234567 90 Kis Péter 10 22 Budape st Kis u. 2. Dr. Zöld 1234567 91 Nagy Irén 11 34 Budape st Új u. 4. Dr. Ház 1234567 92 Pók Géza 25 60 Diósd Tár u. 9. Dr. Kovács Háziorvos Rend_ir sz Rend_vár os Rend_cí m RendMh_t el Dr. Ház 1191 Budapest Kék u. 3. 405-2039 Dr. Zöld 1221 Budapest Fa u. 5. 373-8022 Dr. Ház 1191 Budapest Kék u. 3. 303-1048 46 BETEGEK PK TAJ NEV IRSZ VAROS CIM HAZIORVOS REND IRSZ REND VAROS REND_CIM RENDMH_TEL BETEGEK PK ORVOSOK PK TAJ NEV M N PECSÉT ORVOS IRSZ IRSZ VAROS VAROS CIM CIM TEL 47 DATA MANAGEMENT LANGUAGE • SQL = Structured Query Language • DDL – Data Definition (CREATE) • DML – Data Manipulation (INSERT, UPDATE, DELETE) • QL – Query Language (SELECT) 48 TABLE TYPES • • • • Main (core) tables Subtables Linking tables (association table) Dictionary tables (look-up table, value list) 49 DATABASE SERVER • A database server is a computer program that provides database services to other computer programs or computers, as defined by the client-server model. The term may also refer to a computer dedicated to running such a program. Database management systems frequently provide database server functionality, and some DBMS's (e.g., MySQL) rely exclusively on the client-server model for database access. • Database Master servers are central and main locations of data while Database Slave servers are synchronized backups of the master acting as proxies. 50 DATABASE SERVER • • • • • • Special DBMS software user interface is not an important aim Allows multi-user change access Controll of user rights Backup of stored data Transaction management 51 DATABASE SERVERS • • • • Oracle MS-SQL My-SQL PostgreSQL 52 Gergely Zajzon Dr. Med. E-mail: [email protected] Web: http://www.gyogyszerugy.hu Telefon: +36-20-9604294 53