* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Normalization
Survey
Document related concepts
Transcript
Normalization (Codd, 1972) Practical Information For Real World Database Design Requirements for Relational DB • Table format • Supports Boolean Algebra – Selects, joins, projects + 5 other operations to define queries • Supports mathematical, relational, and logical operators (And-Or-Not) • Codd’s Twelve Rules (abstracted) – – – – – Null values can be present except in primary key All data represented in tables Must be able to update views Access data with table name, field name, or value Data and programs should be independent – Should enforce integrity and validity constraints Normalization Defined • Normalization – Purpose is to avoid potential update problems called anomalies – Assigned attributes to entities using 1NF, 2NF, 3NF, 4NF, 5NF • Denormalization – Moving back a level to gain better performance in the real-world database; in practice 3NF is most common, however, to gain efficiency and speed, minor changes may need to be made Why Normalization Is Important • If not done, updates are less efficient (larger tables, possibly more than one update per data item change) • If not done, indexing is more cumbersome – impractical to build large databases • If not done, no simple strategies for creating views required by users Design Rules • Determine Business Rules – A company manages many different projects – Each project requires the services of many employees – Employees may be assigned to work on more than one project – Each employee has a job classification – Many employees have the same job classification • Translate business rules to validity constraints and relationships Design Rules, Continued • Analyze documents, interview key users, etc. to develop a field list • Determine entities to be used (see next slide for definition) • Determine relationships between entities • Assign the attributes to the entities • Identify primary and foreign keys • Check for 1NF, 2NF, and 3 NF Definitions • Entity – the subject to be modeled by the database file (table or relation) • Primary Key – the field value that uniquely identifies the entity entry (row, tuple, record); all other attributes are functionally dependent on it; can’t be null • Foreign Key – the field (attribute, column) that relates the table to a pre-existing table • Functional Dependence - determines or depends on, e.g. advisor name depends on advisor ID Definitions, Continued • Views – Selected group of records (select) – Selected group of fields (project) – Selected group of records and fields from two or more tables (join) – A query – A report – A set of labels Definitions, Continued • Determinant – Determines value of another attribute; e.g. primary key • Indexes – Tables that contain record numbers only arranged in an order based on some field value • Entity Integrity – Every table must have a field to uniquely identity each record and there must be a field value for every record • Referential Integrity – If a record has a value in a foreign key field, it must match an exiting value in the original table to which it is linked 1 Normal Form • Table does not contain repeating groups – To put it another way, each record has at least one field that differentiates it from every other record in the file; e.g. a unique primary key Examples: Faculty ID is primary key and the same faculty id is associated with two or more courses Solve by creating a course file Faculty ID is primary key and the same faculty id is associated with two or more offices Solve by redesigning database to include offices as a separate table 2 Normal Form • Table must be in 1 Normal Form • No non-key attribute is dependent on only part of the concatenated key – Concatenated key (two or more fields taken together represent primary key) • In course table, concatenated key is faculty ID and Catalog No – every field in table must be dependent on both faculty ID and catalog number Anomalies Avoided By 2 NF • Only have one data item to change when update is made • Avoids “loose” data when deletes are made – When a part number is deleted, could lose reference to invoice – How do you add a new course when there is no associated faculty ID? – A new office with no assigned faculty? • Avoids inconsistent data 3 Normal Form • The only determinants are candidate keys – Candidate keys in student file are social security number and patron ID (both are unique) • To put it another way, there are no transitive dependencies – If student file contains Dept ID (foreign key) and department name, this is a transitive dependency 4 and 5 Normal Form • 4 Normal Form – There are no multivalued dependencies – is like Boyce Codd • 5 Normal Form – Holds only theoretical interest