Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Normalization What is Normalization? Normalization Levels – First Normal Form – Second Normal Form – Third Normal Formal Referential Integrity What Is Normalization? Definition – process of reducing data redundancy in a relational database Process – by organizing data into tables of various Normal Forms Benefits – – – – – greater organization of database reduction of redundant data data consistency in database more flexible database design better handle on database security Normal Forms Definition – way of measuring the extent to which database has been constrained (to reduce redundancy) Levels of Normalization – First Normal Form (1NF) – Second Normal Form (2NF) – Third Normal Form (3NF) First Normal Form Informal Description Each record is unique All attribute values are atomic Objective of 1NF is to divide the database into tables so that each row is unique (with primary key) and each field is atomic (single item per cell) What about this? (Employees Table) lastName: Tom firstName: Smith address: 123 Apple St. city: San Francisco Company_Database emp_id last_name first_name middle_name address city state zip phone pager position date_hire pay_rate date_last_raise cust_id cust_name cust_address cust_city cust_state cust_zip cust_phone cust_fax order_num quantity order_date prod_id prod_descrip cost (Examples in this section are adapted from R. Stephens & R. Plew, Teach Yourself SQL in 24 Hours, SAMS, p. 47-49) Employees emp_id last_name first_name middle_name address city state zip phone pager position position_descrip date_hire pay_rate date_last_raise Company_Database emp_id last_name first_name middle_name address city state zip phone pager position position_descrip date_hire pay_rate date_last_raise cust_id cust_name cust_address cust_city cust_state cust_zip cust_phone cust_fax order_num quantity order_date prod_id prod_descrip cost Customers cust_id cust_name cust_address cust_city cust_state cust_zip cust_phone cust_fax order_num quantity order_date Products prod_id prod_descrip cost Second Normal Form Informal Description All of the strictly informational attributes are attributes of the entities in the table scheme Objective of 2NF is to take the data items that are only partly dependent on primary key and forming a separate table with them What about this? (Customers Table) custID: 12345 custLName: Chew custFName: Mary orderItem: refrigerator orderQuant: 2 Company_Database Employees Employees emp_id last_name first_name middle_name address city state zip phone pager position date_hire pay_rate data_last_raise emp_id last_name first_name middle_name address city state zip phone pager Employee_Pays emp_id position position_descrip date_hire pay_rate date_last_raise Company_Database Customers cust_id cust_name cust_address cust_city cust_state cust_zip cust_phone cust_fax order_num quantity order_date Customers cust_id cust_name cust_address cust_city cust_state cust_zip cust_phone cust_fax Orders cust_id order_num quantity order_date Third Normal Form Informal Description Strictly informational attributes depends only on a primary key Objective of 3NF is to take data items in a table that are not dependent on the primary key and form a separate table with them What about this? Company_Database Employee_Pays Employee_Pays emp_id position date_hire pay_rate data_last_raise emp_id date_hire pay_rate date_last_raise Positions pos_id position position_descrip Quiz Normalization is the process of grouping data into logically related data into tables to reduce redundancy. (T/F) Having no duplicate or redundant data in a database, and having everything in the database normalized, is always the best way to go. (T/F) If data is in the third normal form, it is automatically in the first and second normal forms. (T/F) What is the major advantage of denormalized database versus a normalized database? What are some major disadvantages of denormalization? Exercise: What Type of Relationships Do the Tables Have? Employee_Pays pay_id date_hire pay_rate date_last_raise Positions pos_id position position_descrip Employees Customers emp_id last_name first_name middle_name address city state zip phone cust_id cust_name cust_address cust_city cust_state cust_zip cust_phone cust_fax pager Orders order_num quantity order_date Exercise: Normalize the following data. Take the following data and normalize it. Keep in mind that, in a real DB, there would be many more items than what is given here. Employees: Angela Smith, secretary, 317-545-65879, RR 1 Box 73, Greensburg, IN, 47890, $9.50/hour, started Jan. 22, 1996, SSN is 323149669 Jack Lee Nelson, salesman, 3334 N. Main St., Brownsburg, IN, 45687, 317-852-9901, $35,000.00/year, data started 10/28/95, SSN is 312567342 Customers: Robert’s Games & Things, 5612 Lafayette Rd., Indianapolis, IN, 46224, 317-291-7888, customer ID is 432A Reed’s Dairy Bar, 4556 W 10th St., Indianapolis, IN, 46245, 317-271-9823, customer ID is 117A CustomerOrders: Customer ID is 117A, date of last order is 2/20/1997, product ordered was napkins, and product ID is 661 Solutions: Employees Ssn name street city state zip phoneNum salary hourlyRate startDate position Customers customerID name street city state zip phoneNum Orders orderID customerID productID productDescrip dateOrdered Functional Dependency Definition – If A and B are attributes of relation R, then B is functionally dependent on A if and only if each value in R has associated with it exactly one value of B in R. – A B ( A determines B) Student(stuID, stuName, major, credit, status, SSN) stuID S1001 S1003 S1006 S1010 S1060 stuName Smith, Tom Jones, Mary Lee, Pamela Burns, Edward Jones, Mary stuID stuID SSN credits Major History Math CIS Art CIS Credits 90 95 15 63 25 status Sen Sen Fresh Jun Fresh SSN 100429500 010124567 088520876 099320985 064624738 stuName stuName, major, credit,s status, SSN stuID, stuName, major, credits, status, SSN status status credits (Not true) Classes(course#, stuID, stuName, facID, sched, room, grade) Course# ART103A ART103A ART103A CIS201A CIS201A HST205A stuID S1001 S1010 S1006 S1003 S1006 S1001 stuName Smith, Tom Burns, Edward Lee, Pamela Jones, Mary Lee, Pamela Smith, Tom facID F101 F101 F101 F105 F105 F202 sched MWF9 MWF9 MWF9 TH10 TH10 MWF11 Room H221 H221 H221 M110 M110 H221 course#, stuID stuName, facID, sched, room, grade course# facID, sched, room stuID stuName grade A B A C Second Normal Form (2NF) A relation is 2NF if & only if it is in first normal form & all non-key attributes are fully functionally dependent on the key Note: if R is 1NF and the key consists of a single attribute, the relation is automatically 2NF. Full Functional Dependence In a relation R, attribute B or R is fully functionally dependent on an attribute or set of attributes A of R, if B is functionally dependent on A but not functionally dependent on any proper subset of A Classes(course#, stuID, stuName, facID, sched, room, grade) course#, stuID stuName, facID, sched, room, grade course# facID, sched, room stuID stuName classes2(course#, stuID, grade) course(course#, facID, sched, room) students(stuID, stuName) Third Normal Form (3NF) A relation R is 3NF if it is 2NF and no nonkey attribute is transitively dependent on the key. Transitive dependence ABC stdID credits status student(stuID, stuName, major, credit, status) stuID credits status stuID status students2(stuID, stuName, major, credits) stats(credits, status)