Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CSE 190: Internet E-Commerce Lecture 12: SQL, DB Design SQL: Simple Calculations • SQL allows not only selection from database tables, but also basic calculations over all the rows of those tables • Select count(*) from customers where total_loans < 5500; – Returns the count of customers who have loans of less than 5500 • Select sum(total_loans) from customers; – Returns the sum of all the loans outstanding for all customers • Select avg(total_loans) from customers; – Returns the average of total_loans over all customers • Reference – http://www.w3schools.com/sql/default.asp SQL: GROUP BY Have Query results collated by the value of specified field Syntax: SELECT column, calc(column) FROM table GROUP BY column Example: SELECT company, sum(amount) from sales GROUP by company Sales Table Company Amount ABC 120 XYZ 50 ABC 100 Yields: Company Amount ABC 220 XYZ 50 SQL: Views • Views: A named table that results from applying criteria to a SELECT clause. • Example: – CREATE VIEW YOUNG_PEOPLE AS SELECT * FROM PEOPLE WHERE DOB > DateValue('1.1.1980'); – May then be used in subsequent selects: • SELECT * FROM YOUNG_PEOPLE; DB Design • DB Design involves two activities – Logical design – Physical design • Logical design: describing the schemas, tables, and relationships amongst the tables – Typically done by application developer • Physical design: Deciding what disks tables are located on, how to create indexes, which queries to optimize on DB Design: Entity Relationships • Entity: Logical object captured within a table • Relationships between other entities are of certain category: – 1:1 • Implies entities should be merged – 1:M • Commonly used to represent lists – M:1 • Entity is part of some list of the other entity – M:M • Avoid this kind of relationship • Creating these Entity-Relationship diagrams known as Database modeling; basically similar to object-oriented analysis • Object oriented databases: reduce the effort to map objects to relational tables – However, still immature compared to relational databases DB Design: Normal Forms • Normalization: step-wise refinement of our database schema to eliminate redundancy and reduce the costs of change • Good DB Design usually reaches 3rd Normal Form DB Design: Zero Normal Form • Zero normal form – No normal form rules have been applied yet • Example (Users table) Name Company Company_ Address Url1 Url2 Jack ABC 1 Infinite Loop Abc.com Xyz.com Jill XYZ 1 Microsoft Way Abc.com Xyz.com • Problem: List of URLs, capacity is hard coded to be two. Need to add a new column to support three URLs! DB Design: First Normal Form Rules: 1. 2. 3. Eliminate repeating groups in individual tables Create a new table for each set of related data Identify each set of related data with a primary key UserId Name Company Company Address URL 1 Jack ABC 1 Infinite Loop Abc.com 1 Jack ABC 1 Infinite Loop Xyz.com 2 Jill XYZ 1 Microsoft Way Abc.com 2 Jill XYZ 1 Microsoft Way Xyz.com Problem: Adding new URL means we have to copy the same user, company and company address values. This is redundant and could lead to inconsistent values. DB Design: Second Normal Form Rules: 1. Create separate tables for sets of values that apply to multiple records 2. Relate these tables with a foreign key Users: URLs UserId Name Company Company Address UrlId relUserId URL 1 Jack ABC 1 Infinite Loop 1 1 Abc.com 2 Jill XYZ 1 Microsoft Way 2 1 Xyz.com 3 2 Abc.com 4 2 Xyz.com Problem: Adding new employee means we have to copy address information, leading to redundancy DB Design: Third Normal Form Rules: 1. Eliminate fields which do not depend on the primary key Note: Once you’ve applied 3rd Normal Form, your database schema is in relatively good health UserId Name relCompId UrlId relUserId URL 1 Jack 1 1 1 Abc.com 2 Jill 2 2 1 Xyz.com 3 2 Abc.com 4 2 Xyz.com CompanyId Company Company Address 1 ABC 1 Infinite Loop 2 XYZ 1 Microsoft Way Problem: Note how Abc.com and Xyz.com are copies. If this comes from open data entry, then nothing to do about this. But if it’s from a fixed set of choices, more may be done. DB Design: 4th, 5th Normal Form 4th Normal Form: Independent entities may not be stored in the same table UrlRelationId RelatedUrlId RelatedUrlUserId 1 1 1 2 1 2 3 2 1 4 2 2 5th Normal Form: Original table must be reconstructed from the tables it has been decomposed into. DB Design: Inheritance • What about subclasses of objects, with subsets (or supersets) of properties • Two solutions: – Master table of properties with object id selecting for property ID – One new table for each subclass • Second generally better for relational integrity