Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Where Are My (Primary) Keys? Ami Levin Mentor, [email protected] Think Big. Move Fast. We Always Did it This Way… 2 | Where are my (primary) keys? Session Goals Revisit one of the fundamental design principals of relational databases - key selection. Explore the controversies associated with it from a very practical, hands-on perspective, with a special emphasis on some surprising performance issues that may arise from suboptimal selection of keys... 3 | Where are my (primary) keys? What is a Primary Key? The Forefathers 5 | Where are my (primary) keys? Normalization “A relation whose domains are all simple can be represented in storage by a two-dimensional column homogeneous array of the kind discussed above. Some more complicated data structure is necessary for a relation with one or more non-simple domains. For this reason (and others to be cited below) the possibility of eliminating non-simple domains appears worth investigating. There is, in fact, a very simple elimination procedure, which we shall call normalization” 6 | Where are my (primary) keys? 1st Normal Form There's no top-to-bottom or left-to-right ordering to the rows and columns There are no duplicate rows Every row-column intersection contains exactly one value There are no repeating groups All columns are regular 7 | Where are my (primary) keys? 1st Normal Form SSN Name Phone Number 123-45-6789 Muammar Gaddafi +218-00-9876 987-65-4321 Bashar Assad +1-202-6543221 SSN First Name Last Name Phone Number 123-45-6789 Muammar Gaddafi +218-00-9876, +218-00-8765 987-65-4321 Bashar Assad +1-202-6543221 SSN First Name Last Name Phone Number 1 Phone Number 2 123-45-6789 Muammar Gaddafi +218-00-9876 987-65-4321 Bashar Assad +1-202-6543221 8 | Where are my (primary) keys? +218-00-8765 2nd Normal Form R is in 1NF. Given any candidate key K and any attribute A that is not a constituent of a candidate key, A depends upon the whole of K rather than just a part of it. 9 | Where are my (primary) keys? 2nd Normal Form 10 | Order ID Line Number Customer 0001 1 FISSA 0001 2 FISSA 0002 1 PARIS OrderDetailID OrderID LineNumber Customer AC934245FF00B 0001 1 FISSA 8BA50CC2044AF 0001 2 FISSA F00B344923AB4 0002 1 PARIS Where are my (primary) keys? 3rd Normal Form R is in second normal form (2NF) Every non-prime attribute of R is non-transitively dependent on every candidate key of R. A non-prime attribute of R is an attribute that does not belong to any candidate key of R. A transitive dependency is a functional dependency in which X → Z (X determines Z) indirectly, by virtue of X → Y and Y → Z. 11 | Where are my (primary) keys? 3rd Normal Form 12 | Order ID Line Number Product Manufacturer 0001 1 Chair IKEA 0001 2 Gum Mentos 0002 1 Fighter Jet Boeing Where are my (primary) keys? Keys 13 | Simple Composite Candidate Primary Artificial / Surrogate Intelligent Natural Where are my (primary) keys? The Debate To ID or not to ID? IDENTITY (1,1) vs. Natural key 14 | Where are my (primary) keys? Pro Artificial (I) In some cases, no natural key exists and an artificial key is the only option. Examples? 15 | Where are my (primary) keys? Pro Artificial (II) Natural keys can change. Artificial keys never change. How Often? Cascading referential constraints Artificial keys can change 16 | Where are my (primary) keys? Pro Artificial (III) Natural keys may be long and complex. Become longer with each level 900 Bytes limit in SQL Server Multi-column joins 17 | Where are my (primary) keys? Pro Artificial (IX) Artificial keys help improve performance. Simpler join predicates Ever increasing clustering effect Short keys = Smaller DB = Faster 18 | Where are my (primary) keys? Pro Artificial (X) Artificial keys reduce clustered index fragmentation. Minimize maintenance down time What about deletes? What about non-clustered indexes? 19 | Where are my (primary) keys? Pro Natural (I) Natural keys have business meaning. Artificial keys are never queried for 20 | Where are my (primary) keys? Pro Natural (II) Queries on tables using natural keys require fewer joins. The more familiar and meaningful the key, the less joins are required “Bypass” joins 21 | Where are my (primary) keys? Pro Natural (III) Data consistency is maintained explicitly when using natural keys. Artificial keys enable logical duplicates 22 | Where are my (primary) keys? Pro Natural (IX) Natural keys eliminate potential physical clustering performance issues. Contention for clustered regions 23 | Where are my (primary) keys? Less Mentioned Issues (I) Artificial keys are the de-facto standard. ORMs generate artificial keys LINQ doesn’t cache composite key rows … 24 | Where are my (primary) keys? Less Mentioned Issues (II) Data statistics and optimizations. Statistics on artificial keys are useless for parameter sniffing Estimations on composite key statistics are less accurate 25 | Where are my (primary) keys? Less Mentioned Issues (III) Modularity and portability. Migration to other platforms Merging with other databases 26 | Where are my (primary) keys? Less Mentioned Issues (IX) Simplicity and aesthetics. 27 | Where are my (primary) keys? Demo Spec A database of web sites. URL Country and city of owner Country ISO code for external app Data consistency is crucial 28 | Where are my (primary) keys? Natural vs. Artificial Keys 29 | Where are my (primary) keys? Ask Yourself Is there a natural key that I can use as a primary key? Are there a few natural candidates? Which one is the simplest and most familiar? How stable is it? How will it be used logically? What will be the physical access patterns for this table? What are the common query types for this table? 30 | Where are my (primary) keys? For More Information A Relational Model of Data for Large Shared Data Banks (E.F. CODD) The Relational Model for Database Management: Version 2 (E.F. Codd) An introduction to database systems (C.J. Date) Database in Depth: Relational Theory for Practitioners (C.J. Date) The Database Relational Model: A Retrospective Review and Analysis (C.J. Date) Joe Celko's Data and Databases: Concepts in Practice (J. Celko) Joe Celko's SQL for Smarties, Fourth Edition: Advanced SQL Programming (J. Celko) Database Modeling and Design, Fifth Edition: Logical Design (T.J. Teorey, S.S. Lightstone, T. Nadeau, and H.V. Jagadish) Pro SQL Server 2008 Relational Database Design and Implementation (L. Davidson, K. Kline, S. Klein, and K. Windisch) 31 | Where are my (primary) keys? Where Are My Keys? 32 | Where are my (primary) keys? Where Are My Keys? 33 | Where are my (primary) keys?