Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Dr Alwyn Barry Dr Joanna Bryson CM20145 Database Design Lecture Plan 1. Basic Concepts 2. Data, Information & Knowledge 3. Data Models (The E-R Model) 4. The Relational Algebra 5. Introduction to SQL 6. Further SQL (Joins, RA Equivalences) 7. Database Design 8. Further DB Design – Normalisation 9. Architectures and Implementations 10. Integrity and Security Lecture Plan `Reading’ (PROJECT) Week 11. Ethics and Professional Conduct 12. Legal Issues 13. Transactions 14. Recovery 15. Concurrency Control 16. Storage and File Structure 17. Indexing and Hashing 18. Query Processing & Optimisation January… Review Session Overview Database Design and Software Engineering. Example from real life. Views and Users. Redundancy and Decomposition. Loosely based on A. Keller, 2002, USCU What Is Relational Algebra For? Fundamental Operators: Unary Binary Fundamental Restriction Projection Cartesian Product Union Difference Useful Operators: Not Primitive Join Intersection Division File Problems DBs Should Solve Data redundancy & inconsistency Duplication of information; Different formats Data isolation Different formats; Different locations Limited access Need new program for each query Cannot support “Business Rules” Consistency Validity Atomicity Failures leave data in inconsistent state; Two users cannot update the database at once Security Needs bespoke security in each application Database Design DB design usually covers formal methods for trying to ensure databases meet all these goals. No formal system is perfect: Formal methods take a long time, may be computationally intractable. Human error: mistakes in proofs. Formally correct databases can still have errors (will be examples.) Design is an art; but science can help. E-R Model – System Design From Elmasri & Navathe, 2003, pg 51 Problem domain DBMS Dependentt DBMS Independent Requirements Collection & Analysis Functional Requirements Data Requirements Functional Analysis Conceptual Design High-level Transaction Spec. Application Program Design Conceptual Schema Logical Design (Data model mapping) Logical Schema Transaction Implementation Physical Design Internal Schema Application Programs Software Development Iterates http://www.extremeprogramming.org/ Waterfall Feigning Iteration This is not agile SE. People don’t want to know, but iteration happens. The question is how well you cope. Software Development Iterates Models change as programmers understand the problem better. Requirements change as users understand possibilities better. More resources become available. So: Save and maintain all modelling and planning tools. Interact with users frequently. Learn rules to recognize common failures. Overview Database Design and Software Engineering. Example from real life. Views and Users. Redundancy and Decomposition. Loosely based on A. Keller, 2002, USCU Where Do Models Come From? Demand – What do the users want to see? Data – Don’t throw anything away. Design – Never store anything twice. Efficiency. Integrity. Clarity. Security. Automating Trading Operations Automating Trading Operations Data from: trading cards, market prices, underlying values. Demand from: Knowing what is owned for determining risk. Knowing what individuals did for determining pay. Entities and attributes Trades Instrument, trade, price, 2 traders, 2 clearing firms. Traders Positions, trades. Trading cards Trades, trader, number, time. Instruments Daily values, volatilities, expiration dates. (draw on board) New Requirement: Reconciling Business process – dual entry: Operations: primary concerns are risk, execution on trading floor, relations with individual traders. Clearing: primary concerns are accounting, banks, law, relations with other clearing firms. The same trading cards resulted in different trade quantities. The same trade has different dates! Models will change! Design – Find out models are inefficient or clumsy to maintain. Data – Discover new categories, salient values. Demand – Users see new potential. Overview Database Design and Software Engineering. Example from real life. Views and Users. Redundancy and Decomposition. Loosely based on A. Keller, 2002, USCU Views: Why they’re important. The same data may be seen by different users in different ways. Shorthand for frequent joins, formulas – may be more efficient. Automate / enforce security – make access to tables and views depend on user’s function. Keep reports logically independent from underlying representation – protect the users! Why Limit / Protect Users? Databases contain all data for a company. Limits on access eliminate suspects for errors, crime. Information overload. Smart, authoritative users still need to find things quickly. Some users really are naïve. Programmers are users. Overview Database Design and Software Engineering. Example from real life. Views and Users. Redundancy and Decomposition. Loosely based on A. Keller, 2002, USCU Software Development Iterates Models change as programmers understand the problem better. Requirements change as users understand possibilities better. More resources become available. So: Save and maintain all modelling and planning tools. Interact with users frequently. Learn rules to recognize common failures. Software Development Iterates Models change as programmers understand the problem better. Requirements change as users understand possibilities better. More resources become available. So: Save and maintain all modelling and planning tools. Interact with users frequently. Learn rules to recognize common failures. Pitfalls in Relational DB Design A bad design may lead to: redundant information, difficulty in representing certain information, or difficulty in checking integrity constraints. Design goals: Avoid redundant data. Ensure that relationships among attributes are represented. Facilitate the checking of updates for violation of integrity constraints. ©Silberschatz, Korth and Sudarshan Modifications & additions by S Bird, Melbourne Example of Bad Design Consider the relation schema: Lending-schema = (branch-name,branch-city,assets, customer-name,loan-number,amount) Redundant Information: Data for branch-name, branch-city, assets are repeated for each loan that a branch makes. Wastes space and complicates updates, introducing possibility of inconsistency of assets value. Difficulty representing certain information: Cannot store info. about a branch if no loans exist. Can use null values, but they are difficult to handle. Solution: Decomposition Break up redundant tables into multiple tables - this operation is called decomposition. E.g. Lending-schema = (branchname, branch-city, assets, customername, loan-number, amount) Branch-schema = (branch-name, branch-city,assets) Loan-info-schema = (customer-name, loan-number, branch-name, amount) Lossless-Join Decomposition: Want to ensure that the original data is recoverable. 1.All attributes of the original schema (R) must appear in the decomposition (R1, R2), i.e. R = R1 R2 2.Decomposition must be a lossless-join decomposition. Definition: R1,R2 is a lossless-join decomposition of R if, for all possible relations r(R) r = R1 (r) ⋈ R2 (r) Bad Decomposition Example A Non Lossless-Join Decomposition R = (A, B) R1 = (A), R2 = (B) A B A B A B 1 2 A(r) B(r) 1 2 1 2 1 2 1 r A (r) ⋈ B (r) Thus, r is different to A (r) ⋈ B (r) So A,B is not a lossless-join decomposition of R. Overview Database Design and Software Engineering. Example from real life. Views and Users. Redundancy and Decomposition. Loosely based on A. Keller, 2002, USCU Summary Database design is an ongoing, iterative process. Requirements come from data, user demands, design issues. Change occurs: Corporations & technologies grow. Programmers & users learn. Views / security. Lossless-join decomposition Next: Science for improving design. Reading & Exercises Reading Connolly & Begg Chapter 9, (13, 14) Silberschatz Chapter 7. Much of 7 will be in the next lecture! Exercises: C&B: 9.2, 9.3/11, 9.9, 9.10 Silberschatz:7.1, 7.2, 7.16 These need functional dependencies, which are covered next lecture.