* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Title Slide No more than 2 lines
Microsoft Jet Database Engine wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Functional Database Model wikipedia , lookup
Relational model wikipedia , lookup
Database Design: Solving Problems Before they Start! Ed Pollack Database Administrator CommerceHub Agenda Preventing future headaches, right now! • • • • • • • 2 Why does database design matter to us? Discussion of tables. Examples of good design practices. Important decisions when designing a database. Notes on important database design topics. Normalization Conclusion Why Design Matters • • • • • • 3 Match business needs with implementation. Ensure acceptable performance. Scalability. Needs of tomorrow != Needs of today. Good design = easy to understand. Decrease maintenance costs. Avoid scorn of future DBAs & developers. Understand the Application • • • • • 4 Good design = knowing the business need. Talk to consumers of data. Ask lots of questions! May be technical or non-technical. What is the database used for? Scalability • • • • • 5 Design for today AND tomorrow. Understand cardinality. Determine contention and concurrency. Build overhead to support the near future. Always re-evaluate and ensure scalability for the future. Hardware Considerations • • • • • • 6 CPU (speed, cores, MDOP, etc…) Memory Storage (attached, SAN, SSD, cloud, etc…) Network bandwidth On-premises vs. cloud File & backup locations Designing Good Tables • One table = a distinct entity. • A column represents a unique element. • Table can be one of three basic types: • Entity • Lookup • Relationship (linking) • Schemas can classify name spaces/security/classification. • Avoid “double-duty” tables!!! 7 Data Types • Choose the correct data type. • Choose the correct data length/precision/size. • Consider standards when unsure of representation • Ie: ISO5218 for gender, ISO4217 for currency, ISO3166 for country • Bit = 1 or 0. Do not use INT, string, etc… • Consider mathematical operations for numeric types. 8 Dates & Times • Be consistent across all tables! • Do (or will) time zones matter? • If so, use DATETIMEOFFSET or UTC time. • Store in correct sized data type. • Smalldatetime vs. Datetime2 vs. Date vs. Time • Do not use strings for dates, times, timestamps, etc... • Differentiate between a date/time and duration. 9 NULL vs. NOT NULL • NULL = non-existence. • Do not make up “De-nullifiers”. • -1, 1/1/1900, ‘N/A’, ‘ ‘, ‘00:00’ • • • • 10 NOT NULL = required by the application NULL has meaning. Define this during design phase. Beware NULL behavior. ORDER BY, GROUP BY, =, <>…\ Always explicitly name ALL objects! Naming Objects • What is it? Don’t name for location, time, usage, etc… • A data element name should be unique! • Column names should be positive, not negative • Ie: Is_Not_Active vs. Is_Active • Avoid abbreviations/shorthand. Don’t fear long names. • Avoid spaces and special characters. • DOCUMENTATION!!! 11 Uniqueness All rows in a table should be unique. PRIMARY KEY determines row identity/uniqueness. Natural key vs. surrogate key. Clustered key doesn’t have to be the primary key, but often will be. • Unique constraints can enforce additional important business rules. • • • • 12 Other Schema Elements • Stored Procedures: Use for efficient/controlled data access. • Indexes: Speed up important read operations. • Functions*: Encapsulate often-used logic. • Triggers*: Enforce auditing, ETL, or business rules. • Constraints*: Enforce important relationships/rules. • Views: Provide a convenient/common path to data. • ***Always ensure elements are needed before creation! 13 Iteration vs. Set-Based Operations • SQL Server is built/optimized set-based transactions. • Use of loops, such as CURSOR, WHILE, GOTO are often indicative of poor/suboptimal design. • Consider alternatives to avoid row-by-row work: • OUTPUT • Window functions • CTE 14 OLAP vs. OLTP • The needs of transactional processing are different than the needs of analytics processing. • Separate reporting from transactional processing, or build infrastructure to support a shared environment. • Denormalize data for OLAP, not for OLTP. • Consider different tools for each, if needed. 15 Old Data • • • • Do we delete, inactivate, partition, archive, etc…? Is archiving allowed/desired/required? How to handle data associated with archived data? Data retention policy • Compliance needs. • Contractual obligations. • Prevent data from getting too large 16 Normalization • Normalization reduces data redundancy by separation into tables. • Avoids inconsistency via duplicated/missing/bad data. • Consider 1:1 vs. 1:many vs. many:many vs. optional. • DEMO: Normal forms. 17 Formatting Databases store and return data Formatting/presentation is best left to the application. Formatting can complicate scripts. Making output pretty can lead to poor performing queries. • Be consistent across all objects! • • • • 18 Conclusion • • • • • 19 Good database design saves immense time & effort. Generates database schema that is easy to understand. Allows for easier upgrades, releases, migrations. Poor design = technical debt! Future DBAs and developers will see your initials and love you! Questions??? Contact Info & Links Ed Pollack [email protected] @EdwardPollack SQL Shack SQL Server Central SQL Saturday Albany (2016) Thank you!!!