Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Database Design Presented to Molecular Biology Bioinformatics Meeting by Norbert Kremer April 29, 2002 Database Features • Efficient storage and retrieval of data relevant to a specific purpose – Data integrity – Security – Performance, indexing • Best design depends on application Database Technologies • Manual Systems – Shoebox full of index cards – Yellow stickies • Mechanical Systems – Hollerith Cards – Indexing holes Database Technologies • Automated Systems – Flat file, fields in fixed or delimited columns – Early database systems: network, hierarchical – Client-server paradigm: relational database management systems – Object databases, e.g. AceDB – Object-relational, e.g. Oracle 9 – XML, eg Neocore and Tamino RDBMS Features 1/2 • Data stored as tables with keys • Sound mathematical foundations in set theory, T.Codd, IBM Research Labs • Relational algebra – closure • Best for highly structured data RDBMS Features 2/2 • SQL is most common relational language • SQL is non-procedural, relationally complete • Robust, mature technology, performance and scalability, security, data integrity Relational Table Structure • Examples showing – One big table – Multiple tables with keys • Normalization Process – Data elements depend on the “key, the whole key, and nothing but the key” Data Modeling • Identify and define data elements • Discover logical relationships between data elements Software Engineering Methodology ANALYSIS Identify and Define data elements DESIGN Logical Relationships BUILD Physical layout DEPLOY Requirements Doc, Functional Spec Data dictionary Technical Spec Database schema Code, Release cycle DB creation scripts, seed data Maintenance SOPs Backup, DB Tuning Data Integrity • Constraints enforced by DBMS • Transaction processing Data Integrity - Constraints • Constraints enforced by DBMS – Central location eases maintenance – Uniform application in all code – Cannot be subverted by malicious or lazy programming • Foreign keys Data Integrity - Transactions • ACID Properties – – – – Atomic Consistent Isolated Durable Data Integrity - Transactions • ACID Properties – Atomic Logical unit of work, complete all or roll back – Consistent All constraints observed – Isolated Transactions are independent – Durable “Commit” is irreversible Software Engineering Methodology ANALYSIS Identify and Define data elements DESIGN Logical Relationships BUILD Physical layout DEPLOY Requirements Doc, Functional Spec Data dictionary Technical Spec Database schema Code, Release cycle DB creation scripts, seed data Maintenance SOPs Backup, DB Tuning PowerDesigner Demo • Graphical communication among analysts, architects, developers, end users • Centralized maintenance of database schema and data dictionary • Forward engineering (SQL script from model) • Reverse engineering (model from existing database) • Medline example, showing modeling process for “associative tables” References • WinSQL graphical database client for any ODBCcompliant database (just about all) Get the free Lite version from http://www.indus-soft.com/winsql/ • ODBC driver for MySQL (also need the usual MySQL client installed) http://www.mysql.com/downloads/api-myodbc-3.51.html • Powerdesigner 9 Trial The trial is fully functional for 45 days. See http://www.sybase.com/products/enterprisemodeling/powerdesigner