* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download data type
Survey
Document related concepts
Microsoft Access wikipedia , lookup
Oracle Database wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Ingres (database) wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Concurrency control wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
ContactPoint wikipedia , lookup
Clusterpoint wikipedia , lookup
Transcript
Data Model / Database Implementation Jeffery S. Horsburgh Hydroinformatics Fall 2014 This work was funded by National Science Foundation Grants EPS 1135482 and EPS 1208732 Objectives • Understand common database software packages, limitations and capabilities • Where are certain databases commonly used? • Why and when would you use certain databases? • Introduce MySQL functionality • Create a physical implementation of an Observations Data Model (ODM) database within a RDBMS Where We are Going • This week: Database implementation – How do you physically implement a data model as a database and load data • Next week: Using Structured Query Language (SQL) – Using code to slice and dice your data once you have it in a database • Two weeks: Accessing databases from client software – Getting data from a database into Python Steps in Data Model Design 1. Identify entities 2. Identify relationships among entities 3. Determine the cardinality and participation of relationships 4. Designate keys / identifiers for entities 5. List attributes of entities 6. Identify constraints and business rules 7. Map 1-6 to a physical implementation Physical Data Model • The “physical” means a specific implementation of the data model – Choice of hardware and operating system – Choice of relational database management system – Implementation of tables, relationships, constraints, triggers, indices, data types – Database access and security – Performance – Storage Summary of 3 Levels of Data Model Design Feature Conceptual Logical Entity Names X X Entity Relationships X X Physical Attributes X Primary Keys X X Foreign Keys X X Table Names X Column Names X Column Data Types X Views X Stored Procedures X Triggers X Constraints X Relational Database Management Systems • Robust software for managing data • Data stored in tables that may be related • Based on Structured Query Language (SQL) – Adopted by the American National Standards Institute (ANSI) and the International Standards Organization (ISO) as the standard data access language Relational Database Management Systems (RDBMS) • • • • • File vs. server based Free vs. commercial Different data types Potentially different syntax for SQL queries Security models and concurrent users File Versus Server-Based RDBMS • File-based – Everything contained within a single file – Generally good for single user, desktop applications – Examples: SQLite, Microsoft Access • Server-based – A database “server” manages databases and all transactions – Good for multiple, simultaneous connections and transactions – Examples: Oracle, Microsoft SQL Server, MySQL, PostgreSQL The Beauty of Server-Based RDBMS Data Database Authentication and Access Control Yay! I want some Database ServerDo I know you?Do I recognize data! Are you your IP address? With RDBMS authorized? Firewall Simultaneous Data Users Server-Based RDBMS – “Granules” • Database Server • Databases • Tables • Records DB1 DB2 DB3 ... DBn Site Variable Date Value 1 Temperature 8/2/2007 14:00 12.4 1 Temperature 8/2/2007 14:30 12.7 1 Temperature 8/2/2007 15:00 13.1 1 Temperature 8/2/2007 14:00 12.4 Who are the main contenders? • Commercial software – Sybase Adaptive Server Enterprise – IBM DB2 – Oracle – Microsoft SQL Server – Teradata • Free/GPL/Open Source: – MySQL – PostgreSQL MySQL • No limit on # of CPUs, fully featured, support for several OSs • $0 without Support, approx. $600-$6000 for subscription-based support • Highly scalable, provides support for most types of indexes and storage methods Things to Consider When Deciding on a RDBMS • Price • Operating System Support • Most important features – ACID (Atomicity, Consistency, Isolation, Durability) – Referential Integrity support – Transactions support • Scalability and security • Availability of support and software • Potential longevity ACID Properties of Database Transactions • Atomicity – each transaction is “all or nothing” • Consistency – any transaction brings the database from one valid state to another • Isolation – concurrent execution of transactions results in a system state that would be obtained if they executed serially • Durability – once a transaction has been committed, it will remain so Physical Implementation of a Data Model within a RDBMS 1. 2. 3. 4. Create a new database Create tables for entities Define attributes and data types Create relationships and define their properties 5. Define constraints All of this can be scripted/automated using SQL Reduction of an ER Diagram to Tables • Converting an ER diagram to table format is the basis for deriving a relational database – Entities are expressed as tables that contain data – A database is a collection of tables – Tables are assigned the same name as the entity – Each table has columns that correspond to attributes – each column has a unique name – Each column must have a single data type Don’t forget: Forward engineering in MySQL Workbench does this automatically Data Types • Each attribute of an entity (column in a database table) must have a single data type • Data types are enforced by RDBMS software Table: DataValues Attribute Data Type Sample Data ValueID Integer 1 SiteID Integer 5 VariableID Integer 5 DateTime Date/Time 8/15/2013 4:30 PM DataValue Double 4.567 Data Types • Data types can be specific to RDBMS software RDBMS Integer Floating Point Decimal String Date/Time MS SQL Server TINYINT, SMALLINT, INT, BIGINT FLOAT, REAL NUMERIC, DECIMAL, SMALLMONEY, MONEY CHAR, VARCHAR, TEXT, NCHAR, NVARCHAR, NTEXT DATE, DATETIMEOFFSET, DATETIME2, SMALLDATETIME, DATETIME, TIME MySQL TINYINT (8-bit), SMALLINT (16bit), MEDIUMINT (24-bit), INT (32bit), BIGINT (64bit) FLOAT (32-bit), DOUBLE (aka REAL) (64-bit) DECIMAL CHAR, BINARY, VARCHAR, VARBINARY, TEXT, TINYTEXT, MEDIUMTEXT, LONGTEXT DATETIME, DATE, TIMESTAMP, YEAR PostgreSQL SMALLINT (16bit), INTEGER (32bit), BIGINT (64bit) REAL (32-bit), DOUBLE PRECISION (64-bit) DECIMAL, NUMERIC CHAR, VARCHAR, TEXT DATE, TIME (with/without TIMEZONE), TIMESTAMP (with/without TIMEZONE), INTERVAL Quick summary from: http://en.wikipedia.org/wiki/Comparison_of_relational_database_management_systems Data Types in MySQL http://dev.mysql.com/doc/refman/5.7/en/data-types.html • Numeric – Integer types: bit, tinyint, smallint, mediumint, int, bigint – Fixed-point types: decimal, numeric (exact numeric values to preserve precision) – Floating-point types: float, double (approximate values) • Date and Time – date, datetime, timestamp, time, year • Strings – char(size) – fixed length string, right padded – varchar(size) – variable length string – tinytext, text, mediumtext, longtext - text data block Data Types in MySQL (2) • Binary data – binary(size) – binary byte strings – varbinary(size) – tinyblob, blob, mediumblob, longblob – binary large object • Spatial Data Types – geometry – point, multipoint – linestring, multilinestring – polygon, multipolygon Data Types in MySQL (3) • Nullable and NOT NULL types – All types in SQL Server may or may not allow NULL values • Primary key columns – Define the primary key • Auto-increment columns – Automatically increased values when a new row is inserted (auto-increment values) – Used in combination with primary key Advanced Database Objects • • • • Views Stored procedures Triggers Constraints • Implementation of these objects may depend on your choice of RDBMS software Database Views • A View is equivalent to a table, but is defined by a SQL query • Used to present a set of desired information, independent of the underlying database structure • Can be used to hide complexities of the underlying data model from the user – One way to address the cons of normalization Stored Procedures • A set of structured query language (SQL) statements that are stored and executed on the server • Useful for repetitive tasks • Encapsulate functionality and isolate users from data tables • Can provide a security layer – software applications have no access to the database directly, but can execute stored procedures Triggers • Special kind of stored procedure • Automatically executes on a table or view when an event occurs in the database • Events include: CREATE, ALTER, INSERT, UPDATE, DELETE • Mostly used to maintain the integrity of information in the database Constraints • Common way to enforce data integrity • Examples: – Not NULL – value in a column must not be NULL – Unique – value(s) in specified column(s) must be unique for each row in a table – Primary Key – value(s) in the specified column(s) must be unique for each row in the table and not be NULL – Foreign Key – values(s) in the specified column(s) must reference an existing record in another table via its primary key – Check – an expression that validates data and must not be FALSE MySQL Workbench • MySQL Workbench is a powerful graphical DB management tool – Create and modify ER diagrams – Administrate databases (create, modify, backup / restore DB) – Create/ view / modify table data and other database objects – Write and execute SQL queries 28 Authentication in MySQL • Performed when you connect to the database server • Establishes your identity • Enables you to create accounts for MySQL Server without giving access to the rest of the server • “root” = system administrator access Connecting to a MySQL Server • Connecting to a MySQL Server requires: – The name of the server machine or IP address and port number (3306 is default) – Username – Password 30 DEMO: Connecting to Local and Remote MySQL Servers Creating a New Database • In MySQL Workbench, click the “Create new schema” button • In MySQL, “Schema” = “Database” 32 Creating a New Database (2) • In the "New schema" tab enter the name for your new schema, set the Default Collation, and click “Apply” Character Set: A set of symbols and encodings Collation: A set of rules for comparing characters in a set 33 DEMO: Creating and Deleting Databases in MySQL Creating Tables • Expand database view in the schemas list • Right click on “Tables” and select “Create Table” 35 Creating Tables (2) • Enter table name and define the table columns (name, data type, and properties): Enter the name of Choose the data type of the column here the column here 36 Choose properties here Creating Tables (3) • Defining column properties – PK – the column is the primary key for the table – NN – the column is not nullable – UQ – Unique Index, the values in the column must be unique – BIN – the values in the column are binary – UN – the values in the column are unsigned – ZF – Zero fill, integer values will be padded with leading zeros for display – AI – the values in the column should be set automatically using an auto-increment (cannot be assigned manually) 37 Creating Tables (4) • It is best practice to set the name of the table at the time it is created Enter the name of the table here 38 DEMO: Creating and Deleting Tables 39 Creating Relationships • Alter child table to add foreign key attribute • Alter parent table to add foreign key relationship Create and name the relationship Choose the referenced table Match the primary and foreign key attributes 40 Choose what to do on update and delete DEMO: Creating Relationships 41 Naming Conventions • Tables – Each word is capitalized (Pascal Case) – • But wait - MySQL’s default settings are to lower case database and table names!!! – In English, plural – Examples: Sites, Variables, DataValues • Columns – In English, singular – Each word is capitalized (Pascal Case) – Avoid reserved words and characters (e.g., key, int, date) – Examples: SiteID, SiteCode, SiteName 42 Naming Conventions (2) • Primary key – Use "ID" or name_of_the_table + "ID" – Example: in the Sites table the PK column should be be called ID or SiteID • Foreign key – Use the name of the referenced table + "ID" – Example: in the DataValues table the foreign key column that references the Sites table should be named SiteID 43 Naming Conventions (3) • Relationship names (constraints) – In English, Pascal Case – "fk_" + ParentTable + "_" + ChildTable – Example: fk_Sites_DataValues • View names – "v_" + "name" – Example: v_SitesWithData • Stored procedures names – "sp_" + "name" – Example: sp_UpdateSeriesCatalog() 44 Authorization in MySQL • Assigning specific permissions to specific users for specific database objects DEMO: Creating Database Users and Assigning Permissions in MySQL Summary • Physical database implementation requires choices about hardware, software, security, data types, formats and storage, and other factors • RDBMS provide the ability to implement a database in a multi-user, server environment • MySQL Workbench provides tools for database creation, editing, and administration – including forward and reverse engineering