* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download CS 580 Client-Server Programming
Relational algebra wikipedia , lookup
Oracle Database wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Concurrency control wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Functional Database Model wikipedia , lookup
Ingres (database) wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
ContactPoint wikipedia , lookup
Clusterpoint wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Brief Introduction to DBMS Brief Introduction to DATABASE 1. Some Reasons for Using a DBMS (Database Management System) Persistence of data Sharing of data between programs Handle concurrent requests for data access Transactions that can be rolled back Report generation Sometimes database means a program for managing data Example: Oracle Corporation is a database company. MS Access is database. Sometimes database means a collection of data Example: I keep a database of my CD collection on 3 by 5 cards Sometimes database means a set of tables, indexes, and views Example: My program needs to connect to the Airline Reservation database, which uses Oracle 2. Types of Databases Relational: Data is stored in tables Object-Oriented: Tables can be subclassed, and Programmer can define methods on tables Object: Objects are stored in the database 3. Relational, Object-Relational, O-O Databases and SQL Database consists of a number of tables. Table is a collection of records. Each column of data has a type Firstname Bill George John Lastname Clinton Bush Kelly Phone 234-6677 235-7721 222-7756 Code 2000 3000 5000 Use SQL (Structured Query Language) to access data in the database Some Available DBMS systems Oracle, DB2, SQL Server, Access, Informix, InterBase, Sybase, FileMaker Pro For Software Engineering Students 1 Brief Introduction to DBMS FoxPro, Paradox, dBase Some Open Source DBMS Systems Ingres, Postgres, MySQL server 4. SQL History Dr. E. F. Codd developed relational database model: Early 1970's IBM System R relational database: Mid 1970's, Contained the original SQL language First commercial database - Oracle 1979 SQL was aimed at: Accountants, Business people SQL92: First commonly followed standard, ANSI X3.135-1992, SQL2 ISO/IEC 9075-1 through 5, New SQL standard 5. Understanding Database based on MySQL 5.1 MySQL – Connecting to the Database Can be done with: Mysql command line tool - mysql GUI clients Program Examples (by MySQL command line tool) You can choose mysqld as a service, or mysqld as a standalone daemon. mysqld --standalone C:\mysql\bin\mysqld --standalone –debug mysqladmin" –p -u root shutdown $mysql –h localhost –u user –p MySQL On-line Manual http://www.mysql.com/doc/en/Reference.html 5.2 GUI Clients DbVisualizer: Java based, so runs on may platforms http://www.dbvis.com/products/dbvis/ For Software Engineering Students 2 Brief Introduction to DBMS 5.3 SQL Syntax Names Databases, table columns & indexes have names Legal Characters Alphanumeric characters, '_', '$' Names can start with: Letter Underscore Letter with diacritical marks and some non-latin letters Name length 64 characters - MySQL Names are not case sensitive Data Types Numeric Values Integer: decimal or hexadecimal Floating-point: scientific, real number (e.g., 12.1234) String Values MySQL example: ‘this is a string’ or “this is a string” Escape Sequence \' Single quote \b Backspace \n Newline \r Tab \\ Backslash Comments (MySQL example) -- this is a comment /* this is a comment */ #this is a comment For Software Engineering Students 3 Brief Introduction to DBMS Numeric Data Types Numeric(10, 2) defines a number with maximum of 10 digits with 2 of the 10 to the right of the decimal point (e.g., 12345678.91) Decimal and numeric are different names for the same type. String Types Type char(n) varchar(n) text Blob (MySQL) Description Fixed-length blank padded Variable-length with limit Variable unlimited length Variable (not specifically limited) length binary string char & varchar are the most common string types char is fixed-width: shorter strings are padded (with blanks) text can be of any size MySQL limits char and varchar to 255 characters Dates DATETIME – ‘YYYY-MM-DD HH:MM DATE – ‘YYYY-MM-DD’ format TIMESTAMP Changed from MySQL 4.1, basically now is same as DATETIME Common SQL Statements SELECT INSERT Retrieves data from table(s) Adds row(s) to a table For Software Engineering Students 4 Brief Introduction to DBMS UPDATE DELETE CREATE TABLE DROP TABLE ALTER TABLE CREATE INDEX DROP INDEX CREATE VIEW DROP VIEW Change field(s) in record(s) Removes row(s) from a table Data Definition Define a table and its columns (fields) Deletes a table Adds a new column, add/drop primary key Create an index Delete an index Define a logical table from other table(s) view(s) Deletes a view CREATE DATABASE General Form CREATE DATABASE [IF NOT EXISTS] db_name [create_specification [, create_specification] ...] create_specification: [DEFAULT] CHARACTER SET charset_name | [DEFAULT] COLLATE collation_name (Example) mysql> create database DBMSExamples; Query OK, 1 row affected (0.00 sec) USE Sets a default database for subsequent queries General Form USE db_name (Example) mysql> use DBMSExamples; Database changed CREATE TABLE General Form CREATE TABLE table_name ( col_name col_type [ NOT NULL | PRIMARY KEY] [, col_name col_type [ NOT NULL | PRIMARY KEY]]* ) mysql> use DBMSExamples; Database changed (Example) Mysql> CREATE TABLE students -> ( -> firstname CHAR(20) NOT NULL, -> lastname CHAR(20), -> phone CHAR(10), For Software Engineering Students 5 Brief Introduction to DBMS -> code INTEGER -> ); Query OK, 0 rows affected (0.61 sec) mysql> CREATE TABLE codes -> ( -> code INTEGER, -> name CHAR(20) -> ); Query OK, 0 rows affected (0.63 sec) mysql> DESCRIBE students -> ; +-----------+----------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-----------+----------+------+-----+---------+-------+ | firstname | char(20) | NO | | | | | lastname | char(20) | YES | | NULL | | | phone | char(10) | YES | | NULL | | | code | int(11) | YES | | NULL | | +-----------+----------+------+-----+---------+-------+ 4 rows in set (0.11 sec) mysql> desc students; +-----------+----------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-----------+----------+------+-----+---------+-------+ | firstname | char(20) | NO | | | | | lastname | char(20) | YES | | NULL | | | phone | char(10) | YES | | NULL | | | code | int(11) | YES | | NULL | | +-----------+----------+------+-----+---------+-------+ 4 rows in set (0.00 sec) mysql> explain students -> \g +-----------+----------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-----------+----------+------+-----+---------+-------+ | firstname | char(20) | NO | | | | | lastname | char(20) | YES | | NULL | | | phone | char(10) | YES | | NULL | | | code | int(11) | YES | | NULL | | +-----------+----------+------+-----+---------+-------+ 4 rows in set (0.00 sec) mysql> show columns from students -> ; +-----------+----------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-----------+----------+------+-----+---------+-------+ | firstname | char(20) | NO | | | | For Software Engineering Students 6 Brief Introduction to DBMS | lastname | char(20) | YES | | NULL | | | phone | char(10) | YES | | NULL | | | code | int(11) | YES | | NULL | | +-----------+----------+------+-----+---------+-------+ 4 rows in set (0.02 sec) mysql> show fields from students \g +-----------+----------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-----------+----------+------+-----+---------+-------+ | firstname | char(20) | NO | | | | | lastname | char(20) | YES | | NULL | | | phone | char(10) | YES | | NULL | | | code | int(11) | YES | | NULL | | +-----------+----------+------+-----+---------+-------+ 4 rows in set (0.00 sec) SELECT Gets the data from one or more tables General Form SELECT [STRAIGHT_JOIN] [SQL_SMALL_RESULT] [SQL_BIG_RESULT] [SQL_BUFFER_RESULT] [SQL_CACHE | SQL_NO_CACHE] [SQL_CALC_FOUND_ROWS] [HIGH_PRIORITY] [DISTINCT | DISTINCTROW | ALL] select_expression,... [INTO {OUTFILE | DUMPFILE} 'file_name' export_options] [FROM table_references [WHERE where_definition] [GROUP BY {unsigned_integer | col_name | formula} [ASC | DESC], ... [WITH ROLLUP]] [HAVING where_definition] [ORDER BY {unsigned_integer | col_name | formula} [ASC | DESC] ,...] [LIMIT [offset,] row_count | row_count OFFSET offset] [PROCEDURE procedure_name(argument_list)] [FOR UPDATE | LOCK IN SHARE MODE]] mysql> SELECT * FROM students; Empty set (0.00 sec) INSERT Add data to a table General Form INSERT [LOW_PRIORITY | DELAYED] [IGNORE] [INTO] tbl_name [(col_name,...)] VALUES ((expression | DEFAULT),...),(...),... [ ON DUPLICATE KEY UPDATE col_name=expression, ... ] (Example) For Software Engineering Students 7 Brief Introduction to DBMS mysql> INSERT -> INTO students (firstname, lastname, phone. Code) -> VALUES ('bill', 'clinton', '123-4567’,2000); Query OK, 1 row affected (0.50 sec) mysql> INSERT -> INTO codes (code, name) -> VALUES (2000, 'marginal' ); Query OK, 1 row affected (0.48 sec) mysql> SELECT firstname , phone FROM students; +-----------+----------+ | firstname | phone | +-----------+----------+ | bill | 123-4567 | +-----------+----------+ 1 row in set (0.00 sec) mysql> SELECT lastname, name -> FROM students, codes -> WHERE students.code = codes.code; +----------+----------+ | lastname | name | +----------+----------+ | clinton | marginal | +----------+----------+ 1 row in set (0.01 sec) mysql> SELECT students.lastname, codes.name -> FROM students, codes -> WHERE students.code = codes.code; +----------+----------+ | lastname | name | +----------+----------+ | clinton | marginal | +----------+----------+ 1 row in set (0.44 sec) mysql> Update Modify existing data in a database General Form UPDATE [LOW_PRIORITY] [IGNORE] tbl_name [, tbl_name ...] SET col_name1=expr1 [, col_name2=expr2 ...] [WHERE where_definition] Example mysql> update students -> set firstname='Hilary' For Software Engineering Students 8 Brief Introduction to DBMS -> where lastname="Clinton"; Query OK, 1 row affected (0.52 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> select * from students; +-----------+----------+----------+------+ | firstname | lastname | phone | code | +-----------+----------+----------+------+ | Hilary | clinton | 123-4567 | 2000 | +-----------+----------+----------+------+ 1 row in set (0.00 sec) mysql> ALTER TABLE students ADD column foo CHAR(40); Query OK, 1 row affected (0.03 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql> DROP TABLE students; Query OK, 0 rows affected (0.01 sec) mysql> DROP DATABASE databaseexamples; Query OK, 0 rows affected (0.00 sec) 6. Simple JDBC Tutorial with MySQL DBMS This tutorial teaches you how to connect to a MySQL database with java. Connecting to MySQL on localhost is simple, so lets get started. First download MySQL(Windows Version) and connector/J from the MySQL web site. Installing MySQL is very simple. Check the MySQL documentation if you need help. You will need to download connector/J. In this tutorial I will only be explaining how to connect to a Windows version of MySQL. 6.1 What is Connector/J? Why do I need it? Connector/J is a database driver, which allows connectivity for client applications developed in Java via a JDBC (Java Database Connectivity). MySQL Connector/J is a JDBC-3.0 "Type 4" driver, which means that it is pure Java, implements version 3.0 of the JDBC specification, and communicates directly with the MySQL server using the MySQL protocol. Without it we would not be able to connect to MySQL from our java application, which we will create later. Use the following URL (up-to-dated Connector) http://dev.mysql.com/downloads/connector/j/5.0.html After you have downloaded Connector/J, place the following jar file "mysqlconnector-java-5.0.4-bin.jar" (You may have a different version) in this location: "$JAVA_HOME\jre1.5.0_11\lib\ext". Once you have done this you are ready to start creating the database. 6.2 Creating a phonebook database With every small database example I always use a phonebook as an example. We will be creating a very small database which will have one table with four fields. First For Software Engineering Students 9 Brief Introduction to DBMS open MySQL console. Enter your password to login. This is the password you created when installing MySQL. MySQL comes with a test database which you can use. We are not going to use this test database as we will be creating our own database called "MyPhoneBook". Let's create this database now. Enter the following SQL command "CREATE DATABASE MyPhoneBook;". Remember to enter the semi colon. Now the database has been created. Because there can be more than one database, we need to tell MySQL, which database we want to use. We do this by using the USE [Database name] command. Type "USE MyPhoneBook;". You will receive a message, which tells you that the database has been changed. Now that the database has been created we need to create a table. At the moment there are no tables. We can check to see if our newly created database has any tables using the following command "SHOW TABLES;". Once again don’t forget the semi colon. As you can see our database is empty. Let's create a table in our database called "MyContacts". This table will have four fields (Name, Surname Contact number and E-mail). Enter the following SQL statement "CREATE TABLE MyContacts (Name VARCHAR(20), Surname VARCHAR(20), Contact_Num Int(11), E_Mail VARCHAR(50));". This SQL statement is quite long, so make sure that you type it out correctly. You can copy and paste it to the console but for good practise you should try and write it out your self. Remember the semi colon. If the query was successful, let's show the table using the "SHOW TABLES" SQL statement. Remember that the database was empty. Now you should see a new table called "MyContacts". Now we know that the table has been created. We now want to see the fields in the table. To see what fields are in a table, we use the DESCRIBE [Table Name] SQL statement. Enter "DESCRIBE MyContacts;". Now our database is created and we have a table with the required fields, let's enter some records into our table. To enter data into a table we use the "INSERT INTO" SQL command. Enter the following SQL statement into the console. "INSERT INTO MyContacts VALUES('Bill','Clinton','12354567897','[email protected]');". If you are successful, then you will receive a Query Ok message. Please NOTE, that, the way in, which you enter, records must correspond to the order in which your fields are created. For example, our first field in our table is "Name", so when entering the values, the first value that you enter will be stored in the first field, in our case the Name field. Also make sure that you do not try and store different types of data in fields which do not support the data type. For example, the third field in our table is off type Int, this means that only integer values can only be stored in this field. After entering a record it's a good idea to view it to make sure that our data was entered. To view records we use the "SELECT" SQL command. Let’s view our first record. Enter "SELECT * FROM MyContacts;". You should see the new record. Keep MySQL console open. You will need it later. Now we have successfully created a database and a table with four fields. There are various other SQL commands which are not covered in this tutorial. You can add For Software Engineering Students 10 Brief Introduction to DBMS more records by repeating the "INSERT INTO" SQL statement. 6.3 Setting up DriverManager Now that we have created the database we need to get down to the real work. We first need to grant a connection to the database. The connection is referenced as an Object of type java.sql.Connection. It is handed out by the DriverManager. We simply tell the DriverManager what type of driver to use to handle the connections to databases, and from there onwards we, ask it to give us a connection to a particular database of that type. 6.4 Connecting to a Database To create a connection to a database, we will need to use java.sql.DriverManager's getConnection() method. This method takes an argument the database URL that we want to connect to. It finds the correct driver which has been loaded in the JVM and then delegates the work of creating the connection to that driver. Example Java Code Save the following code as MySQLMyPhoneBook.java import javax.swing.*; import java.sql.*; class MySQLMyPhoneBook { public static void main(String[] args) { try { Class.forName("com.mysql.jdbc.Driver").newInstance(); Connection con = DriverManager.getConnection("jdbc:mysql:///MyPhoneBook", "",""); Statement s = con.createStatement(); s.execute("SELECT * FROM MyContacts"); ResultSet rs = s.getResultSet(); if (rs != null) while ( rs.next() ) { JOptionPane.showMessageDialog(null,"Name: " + rs.getString(1) + "\nSurname: " + rs.getString(2) + "\nContact: " + rs.getInt(3) + "\nE-Mail: " + rs.getString(4)); } } catch (Exception e) { For Software Engineering Students 11 Brief Introduction to DBMS JOptionPane.showMessageDialog(null,"An error occured: " + e); } } } Compile and run the above program. What you should receive is an "Access Denied" message. This is because you need to grant a user to your database and set privileges. In your MySQL console, type the following SQL statement to grant a user to the database GRANT ALL PRIVILEGES ON *.* TO 'Username'@'localhost' -> IDENTIFIED BY 'Password' WITH GRANT OPTION; Where stated username and password, change this to suit a username and password of your choice. Make sure you remember these details as they will need to be supplied in your java code. Now enter the username and password that you granted into the java code. Look at the example below. import javax.swing.*; import java.sql.*; class MySQLMyPhoneBook { public static void main(String[] args) { try { Class.forName("com.mysql.jdbc.Driver").newInstance(); Connection con = DriverManager.getConnection("jdbc:mysql:///MyPhoneBook", "Username","Password"); Statement s = con.createStatement(); s.execute("SELECT * FROM MyContacts"); ResultSet rs = s.getResultSet(); if (rs != null) while ( rs.next() ) { JOptionPane.showMessageDialog(null,"Name: " + rs.getString(1) + "\nSurname: " + rs.getString(2) + "\nContact: " + rs.getInt(3) + "\nE-Mail: " + rs.getString(4)); } } catch (Exception e) { For Software Engineering Students 12 Brief Introduction to DBMS JOptionPane.showMessageDialog(null,"An error occured: " + e); } } } Compile and run the java code. You should now successfully be able to access the database you created. You should see a message box, which contains a single record. If you have entered more than one record you will receive more messages box's each message box will show a record. 7. More on JDBC Sun's on-line JDBC Tutorial & Documentation http://java.sun.com/j2se/1.5/docs/guide/jdbc/index.html Connector-J, MySql JDBC Driver Documentation, http://www.mysql.com/products/connector/j/ Java Database Connectivity – Connecting to DBMS 7.1.Java supports four types of JDBC drivers JDBC-ODBC bridge plus ODBC driver: Java code access ODBC native binary drivers, ODBC driver accesses databases, ODBC drivers must be installed on each client Native-API partly-Java driver: Java code accesses database specific native binary drivers JDBC-Net pure Java driver: Java code accesses database via DBMSindependent net protocol For Software Engineering Students 13 Brief Introduction to DBMS Native-protocol pure Java driver: Java code accesses database via DBMSspecific net protocol 7.2. JDBC URL Structure jdbc:<subprotocol>:<subname> <subprotocol> Name of the driver or database connectivity mechanism <subname> Depends on the <subprotocol>, can vary with vender MySQL jdbc:mysql://[host][,failoverhost...][:port]/[database] [?propertyName1][=propertyValue1][&propertyName2][=propertyValue2]... java.sql DriverManager javax.sql DataSource Connection Pools Distributed Transactions Requires JNDI 7.3. Queries executeUpdate Use for INSERT, UPDATE, DELETE or SQL that return nothing executeQuery Use for SQL (SELECT) that return a result set execute Use for SQL that return multiple result sets Uncommon 7.4. ResultSet ResultSet - Result of a Query - JDBC returns a ResultSet as a result of a query - A ResultSet contains all the rows and columns that satisfy the SQL statement - A cursor is maintained to the current row of the data - The cursor is valid until the ResultSet object or its Statement object is closed - next() method advances the cursor to the next row - You can access columns of the current row by index or name ResultSet has getXXX methods that: For Software Engineering Students 14 Brief Introduction to DBMS has either a column name or column index as argument returns the data in that column converted to type XXX getObject - A replacement for the getXXX methods Rather than ResultSet tableList = getTables.executeQuery("SELECT * FROM name"); String firstName = tableList.getString( 1); - Can use ResultSet tableList = getTables.executeQuery("SELECT * FROM name"); String firstName = (String) tableList.getObject( 1); getObject( int k) returns the object in the k’th column of the current row getObject( String columnName) returns the object in the named column 7.4. Data Conversion 7.4. Some Cautions Mixing ResultSets Can't have two active result sets on same statement For Software Engineering Students 15 Brief Introduction to DBMS Connection rugby; rugby = DriverManager.getConnection( dbUrl, user, password); Statement getTables = rugby.createStatement(); ResultSet count = getTables.executeQuery("SELECT COUNT(*) FROM name"); ResultSet tableList = getTables.executeQuery("SELECT * FROM name"); while (tableList.next() ) System.out.println("Last Name: " + tableList.getObject(1) + '\t' + "First Name: " + tableList.getObject( "first_name")); // Raises java.sql.SQLException count.getObject(1); rugby.close(); this can happen when two threads have access to the same statement Two Statements on one Connection work Connection rugby; rugby = DriverManager.getConnection( dbUrl, user, password); Statement getTables = rugby.createStatement(); Statement tableSize = rugby.createStatement(); ResultSet count = getTables.executeQuery("SELECT COUNT(*) FROM name"); ResultSet tableList = tableSize.executeQuery("SELECT * FROM name"); while (tableList.next() ) System.out.println("Last Name: " + tableList.getObject(1) + '\t' + "First Name: " + tableList.getObject( "first_name")); count.next(); System.out.println("Count: " + count.getObject(1) ); count.close(); tableList.close(); rugby.close(); Threads & Connections - Some JDBC drivers are not thread safe - If two threads access the same connection results may get mixed up MySql driver is thread safe - When two threads make a request on the same connection - The second thread blocks until the first thread get it its results - Can use more than one connection but - Each connection requires a process on the database For Software Engineering Students 16 Brief Introduction to DBMS 8. Some Data Modeling Terms Entity: a distinct class of things about which something is known Entity Occurrence: particular instance of an entity class. In a database entity occurrences are records in a table Attribute: an abstraction belonging to or characteristic of an entity Primary Key (unique identifier): an attribute (or set of attributes) that uniquely defines an entity Relationship: an abstraction belonging to or characteristic of two entities or parts together Foreign Key: a unique identifier in a record representing another record Relational databases do not support pointers to entities 8.1. Entity Relationship (ER) Diagram Entity (car) with: Attributes (Color, make, model, serial number) Primary key (serial number) Relationship between Car and Person entities. Car must have one and only one owner. Person may own zero, one or more cars. Person can own many cars Primary Key A primary key is one that uniquely identifies a row in a table. A Simple Faculty Table name faculty_id Brown 1 Clinton 2 Bush 3 For Software Engineering Students 17 Brief Introduction to DBMS CREATE TABLE faculty ( name CHAR(20) NOT NULL, faculty_id INTEGER AUTO_INCREMENT ); PRIMARY KEY Indices Indices make accessing faster Primary keys automatically have an index The CREATE INDEX command creates indices CREATE INDEX faculty_name_key on faculty (name); Adding Values INSERT INSERT INSERT INSERT INTO INTO INTO INTO faculty ( name) VALUES ('White'); faculty ( name) VALUES ('Beck'); faculty ( name) VALUES ('Anantha'); faculty ( name) VALUES ('Vinge'); select * from faculty; Result name | faculty_id ----------------------+------------White | 1 Beck | 2 Anantha | 3 Vinge | 4 (4 rows) CREATE TABLE office_hours ( start_time TIME NOT NULL, end_time TIME NOT NULL, day CHAR(3) NOT NULL, faculty_id INTEGER REFERENCES faculty , office_hour_id INTEGER AUTO_INCREMENT PRIMARY KEY ); faculty_id is a foreign key REFERENCES faculty insures that only valid references are made start_time end_time day faculty_id office_hour_id 10:00 11:00 Wed 1 1 8:00 12:00 Mon 2 2 For Software Engineering Students 18 Brief Introduction to DBMS 17:00 18:30 Tue 1 3 9:00 10:30 Tue 3 4 9:00 10:30 Thu 3 5 15:00 16:00 Fri 1 6 INSERT INTO office_hours ( start_time, end_time, day, faculty_id ) VALUES ( '10:00:00', '11:00:00' , 'Wed', 1 ); The problem is that we need to know the id for the faculty Using Select INSERT INTO office_hours (start_time, end_time, day, faculty_id ) SELECT '8:00:00' AS start_time, '12:00:00' AS end_time, 'Fri' AS day, faculty_id AS faculty_id FROM faculty WHERE name = 'Beck'; Getting Office Hours SELECT name, start_time, end_time, day FROM office_hours, faculty WHERE faculty.faculty_id = office_hours.faculty_id; mysql> SELECT -> name, start_time, end_time, day -> FROM -> office_hours, faculty -> WHERE -> faculty.faculty_id = office_hours.faculty_id; +---------+------------+----------+-----+ | name | start_time | end_time | day | +---------+------------+----------+-----+ | White | 10:00:00 | 11:00:00 | Wed | | Beck | 08:00:00 | 12:00:00 | Mon | | White | 17:00:00 | 12:30:00 | Tue | | Anantha | 09:00:00 | 10:30:00 | Tue | | Anantha | 09:00:00 | 10:30:00 | Thu | | White | 15:00:00 | 16:00:00 | Fri | +---------+------------+----------+-----+ For Software Engineering Students 19 Brief Introduction to DBMS 6 rows in set (0.00 sec) mysql> name start_time end_time day Whitney 10:00:00 11:00:00 Wed Beck 08:00:00 12:00:00 Mon Whitney 17:00:00 18:30:00 Tue Whitney 15:00:00 16:00:00 Fri Anantha 09:00:00 10:30:00 Tue Anantha 09:00:00 10:30:00 Thu Some Selection SELECT name, start_time, end_time, day FROM office_hours, faculty WHERE faculty.faculty_id = office_hours.faculty_id AND start_time > '09:00:00' AND end_time < '16:30:00' ORDER BY Name; name start_time end_time day white 10:00:00 11:00:00 Wed white 15:00:00 16:00:00 Fri Joins People id first_name 1 Roger last_name Whitte 2 Leland Beck 3 Carl Eckberg Email_Addresses id user_name host person_id 1 beck cs.lehman.edu 2 2 white cs.lehman.edu 1 3 white cs2.lehman.edu 1 4 foo cs2.lehman.edu For Software Engineering Students 20 Brief Introduction to DBMS The tables have a column in common as email_addresses.person_id refers to people.id. So we can create a new table by joining the two tables together on that column Inner Join (or just Join) Only uses entries linked in two tables first_name last_name user_name host Leland Beck beck cs.sdsu.edu Roger white white cs.sdsu.edu Roger White white rohan.sdsu.edu select first_name, last_name, user_name, host from people, email_address where people.id = email_address.person_id; or equivalently select first_name, last_name, user_name, host from people inner join email_address on (people.id = email_address.person_id); mysql> select -> first_name, last_name, user_name, host -> from -> people, email_address -> where -> people.id = email_address.person_id; +------------+-----------+-----------+----------------+ | first_name | last_name | user_name | host | +------------+-----------+-----------+----------------+ | Leland | Beck | beck | cs.lehman.edu | | Roger | White | white | cs.lehman.edu | | Roger | White | white | cs2.lehman.edu | +------------+-----------+-----------+----------------+ 3 rows in set (0.00 sec) mysql> select -> first_name, last_name, user_name, host -> from -> people inner join email_address -> on -> (people.id = email_address.person_id); For Software Engineering Students 21 Brief Introduction to DBMS +------------+-----------+-----------+----------------+ | first_name | last_name | user_name | host | +------------+-----------+-----------+----------------+ | Leland | Beck | beck | cs.lehman.edu | | Roger | White | white | cs.lehman.edu | | Roger | White | white | cs2.lehman.edu | +------------+-----------+-----------+----------------+ 3 rows in set (0.00 sec) mysql> Outer Join Uses all entries from a table Left Outer Join Use all entries from the left table first_name last_name user_name host Leland Beck beck cs.lehman.edu Roger White white cs.lehman.edu Roger White white cs2.lehman.edu Carl Eckberg select first_name, last_name, user_name, host from people left outer join email_address on (people.id = email_address.person_id); Right Outer Join first_name last_name user_name host Leland Beck beck cs.lehman.edu Roger white white cs.lehman.edu Roger Whitney whitney cs2.lehman.edu foo cs2..lehman.edu Use all entries from the right table select first_name, last_name, user_name, host For Software Engineering Students 22 Brief Introduction to DBMS from people right outer join email_address on (people.id = email_address.person_id); A right outer join B & B left outer join A The following two statements are equivalent select first_name, last_name, user_name, host from people right outer join email_address on (people.id = email_address.person_id); select first_name, last_name, user_name, host from email_address left outer join people on (people.id = email_address.person_id); 9. Normalization The normal forms defined in relational database theory represent guidelines for record design. The guidelines corresponding to first through fifth normal forms are presented here, in terms that do not require an understanding of relational theory. The design guidelines are meaningful even if one is not using a relational database system. We present the guidelines without referring to the concepts of the relational model in order to emphasize their generality, and also to make them easier to understand. Our presentation conveys an intuitive sense of the intended constraints on record design, although in its informality it may be imprecise in some technical details. The normalization rules are designed to prevent update anomalies and data inconsistencies. With respect to performance tradeoffs, these guidelines are biased toward the assumption that all non-key fields will be updated frequently. They tend to penalize retrieval, since data which may have been retrievable from one record in an unnormalized design may have to be retrieved from several records in the normalized form. There is no obligation to fully normalize all records when actual performance requirements are taken into account. 9.1. FIRST NORMAL FORM First normal form deals with the "shape" of a record type. Under first normal form, all occurrences of a record type must contain the same number of fields. First normal form excludes variable repeating fields and groups. This is not so much a design guideline as a matter of definition. Relational database theory doesn't deal with records having a variable number of fields. For Software Engineering Students 23 Brief Introduction to DBMS Example: Bad Good Name Address First_Name Last_Name Street_Address City Post Office Box 964, Ben Tempe, Goren Arizona 852800964 Jane Doe State Ben Goren Post Office Box 964 Tempe Arizona Jane Doe 1234 Main Street Mesa ZIP 852800964 Arizona 85345 1234 Main Street, Mesa, Arizona 85345 SECOND AND THIRD NORMAL FORMS Second and third normal forms deal with the relationship between non-key and key fields. Under second and third normal forms, a non-key field must provide a fact about the key, the whole key, and nothing but the key. In addition, the record must satisfy first normal form. We deal now only with "single-valued" facts. The fact could be a one-to-many relationship, such as the department of an employee, or a one-to-one relationship, such as the spouse of an employee. Thus the phrase "Y is a fact about X" signifies a one-to-one or one-to-many relationship between Y and X. In the general case, Y might consist of one or more fields, and so might X. In the following example, QUANTITY is a fact about the combination of PART and WAREHOUSE. 9.2 Second Normal Form Second normal form is violated when a non-key field is a fact about a subset of a key. It is only relevant when the key is composite, i.e., consists of several fields. Consider the following inventory record: --------------------------------------------------| PART | WAREHOUSE | QUANTITY | WAREHOUSE-ADDRESS | ====================------------------------------The key here consists of the PART and WAREHOUSE fields together, but WAREHOUSE-ADDRESS is a fact about the WAREHOUSE alone. The basic problems with this design are: The warehouse address is repeated in every record that refers to a part stored in that warehouse. If the address of the warehouse changes, every record referring to a part stored in that warehouse must be updated. Because of the redundancy, the data might become inconsistent, with different records showing different addresses for the same warehouse. For Software Engineering Students 24 Brief Introduction to DBMS If at some point in time there are no parts stored in the warehouse, there may be no record in which to keep the warehouse's address. To satisfy second normal form, the record shown above should be decomposed into (replaced by) the two records: ------------------------------- --------------------------------| PART | WAREHOUSE | QUANTITY | | WAREHOUSE | WAREHOUSE-ADDRESS | ====================----------- =============-------------------When a data design is changed in this way, replacing unnormalized records with normalized records, the process is referred to as normalization. The term "normalization" is sometimes used relative to a particular normal form. Thus a set of records may be normalized with respect to second normal form but not with respect to third. The normalized design enhances the integrity of the data, by minimizing redundancy and inconsistency, but at some possible performance cost for certain retrieval applications. Consider an application that wants the addresses of all warehouses stocking a certain part. In the unnormalized form, the application searches one record type. With the normalized design, the application has to search two record types, and connect the appropriate pairs. 9.3 Third Normal Form Third normal form is violated when a non-key field is a fact about another non-key field, as in -----------------------------------| EMPLOYEE | DEPARTMENT | LOCATION | ============-----------------------The EMPLOYEE field is the key. If each department is located in one place, then the LOCATION field is a fact about the DEPARTMENT -- in addition to being a fact about the EMPLOYEE. The problems with this design are the same as those caused by violations of second normal form: The department's location is repeated in the record of every employee assigned to that department. If the location of the department changes, every such record must be updated. Because of the redundancy, the data might become inconsistent, with different records showing different locations for the same department. If a department has no employees, there may be no record in which to keep the department's location. To satisfy third normal form, the record shown above should be decomposed into the two records: ------------------------- ------------------------| EMPLOYEE | DEPARTMENT | | DEPARTMENT | LOCATION | ============------------- ==============----------To summarize, a record is in second and third normal forms if every field is either part of the key or provides a (single-valued) fact about exactly the whole key and nothing else. 9.4. Functional Dependencies In relational database theory, second and third normal forms are defined in terms of functional dependencies, which correspond approximately to our single-valued facts. For Software Engineering Students 25 Brief Introduction to DBMS A field Y is "functionally dependent" on a field (or fields) X if it is invalid to have two records with the same X-value but different Y-values. That is, a given X-value must always occur with the same Y-value. When X is a key, then all fields are by definition functionally dependent on X in a trivial way, since there can't be two records having the same X value. There is a slight technical difference between functional dependencies and singlevalued facts as we have presented them. Functional dependencies only exist when the things involved have unique and singular identifiers (representations). For example, suppose a person's address is a single-valued fact, i.e., a person has only one address. If we don't provide unique identifiers for people, then there will not be a functional dependency in the data: ---------------------------------------------| PERSON | ADDRESS | -------------+-------------------------------| John Smith | 123 Main St., New York | | John Smith | 321 Center St., San Francisco | ---------------------------------------------Although each person has a unique address, a given name can appear with several different addresses. Hence we do not have a functional dependency corresponding to our single-valued fact. Similarly, the address has to be spelled identically in each occurrence in order to have a functional dependency. In the following case the same person appears to be living at two different addresses, again precluding a functional dependency. --------------------------------------| PERSON | ADDRESS | -------------+------------------------| John Smith | 123 Main St., New York | | John Smith | 123 Main Street, NYC | --------------------------------------We are not defending the use of non-unique or non-singular representations. Such practices often lead to data maintenance problems of their own. We do wish to point out, however, that functional dependencies and the various normal forms are really only defined for situations in which there are unique and singular identifiers. Thus the design guidelines as we present them are a bit stronger than those implied by the formal definitions of the normal forms. For instance, we as designers know that in the following example there is a singlevalued fact about a non-key field, and hence the design is susceptible to all the update anomalies mentioned earlier. ---------------------------------------------------------| EMPLOYEE | FATHER | FATHER'S-ADDRESS |============------------+-------------------------------| | Art Smith | John Smith | 123 Main St., New York | | Bob Smith | John Smith | 123 Main Street, NYC | | Cal Smith | John Smith | 321 Center St., San Francisco | ---------------------------------------------------------- | However, in formal terms, there is no functional dependency here between FATHER'S-ADDRESS and FATHER, and hence no violation of third normal form. For Software Engineering Students 26