* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download week5
Survey
Document related concepts
Transcript
Indexing and Joins Indexes • Increase database performance • must be explicitly defined • once defined, are transparent to the user • once created, the system maintains it • more than one can exist on a given table Creating an Index • Syntax CREATE [UNIQUE] INDEX index_name ON table_name (column_name) • Example create index auind on authors (au_id) Composite Index •Used when columns have a logical relationship and would be searched as a unit •Example create index au_name_ind on authors (au_lname, au_fname) •order not important, but performance is better when primary search col is first 2 Kinds of Indexes • Unique Index • Clustered Index Unique Index • • • • No 2 rows are permitted to have the same value system checks data upon creation and data addition rejects duplicates and returns an error should only be created on a column that requires uniqueness eg. ssn, acct code • can be created as a composite or single column • helps in maintaining data integrity • boosts search performance Clustered Index •System sorts rows on an ongoing basis so that the physical order is the same as the indexed order •only 1 can exist per table •should only be created for a column that is most often retrieved in order •greatly increases performance when searching for contiguous key values… especially a range •slows down data updates due to the sorting involved Things to Consider • Indexes greatly increase query response time • every index requires system resources to store and maintain • indexes can actually slow down the performance of UPDATES, INSERTS, and DELETES due to index maintenance So… don’t over index What Should We Index? • Any column frequently used in retrieval • primary key columns • columns that are often queried in a sorted order • columns that are used in joins • columns that are often searched for ranges We Should NOT Index… • Columns rarely used in queries • columns with 2 or 3 possible values eg. Male or Female • small tables SQL-92 Create Table Constraints • PRIMARY KEY – rejects duplicates and nulls • UNIQUE – rejects duplicates, allows nulls • DEFAULT – inserts the default value when no value is entered • CHECK – validates data format • FOREIGN KEY and REFERENCES – ties foreign key to the primary key it references Put it on paper! Column Datatype Null? Key title_id title char(6) varchar(80) not null not null primary, unique unique type pub_id price advance char(12) char(4) money money Default Reference 2 letter then 2 dig unclass null null Check business, mod_cook, trad_cook publishers, pub_id Then write your SQL create table title (title_id char(6) not null constraint tididx primary key constraint tidcheck check (title_id like ‘[A-Z] [A-Z] [0-9] [0-9]…’), title varchar(80) not null constraint titleidx unique, type char(12) default ‘unclassified’ null constraint typechk check (type in(‘business’, ‘mod_cook’, ‘trad_cook’)), pub_id char(4) null reference publishers (pub_id), price money null, advance money null) Changing a Table • Syntax – ALTER table table_name add column_name datatype null|not null Removing Objects • Database – DROP DATABASE db_name – deletes ALL tables and data within it!! • Table – DROP TABLE table_name – deletes table and its contents • Index – DROP INDEX table_name.index_name – deletes named index on named table Joins In order to maintain normalization in the database design it is necessary to break up data into separate tables. The data can then be re-associated through the use of a join. Joins • • • • • • What columns do I need? What tables have these columns? Are all the tables related in some way? If not, are there other tables that can relate them? How are they all related? Link them together by setting their common fields equal in the WHERE clause. • Restrict the WHERE clause to the record(s) of interest. What to join? • Key columns are the best since these were created for the purpose of existing as a reference. • Should have similar data • Should be the same datatype • nulls will not join since their value is not known. Syntax • Usually best to put the join conditions first in the WHERE clause • Use of aliases greatly simplifies the statement. • Any logical operator can be used. • A self-join can be performed on the same table by qualifying it twice. Self Join Which authors in Oakland have the same zip code? Select distinct au1.au_fname , au1.au_lname, au1.zip from authors au1, authors au2 where au1.city = “Oakland” and au1.zip = au2.zip and au1.au_id != au2.au_id How a Join is Processed • First the system obtains the Cartesian Product of all tables in join Cartesian Product - the matrix of all possible combinations that could satisfy the join • The select list is used to restrict the columns returned • The WHERE clause is then used to restrict the rows return that satisfy the query 2 ways of looking at a Join • Looking at all the tables, linking them together and treating them like one big table. • Setting the main search criteria and then linking the common fields to the data that is of interest.