Download XSQLUnit10OverheadsOlder

Document related concepts

Microsoft SQL Server wikipedia , lookup

SQL wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Join (SQL) wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Transcript
SQL Unit 10
Creating Tables, Adding Constraints,
and Defining Indexes
Kirk Scott
1
•
•
•
•
•
10.1
10.2
10.3
10.4
10.5
The Keyword CREATE
Background on General Constraints
Syntax for General Constraints
What are Indexes?
Syntax for Indexes
2
10.1 The Keyword CREATE
3
10.1.1 Creating Tables
• Microsoft Access has a graphical user interface
for defining database tables.
• This interface allows you to name the fields in
a table, specify their types and sizes, designate
a key, and also add various kinds of
formatting, data integrity, and referential
integrity constraints.
4
• Assuming you already have an understanding
of relational databases, this is probably the
preferred way of creating a table definition.
• It will be covered in Unit 12, when explaining
the final project assignment.
• In the meantime, it is still useful to see how
these things are done in SQL.
5
• Dealing with SQL syntax is not as convenient
as dealing with a graphical user interface, but
using SQL has the advantage that everything is
made explicit.
• If you have mastered the SQL, then it's safe to
say that you will be able to manage the kinds
of things you'll find in the graphical user
interface.
6
• The converse is less likely to be the case.
• Working with the interface may not be the
easiest way to get a clear, organized picture of
what table creation is all about.
• Certain critical points may be hidden in the
interface and easily overlooked.
7
• On the overhead following the next one, an
example is given of using the CREATE TABLE
command in SQL, where one of the tables of
the sample database is created.
• Everything that is shown is correct, and it is
sufficient to create the table.
• In this command the primary key is not
designated.
8
• In the long run, this is not ideal, but it is
acceptable.
• Under the covers, the system will in effect
supply a hidden primary key which it will use
to prevent duplicate records, and so on.
• The syntax for specifying primary and foreign
key fields will be given later in this unit.
• Notice that the SQL syntax parallels the
schema notation for a table.
9
•
•
•
•
•
•
•
CREATE TABLE Car
(vin
TEXT(5),
make
TEXT(18),
model
TEXT(18),
year
TEXT(4),
stickerprice CURRENCY,
dealercost
CURRENCY)
10
• Although strictly speaking the command
above is not a query, it can be entered as a
query in Microsoft Access using the SQL
editor.
• Clicking the execute button will cause the
table to come into existence.
11
10.2 Background for General
Constraints
• 1. A simple example of creating a table using
SQL is given on the next overhead.
• The fields and their types and sizes are
defined.
• The primary key field is not defined.
• This example will be used to illustrate how
other characteristics, or constraints, can be
added to the table definition.
12
•
•
•
•
CREATE TABLE Person
(SSN
TEXT(9),
lastname
TEXT(12),
dob
DATE)
13
• In a complete table definition it would be
desirable to specify the primary key.
• Remember that the primary key value has to be a
unique identifier for each record in the table and
no part of the primary key can be null.
• These requirements together are formally known
as entity integrity.
• Defining a primary key field is a kind of constraint,
which will be shown shortly.
14
• 2. It may be desirable to require that other
fields in a table besides the primary key be
unique or not null.
• It is possible to have a situation like this:
• The Person table is redefined so that it has a
personid field which is different from the SSN,
but the SSN is still included.
• This is shown on the next overhead.
15
•
•
•
•
•
CREATE TABLE Person
(personid
TEXT(12),
lastname
TEXT(12),
dob
DATE,
SSN
TEXT(9))
16
• In a situation like this, the personid should be
unique and not null, because it's the key.
• It would also be desirable for the SSN to be
unique and probably not null.
• Enforcing this would be another kind of
constraint that could be added to the table
definition.
17
• It is also possible to have a situation where it
is possible for a field to have duplicate values
in different records, but you don't want to
allow null values.
• For example, in the Person table, you may not
wish to allow entries for people who do not
have names.
• This is yet another example where a constraint
would be used.
18
• 3. Referential integrity defines the requirements for a
primary key to foreign key relationship between two
tables or between a table and itself.
• Consider this alternative definition of the Person table:
•
• CREATE TABLE Person
• (SSN
TEXT(9),
• lastname
TEXT(12),
• motherSSN
TEXT(9),
• dob
DATE)
19
• Let there be a Mother table which also has a
primary key named SSN.
• The motherSSN field in the Person table is known
as a foreign key and it refers to the SSN field in
the Mother table.
• Referential integrity states that the motherSSN
field in the Person table cannot contain values
which do not exist in the SSN field in the Mother
table.
• Enforcing referential integrity is another kind of
constraint.
20
• 4. When including additional specifications or
conditions in a table definition, these are known
generally as constraints.
• In general, it is also possible to add constraints to
table definitions after the tables have been
created.
• This unit will cover including constraints in the
original definition.
• The next unit will cover adding or dropping
constraints after a table has been created.
21
• If constraints are named, this makes it
possible to refer to them later on, in
particular, so that they can be removed from
the table.
• There are various forms of the syntax for
constraints.
• Not all of the forms will be shown below, just
a consistent set of forms that should be
relatively easy to remember.
22
10.3 Syntax for General Constraints
• 1. This example shows the syntax for specifying a
primary key in a table definition:
•
• CREATE TABLE Person
• (SSN
TEXT(9),
• lastname
TEXT(12),
• dob
DATE,
• CONSTRAINT personpkSSN PRIMARY KEY(SSN))
23
• As usual, the keywords are capitalized.
• The field name SSN happens to be capitalized
too in this example, but that is a coincidence.
• It is a good idea to give the constraint a
descriptive name.
• The name can't have spaces in it.
24
• 2. Recall that it is possible to have a table with
a concatenated key field.
• This means that the unique identifier for a
record in the table is the combination of the
values of two different fields in the table.
25
• This can happen when there is a many-to-many
relationship, and the primary keys of both of the
tables in the many-to-many relationship are
embedded as foreign keys in a table in the
middle.
• Assuming that there was a Chimpanzee table
with chimpid as its primary key, the relationships
between chimps could be captured by the table
design given in the next example.
26
• The table's primary key would be the
concatenation of chimpid1 and chimpid2.
• You could specify the primary key by including
the line shown at the end of the table
definition.
• All you have to do is list the concatenated key
fields inside the parentheses, separated by
commas.
27
•
•
•
•
•
•
CREATE TABLE Chimprelationships
(chimpid1
TEXT(6),
chimpid2
TEXT(6),
beginningdate
DATE,
enddate
DATE,
CONSTRAINT chimppk PRIMARY
KEY(chimpid1, chimpid2))
28
• 3. The next example shows the syntax for
specifying the primary key and also for
specifying that another field in the table be
unique.
29
•
•
•
•
•
•
CREATE TABLE Person
(personid
TEXT(12),
lastname
TEXT(12),
dob
DATE,
SSN
TEXT(9),
CONSTRAINT personpkpersonid PRIMARY
KEY(personid),
• CONSTRAINT SSNunique UNIQUE(SSN))
30
• The personid field will be constrained to be
unique because it's the primary key.
• The SSN field will be constrained to be unique
by the separate uniqueness constraint on it.
• As before, the key words are shown
capitalized.
• It's a coincidence that the field SSN is also
capitalized.
31
• 4. Specifying NOT NULL as a constraint on a
table is slightly different from the other
constraints, because it is not named.
• All you have to do is put the constraint after
the relevant field in the table definition:
• This is shown on the next overhead.
32
•
•
•
•
CREATE TABLE Person
(SSN
TEXT(9),
lastname
TEXT(12) NOT NULL,
dob
DATE)
33
• 5. When putting a referential integrity
constraint into a database design, the
constraint goes into the foreign key table, not
the primary key table.
• Let there be a table named Mother with a
primary key field defined as shown on the
next overhead.
34
•
•
•
•
CREATE TABLE Mother
(SSN
TEXT(9),
…,
CONSTRAINT motherpkSSN PRIMARY
KEY(SSN))
35
• Then a foreign key constraint in the Person
table would be as shown on the next
overhead.
36
•
•
•
•
•
•
•
CREATE TABLE Person
(SSN
TEXT(9),
lastname
TEXT(12),
motherSSN
TEXT(9),
dob
DATE,
CONSTRAINT personpkSSN PRIMARY KEY(SSN),
CONSTRAINT personfkmother FOREIGN
KEY(motherSSN) REFERENCES Mother(SSN))
37
• Notice that there are two sides to the foreign key
constraint.
• It is not possible to enter new values into the
foreign key table, or update values in the foreign
key table to ones that don't exist in the primary
key table.
• It would also violate referential integrity if there
were changes in the primary key table that left
values in the foreign key table without matches in
the primary key table.
38
• Referential integrity is so important that the
system also protects the database contents from
changes in the primary key table.
• There are two possibilities:
• 1) If a primary key record is deleted, if it had
corresponding foreign key records, they would be
orphaned.
• It is most common in this case to disallow such
deletions.
• This is known as "ON DELETE RESTRICT".
39
• 2) If the primary key value is updated, if that
value had matches in the foreign key table,
they would be orphaned.
• It is most common in this case to specify that
the corresponding foreign key records be
updated to match.
• This is known as "ON UPDATE CASCADE".
40
• The next overhead shows how the foreign key
constraint example would look with DELETE
and UPDATE restrictions/cascades explicitly
specified:
41
•
•
•
•
•
•
•
CREATE TABLE Person
(SSN
TEXT(9),
lastname
TEXT(12),
motherSSN
TEXT(9),
dob
DATE,
CONSTRAINT personpkSSN PRIMARY KEY(SSN),
CONSTRAINT personfkmother FOREIGN
KEY(motherSSN)
• REFERENCES Mother(SSN)
•
ON DELETE RESTRICT
•
ON UPDATE CASCADE)
42
• Notice that with these options set, the system
is doing a lot of work on behalf of the user,
protecting the integrity of the data in the
related tables.
43
10.4 What are Indexes?
44
• 1. An index can be described as a construct
that supports two-column lookup.
• Suppose you're interested in words and their
locations in a book.
• You look up the word in the index, and what
you find is its page number.
45
• This is a somewhat more detailed description
of the situation:
• A) The words in a book don't occur in sorted
order.
• They appear in sentences and paragraphs in
an order that is determined by the topic under
discussion and the rules of grammar.
46
• B) The index of a book consists of the
important words in the book sorted in
alphabetical order, followed by the page
numbers where those words appear.
• This is your two column lookup.
• You look up the word, and what you find is the
page where it occurs.
47
• 2. Being able to look things up is critical to the
internal operation of a database management
system and the execution of queries.
• Remember that technically tables are like sets:
• Their contents do not have to be kept in any
particular order.
48
• If you want to see the contents of tables in
sorted order, you know that you can put the
key words ORDER BY in a query, but this
doesn't change the order in which the records
are stored.
• You may have noticed that if you don't specify
ORDER BY in a query, the results tend to come
out sorted in primary key order.
49
• This still doesn't signify that the contents of
the table are maintained in that order.
• It just means that that order may be the
default order for results in some cases.
• It is generally the case that the records in a
table are simply stored in the same order that
they were entered into the table.
50
• Shown below is a very simple example of a
situation like this.
• The records are not sorted on any of the data
fields.
• In order to keep track of what's going on, an extra
field is shown which simply indicates which is
first, which is second, and so on.
• The technical term for this extra field is the
relative record number, and it is abbreviated RRN:
51
RRN
1
2
3
4
idno
2
4
1
3
Simple Person Table
name
Pam
Lee
Ned
Kim
age
20
18
21
22
52
• It should be clear that if you know the RRN for
a given record, then it is easy to find that
record and all of the data values that go with
it.
• The RRN is analogous to a page number in the
book index example.
53
• It is important to emphasize that the RRN is
invisible to the database user.
• You never see this as part of a table.
• However, it is conceptually useful to talk about
RRN's when trying to explain what a database
index is and how it works.
54
• 3. An index in a database management
system is a data structure separate from a
table that makes it possible for the system to
easily find specified information.
• Suppose you frequently wanted to look up
records for the people in the example table by
idno.
• You could specify that the system maintain an
index on idno.
55
• The little example table given above is shown
again on the next overhead.
• Along with it is shown a conceptual
representation of an index on the idno field.
• The index is not actually maintained as a twocolumn lookup table and the index is invisible
to the user, but the example illustrates how it
can aid in data retrieval.
56
RRN
1
2
3
4
idno
1
2
3
4
idno
2
4
1
3
Simple Person Table
name
Pam
Lee
Ned
Kim
age
20
18
21
22
Index
RRN
3
1
4
2
57
• Doing the lookup consists of these two steps:
First look up the idno in the index, where you find
the RRN.
• Then use the RRN to find the corresponding
record in the table.
• For the first step, it is not hard to find one out of
a set of values when they’re maintained in sorted
order.
• For the second step, it is straightforward to find
the corresponding record when given its RRN.
58
• The result is that if an index on the desired
field exists, a query that uses the index will
run more quickly than a query that doesn’t.
• Just to emphasize what an index is and how it
gives improved access to the contents of a
table, here are lookup table representations of
indexes on the other two fields of the Simple
Person table:
59
name
Kim
Lee
Ned
Pam
Index
RRN
4
2
3
1
age
18
20
21
22
Index
RRN
4
2
1
3
60
• 4. There is syntax in SQL that allows you to
specify that a given table have an index on a
given field.
• The index is created when the command is given,
and from that point on the system will maintain
the index.
• A table is not indexed on all fields by default—
that’s why it’s necessary for the user to
specifically create one.
• There is also syntax to remove an index if it is no
longer wanted.
61
• The benefits and uses of indexes can be more
completely outlined as follows:
• The benefits of indexes come at query
execution time.
• For example, if you write a query that
requests that the results be ordered on a
given field, that will be speeded up if there is
an index on that field.
62
• If you write a query that has a WHERE clause on a
given field, finding the records that match will be
speeded up if there is an index on that field.
• If you write a join query on a field in a table, the
join query will be speeded up if there is an index
on that field.
• If you have requested that an index be
maintained on a certain field, the system will
automatically use the index for those queries
where it helps.
63
• Whether a query needs to access all of the
records in sorted order or whether it needs to
be able to find specific records that match a
condition, an index can help.
• For the small tables in the example database,
the difference in speed will not be noticeable.
• However, if a table had thousands or tens of
thousands of records, the difference would be
significant.
64
• Maintaining indexes, although not visible to the
user, involves a cost.
• If a table has an index on it, every time the user
inserts a new record into a table, deletes a record
from the table, or updates the value of the field
the index is on, not only does the table have to be
changed, but the index also has to be changed.
• This causes insert, update, and delete operations
to run somewhat more slowly than they
otherwise would.
65
• Indexes also take up space.
• Any one index alone would not be too costly
to maintain, but it is not practical to
automatically index on ever field of every
table.
• The user has to choose which indexes would
be worth having, and create only those.
66
• There is one last thing to consider:
• If a table definition includes the specification
of a primary key field, the system will
automatically create and maintain an index on
that field.
• It is the index that is used in order to enforce
uniqueness and the not null requirement on
the field.
67
• Because the values in the index are maintained in
sorted order, if a user tries to enter a duplicate
value, it is immediately detected when the index
is updated, and the duplicate will be rejected.
• Having an index on the primary key is useful
because the primary key field is the field most
frequently used to access a table.
• The existence of the index is the reason that
query results tend to be shown in primary key
order, even if no ORDER BY was included in them.
68
10.5 Syntax for Indexes
• 1. Creating indexes is not exactly the same as
creating tables or specifying constraints, but
indexes are related to both ideas.
• Indexes are created using the CREATE keyword,
like creating a table.
• They are not created by adding a line to a table
definition.
• As explained above, some constraints are actually
implemented by means of indexes. The creation
of an index is not specified in a table definition.
69
• When creating an index you have to give it a
name, tell which table it’s on, and specify which
field (or fields) it’s on.
• The default sort order for the index field is
ascending, but it is also possible to specify a sort
order, either ASC or DESC.
• If you frequently wrote queries on a table where
you requested the results in descending order,
then DESC would be a desirable option.
70
• Here is a simple example of the creation of an
index:
• CREATE INDEX lastnameindexasc
• ON Person(lastname)
71
• Here is an example with a different sort order:
•
• CREATE INDEX lastnameindexdesc
• ON Person(lastname DESC)
72
• You can create the index before any data are
in the table.
• You can also create it after data have already
been entered.
• If you create it afterwards, if there are lots of
existing records, it may take the system a
noticeable amount of time to create the
index.
73
• Just like it’s possible to have concatenated key
fields, or do ORDER BY on more than one field
at a time, it is possible to have an index on
more than one field.
• If you had a city and state field in the Person
table and frequently ran queries containing
ORDER BY state, city, this might be a useful
index to have:
74
• CREATE INDEX statecityindex
• ON Person(state, city)
75
• 2. As already pointed out, the system
automatically creates an index for the primary
key field because the index can be used to
enforce uniqueness and the not null requirement.
• When specifying the uniqueness and not null
constraints separately, this also causes the
creation of an index.
• If the user is creating an index anyway, it is also
possible to specify these constraints as part of
the index rather than specifying them separately.
76
• For example, you could create an index on a
person’s lastname and enforce uniqueness at
the same time in this way:
• CREATE UNIQUE INDEX uniquelastnameindex
• ON Person(lastname)
77
• If you wanted to prevent null values for
lastname, you could do this:
•
• CREATE INDEX notnulllastnameindex
• ON Person(lastname)
• WITH DISALLOW NULL
78
• Not all of this syntax is guaranteed to work in
Microsoft Access SQL, but the general
specification of an index can be shown using
special notation.
• The symbols in the notation are read in this way:
• [] means something that’s optional.
• | means “or”, that is something where you can
choose from a list.
• {} encloses a list to choose from.
79
• As you can see, the complete syntax allows you to
specify an index that may or may not enforce
uniqueness and the not null requirement, and it
can also specify that the field being indexed on is
the primary key or not.
• CREATE [UNIQUE] INDEX index_name
• ON table_name(field_name(s) [ASC|DESC])
• [WITH {DISALLOW NULL | IGNORE NULL |
PRIMARY}]
80
The End
81