Download Database Tables

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Microsoft Jet Database Engine wikipedia , lookup

Relational algebra wikipedia , lookup

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Ingres (database) wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Database Tables
Review of Definitions
• Database: A collection of interrelated data
items that are managed as a single unit.
• Database Management System: The software
application that organizes and retrieves data.
Common DBMS’s are Oracle, Microsoft SQL
Server, DB2, MySQL, and Access.
• Relational Database: A database consisting of
tables which follow the rules specified by E.F.
Codd.
More Definitions
• Record: A horizontal row in a table. “Record”
and “row” are used interchangeably.
• Field: A vertical column in a table. “Field” and
“column” are used interchangeably.
Table: A technical Definition
• The term “table” is loosely used to refer to any
spreadsheet-like grid of data.
• In a relational database, however, “table” refers
to the place in memory where the data is actually
stored.
• All DBMS’s offer different ways to look at the data
stored in tables, and these frequently look like
tables.
• But officially, a table is the place where the data
is actually stored—not just a view of the data.
Table Example
• Here’s a table of the teams in this year’s Tour de France.
• It contains all of the teams, and four fields of information about them.
View Example
• A view looks like a table, but it is not the place
where the data is stored:
• In this view, we are only showing three of the
22 teams, and only two of the four fields from
the Teams table.
• It would be wasteful and confusing to store
this information directly in the database along
with the table it came from.
To Repeat
• In a relational database, a table is where the
data is actually stored.
• Other displays of data may look like tables, but
if they are not actually the data containers,
they are views (or queries), not tables.
Entities
• Definition: An entity is a person, place, thing,
event, or concept about which data is
collected.
• Examples:
– Person: Student, Employee, Voter
– Place: City, County, State, Country
– Thing: Vehicle, Part, Computer
– Event: Game, Concert, Class
– Concept: Political System, Team,
Attributes
• Definition: An attribute is a unit fact that
characterizes or describes an entity.
• Suggest attributes for some of the entities
listed earlier.
Entities, Attributes, Tables, Fields
• In good database design, entities become
tables, and attributes become the fields
(columns) in those tables.
• However, deciding whether something is an
entity or an attribute isn’t always easy.
Entity or attribute?
• Suppose a particular company assigns a
company car to each employee.
• That car might be considered an attribute of
the employee.
• However, a car is itself a thing about which
data is collected: Make, model, license
number, etc. This makes it an entity.
• So: is car an attribute or an entity?
It depends
• If you are just recording one identifier about
the cars, such as license plate number, you can
consider it an attribute.
• If your database will include other information
about the cars, you should treat “car” as an
entity.
• In general, entities have attributes; attributes
do not.
Review Definitions
• Entity: Person, place, thing, event, or concept
about which data is collected.
• Attribute: A unit fact that characterizes or
describes an entity.
Primary Keys
• I said before that a table in a database is
where the data is actually stored.
• However, it takes more than just storing data
to make a proper database table.
It’s not easy being a table
• Just containing data isn’t sufficient to be a table.
• Consider the grid below. What might be wrong?
• There is no way to tell the first three rows apart.
• In a proper database table, each row (record) is
different from all others.
• The rows do not need to be unique in every column;
rows can have identical attributes.
Good tables have primary keys
• Definition: A primary key is a column or set of
columns that uniquely identifies each row in a
table.
• A key is a way to identify a particular record in
a table. The primary key is a field, or collection
of fields, that allows for quick access to a
record.
Candidate Keys
• For some entities, there may be more than
one field or collection of fields that can serve
as the primary key.
• These are called candidate keys.
Candidate Keys
• All candidate keys can open the lock;
• that is, they can uniquely identify a record in a
table.
• Consider a table containing U of M students.
• What are some possible candidate keys?
Compound Keys
• The primary key doesn’t have to be a single column. In many
cases, the combination of two or more fields can serve as the
primary key. For example, consider a table of all of the courses
offered at the University.
• The course number would not be sufficient to identify a single
course; there are 101’s in many departments, and probably plenty
of 373’s as well.
• Department won’t work either; there are many courses in IOE, as
well as Psychology, German, and Economics.
• But when you combine department and course number, you have
uniquely identified a course.
• Therefore, those two fields together can serve as the primary key
for the Course table.
• Definition--Compound Key: A primary key consisting of two or
more fields.
• Definition—Simple Key: A primary key consisting of one field.
Choosing a Key
• Picking/assigning a primary key can be tricky, and
is a matter of some controversy.
• Some books recommend that you always create a
surrogate or artificial key; by default, Access does
this for (or to?) you.
• The Databases Demystified/Goodsell approach:
1. if the table already has a natural primary key use it;
2. if not, then use a combination of fields that
guarantees uniqueness;
3. only if you can’t find a natural simple or compound
key should you resort to creating a surrogate key.
“Natural” Keys
• Many entities already have widely accepted and enforced
keys that you can use in your database—these are termed
“natural” keys.
• Definition: A “natural key” is a pre-existing or ready-made
field which can serve as the primary key for a table.
• For employees, social security number is widely used. You
can have two Tim Joneses working for you, but only one
123-45-6789.
• For vehicles, you can use Vehicle Identification Number
(VIN), a unique 17-character code that every vehicle
manufactured is required to have.
• Most “natural” keys aren’t really natural; they are simply
artificial or “surrogate” keys that someone else created and
that are now widely recognized.
Election Day
• Databases Demystified spells out how to choose a primary
key:
1.
2.
3.
4.
If there is only one candidate, choose it.
Choose the candidate least likely to have its value change.
Changing primary key values once we store the data in tables
is a complicated matter because the primary key can appear
as a foreign key in many other tables. Incidentally, surrogate
keys are almost always less likely to change compared with
natural keys.
Choose the simplest candidate. The one that is composed of
the fewest number of attributes is considered the simplest.
Choose the shortest candidate. This is purely an efficiency
consideration. However, when a primary key can appear in
many tables as a foreign key, it is often worth it to save some
space with each one.
Surrogate Keys
• Occasionally, you will come across a table that has NO
candidate keys.
• In this case, you must resort to a surrogate key (Databases
Demystified refers to this as an “act of desperation”)
• Definition: A “surrogate key” is a field (usually numeric)
added to a table as the primary key when no natural keys
are available.
• When you have to resort to a surrogate key, don’t try to put
any meaning into it.
• The values in surrogate key fields frequently become
important identification numbers: Social Security Numbers,
Student IDs, VINs, SKUs (in stores), etc. They are frequently
associated with some sort of ID card or tag.
Surrogate Key Example
• Suppose you run a chicken farm with lots of
chickens.
• You want to be able to track the chickens—when
they were hatched, weights at certain ages, how
many eggs they lay, etc.
• The chickens don’t have names, social security
numbers, uniqnames, or any other simple way to
tell them apart.
• Therefore, you put a numbered tag on each
chicken. This number becomes the surrogate key
in your Chickens table.
Surrogate Keys
It is tempting to make the primary key be a sort
of code: a grocery store might say “let’s make a
product key out of department, brand, name,
and size,” yielding a key value something like
“CEREAL-KELL-FROO-15”.
These are keys that manage to violate
normalization rules in one data cell! If Post buys
Kelloggs, do you change the key for Froot
Loops?
If you must have a surrogate key, make it an
integer with no extra meaning attached.
Natural vs. Surrogate Keys
• Using a surrogate key when a natural key is available can defeat the
whole purpose of having a primary key. This example shows how
using an AutoNumber primary key allows duplicate entries to be
made (StaffID is the primary key):

If SocialSecurityNumber were the primary key, this duplicate entry
would not be allowed by the DBMS.
Formatting Conventions
Midterm
Material!
• Tables should generally be named as plural nouns, with the
first letter capitalized: Customers, Orders, Products, etc.
• Table names should not have any spaces or punctuation in
them. The following are bad table names: Bob’s Customers,
Back Orders, hours/day. (Details later.)
• When designing a table, the primary key field or fields should
be at the far left of the table.
• It is common to see the primary key field(s) highlighted in
some way—underlined, bold, with an asterisk, etc.
• In online quiz 2, we highlight the primary key in orange, like
this:
Rules for Table and Field Names
• No Spaces! Access will allow you to name a
table “My Company’s Employees”, but this
causes many problems down the road,
especially when interfacing with VB.
• “Employees” would be a much better name
• The whole database probably relates to “My
Company”, so there’s no need to include that
in the table name. Even if there were—NO
SPACES!!!!!
Rules for Table and Field Names
• No punctuation: Most DBMS’s (except for
Access) won’t allow any punctuation in the
name of a field or table except the underscore
(_).
• I’m not a fan of underscores, either—they
tend to be obscured by hyperlinks and such.
• For this class, if your table or field needs a
multi-word name, use InteriorCapitals.
Access Keywords
• Access has many keywords with special
meaning that cannot be used as field or table
names.
• A complete list is here.
Basic Table Design
• “Designing a table” means defining the fields
(columns) in the table.
• Adding rows (data) to a table isn’t designing
it—the proper term for adding rows is
“populating” the table.
Defining Fields
• In Excel, designing an column is easy—you just
type a word or two in the cell at the top of a
column.
• For a true relational database, designing
columns takes more work. You must give a
column a legal name (which varies somewhat
between DBMS’s), a data type, and sometimes
some constraints as well.
Designing a “Uniqname” Column
• If you were creating a table of UM
students in Excel, you might just type the
word “Uniqname” in cell A1 to get
started.
Relational Database Column Design
• In a true relational database, you would define
the column this way:
– The name, “Uniqname”, is valid in every DBMS I know
about, so you can use that. (Valid field names are
discussed later.)
– The data type would be Text in Access (some DBMS’s
might call it something else—String or Varchar, for
example)
– Constraints could include:
• Minimum of 1 character
• Maximum of 8 characters
• Letters only; no numbers, spaces, or punctuation
Practice
• Which one of the following is a good table name?
–
–
–
–
IOE 373 Students
University
Stores
Miles/Gallon
• Which one of the following is NOT a good table
name?
–
–
–
–
Women
Order Details
OrderDetails
Locations
Foreign Keys
• Definition: A foreign key is a field (or fields) in
a table that is not the primary key in that
table, but IS the primary key in another table.
• Foreign keys are used for creating
relationships (linking tables together),
something we’ll look at in the next lecture.
Don’t Forget!
• Good relational database design is about
optimizing how the data is STORED, not how it is
DISPLAYED.
• Most “tables” you have seen—in books, in
lectures, on the web—were probably optimized
for display, not for storage.
• Relational database tables are designed for
consistency and to reduce redundancy. They are
not designed for appearance.
• When we learn SQL and Visual Basic, we will look
at various ways to display the data stored in
relational database tables.