Download Lecture Note 9

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

SQL wikipedia , lookup

IMDb wikipedia , lookup

Oracle Database wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Ingres (database) wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Concurrency control wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Database wikipedia , lookup

ContactPoint wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
CSUN Information Systems
Systems Analysis & Design
http://www.csun.edu/~dn58412/IS431/IS431_F15.htm
Database Design
IS 431: Lecture 9
1
Database Design in SDLC
IS 431 : Lecture 9
2
Database Design
 Elements of a Database
 Evolution of Database Systems
 DBMS Architecture
 Relational Data Model
and Relational Database Systems
IS 431 : Lecture 9
3
Data Fields
A field is the physical implementation of a data
attribute. They are the smallest unit of meaningful
data.
A key (identifier) is a field having unique values to
identify one and only one record in a file.
A secondary key is an alternate identifier for a record.
A descriptive field is any other (non-key) field that
stores business data.
IS 431 : Lecture 9
4
Records
A record is a collection of fields arranged in a
predefined format.
– Fixed-length record structures
– Variable-length record structures
IS 431 : Lecture 9
5
Files
A file is the set of all occurrences of a given record
structure.
– Types






Master files
Transaction files
Document files
Archival files
Table lookup files
Audit files
– File organization: index, sequential …
– File access: direct, sequential …
IS 431 : Lecture 9
6
Conventional Files vs.
the Database
File – a collection of similar records.
– Files are unrelated to each other except in the code of an
application program.
– Data storage is built around the applications that use the
files.
Database – a collection of interrelated files
– Records in one file (or table) are physically related to
records in another file (or table).
– Applications are built around the integrated database
IS 431 : Lecture 9
7
Files vs. Database
IS 431 : Lecture 9
8
Pros and Cons of
Conventional Files
 Pros:
– Easy to design because of their single-application
focus
– Excellent performance due to optimized organization
for a single application
 Cons
– Harder to adapt to sharing across applications
– Harder to adapt to new requirements
– Need to duplicate attributes in several files.
IS 431 : Lecture 9
9
 Pros:
Pros and Cons of
Databases
– Data independence from applications increases adaptability
and flexibility
– Superior scalability
– Ability to share data across applications
– Less, and/or controlled redundancy (total non-redundancy is
not achievable)
 Cons:
–
–
–
–
–
More complex than file technology
Somewhat slower performance
Investment in DBMS and database experts
Need to adhere to design principles to realize benefits
Increased vulnerability due to consolidating data in a
centralized database
IS 431 : Lecture 9
10
Logical Structure of
a Database
VIEW A
VIEW B
VIEW C
EXTERNAL
LEVEL
Inventory
Sales
Customer
CONCEPTUAL
LEVEL
Cash
Receipt
INVENTORY RECORD
Item_No
Description
Cost
INT(5), non-null, index=itemx
CHAR(20)
CURRENCY (6,2)
IS 431 : Lecture 9
INTERNAL
LEVEL
11
Evolution of
Database Systems
 File Management (Flat File) Systems
 Hierarchical Databases
 Network Databases
 Relational Databases
 Object-Oriented Databases
 Data Warehouse
IS 431 : Lecture 9
12
File Management Systems
EMPLOYEE UPDATE PROGRAM
FD
EMPLOYEE
MASTER
FILE
EMPLOYEE REPORT PROGRAM
FD
CHECK-WRITING PROGRAM
FD
TIMECARD
FILE
FD
IS 431 : Lecture 9
13
Hierarchical Databases
Car
Engine
Body
Left
Door
Handle
Right
Door
Window
IS 431 : Lecture 9
Chassis
Hood
Roof
Lock
14
Network Databases
CUSTOMERS
Acme
Mfg.
#11231
PRODUCTS
First
Corp.
#11232
Size 4
Widget
#11233
ORDERS
IS 431 : Lecture 9
#11234
4D
Bolt
#11235
15
Relational Databases
CUSTOMERS
CUST ID
PRODUCTS
1
1
PRODUCT ID
ORDERS
M
ORDER #
CUST ID
PRODUCT ID
QUANTITY
IS 431 : Lecture 9
M
16
Object-Oriented
Databases
CUSTOMERS
CUST ID
1
1
PRODUCTS
PRODUCT ID
CUST NAME
PRICE
ADDRESS
QTY-ON HAND
Add Customer
Drop Customer
Change Customer
*
ORDER #
Buy Product
CUST ID
Sell Product
PRODUCT ID
Data
New Product
ORDERS
*
QUANTITY
Take Order
Method
Drop Order
IS 431 : Lecture 9
17
Data Warehouse
IS 431 : Lecture 9
18
Difficulties of
Non-Relational Tables
 Update Anomaly: not changing all occurrence of a
data item (in many places)
 Insert Anomaly: add an invalid (null record) to the
database
 Delete Anomaly: not remove all info (in many
places) about a deleted record
IS 431 : Lecture 9
19
Relational Data Model
 Table (Relation)
– Row: specific occurrence of an entity
– Column: characteristic of an entity, must be single-valued
and same data type for all occurrences
 Primary Key: unique identifier of an occurrence of an entity
(entity integrity)
 Foreign Key: to link tables together, must be null or
corresponding to value of a primary key in another table
(referential integrity)
IS 431 : Lecture 9
20
Requirements of
Relational Data Model
 Primary Key must be unique
 Foreign Key must either be null or corresponding to
existing value of a primary key of another table (to link
with this table)
 Column describes a characteristic of an object identified by
primary key
 Column is single-value ( ex: not “$100, $250” )
 Values in rows of a specific column must be of the same
data type
 Column order or row order are unimportant
IS 431 : Lecture 9
21
Database Integrity
 Entity integrity: error exists when identifiers
(primary keys) are not unique to identify specific
instances of the entity (2 customers have a same
identification)
 Referential integrity: error exists when a foreign key
value in one table has no matching primary key
value in the related table (Sell to non-existing
customers)
 Domain integrity:error exists when field value is
outside the range (a pencil costs $100,000 !!!)
IS 431 : Lecture 9
22
Data Architecture
A business’s data architecture defines how that business will
develop and use files and databases to store all of the
organization’s data; the file and database technology to be
used; and the administrative structure set up to manage the
data resource.
Data is stored in some combination of:
–
–
–
–
–
Conventional files
Operational databases (also called transactional databases)
Data warehouses
Personal databases
Work group databases
IS 431 : Lecture 9
23
A Modern Data
Architecture
Users and
Programmers
Information System
(built in-house)
File
Users and
Programmers
A legacy
file-based
information
system
(built
in-house)
File
Information
System
(built
in-house)
Operational
Database
File
End-User
Tools
File
Data
Warehouse
End-User
Applications
File
Users and
Programmers
A legacy
file-based
information
system
(purchased)
Information
System
(built
in-house)
Personal
DB
Users
File
File
Operational
Database
Information
System
(purchased)
Work-Group
Database
End-User
Work Group
Users and
Programmers
IS 431 : Lecture 9
24
Database Architecture
Database architecture refers to the database technology
including the database engine, database utilities, CASE tools,
and database development tools.
A database management system (DBMS) is specialized
software that is used to create, access, control, and manage the
database. The core of the DBMS is a database engine.
– A data definition language (DDL) is that part of the engine used to
physically define tables, fields, and structural relationships:
(CREATE TABLE; ALTER TABLE …)
– A data manipulation language (DML) is that part of the engine used
to create, read, update, and delete records in the database, and
navigate between different files (tables) in the database: (INSERT;
SELECT… FROM …WHERE…)
IS 431 : Lecture 9
25
Typical DBMS
Architecture
IS 431 : Lecture 9
26
Goals of Database Design
 A database should provide for efficient storage,
update, and retrieval of data.
 A database should be reliable—the stored data should
have high integrity and promote user trust in that data.
 A database should be adaptable and scalable to new
and unforeseen requirements and applications.
IS 431 : Lecture 9
27
Normalization Revisited
 First normal form (1NF) :
– No repeating group of a same attribute
– If not: create a new entity for this group.
 Second normal form (2NF)
– Attributes depend on the (composite) key, not part of it.
– If not: create a new entity for these partial depended
attributes
 Third normal form (3NF)
– Attributes depend on the (primary) key only, not on each
other
– If not: create new entity for these partial depended
attributes
IS 431 : Lecture 9
28
Database Models
An Entity Relationship Diagram (ERD) is the
logical model of the data requirements (data to be
stored in the system and their relationships.
A Database Schema (physical data model) is the
blueprint of the planned implementation of the logical
model.
One can also use Relational Data Model as a
blueprint of relational database design.
IS 431 : Lecture 9
29
Physical Data Types
Logical Data Type
to be stored in field)
Physical Data
Type MS Access
Fixed length character data
(use for fields with relatively
fixed length character data)
Variable length character data
(use for fields that require
character data but for which
size varies greatly--such as
ADDRESS)
Very long character data (use
for long descriptions and notes-usually no more than one such
field per record)
Integer number
TEXT
Decimal number
NUMBER
Physical Data Type
Microsoft SQL Server
CHAR (size) or
character (size)
Physical Data Type
Oracle
CHAR (size)
TEXT
VARCHAR (max size) or VARCHAR (max size)
character varying (max
size)
MEMO
TEXT
LONG VARCHAR or
LONG VARCHAR2
NUMBER
INT (size) or
integer
or
smallinteger or
tinuinteger
DECIMAL (size, decimal
places) or
NUMERIC (size, decimal
places)
INTEGER (size) or
NUMBER (size)
IS 431 : Lecture 9
DECIMAL (size, decimal
places) or
NUMERIC (size, decimal
places) or
NUMBER
30
Physical Data Types …
Logical Data Type
to be stored in field)
Physical Data
Type MS Access
Physical Data Type
Microsoft SQL Server
Physical Data Type
Oracle
Financial Number
Date (with time)
CURRENCY
MONEY
DATETIME or
SMALLDATETIME
Depending on precision
needed
see decimal number
Current time (use to store the
data and time from the
computer’s system clock)
not supported
TIMESTAMP
not supported
Yes or No; or True or False
YES/NO
BIT
use CHAR(1) and set a yes
or no domain
Image
OLE OBJECT
IMAGE
LONGRAW
Hyperlink
HYPERLINK
VARBINARY
RAW
Can designer define new data
types?
NO
YES
YES
DATE/TIME
IS 431 : Lecture 9
DATE
31
Data Distribution
Data distribution analysis establishes which business
locations need access to which logical data entities and
attributes.
 Centralization
 Horizontal distribution (Data partitioning: Rows or
Records)
 Vertical distribution (Data partitioning: Columns or
Attributes)
 Replication
IS 431 : Lecture 9
32
Data Distribution Options
 Store all data on a single server (Centralization)
 Store specific tables on different servers.
 Store subsets of a specific table on different servers
– Subsets of columns (Vertical Partitioning)
– Subsets of rows (Horizontal Partitioning)
 Replicate (duplicate) specific tables or subsets on
different servers for back up (Replication)
IS 431 : Lecture 9
33
Database Distribution
Vertical partitioning
IS 431 : Lecture 9
34
Database Distribution
Horizontal partitioning
IS 431 : Lecture 9
35
Database Distribution
IS 431 : Lecture 9
36
Database Distribution
IS 431 : Lecture 8
37
Relational Databases
Relational databases implement stored data in a
series of two-dimensional tables that are “related” to
one another via foreign keys.
– The physical data model is called a schema.
– The DDL and DML for a relational database is called
SQL (Structured Query Language).
– Triggers are programs embedded within a table that are
automatically invoked by updates to another table.
– Stored procedures are programs embedded within a
table that can be called from an application program.
IS 431 : Lecture 9
38
A Method for
Database Design
1.
2.
3.
4.
5.
6.
7.
Review the logical data model.
Create a table for each entity.
Create fields for each attribute.
Create an index for each primary and secondary key.
Create an index for each subsetting criterion.
Designate foreign keys for relationships.
Define data types, sizes, null settings, domains, and defaults
for each attribute.
8. Create or combine tables to implement supertype/ subtype
structures.
9. Evaluate and specify referential integrity constraints.
IS 431 : Lecture 9
39
From Logical Data Model
Customer
Cust No (1,1)
Order
Product
place (1,M) Order No (1,M) contain (1,N) Product No
CUSTOMER (Cust No, ….)
ORDER (Order No, Cust No, ….)
PRODUCT (Product No,…)
ORDER-PRODUCT (OrderNo, ProductNo, …)
IS 431 : Lecture 9
40
to Physical Data Model
(Relational Schema) …
IS 431 : Lecture 9
41
…to “MS Access”
Implementation.
IS 431 : Lecture 9
42
Database Components
Form
Builder
Report
Writer
Interactive
Query Tool
Application
Program
Database
Front-end
Database Engine
To other
computer
systems
Database
Database
Gateway
To other DBMS brands
IS 431 : Lecture 9
43
The Role of SQL
Form
Builder
SQL
Report
Writer
SQL
Interactive
Query Tool
Application
Program
SQL
SQL
Database
Front-end
SQL
Database Engine
SQL
To other
computer
systems
SQL
Database
Database
Gateway
To other DBMS brands
IS 431 : Lecture 9
44
Administrators
Data administrator is responsible for the data
planning, definition, architecture, and management in
the organization. (CIO)
Database administrators are responsible for the
database technology, database design and
construction, security, backup and recovery, and
performance tuning. (Supervisors)
IS 431 : Lecture 9
45