Download Course Review

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

IMDb wikipedia , lookup

Tandem Computers wikipedia , lookup

Serializability wikipedia , lookup

Relational algebra wikipedia , lookup

Microsoft Access wikipedia , lookup

Oracle Database wikipedia , lookup

Ingres (database) wikipedia , lookup

Concurrency control wikipedia , lookup

Functional Database Model wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Database wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Clusterpoint wikipedia , lookup

SQL wikipedia , lookup

Database model wikipedia , lookup

PL/SQL wikipedia , lookup

Relational model wikipedia , lookup

Transcript
Course Review
We’ve made it to Lecture 10
This is the last lecture in the Course CO24
Sem 1 2002 10 / 1
Lecture 10 - Review of Subject Material
This is the final lecture for Semester 1, 2002
I will introduce the major components of a Database
Management System as a means of reviewing much of the
material
Then we will move onto a slightly wider subject review
Sem 1 2002 10 / 2
Sem 1 2002 10 / 3
Some final thoughts on SQL
IBM’s DB2 is overtaking Oracle’s DBMS 8.0.n and 8i as the
DBMS of the period.
An SQL analyst who also is a member of X3H2, a committee
set up to determine amendments and extensions to SQL and
which then become ANSI standard (if they are approved
etc.), was asked why this migration and customer redirection
was occurring. His name is Joe Celko.
Sem 1 2002 10 / 4
Some final thoughts on SQL
His comments ?
DB2 products are now more uniform across their platforms.
This means that the same database occurs at all levels in a
company
Oracle runs on many platforms but is retrofitted and not
designed from the start for that niche.
DB2 has a better Optimiser and other internals (read indexing
and join functions)
Sem 1 2002 10 / 5
Course Review
• We will look briefly at the material covered in the preceding
lectures
• Introduce some more material on SQL
• and have a few thoughts about the examination
Sem 1 2002 10 / 6
Introduction to SQL, Database Design
• Data, Information, Need for Information
• Database, Database Management Systems
• Commercial Models - Hierarchical, Network, Relational
• Business Activities, Functions, Processes
• Need for Information to Manage Resources
Sem 1 2002 10 / 7
Introduction to SQL, Database Design
• Need for Information to be
– Accurate, Representative, Timely, Adequately Based,
– Protected against Loss, Corruption, Unauthorised Access
• Communication capabilities
Sem 1 2002 10 / 8
Introduction to SQL, Database Design
• Designing a database
–
–
–
–
–
User requirements - current and future
Information outlines
Sources of data
Processes to be applied
Restrictions on data occurrences, either singly or
when associated with other data
Sem 1 2002 10 / 9
Introduction to SQL, Database Design
• Development of LOGICAL model
– External Schema - User views of data
– Conceptual Schema - Composite of all user views and
requirements
• Development of Entity Relationship Diagram
– Shows the ‘natural grouping’ of data
– Shows the relationships between the ‘natural groups’
– Shows the modality and cardinality (1:M, optional,
mandatory, number of sets which are related)
Sem 1 2002 10 / 10
Introduction to SQL, Database Design
Designing a database (cont’d)
– User Rules for existence and operations on data
– Dependencies of data
– Life of data
– Value or worth of data
Sem 1 2002 10 / 11
Introduction to SQL, Database Design
Determining what ‘data sets’ exist (entities)
Determining the relationships between these sets
– Optional
– Mandatory
– 1 to 1 (1:1) , 1 to Many (1:M)
or Many to Many (M:N)
– Unary, binary or ternary relationships
– Special conditions for existence of data
Sem 1 2002 10 / 12
Introduction to SQL, Database Design
Expanding the Entities to Attributes
– Conditions of existence for attributes
– Dependencies of attributes on a determinant
– Primary Key (and its characteristics)
Primary - Foreign Key occurrences (and referred to the
Entity Relationships)
Performance of the relationships on Insert, Update,
Modification of the Primary and Foreign keys
Sem 1 2002 10 / 13
Introduction to SQL, Database Design
What are the attributes of each set ?
Data Structure Diagram
Primary Key(s), Foreign key(s)
Nullable, non nullable attributes
Suspicion of the effect of update, insert, delete operations
Sem 1 2002 10 / 14
Introduction to SQL, Database Design
Normalisation - dependencies of attributes on the Primary
Key
Special conditions which could restrict some data
occurrences
Self checking features ?
Long term storage requirements
Sem 1 2002 10 / 15
Introduction to SQL, Database Design
Access considerations - File Organisation and
Access Methods
Indexing - Changing Structure - Secondary Indexing
Performance Requirements
Logical Path Access evaluation - Data Base and Equipment
Loading
Growth, Volatility of Database - Scalability Aspects
Sem 1 2002 10 / 16
Introduction to SQL, Database Design
Transaction Processing - Transaction Manager, Scheduler,
Recovery Manager
Stages of access to the physical database
Effect of power, software and hardware interruptions
Checkpoint, Logging facilities
Restart / Recovery / Backup outline
Sem 1 2002 10 / 17
Introduction to SQL, Database Design
Concurrency Aspects - multi user (transaction) control
Time stamping, Locks
Adverse effects of no or poor concurrency (optimistic and
pessimistic)
Integrity Considerations - Entity, Referential, Domain,
Security of
- hardware, communications, data
access limitations, authorisations
Sem 1 2002 10 / 18
Introduction to SQL, Database Design
Use of Data Dictionary information about data
Optimisation
- query execution plan
Data Base Administration
Distributed database
Data Warehousing
Data Mining
Features of a database management system
Sem 1 2002 10 / 19
Introduction to SQL, Database Design
SQL - what is it ? What is its purpose ?
The commands : Create (object), delete, drop,
insert, update,
Querying with SQL
– Select (list of attributes)
– From (named tables)
– Where (conditions for selection, including nested queries)
– Group By (selection of aggregations
– Having (condition for Grouping)
– Order by (viewing sequence)
Sem 1 2002 10 / 20
Introduction to SQL, Database Design
Operators :
> < = !=
in, between
exists, not exists
any, not any
AND OR
NOT
Avg, max, min, std, sum - set functions
Correlation - multiple table query
Sem 1 2002 10 / 21
An Introduction to SQL - Some Terms
In the next few overheads, there will be some terms and
explanations which should help you to make the transition
from the methods of data storage and file processing to that
of the relational database style of storage and processing of
data.
Sem 1 2002 10 / 22
More on Tables
Attributes which make up a row in a table are always scalar
or atomic data types
They are complete in themselves and cannot be broken into
smaller parts
Attributes can have constraints placed on them when
– the table is created
– or when the table is altered
These constraints are part of the database
Unlike spreadsheet cells, an attribute set of values can only
hold data.
Sem 1 2002 10 / 23
Schemas and Tables
A Schema : is the data, the operators and the rules of the
defined database
Tables can be permanent (base tables) or virtual (views)
Any operation in SQL will return a result table
Sem 1 2002 10 / 24
More Terms
Are you ready for a few more terms ?
Try these :
Scalar Values : This is a value which is not an array - one
dimension. 1 is a scalar value, 1,4 is not -it’s an array.
Literals : This is a constant which is referred to by its value.
Literal 3 is ‘the integer data type having the value 3’
Column Expression : This is a calculation which will return a
value for any data item in that attribute - also known as a
‘derived value’.
Sem 1 2002 10 / 25
A Transaction
A transaction is a sequence of SQL statements which Oracle
treats as a single unit
The set of changes is made permanent with the Commit
statement
Part or all of a transaction can be undone with the Rollback
statement
A transaction starts with the execution of the first SQL
statement in the transaction and ends with either the Commit
or Rollback statement
Sem 1 2002 10 / 26
A Transaction
Oracle guarantees that a transaction has statement-level
read consistency (the data stays the same while Oracle is
gathering and returning it)
If a transaction has multiple queries, then each query is
consistent, but not with each other
Transaction-level read consistency can be achieved with the
Set Transaction Read Only - (queries only)
Sem 1 2002 10 / 27
Some Guidelines
1. The name allocated to a table, known as the table name,
must be unique in the schema.
2. The attribute names used in a table must be unique in that
table
3. It is good practice NOT to have a table name repeated as an
attribute name in the same table (e.g. a table named
Personnel could have an attribute ‘employee’ - not an
attribute named ‘personnel’.)
4. It is good practice to use the same name for an attribute
which occurs in other tables.
5. It is good practice to have related attributes in natural order e.g. Person’s name, title, address, salary rather than
address, person’s name, title, salary.
Sem 1 2002 10 / 28
Some Thoughts about Keys
A KEY is an attribute (or a set of attributes) in a table which
uniquely identifies each row in that table
There are several types of keys :
1. A candidate key - this is one of the ‘possible’ keys in a
table. A an example, a motor car has a Registered Number,
and also a Vehicle identification Number (VIN).
2. A surrogate key is a key with no meaning in itself. They
are normally generated by the database system using some
form of number generator. Users rarely see these values except in Microsoft Access
Sem 1 2002 10 / 29
Some Thoughts about Keys
Intelligent Keys : have a meaning. Longitude and Latitude of
an object (say a building) uniquely identifies a building and
also nominates where it is.
Super Key : This is normally a key with too many attributes in
it. Some of the columns could be removed and a key would
still exist. A good example is a key which contains ‘old
value/new value’ such as ‘old telephone number/new
telephone number’ when numbers are upgrades or altered.
A Primary Key : The unique, not null expression of the
access to each row in a database table - normally
constrained by a domain set.
Sem 1 2002 10 / 30
Embedded SQL
In commercial processing, SQL normally is between the
database and a host programming language, passing data in
either a forward or backward direction ( accepting new data,
or returning results of queries.)
The host program communicates with the ‘outside world’
There are also PC tools which are designed to use SQL PowerBuilder, Designer2000, Developer2000, Delphi are
examples.
Sem 1 2002 10 / 31
Embedded SQL
There are 3 problems
1. SQL has a rich or extended set of datatypes which might not
match those of the host program
2. The ‘Null’ may not exist in the host program language
3. SQL uses a set model of data - not a file model.
A file is processed (read, updated, deleted) one record at a
time
A set in SQL is read, updated, deleted ‘all at once’
Embedded SQL handles these problems.
Sem 1 2002 10 / 32
Embedded SQL
• Embedded SQL provides for SQL statements to be
recognised by the keywords ‘exec sql’ preceding the SQL
commands.
• A precompiler converts the SQL statements into host
language procedure calls.
• The database calls are part of an Application Program
Interface library
Sem 1 2002 10 / 33
New SQL Commands and Features
The OLAP functions proposed for SQL-99 are
ceiling
percentile_cont
regr_slope
corr
percent_rabk
regr_sxx
covar_pop power
regr_sxy
covar_samp range
regr_syy
cume_dist rank
row_number
dense_rank regr_avg
sqrt
exp
regr-avgx
stddev_pop
floor
regr_agvy
stddev_samp
ln
regr_count
car_pop
moving_avg regr_intercept
var_samp
moving_sum regr_r2
Sem 1 2002 10 / 34
New SQL Commands and Features
The principal statistical functions are ;
correlation
cumulative distribution
percentile
standard deviation
covariance
ranking
linear regression
variance
Sem 1 2002 10 / 35
Business Intelligence
Analysis is a fundamental to Business Intelligence
For example :Is there a correlation between the frequency of Web site visits
(hits) and customer profitability
Who would be the ‘interested parties’ to the results of this
analysis ?
Sem 1 2002 10 / 36
Introduction to SQL, Database Design
Some other features:
Not all DBMS’s offer the same facilities
Not all SQL software supplier versions are the same
Some interesting differences : data types, (money, currency,
number - with or without ‘decimal’ provision)
Nulls and their processing capabilities
Aggregate functions especially in the handling of nulls
Sem 1 2002 10 / 37
Some Commands You Have Met
• Create : table, view, index, sequence (also create as select
..)
• Drop : table, view, index, sequence
• Insert : Adds a single row to an existing table
• Delete : Deletes a row or rows or all contents from an
existing table
• Update : Updates a row or rows
• SqlLoad : Loads bulk data from an external file to the current
database table (sqlldr80 for Personal Oracle)
Sem 1 2002 10 / 38
Some Commands You Have Met
• Alter : With care, provides alteration to some of the table
attributes and properties
• Copy : Copies data from 1 database to a named database
• Export : Copies data from a named table in a database to the
Operating system level
• Import : Copies data from the Operating system level to a
named database
• Union : Combines 2 tables into 1
Sem 1 2002 10 / 39
The Query Form
• The form of a Query :
Select
(list of attributes)
From
(table, or list of table names)
Where
(conditions of selection)
Group By (used for aggregate functions)
Having (conditions of selection of the Group By)
Order By (determines the order of content displayed from
the Result table - Ascend, Descend on a named
attribute)
Sem 1 2002 10 / 40
Operators and Predicates
• Arithmetic :
+ - / * **
• Logical
And
:
Or
• Comparison Predicates :
Not
= > < <=
>= <> != ^=
• Like Predicate : Pattern matching or Not matching
• Between Predicate : Numeric valued constraints
• In Predicate : String or numeric lists
Sem 1 2002 10 / 41
More Predicates
• Any
• All
• Exists
Sem 1 2002 10 / 42
Constraints
• Null
(and Not null)
• Unique
• Check (attribute property limit - In, Between )
Aggregate Functions
Count, Sum, Average, Minimum, Maximum
and don’t forget the ‘additional’ functions in SQL99
Sem 1 2002 10 / 43
Data Types
• Varchar2(number) e.g. varchar2(15)
• Number(p,s) e.g. number (10,2) 8 whole number values, 2
decimal values)
• Date
• Blob - Binary large object
• Clob - Compressed binary large object
• Other datatype supported by Microsoft - currency, long and
short integer, text, various temporal options
Sem 1 2002 10 / 44
The Exam Paper
A few words about the exam :
The paper consists of 2 parts
You should read the paper completely before you attempt to
answer anything
Use the text and the overheads and notes which you received
during the course
Sem 1 2002 10 / 45
The Examination
In answering questions, use
• diagrams
• text
• note
• point form - whichever you feel best suits
• examples
(Not Part A)
Sem 1 2002 10 / 46
The Examination
Exam in
Part A 20 multiple choice questions
Part B
(40)
6 short essay type questions
Answer any 3
(60)
- includes some SQL scripts
- short thesis type questions
Only answer the number of questions required.
Sem 1 2002 10 / 47
The Examination
• Make sure your Name is on all materials handed in
• Send in via email or post or deliver all required materials
• Responses returned by Friday 21st June, 2002
Sem 1 2002 10 / 48
Introduction to SQL, Database Design
Sem 1 2002 10 / 49