Download COSC344 Database Theory and Applications Lecture 1: Introduction

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Serializability wikipedia , lookup

SQL wikipedia , lookup

Microsoft Access wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

IMDb wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Oracle Database wikipedia , lookup

Ingres (database) wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Concurrency control wikipedia , lookup

Database wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

ContactPoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
COSC344
Database Theory and Applications
Lecture 1: Introduction
COSC344
Lecture 1
1
Welcome to
COSC 344
Database Theory and Applications
COSC344
Lecture 1
2
Course Goals
•  Introduce the fundamental concepts, principles, and
problems in database.
•  Introduce the principles of designing a database and
using a database management system
•  Learn how to design, implement, optimize the
underlying database management system
What will you get from this course?
Necessary knowledge and fundamental
skills you’ll need in your future career
Theory + Practice
COSC344
Lecture 1
3
Teaching Team
•  Lecturers
Haibo Zhang (Course Coordinator)
2.47, Owheo
479-8534
[email protected]
Yawen Chen
2.46, Owheo
479-5740
[email protected]
•  Oracle Database Administrator
Cathy Chandra
1.21, Owheo
479-58580
[email protected]
COSC344
Lecture 1
4
Course Details
•  Recommended Textbooks
–  Elmasri, R. & Navathe, S., Fundamentals of Database
Systems, (6th edition), 2013
–  Elmasri, R. & Navathe, S., Database Systems: Models,
Languages, Design, and Application Programming, (6th
edition), 2013
•  Course web page
–  http://www.cs.otago.ac.nz/cosc344
•  Consultation
–  Send email first to book a time slot
•  Almost one lab per week (no lab in week 1, 10, and 13)
–  Two streams, only need to attend one
COSC344
Lecture 1
5
Outline of Lectures
Introduction to databases, models
Database design, ER modeling
Relational model
Relational algebra & operator
Functional dependencies
Normalization
View & NULL
SQL, Java & SQL,C & SQL, PHP &SQL
PL/SQL, Triggers
DBMS Architecture & System catalog
Database file and storage
Database indexing (1 & 2)
Database security & auditing
Transactions
Concurrency control
Query optimisation
NoSQL data modelling and databases
COSC344
Lecture 1
Database fundamentals
and design
Database programming
DBMS architecture, design
and optimization
Advanced data models
6
Assessment
•  Three assignments - 40%
–  Friday, 29/7/16 - 10%
–  Friday, 26/8/16 - 15%
–  Monday, 26/9/16 - 15%
Assignments due at 4 PM
Late assignments incur a 10% penalty (of marks) per working day late.
•  Exam - 60%
–  3 hours
–  A mark of at least 50% is required to pass the course
COSC344
Lecture 1
7
A Useful Study Tip
Do homework first
Attend lectures
Lab practice
COSC344
Lecture 1
8
Overview
•  This Lecture
–  Introduction to databases
–  Data models
–  Schemas
–  ANSI/SPARC architecture
–  Database languages
–  Source: Chapters 1, Chapter 2.1 – 2.4
•  Next Lecture
–  Database design
–  ER modelling
–  Source: Chapter 7.1 – 7.8
COSC344
Lecture 1
9
Try It
•  List some ways you encounter databases in everyday
activities.
COSC344
Lecture 1
10
A Motivating Example
•  Suppose we are building a system to store the information
about:
–  students
–  courses
–  lecturers
–  who takes what, who teaches what
•  Allow users to query/update:
–  who teaches “COSC344”, enroll “Mary” in “COSC344”
•  Allow several (100s) users to access data simultaneously
•  Store the data for a long period of time
–  Protect against crashes
–  Protect against unauthorized use
COSC344
Lecture 1
11
Introduction to Databases (1)
•  What is a database?
–  A database is a collection of related data.
•  A database generally has the following implicit properties:
–  Represents some aspect of the real-word, called miniworld
–  Is a coherent collection of data with inherent meaning
–  Is designed, built, and populated with data for a specific
purpose.
•  A database can be of any size and complexity
–  Small database: Address book, student database in CS
–  Large database: Inland Revenue Department database,
Amazon.com,
COSC344
Lecture 1
12
Large Databases
COSC344
Lecture 1
13
Introduction to Databases (2)
•  What is a database management system (DBMS)?
–  A collection of programs that enable to define, construct,
manipulate, and share databases among various users and
applications.
–  Commercial DBMSs: DB2, Oracle, SQL Server …
•  Application program
–  Accesses the database by sending queries and requests for
data to the DBMS
COSC344
Lecture 1
14
A Simplified Database System Environment
Users/Programmers
Database
System
DBMS
Software
Application Programs/Queries
Software to Process
Queries/Programs
Software to Access
Stored Data
Database definition
(Meta-data)
COSC344
Stored Database
Lecture 1
15
An Example Database
COSC344
Lecture 1
16
Characteristics of the Database Approach (1)
•  Contrasting database and file systems
A File System
Registration
Office
CS
Department
Accounting
Office
Students
Courses
Accounts
Registration
Office
CS
Department
A Database System
Students
DBMS
Accounts
Courses
Accounting
Office
COSC344
Lecture 1
17
Characteristics of the Database Approach (2)
•  Self-describing nature of a database system
–  Database + Meta-data (a complete definition of data structure
and constraints stored in DBMS catalog)
•  Insulation between programs and data
–  Program-data independence
–  Program-operation independence
data abstraction
•  Support of multiple views of data
–  A subset of the database
–  Virtual data derived from the database files
•  Sharing of data
–  Concurrency control
–  Online transactions processing
COSC344
Lecture 1
18
Actors on the Scene
•  Database Administrators (DBA)
–  Authorizing access to the database
–  Coordinating and monitoring its use
–  Maintain software and hardware resources
•  Database Designers
–  Identify the data to be stored and choose the appropriate
structure to represent and store the data
•  End Users
•  System Analysts and Application Programmers.
COSC344
Lecture 1
19
Why use a DBMS?
•  Controlling redundancy
•  Providing storage structures for efficient query
processing
•  Providing backup and recovery
•  Improving data security and privacy
•  Providing multiple user interface
•  Representing complex relationships among data
•  Enforcing integrity constraints
COSC344
Lecture 1
20
UML Modeling
User name and password are
stored in an encrypted file.
UserPass private Username Private Password Implement a class to read
user name and password
from the file.
What about
database design?
public UserPass () public String getPassWord() public String getUserName() public class UserPass {
private String password;
private String username;
// Constructor - Also reads the username and password
//
from the file.
public UserPass () {
String line = null;
String passwordFile = "pass.dat";
try {
…
Data Modeling and Data Models
•  Data modeling: the first step in designing a database
–  the process of creating a specific data model for a
determined problem domain
•  A data model is a collection of concepts that can be
used to describe the structure of a database such as
data types, relationships, and constraints that should
hold for the data.
–  a high level of data abstraction by hiding details of the data
storage not needed by most users
•  Definition of a data model - try it
COSC344
Lecture 1
22
Categories of Data Models
•  High-level or conceptual data models
– 
– 
– 
– 
close to the way most users perceive data
entity
attribute
relationship
•  Low-level or physical data models
–  storage details
•  Representational or implementation data models
– 
– 
– 
– 
COSC344
bridge the gap
relational
network
hierarchical
Lecture 1
23
Other Terms
•  Database Schema: the description of the database in a
data model
–  Specified during design
–  Changes infrequently
–  Displayed using schema diagram
•  Meta-data
–  Stores the description of the schema and constraints
–  Stored in the DBMS catalog
•  Occurrence or Instance
–  Database state/Snapshot: the data in the database at a
particular moment in time
COSC344
Lecture 1
24
Example Schema
COSC344
Lecture 1
25
Three-Schema Architecture (1)
•  Goal: to separate the user applications from the
physical database
End Users
External Level
External
View
…
External
View
External/Conceptual Mapping
Conceptual Level
Conceptual Schema
Conceptual/Internal Mapping
Internal Level
COSC344
Also known as the
ANSI/SPARC architecture
Internal Schema
Stored Database
Lecture 1
26
Three-Schema Architecture (2)
•  Internal Schema
–  Uses a physical data model
–  Describes the physical storage structure of the database
•  Conceptual Schema
–  Hides the details of the physical storage structures
–  Concentrates on describing entities, data types, relationships,
user operations, and constraints
•  External Schema
–  Describes the part of the database that a particular user group
is interested in
–  Hides the rest of the database from that user group
•  Mapping
–  Transform requests and results between adjacent levels
COSC344
Lecture 1
27
Data Independence
Defined as the ability to change the schema at one level
of a database system without having to change the
schema at the next higher level
•  Logical data independence
–  The capacity to change the conceptual schema without
having to change external schemas
–  e.g. add/remove record type, change constraints
•  Physical data independence
–  The capacity to change the internal schema without having to
change the conceptual schemas
–  e.g. creating additional access structures
COSC344
Lecture 1
28
DBMS Languages
•  Data definition language (DDL)
–  Used to define the conceptual schema
–  A DDL compiler is used to process DDL statements
•  Storage definition language (SDL)
–  Used to define the internal schema
–  Most relational DBMS do not have specific SDL languages
•  View definition language (VDL)
–  To specify user views and their mapping to conceptual schema
–  Most DBMSs use DDL for both conceptual and external schema
•  Data manipulation language (DML)
–  High-level or nonprocedural DML (set-at-a-time)
–  Low-level or procedural DML (record-at-a-time)
COSC344
Lecture 1
29
Question to Ponder
•  Given some mini-world, how do we design a
database?
–  What do you put into it?
–  How are the pieces interrelated?
COSC344
Lecture 1
30
Assignment Grouping
•  Each group should have 5 or 6 students.
•  Each group must have one group leader.
•  Group leaders send the list of group members to
me via [email protected] asap.
COSC344
Lecture 1
31