Download (Student#).

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Big data wikipedia , lookup

Data Protection Act, 2012 wikipedia , lookup

Concurrency control wikipedia , lookup

Versant Object Database wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Data center wikipedia , lookup

Data model wikipedia , lookup

Data analysis wikipedia , lookup

SAP IQ wikipedia , lookup

Forecasting wikipedia , lookup

Database wikipedia , lookup

3D optical data storage wikipedia , lookup

Relational model wikipedia , lookup

Information privacy law wikipedia , lookup

Data vault modeling wikipedia , lookup

Clusterpoint wikipedia , lookup

Business intelligence wikipedia , lookup

Database model wikipedia , lookup

Transcript
An Introduction to Database
and Database Designing
R C Goyal
Principal Scientist
IASRI, New Delhi
Databases Before the Use of
Computers
Data kept in books, ledgers, card files,
folders, and file cabinets
Long response time
Labor-intensive
Often incomplete or inaccurate
Data & Information
Data known facts that can be
recorded and that has implicit
meaning
Information processed data
Meaningless data becomes
information when it is processed
and presented to the decision
maker in a meaningful way
Database
 A set of data that is required for a specific
purpose or is fundamental to a system,
project, enterprise, or business. A formally
structured collection of data.
 A database may consist of one or more data
banks and be geographically distributed
among several repositories.
 In automated information systems, the
database is manipulated using a database
management system.
File Mgmt vs. Database Mgmt
File Management
Each data entity is in a separate file
system for creating, retrieving and
manipulating files.
Database Management
Same file but data elements are integrated
and shared among different files
Program controls the structure of a
database and access to data
Database Management for
Strategic Advantage
We live in the Information Age
Information used to make organizations
more productive and competitive
Databases used to support business
operations
Databases used across a range of
applictions
Personal, department, enterprise, web
Technical Advantages of
Database Management
Reduced data redundancy
Improved data integrity
More program independence
Increased user productivity
Increased security
Disadvantages of DBMS
Cost issues
Implementation & maintenance issues
Security issues
Privacy issues
Features of DBMS
Database Engine
heart of DBMS
stores, retrieves and updates data
enforces business rules
Data dictionary
holds definitions of all of the data tables
Describes the data type, allows DBMS to
keep track of data, helps user find data
they need
Features of DBMS…Cont.
Query processor
Enables users to store and retrieve data
Use English command that use such words
as SELECT, DELETE, MODIFY
Report generator
Formats and prints reports after user uses
query processor
Features of DBMS…Cont.
Forms generator
Application generator
Access security
Setup access privileges to protect data
from unauthorized access and sabotage
System recovery
Database Development Cycle











Database planning
System Definition
Requirements collection and analysis
Database design
DBMS selection
Application design
Prototyping
Implementation
Data Conversion and loading
Testing
Operational Maintenance
Database Life Cycle
Database Planning
Systems Definition
Requirements Collection
and analysis
DBMS
Selection
Database Design
Application
Design
Implementation
Data Conversion and loading
Prototyping
Testing
Evaluation & Maintenance
Source: http://www.cs/ucf.edu/courses/cgs2545/CH02/index.htm
Database Planning
Current systems evaluation
Development of standards
Technological feasibility
Operational feasibility
Economical feasibility
Systems definition
Data dictionary
Metadata
Requirements Collection and Analysis
 identifying management information
requirements,
 determining information requirements by
functional area,
 and establishing hardware and software
requirements
Database Design
Conceptual design
Logical design
Physical design
Database design methodology
A structured approach that uses
procedures, techniques, tools, and
documentation aids to support and
facilitate the process of design.
* Conceptual database design
* Logical database design
* Physical database design
Critical Success Factors in
Database Design
 Work interactively with the users as much as
possible.
 Follow a structured methodology throughout the
data modelling process.
 Incorporate structural and integrity
considerations into the data models.
 Combine conceptualisation, normalisation, and
transaction validation techniques into the data
modelling methodology.
Design Considerations - Tips
 Planning
 Tables
 Queries
 Forms
 Reports
 Macros
 Modules
 General
Tips - Planning
 Before you start up the computer to build a
database, PLAN ON PAPER!!!
 Gather all of the paper forms you use to collect
your data
 Interview the people who will be using the
database, from those who conduct the field work
to those who analyze the data
 Think about what you want to get back out of
your database, not just about how to get data in
 Steal - see if it’s been done before
Tips - Tables
 Normalize, normalize, normalize
 Use field properties to help maintain data
integrity
 When possible limit the possible entries in a
field to a lookup list (domain)
 Choose your key fields carefully
 Index fields that you will search often
 Many-to-many relationships in Access
 Steal - link to existing tables
Normalization
The process of breaking down large
tables into smaller ones by removing all
unnecessary or duplicate fields,
eliminating redundant data, and making
sure that each table represents only
one thing.
Why Normalize?
If you don’t normalize, you may run into
anomalies (unexpected results)
Deletion anomaly - deleting a record
unexpectedly removes a value we wanted to
keep
Insertion anomaly - we can’t add a record
because we don’t know the value of all of the
required fields
Change anomaly - one change must be
applied to many records in a single table
Non-normalized Table
Typical flat-file table
Student# StName
1022 Rajat
4123 Patel
AdvName AdvRoom Class1 Class2 Class3
Goyal
412 101-07 143-01 159-02
Gupta
216 202-01 211-02 214-01
 Contains all data in one record.
 Contains repeating fields.
 Contains information on more than one thing.
First Normal Form
Eliminate repeating fields within tables.
Create a separate table for each set of
related data.
Identify each set of related data with a
primary key.
First Normal Form
Student#
1022
1022
1022
4123
4123

StName
Rajat
Rajat
Rajat
Patel
Patel
AdvName AdvRoom Class#
Goyal
412 101-07
Goyal
412 143-01
Goyal
412 159-02
Gupta
216 210-01
Gupta
216 211-02
No Repeating Fields: Class# is now a single field instead of repeating for each class a
student takes.

Now multiple records for each student because >1 record is needed to accommodate all
classes for each student
Non-normalized Table
Student# StName
1022 Rajat
4123 Patel
AdvName AdvRoom Class1 Class2 Class3
Goyal
412 101-07 143-01 159-02
Gupta
216 202-01 211-02 214-01
Second Normal Form
Create separate tables for values that
apply to multiple records.
Relate these tables with a foreign key.
Second Normal Form
Students:
Student# StName
1022 Rajat
4123 Patel
Student#
Classes:
1022
1022
1022
4123
4123
4123
AdvName AdvRoom
Goyal
412
Gupta
216
Class#
101-07
143-01
159-02
210-01
211-02
214-01
Eliminate Redundant Data: no longer repeating
student name, advisor, and adv-room for each class
Third Normal Form
Eliminate fields that do not depend on
the key.
Student# StName
Students:
1022 Rajat
4123 Patel
AdvName AdvRoom
Goyal
412
Gupta
216
 AdvRoom is independent of the key field
(Student#).
 It does not depend on the student, it depends on
the advisor.
Third Normal Form
Students:
Student# StName
1022 Rajat
4123 Patel
AdvID
1
2
Advisors:
AdvID AdvName AdvRoom
1 Goyal
412
2 Gupta
216
 Eliminate Data Not Dependent on Key: remove
AdvName from the Students table and create a
separate table describing the advisors with its own
primary key (AdvID).
 Relate Advisors to Students using AdvID.
Deletion Anomalies
Student# StName
1022 Rajat
1022 Rajat
1022 Rajat
4123 Patel
4123 Patel
AdvName AdvRoom Class#
Goyal
412 101-07
Goyal
412 143-01
Goyal
412 159-02
Gupta
216 210-01
Gupta
216 211-02
If we delete student Rajat, we lose the
only records for advisor Goyal
Insertion Anomalies
Student# StName
1022 Rajat
1022 Rajat
1022 Rajat
4123 Patel
4123 Patel
AdvName AdvRoom Class#
Goyal
412 101-07
Goyal
412 143-01
Goyal
412 159-02
Gupta
216 210-01
Gupta
216 211-02
If we want to add a new student we
have to know what class they are in (it’s
part of the key)
Change Anomalies
Student# StName
1022 Rajat
1022 Rajat
1022 Rajat
4123 Patel
4123 Patel
AdvName AdvRoom Class#
Goyal
412 101-07
Goyal
412 143-01
Goyal
412 159-02
Gupta
216 210-01
Gupta
216 211-02
If student Rajat switches to advisor
Gupta, we have to change three records
instead of just one
Field Properties
 Data Type - text, integer, double, boolean, etc.
 Field Size - for text fields
 Input Mask - a pattern for entering data
 Default Values - auto-entered for new records
 Validation Rule - limits values entered
 Required? - force entry of data
Relationships
One to One
One to Many
Many to One
Many to Many
Many-to-Many Relationship
Exist between two tables when:
for one record in the first table, there
can be many corresponding records in
the second table and…
for one record in the second table,
there can be many corresponding
records in the first table
Many-to-Many Relationship
One student can take many classes, and one class can
be taken by many students.
Students
StudentID
Name
Address
City
State
Zip
WorkPhone
HomePhone
Classes

 ClassNumber
StudentID
Subject
InstructorID
Days
Time
Comments
Tips - Queries
 Take advantage of action queries to handle
batch record operations
 Use queries to present calculated values
rather than storing the calculated values in
your tables
 Remember that null never equals another null
 Joining two tables on a field that may contain a
null value may not give you the results you expect
 Searching for duplicate values will not return two
records that have a null
 Realize that you can link a table to itself
DBMS Selection
* Costs
* Features and Tools
* Underlying model
* Portability
* DBMS hardware requirements
* Organisational requirements
Implementation
The physical realisation of the database
and application designs
the detailed model is converted to the
appropriate implementation model, the
data dictionary is built, the database is
populated, application programs are
developed and users are trained
Data Conversion and Loading
& Testing
Transferring any existing or new data
into the new database and converting
any existing applications to run on the
new database
Finding errors
Database Evaluation
Interviewing and polling users to
determine whether any data needs are
unmet.
Operational maintenance
preventive maintenance (backup)
corrective maintenance (recovery)1
adaptive maintenance
regular monitoring & periodical check
up
Recent Developments Affecting
Database Design and Use
 Data Mining (On-Line Analytical Processing)
 Drill down from summary data to detailed data
 Data Warehouses/Data Marts
 Integrates many large databases into one repository
 Linking Web Site Applications to Organizational
Databases
 Users have Web view to organizational database
 Improves customer contact and service
 Adds security as a concern
Data Warehouses
Data warehouses are used to organize
information for quick and effective
queries
Data Warehouses and
Database
 In the data warehouse, data are organized
around major subjects
 Data in the warehouse are stored as
summarized rather than detailed raw data
 Data in the data warehouse cover a much
longer time frame than in a traditional
transaction-oriented database
 Data warehouses are organized for fast
queries
 Data warehouses are usually optimized for
answering complex queries, known as OLAP
Data Warehouses and
Database
 Data warehouses allow for easy access via
data-mining software called software
 Data warehouses include multiple databases
that have been processed so that data are
uniformly defined, containing what is referred
to as “clean” data
 Data warehouses usually contain data from
outside sources
Data Mining
Statistical analysis
Decision trees
Neural networks
Fuzzy logic
Clustering
Data Mining Patterns
Data mining patterns that decision
makers try to identify include
Associations, patterns that occur together
Sequences, patterns of actions that take
place over a period of time
Clustering, patterns that develop among
groups of people
Trends, the patterns that are noticed over
a period of time
Web Based Databases and
XML
Web-based databases are used for
sharing data
Extensible markup language (XML) is
used to define data used primarily for
business data exchange over the Web
An XML document contains only data and
the nature of the data
 Continue………………………………..