Download System Data Design

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Database wikipedia , lookup

Big data wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Functional Database Model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Database Design
• Part of the design process is deciding
how data will be stored in the system
– Conventional files (sequential, indexed,..)
– Databases (database management system)
• Relational database
• Object oriented database
• Hybrid database
– Data warehouse
Objectives of Database Design
• Make sure that data can be shared among
users
• Keep accurate and consistent data
• Make sure that all data needed for current and
future applications is stored (anticipate needs)
• Allow database to grow with user needs
• Ability to construct new views of database
suitable for user needs
Relational Database
• The most common type of DBs is relations DBs
represented by the ER (Entity-Relationship) diagram.
• Entity: An object or event about which someone chooses
to collect data.
• Relationship: associations between entities (1:1, 1:M,
M:N)
• Attribute: some characteristics of an entity. Attributes are
sometimes called data items or fields.
• Record: A collection of data items that have something in
common with the entity described. A record has a
primary key and may have many attributes.
Relational Database
• Key: one of the data items in a record that is used to
identify a record.
– Two types of keys:
• Primary key: a key that uniquely identifies a record
• secondary key:
– when it’s impossible to identify a record uniquely by using one of
the data items found in a record, a key can be constructed by
choosing two or more data items and combining them.
• Metadata: data about the data in a file or a DB. Metadata
describe the name, data type and length of each data
item.
Database Design from DFD & Data
Dictionary
Sales Info
4
SalesMan
Record Sale
Sales Info = SalesPerson No + SalesPerson Name + Sales Area + Customer No.
+ Customer Name + Warehouse No. + Sales Amount
Sales Info
Salesperson
Number
Salesperson
Name
Sales
Area
Customer
Number
Customer
Name
Detailed Field Information (Meta data) come from
Data Element Data Dictionary
Warehouse Sales
Number
Amount
Normalization
• Normalization is the transformation of
complex user views and data stores to a set
of smaller, stable data structures.
– Reduce data redundancy
– Increase relations between tables (inefficiency)
Normalization
• The problem
Salesperson
Number
Salesperson
Name
Sales
Area
Customer
Number
Customer
Name
Warehouse
Number
Sales
Amount
10
Ali
Abdoun
150
Samer
2
20
10
Ali
Abdoun
300
Rula
3
16
16
Sameh
Abdoun
300
Rula
2
4
16
Sameh
Abdoun
240
Rashed
2
6
11
Abdo
Jbeiha
177
Lamia
1
4
Normalization
• The solution
Salesperson
Number
Salesperson
Name
Customer
Number
Sales
Area
Customer
Name
Customer
Number
Customer
Name
Salesperson
Number
Salesperson Customer
Number
Number
Warehouse
Number
Salesperson
Name
Sales
Amount
Sales
Area
Warehouse Sales
Number
Amount
Data Warehouse
• A data warehouse is a centralized database that
captures information from various parts of an
organization's business processes.
• This information can later be analyzed to
determine predictive relationships through the
use of data mining techniques.
• Data are gathered from different departments in
the business and stored in a huge expensive
data warehouse that costs millions of dollars
Data Warehouse
Decision Support System
Operational environment
Internal
database
Transformation
&
Integration
Data
Warehouse
Internal
files
Marketing Planning
External
sources
Data Warehouse
• Types of data gathered could be:
– Customers behaviors and orders
– Product prices
– Timely business transactions (inventory, order
processing, distribution, chain management)
– Timely marketing activities
Data Warehouse
• Characteristics of data stored in a data warehouse:
– Not necessarily normalized (for performance reasons)
– Stored in sections aligned with business structure entities
(customers, products, orders, distributors,..)
– Nonvolatile data (loaded from operational environment
but not changed)
– Unify data terms, attribute definitions and values from
different sources
– Stored for a long time
Data Warehouse
• Purposes of creating data warehouses in
businesses
– Assist managers in the decision making process
– Post-decision monitoring of decision effects
– Answer frequent queries from managers, using
Online Analytic Processing (OLAP)
• Data Marts: a data warehouse that is limited in
scope.
– Contains selected information from a centralized data
warehouse that supports certain group of end users.
Data Mining
• Data mining software is software that allows you
to analyze large volumes of raw data from
business systems, applications, databases, web
sites, and text based mediums.
• Extracts and manipulates information that is
hidden in the raw data. (patterns)
Data Mining Example
• Collect data from different shops & credit
card companies, such as:
– Credit card values and payment
– Items bought and shops frequently used
– Membership applications
– Response of free offers
– When and how frequent traveled
• Used for marketing purposes to predict customer
behavior, give free coupons & advertisement
Data Mining Benefits
• Identify prospects and then retain them as
customers
• Unfold troubled business processes
• Predict cross-sell opportunities and make
recommendations
• Learn parameters influencing trends in
sales
• Segment markets and personalize
communications
Data Patterns
• Four types of patterns decision makers try to identify
include:
1. Associations: patterns that occur together at the same
time: A person who buys cereal usually buys milk too.
2. Sequences: patterns of action that take place over a
period of time: if a family buys a house this year, they will
most probably buy refrigerator, washer, oven, … next
year.
3. Clustering: pattern that develops amongst a group of
people: people who live in a particular area buy a
particular car.
4. Trends: patterns that are noticed over a period of time:
customer move from buying generic goods to buying
customized products.
Online Analytic Processing
• Introduced in 1993
• Helps decision makers who want to look at
data from different ways
• Data are best stored in a way that makes
most frequent queries more efficient…
hence de-normalization
• Data are summarized, calculated and
collected to answer mangers’ queries
Data Warehouse / Mining
• Disadvantages
– High cost
– Ethical issues (Privacy)
– Security of the data