* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download System Data Design
Survey
Document related concepts
Transcript
Database Design • Part of the design process is deciding how data will be stored in the system – Conventional files (sequential, indexed,..) – Databases (database management system) • Relational database • Object oriented database • Hybrid database – Data warehouse Objectives of Database Design • Make sure that data can be shared among users • Keep accurate and consistent data • Make sure that all data needed for current and future applications is stored (anticipate needs) • Allow database to grow with user needs • Ability to construct new views of database suitable for user needs Relational Database • The most common type of DBs is relations DBs represented by the ER (Entity-Relationship) diagram. • Entity: An object or event about which someone chooses to collect data. • Relationship: associations between entities (1:1, 1:M, M:N) • Attribute: some characteristics of an entity. Attributes are sometimes called data items or fields. • Record: A collection of data items that have something in common with the entity described. A record has a primary key and may have many attributes. Relational Database • Key: one of the data items in a record that is used to identify a record. – Two types of keys: • Primary key: a key that uniquely identifies a record • secondary key: – when it’s impossible to identify a record uniquely by using one of the data items found in a record, a key can be constructed by choosing two or more data items and combining them. • Metadata: data about the data in a file or a DB. Metadata describe the name, data type and length of each data item. Database Design from DFD & Data Dictionary Sales Info 4 SalesMan Record Sale Sales Info = SalesPerson No + SalesPerson Name + Sales Area + Customer No. + Customer Name + Warehouse No. + Sales Amount Sales Info Salesperson Number Salesperson Name Sales Area Customer Number Customer Name Detailed Field Information (Meta data) come from Data Element Data Dictionary Warehouse Sales Number Amount Normalization • Normalization is the transformation of complex user views and data stores to a set of smaller, stable data structures. – Reduce data redundancy – Increase relations between tables (inefficiency) Normalization • The problem Salesperson Number Salesperson Name Sales Area Customer Number Customer Name Warehouse Number Sales Amount 10 Ali Abdoun 150 Samer 2 20 10 Ali Abdoun 300 Rula 3 16 16 Sameh Abdoun 300 Rula 2 4 16 Sameh Abdoun 240 Rashed 2 6 11 Abdo Jbeiha 177 Lamia 1 4 Normalization • The solution Salesperson Number Salesperson Name Customer Number Sales Area Customer Name Customer Number Customer Name Salesperson Number Salesperson Customer Number Number Warehouse Number Salesperson Name Sales Amount Sales Area Warehouse Sales Number Amount Data Warehouse • A data warehouse is a centralized database that captures information from various parts of an organization's business processes. • This information can later be analyzed to determine predictive relationships through the use of data mining techniques. • Data are gathered from different departments in the business and stored in a huge expensive data warehouse that costs millions of dollars Data Warehouse Decision Support System Operational environment Internal database Transformation & Integration Data Warehouse Internal files Marketing Planning External sources Data Warehouse • Types of data gathered could be: – Customers behaviors and orders – Product prices – Timely business transactions (inventory, order processing, distribution, chain management) – Timely marketing activities Data Warehouse • Characteristics of data stored in a data warehouse: – Not necessarily normalized (for performance reasons) – Stored in sections aligned with business structure entities (customers, products, orders, distributors,..) – Nonvolatile data (loaded from operational environment but not changed) – Unify data terms, attribute definitions and values from different sources – Stored for a long time Data Warehouse • Purposes of creating data warehouses in businesses – Assist managers in the decision making process – Post-decision monitoring of decision effects – Answer frequent queries from managers, using Online Analytic Processing (OLAP) • Data Marts: a data warehouse that is limited in scope. – Contains selected information from a centralized data warehouse that supports certain group of end users. Data Mining • Data mining software is software that allows you to analyze large volumes of raw data from business systems, applications, databases, web sites, and text based mediums. • Extracts and manipulates information that is hidden in the raw data. (patterns) Data Mining Example • Collect data from different shops & credit card companies, such as: – Credit card values and payment – Items bought and shops frequently used – Membership applications – Response of free offers – When and how frequent traveled • Used for marketing purposes to predict customer behavior, give free coupons & advertisement Data Mining Benefits • Identify prospects and then retain them as customers • Unfold troubled business processes • Predict cross-sell opportunities and make recommendations • Learn parameters influencing trends in sales • Segment markets and personalize communications Data Patterns • Four types of patterns decision makers try to identify include: 1. Associations: patterns that occur together at the same time: A person who buys cereal usually buys milk too. 2. Sequences: patterns of action that take place over a period of time: if a family buys a house this year, they will most probably buy refrigerator, washer, oven, … next year. 3. Clustering: pattern that develops amongst a group of people: people who live in a particular area buy a particular car. 4. Trends: patterns that are noticed over a period of time: customer move from buying generic goods to buying customized products. Online Analytic Processing • Introduced in 1993 • Helps decision makers who want to look at data from different ways • Data are best stored in a way that makes most frequent queries more efficient… hence de-normalization • Data are summarized, calculated and collected to answer mangers’ queries Data Warehouse / Mining • Disadvantages – High cost – Ethical issues (Privacy) – Security of the data