Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Concurrency control wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Relational model wikipedia , lookup
Functional Database Model wikipedia , lookup
Ralph M. Stair | George W. Reynolds Chapter 3 Database Systems and Applications © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Principles and Learning Objectives: Data Management and Modeling • Data management and modeling are key aspects of organizing data and information – Define general data management concepts and terms, highlighting the advantages of the database approach to data management – Describe logical and physical database design considerations, the function of data centers, and the relational database model © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 2 Principles and Learning Objectives: Database Support in Decision Making • A well-designed and well-managed database is an extremely valuable tool in supporting decision making – Identify the common functions performed by all database management systems, and identify popular database management systems © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 3 Principles and Learning Objectives: Evolving Database Applications • The number and types of database applications will continue to evolve and yield real business benefits – Identify and briefly discuss business intelligence, data mining, and other database applications © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 4 Why Learn About Database Systems and Applications? • A huge amount of data is captured for processing by computers every day • Learning about database systems and applications can help you make the most effective use of information • Databases and applications to extract and analyze valuable information can help you succeed in your career © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 5 Introduction • Database: an organized collection of data • A database management system (DBMS) is a group of programs that: – Manipulate the database – Provide an interface between the database and its users and other application programs © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 6 Data Management • Without data and the ability to process it: – An organization could not successfully complete most business activities • Data consists of raw facts • Data must be organized in a meaningful way to transform it into useful information © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 7 The Hierarchy of Data • A bit (binary digit) represents a circuit that is either on or off • A byte is made up of eight bits – Each byte represents a character • Field: a name, number, or combination of characters that describes an aspect of a business object or activity © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 8 The Hierarchy of Data (cont’d.) • Record: a collection of related data fields • File: a collection of related records • Database: a collection of integrated and related files • Hierarchy of data: bits, characters, fields, records, files, and databases © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 9 An Example of Hierarchy of Data © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 10 Data Entities, Attributes, and Keys • Entity: a person, place, or thing for which data is collected, stored, and maintained • Attribute: a characteristic of an entity • Data item: the specific value of an attribute • Primary key: a field or set of fields that uniquely identifies the record © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 11 Keys and Attributes © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 12 The Database Approach • Traditional approach to data management – Each distinct operational system used data files dedicated to that system • Database approach to data management – Information systems share a pool of related data © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 13 Database Approach to Data Management © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 14 Data Centers, Data Modeling and Database Characteristics • Considerations when building a database – Content: what data should be collected and at what cost? – Access: what data should be provided to which users and when? – Logical structure: how should data be arranged so that it makes sense? – Physical organization: where should data be physically located? © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 15 Data Modeling • Data model: a diagram of data entities and their relationships • Enterprise data modeling: data modeling done at the level of the entire enterprise • Entity-relationship (ER) diagrams: data models that use basic graphical symbols to show the organization of and relationships between data © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 16 Entity-Relationship (ER) Diagram for a Customer Order Database © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 17 Relational Database Model • Relational model: a simple but highly useful way to organize data into collections of two-dimensional tables called relations • Relational model databases include: – Oracle, IBM DB2, Microsoft SQL Server, Microsoft Access, MySQL, and Sybase • Domain: range of allowable values for a data attribute © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 18 Relational Database Model (cont’d.) © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 19 Manipulating Data • Selecting: eliminating rows according to certain criteria • Projecting: eliminating columns in a table • Joining: combining two or more tables • Linking: combining two or more tables through common data attributes to form a new table with only the unique data attributes © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 20 Simplified ER Diagram © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 21 Linking Data Tables to Answer an Inquiry © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 22 Data Cleansing • Also called data cleaning or data scrubbing • The process of detecting and then correcting or deleting incomplete, incorrect, inaccurate, irrelevant records that reside in a database • The cost of performing data cleansing can be quite high © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 23 Database Management Systems • Creating and implementing the right database system ensures that the database will support both business activities and goals • Capabilities and types of database systems vary considerably © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 24 Overview of Database Types • Single-user DBMS – Installed on a personal computer and meant for a single user – Examples: Microsoft Access and InfoPath, Lotus Approach, and Personal Oracle • Multiple-user DBMS – Allows dozens or hundreds of people to access the same system at the same time – Vendors: Oracle, Microsoft, Sybase, and IBM © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 25 Overview of Database Types (cont’d.) • Flat file – Simplest database program – The records have no relationship to one another – Store and manipulate a single table or file – Examples: OneNote and Evernote © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 26 SQL Databases • SQL: a special-purpose programming language for accessing and manipulating data stored in a relational database • SQL databases conform to ACID properties: – Atomicity, consistency, isolation, and durability • 1986: SQL was adopted by ANSI as the standard query language for relational databases © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 27 Table 3.1 Examples of SQL Commands SQL Command Description SELECT ClientName, Debt FROM Client WHERE Debt > 1000 This query displays all clients (ClientName) and the amount they owe the company (Debt) from a database table called Client for clients who owe the company more than $1,000 (WHERE Debt > 1000). SELECT ClientName, ClientNum, OrderNum FROM Client, Order WHERE Client.ClientNum=Order.ClientNum This command is an example of a join command that combines data from two tables: the Client table and the Order table (FROM Client, Order). The command creates a new table with the client name, client number, and order number (SELECT ClientName, ClientNum, OrderNum). Both tables include the client number, which allows them to be joined. This ability is indicated in the WHERE clause, which states that the client number in the Client table is the same as (equal to) the client number in the Order table (WHERE Client.ClientNum=Order.ClientNum). GRANT INSERT ON Client to Guthrie This command is an example of a security command. It allows Bob Guthrie to insert new values or rows into the Client table. © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 28 NoSQL Databases • A database designed to store and retrieve data in a manner that does not rigidly enforce the atomic conditions associated with the relational database model – Provides faster performance and greater scalability • Examples – Cassandra used by Facebook – DynamoDB used by Amazon © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 29 Visual, Audio, and Other Database Systems • Visual databases store images of charge slips, X-rays, and vital records – Images can be stored in some objectrelational databases or special-purpose database systems • Spatial databases provide location-based services – Maps are embedded into a Web site’s Web applications and operational systems © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 30 Database Activities • • • • Providing a user view of the database Adding and modifying data Storing and retrieving data Manipulating the data and generating reports © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 31 Providing a User View • Schema: a description of the entire database • A schema can be part of the database or a separate schema file • The DBMS can reference a schema to find where to access the requested data in relation to another piece of data © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 32 Creating and Modifying the Database • Data definition language (DDL) – A collection of instructions and commands used to define and describe data and relationships in a specific database – Allows the database’s creator to describe data and relationships that are to be contained in the schema • Data dictionary: a detailed description of all the data used in the database © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 33 Data Definition Language © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 34 Data Dictionary Entry © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 35 Storing and Retrieving Data • When an application program needs data, it requests the data through the DBMS • Concurrency control deals with the situation in which two or more users or applications need to access the same record at the same time © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 36 Logical and Physical Access Paths © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 37 Manipulating Data and Generating Reports • Query by Example (QBE) is a visual approach to developing database queries or requests • Data manipulation language (DML): a specific language, provided with a DBMS – Allows users to access and modify the data, to make queries, and to generate reports © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 38 Manipulating Data and Generating Reports (cont’d.) • Once a database has been set up and loaded with data, it can produce reports, documents, and other outputs • A DBMS can produce a wide variety of documents, reports, and other output that can help organizations achieve their goals © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 39 Database Administration • DBA: a skilled and trained IS professional – Works with users to define their data needs – Applies database programming languages to craft a set of databases to meet those needs – Tests and evaluates databases – Implements changes to improve their databases’ performance – Assures that data is secure from unauthorized access © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 40 Database Administration (cont’d.) • Data administrator: a nontechnical position responsible for defining and implementing consistent principles for a variety of data issues • The data administrator can be a high-level position reporting to top-level managers © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 41 Table 3.2 Popular Database Management Systems Open-Source Relational DBMS Relational DBMS for Individuals and Workgroups Relational DBMS for Workgroups and Enterprise NoSQL DBMS MySQL Microsoft Access Oracle Mongo DB PostgreSQL IBM Lotus Approach IBM DB2 Cassandra MariaDB Google Base Sybase Redis SQL Lite OpenOffice Base Teradata CouchDB Microsoft SQL Server Progress OpenEdge © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 42 Popular Database Management Systems • Database as a Service (DaaS) – The database is stored on a service provider’s servers – The database is accessed by the client over a network, typically the Internet – Database administration is handled by the service provider • Example of DaaS: Amazon Relational Database Service (Amazon RDS) © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 43 Using Databases with Other Software • DBMSs can act as front-end or back-end applications – Front-end applications interact directly with people – Back-end applications interact with other programs or applications • Spin-off database applications include: – Big data, data warehouses and data marts, and business intelligence © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 44 Big Data • Extremely large and complex data collections – Traditional data management software, hardware, and analysis processes are incapable of dealing with them • Three characteristics of big data – Volume – Velocity – Variety © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 45 Table 3.3 Big Data Generators Source Magnitude of Data Generated Large Hadron particle accelerator at CERN 40 terabytes of data per second Commercial aircraft engines More than 1 petabyte per day of sensor data Cell phones More than 5 billion people worldwide are making cell phone calls, exchanging text messages, and accessing Web sites YouTube 48 hours of video uploaded per minute Facebook 100 terabytes uploaded per day Twitter 500 million tweets per day RFID tags 1,000 times the volume of data generated by bar codes © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 46 Challenges of Big Data • How to choose what subset of the data to store • Where and how to store the data • How to find the nuggets of data that are relevant to the decision making at hand • How to derive value from the relevant data © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 47 In-Memory Databases • A database management system that stores the entire database in random access memory (RAM) • Enable the analysis of big data and other challenging data-processing applications • Perform best on multiple multicore CPUs © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 48 Data Warehouses and Data Marts • Data warehouse: a large database that collects business information from many sources in the enterprise in support of management decision making • ETL process – Extract – Transform – Load © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 49 Elements of a Data Warehouse © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 50 Data Warehouses and Data Marts (cont’d.) • Data mart: a subset of a data warehouse that is used by small- and medium-sized businesses and departments within large companies to support decision making • A specific area in the data mart might contain greater detailed data than the data warehouse © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 51 Business Intelligence • A broad range of technologies and applications – Enabling an organization to transform mostly structured data obtained from information systems to perform analysis, generate information, and improve the decision making of the organization © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 52 Business Intelligence (cont’d.) • Technologies include: – Data mining – Online analytical processing – Predictive analytics – Data visualization – Competitive intelligence © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 53 Data Mining • An information-analysis tool that involves the automated discovery of patterns and relationships in a data warehouse • Provides bottom-up, discovery-driven analysis © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 54 Online Analytical Processing (OLAP) • A form of analysis that allows users to explore data from a number of perspectives, enabling a style of analysis known as “slicing and dicing” • Provides top-down, query-driven data analysis © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 55 Table 3.6 Comparison of OLAP and Data Mining Characteristic OLAP Data Mining Purpose Supports data analysis and decision making Supports data analysis and decision making Type of analysis supported Top-down, query-driven data analysis Bottom-up, discovery-driven data analysis Skills required of user Must be very knowledgeable of the data and its business context Must trust in data-mining tools to uncover valid and worthwhile hypotheses © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 56 Predictive Analysis • Also called predictive analytics • A form of data mining that combines historical data with assumptions about future conditions to predict outcomes of events, e.g., future product sales or the probability that a customer will default on a loan © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 57 Data Visualization • In analyzing data, charts and graphs make it easier to: – See trends and patterns – Identify opportunities for further analysis • Software examples – Excel and SAS Visual Analytics © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 58 Data Visualization: Social Graph Analysis • A data visualization technique in which data is represented as networks – Vertices are the individual data points (social network users) – Edges are the connections among the vertices © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 59 Data Visualization: Key Performance Indicators (KPIs) and Dashboards • KPIs: quantifiable measurements that assess progress toward organizational goals and reflect the critical success factors of an organization • Dashboard: a data visualization tool that displays the current status of the key performance indicators (KPIs) for an organization © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 60 Competitive Intelligence • Competitive intelligence encompasses information about competitors and the ways that knowledge affects strategy, tactics, and operations • Counterintelligence describes the steps an organization takes to protect information sought by “hostile” intelligence gatherers © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 61 Summary – Principle 1 • Data is one of the most valuable resources that a firm possesses • An entity is an object for which data is collected, stored, and maintained • Database considerations: content, access, logical structure, and physical organization • The relational model places data in twodimensional tables © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 62 Summary – Principle 2 • A DBMS is a group of programs used as an interface between a database and its users and other application programs • DBMS basic functions include: – Providing user views – Creating and modifying the database – Storing and retrieving data – Manipulating data and generating reports © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 63 Summary – Principle 2 (cont’d.) • After a DBMS has been installed, the database can be accessed, modified, and queried via a data manipulation language • A database administrator (DBA) plans, designs, creates, operates, secures, monitors, and maintains databases • Database as a Service (DaaS) is a new form of database service © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 64 Summary – Principle 3 • “Big data” is the term used to describe enormous and complex data collections • Data warehouses are relational DBMSs specifically designed to support management decision making © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 65 Summary – Principle 3 (cont’d.) • Data mining allows the automated discovery of patterns and relationships in a data warehouse • Counterintelligence describes the steps an organization takes to protect information sought by “hostile” intelligence gatherers © 2016 Cengage Learning®. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. 66