Download 5-37 Distributed Databases

Document related concepts

Big data wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Concurrency control wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Database wikipedia , lookup

Relational model wikipedia , lookup

Functional Database Model wikipedia , lookup

Clusterpoint wikipedia , lookup

Healthcare Cost and Utilization Project wikipedia , lookup

Database model wikipedia , lookup

Transcript
Chapter
5
Data Resource Management
McGraw-Hill/Irwin
Copyright © 2008
2008,The
TheMcGraw-Hill
McGraw-HillCompanies,
Companies,Inc.
Inc.All
Allrights
rightsreserved.
reserved.
Learning Objectives
• Explain the business value of implementing
data resource management processes and
technologies in an organization
• Outline the advantages of a database
management approach to managing the data
resources of a business, compared to a file
processing approach
• Explain how database management software
helps business professionals and supports the
operations and management of a business
5-2
Learning Objectives
• Provide examples to illustrate the following
concepts
• Major types of databases
• Data warehouses and data mining
• Logical data elements
• Fundamental database structures
• Database development
5-3
Case 1: Sharing Business Databases
• Amazon’s data vault
•
•
•
•
•
•
Product descriptions
Prices
Sales rankings
Customer reviews
Inventory figures
Countless other layers of content
• Took 10 years and a billion dollars to build
5-4
Case 1: Sharing Business Databases
• Amazon opened its data vault in 2002
• 65,000 developers, businesses, and entrepreneurs
have tapped into it
• Many have become ambitious business partners
• eBay opened its $3 billion databases in 2003
• 15,000 developers and others have registered
to use it and to access software features
• 1,000 new applications have appeared
• 41 percent of eBay’s listings are uploaded to
the site using these resources
5-5
Case 1: Sharing Business Databases
• Google recently unlocked access to its desktop
and paid-search products
• Dozens of Google-driven services cropped up
• Developers can grab 1,000 search results a day
for free; anything more requires permission
• In 2005, the Ad-Words paid-search service
was opened to outside applications
5-6
Case Study Questions
• What are the business benefits to Amazon and
eBay of opening up some of their databases to
developers and entrepreneurs?
• Do you agree with this strategy?
• What business factors are causing Google to
move slowly in opening up its databases?
• Do you agree with its go-slow strategy?
5-7
Case Study Questions
• Should other companies follow Amazon and
eBay’s lead and open up some of their databases
to developers and others?
• Defend your position with an example of the risks
and benefits to an actual company
5-8
Logical Data Elements
5-9
Logical Data Elements
• Character
• A single alphabetic, numeric, or other symbol
• Field or data item
• Represents an attribute (characteristic or quality)
of some entity (object, person, place, event)
• Examples: salary, job title
• Record
• Grouping of all the fields used to describe the
attributes of an entity
• Example: payroll record with name, SSN, pay rate
5-10
Logical Data Elements
• File or table
• A group of related records
• Database
• An integrated collection of logically related
data elements
5-11
Electric Utility Database
5-12
Database Structures
• Common database structures…
• Hierarchical
• Network
• Relational
• Object-oriented
• Multi-dimensional
5-13
Hierarchical Structure
• Early DBMS structure
• Records arranged in tree-like structure
• Relationships are one-to-many
5-14
Network Structure
• Used in some mainframe DBMS packages
• Many-to-many relationships
5-15
Relational Structure
• Most widely used structure
• Data elements are stored in tables
• Row represents a record; column is a field
• Can relate data in one file with data in another,
if both files share a common data element
5-16
Relational Operations
• Select
• Create a subset of records that meet a stated
criterion
• Example: employees earning more than $30,000
• Join
• Combine two or more tables temporarily
• Looks like one big table
• Project
• Create a subset of columns in a table
5-17
Multidimensional Structure
• Variation of relational model
• Uses multidimensional structures to
organize data
• Data elements are viewed as being in cubes
• Popular for analytical databases that support
Online Analytical Processing (OLAP)
5-18
Multidimensional Model
5-19
Object-Oriented Structure
• An object consists of
• Data values describing the attributes of an entity
• Operations that can be performed on the data
• Encapsulation
• Combine data and operations
• Inheritance
• New objects can be created by replicating some
or all of the characteristics of parent objects
5-20
Object-Oriented Structure
Source: Adapted from Ivar Jacobsen, Maria Ericsson, and Ageneta Jacobsen, The Object Advantage: Business Process
Reengineering with Object Technology (New York: ACM Press, 1995), p. 65.
Copyright @ 1995, Association for Computing Machinery. By permission.
5-21
Object-Oriented Structure
• Used in object-oriented database management
systems (OODBMS)
• Supports complex data types more efficiently
than relational databases
• Examples: graphic images, video clips,
web pages
5-22
Evaluation of Database Structures
• Hierarchical
• Works for structured, routine transactions
• Can’t handle many-to-many relationship
• Network
• More flexible than hierarchical
• Unable to handle ad hoc requests
• Relational
• Easily responds to ad hoc requests
• Easier to work with and maintain
• Not as efficient/quick as hierarchical or network
5-23
Database Development
• Database Administrator (DBA)
• In charge of enterprise database development
• Improves the integrity and security of
organizational databases
• Uses Data Definition Language (DDL) to develop
and specify data contents, relationships, and
structure
• Stores these specifications in a data dictionary
or a metadata repository
5-24
Data Dictionary
• A data dictionary
• Contains data about data (metadata)
• Relies on specialized software component to
manage a database of data definitions
• It contains information on..
• The names and descriptions of all types of data
records and their interrelationships
• Requirements for end users’ access and use of
application programs
• Database maintenance
• Security
5-25
Database Development
5-26
Data Planning Process
• Database development is a top-down process
• Develop an enterprise model that defines the
basic business process of the enterprise
• Define the information needs of end users in
a business process
• Identify the key data elements that are needed
to perform specific business activities
(entity relationship diagrams)
5-27
Entity Relationship Diagram
5-28
Database Design Process
• Data relationships are represented in a data
model that supports a business process
• This model is the schema or subschema on
which to base…
• The physical design of the database
• The development of application programs to
support business processes
5-29
Database Design Process
• Logical Design
• Schema - overall logical view of relationships
• Subschema - logical view for specific end users
• Data models for DBMS
• Physical Design
• How data are to be physically stored and
accessed on storage devices
5-30
Logical and Physical Database Views
5-31
Data Resource Management
• Data resource management is a managerial
activity
• Uses data management, data warehousing,
and other IS technologies
• Manages data resources to meet the information
needs of business stakeholders
5-32
Case 2: Emerson & Sanofi, Data Stewards
• Data stewards
• Dedicated to establishing and maintaining the
quality of data
• Need business, technology, and diplomatic skills
• Focus on data content
• Judgment is a big part of the job
5-33
Case Study Questions
• Why is the role of a data steward considered to
be innovative?
• What are the business benefits associated with
the data steward program at Emerson?
• How does effective data resource management
contribute to the strategic goals of an
organization?
5-34
Types of Databases
5-35
Operational Databases
• Stores detailed data needed to support business
processes and operations
• Also called subject area databases (SADB),
transaction databases, and production
databases
• Database examples: customer, human resource,
inventory
5-36
Distributed Databases
• Distributed databases are copies or parts of
databases stored on servers at multiple locations
• Improves database performance at worksites
• Advantages
•
•
•
•
Protection of valuable data
Data can be distributed into smaller databases
Each location has control of its local data
All locations can access any data, any where
• Disadvantages
• Maintaining data accuracy
5-37
Distributed Databases
• Replication
• Look at each distributed database and find
changes
• Apply changes to each distributed database
• Very complex
• Duplication
• One database is master
• Duplicate the master after hours, in all locations
• Easier to accomplish
5-38
External Databases
• Databases available for a fee from commercial
online services, or free from the Web
• Examples: hypermedia databases, statistical
databases, bibliographic and full text databases
• Search engines like Google or Yahoo are
external databases
5-39
Hypermedia Databases
• A hypermedia database contains
• Hyperlinked pages of multimedia
• Interrelated hypermedia page elements,
rather than interrelated data records
5-40
Components of Web-Based System
5-41
Data Warehouses
• Stores static data that has been extracted from
other databases in an organization
• Central source of data that has been cleaned,
transformed, and cataloged
• Data is used for data mining, analytical
processing, analysis, research, decision support
• Data warehouses may be divided into data marts
• Subsets of data that focus on specific aspects
of a company (department or business process)
5-42
Data Warehouse Components
5-43
Applications and Data Marts
5-44
Data Mining
• Data in data warehouses are analyzed to reveal
hidden patterns and trends
• Market-basket analysis to identify new
product bundles
• Find root cause of qualify or manufacturing
problems
• Prevent customer attrition
• Acquire new customers
• Cross-sell to existing customers
• Profile customers with more accuracy
5-45
Traditional File Processing
• Data are organized, stored, and processed in
independent files
• Each business application designed to use
specialized data files containing specific
types of data records
• Problems
•
•
•
•
Data redundancy
Lack of data integration
Data dependence (files, storage devices, software)
Lack of data integrity or standardization
5-46
Traditional File Processing
5-47
Database Management Approach
• The foundation of modern methods of managing
organizational data
• Consolidates data records formerly in separate
files into databases
• Data can be accessed by many different
application programs
• A database management system (DBMS) is the
software interface between users and databases
5-48
Database Management Approach
5-49
Database Management System
• In mainframe and server computer systems, a
software package that is used to…
• Create new databases and database applications
• Maintain the quality of the data in an
organization’s databases
• Use the databases of an organization to provide
the information needed by end users
5-50
Common DBMS Software Components
• Database definition
• Language and graphical tools to define entities,
relationships, integrity constraints, and
authorization rights
• Nonprocedural access
• Language and graphical tools to access data
without complicated coding
• Application development
• Graphical tools to develop menus, data entry
forms, and reports
5-51
Common DBMS Software Components
• Procedural language interface
• Language that combines nonprocedural access
with full capabilities of a programming language
• Transaction processing
• Control mechanism prevents interference from
simultaneous users and recovers lost data after
a failure
• Database tuning
• Tools to monitor, improve database performance
5-52
Database Management System
• Database Development
• Defining and organizing the content,
relationships, and structure of the data needed
to build a database
• Database Application Development
• Using DBMS to create prototypes of queries,
forms, reports, Web pages
• Database Maintenance
• Using transaction processing systems and other
tools to add, delete, update, and correct data
5-53
DBMS Major Functions
5-54
Database Interrogation
• End users use a DBMS query feature or report
generator
• Response is video display or printed report
• No programming is required
• Query language
• Immediate response to ad hoc data requests
• Report generator
• Quickly specify a format for information you
want to present as a report
5-55
Database Interrogation
• SQL Queries
• Structured, international standard query language
found in many DBMS packages
• Query form is SELECT…FROM…WHERE…
5-56
Database Interrogation
• Boolean Logic
• Developed by George Boole in the mid-1800s
• Used to refine searches to specific information
• Has three logical operators: AND, OR, NOT
• Example
• Cats OR felines AND NOT dogs OR Broadway
5-57
Database Interrogation
• Graphical and Natural Queries
• It is difficult to correctly phrase SQL and other
database language search queries
• Most DBMS packages offer easier-to-use,
point-and-click methods
• Translates queries into SQL commands
• Natural language query statements are similar
to conversational English
5-58
Graphical Query Wizard
5-59
Database Maintenance
• Accomplished by transaction processing systems
and other applications, with the support of the
DBMS
• Done to reflect new business transactions and
other events
• Updating and correcting data, such as customer
addresses
5-60
Application Development
• Use DBMS software development tools to
develop custom application programs
• Not necessary to develop detailed data-handling
procedures using conventional programming
languages
• Can include data manipulation language (DML)
statements that call on the DBMS to perform
necessary data handling
5-61
Major DBMS Software
•
•
•
•
•
MS Access
MS SQL Server
IBM DB2
Oracle 9i
MySQL (Open source DBMS)
5-62
mySQL DBMS Application
5-63
Case 3: Acxiom Corp. Data
• Acxiom does three things really well…
• Manages large volumes of data
• Cleans, transforms, and enhances that data
• Distills business intelligence from that data to
drive smart decisions
• Refined data is sold to customers
•
•
•
•
Developing telemarketing lists
Identifying prospects for credit card offers
Screen prospective employees
Detecting fraudulent financial transactions
5-64
Case 3: Acxiom Corp. Data
• Primary business activities
• Building its data library
• Selling data
• Managing other companies’ data and data centers
5-65
Case Study Questions
• Acxiom is in a unique type of business. How
would you describe the business of Acxiom?
• Are they a service- or product-oriented business?
• It is easy to see that Acxiom has focused on a
wide variety of data from different sources.
• How does Acxiom decide which data to collect,
and for whom?
• Acxiom’s business raises many issues related
to privacy.
• Are the data collected by Acxiom really private?
5-66
Case 4: Protecting the Data Jewels
• Harrah’s Entertainment and other casino
companies closely guard customer data
• Both hard copy and electronic files
• Concerns
• Broader access to CRM systems
• More frequent job switching
5-67
Case 4: Protecting the Data Jewels
• Protection methods
• Nondisclosure, non-compete, and nonsolicitation
agreements that specify customer lists
• Trade-secret laws and legal action
• Limiting access to sensitive information
• Physical security
• Strong password protection
• Reinforcement of signed agreements during
exit interviews
• Monitoring electronic communication
5-68
Case Study Questions
• Why have developments in IT helped to
increase the value of the data resources of
many companies?
• How have these capabilities increased the
security challenges associated with protecting
a company’s data resources?
• How can companies use IT to meet the
challenges of data resource security?
5-69