Download Chapter 3. - GEOCITIES.ws

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data analysis wikipedia , lookup

Data model wikipedia , lookup

Versant Object Database wikipedia , lookup

Expense and cost recovery system (ECRS) wikipedia , lookup

Concurrency control wikipedia , lookup

3D optical data storage wikipedia , lookup

Data vault modeling wikipedia , lookup

Database wikipedia , lookup

Enterprise content management wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Information privacy law wikipedia , lookup

Business intelligence wikipedia , lookup

Database model wikipedia , lookup

Transcript
Chapter 3
Content Management
Information Technology & Management
Thompson Cats-Baril
The McGraw-Hill Companies, Inc. 2002
Chapter Objectives
• To understand how digital content is represented.
• To have an appreciation for how transactions are
recorded and processed.
• To understand the role of a database management
system (DBMS) in creating and using databases.
• To appreciate the different types of DBMSs
available and understand the trends in DBMSs.
• To appreciate the potential for using data mining
tools to derive insights from data stored in
databases and data warehouses.
Information Technology & Management
Thompson Cats-Baril
The McGraw-Hill Companies, Inc. 2002
2
Data Representation
• A Byte is typically 8 bits.
• A bit is the smallest item information technology
can process, normally either a 1 or 0.
• A field or data element is the smallest unit of data
that has meaning to humans.
– Examples include, EmployeeNumber, EmployeeName,
Department, and StartDate
• Field is normally used to describe the field name.
• Data element is used to describe the contents of
the field.
Information Technology & Management
Thompson Cats-Baril
The McGraw-Hill Companies, Inc. 2002
3
Data Representation
• A record is a collection of fields that contain information
concerning a specific thing or event.
– An example of an employee record would include the
four previous fields
•
•
•
•
EmployeeNumber=“10121”
EmployeeName=“Greenwood, Marie-Louise”
Department=“Customer Service”
StartDate=“05/01/2002”
• A collection of records is called a file.
• Records are usually identified by a key field or Primary
key.
• A group of related files would be referred to as a database.
Information Technology & Management
Thompson Cats-Baril
The McGraw-Hill Companies, Inc. 2002
4
File Access
• Sequential Access
– A specific record is located by starting at the beginning
of the file and scanning each record until the desired
record is located.
• Direct Access
– A specific record is located by going directly to correct
folder or close to it.
– One popular technique is hashing, based on a
mathematical algorithm.
• The hashing algorithm is applied to the primary key field to
generate a storage location on a physical storage device.
Information Technology & Management
Thompson Cats-Baril
The McGraw-Hill Companies, Inc. 2002
5
File Access
• ISAM or indexed sequential access method
– In between sequential and direct access
– An index is maintained that points to sections
of records in the file.
– When a specific record is requested, the
database software goes to the first record of the
section.
– Then reads the records in that section
sequentially until the correct record is located.
Information Technology & Management
Thompson Cats-Baril
The McGraw-Hill Companies, Inc. 2002
6
Transaction Processing
• A transaction is the record of an event.
• Transaction processing involves the use of
human procedures and/or computer
programs to store, retrieve, and manipulate
records of events.
• Master File
• Transaction – information relevant to the
most recent transaction.
Information Technology & Management
Thompson Cats-Baril
The McGraw-Hill Companies, Inc. 2002
7
Transaction Processing
• Master File
• Transaction File
• File Processing System (used to store,
retrieve, and manipulate records within
files)
• Sequential File Organization
• Data Redundancy
Information Technology & Management
Thompson Cats-Baril
The McGraw-Hill Companies, Inc. 2002
8
Data Processing
• DBMS– Database Management System
– Data Definition Language
Used in conjuction
– Data Dictionary
– Data Manipulation Language (program for
retrieving and manipulating data)
– Application Generators (easy to use queries for
retrieving files)
– Data Administration (i.e., back up data)
Information Technology & Management
Thompson Cats-Baril
The McGraw-Hill Companies, Inc. 2002
9
Data Capture and Processing
• Batch Processing
– Transactions are temporarily stored and then
processed all at once.
• Real Time Processing
– Each transaction is processed as it occurs.
• OLTP – Online Transaction Processing
– Combination of on-line data capture and realtime processing.
Information Technology & Management
Thompson Cats-Baril
The McGraw-Hill Companies, Inc. 2002
10
Relational Database Model
• Relational Database Model
– Relations or tables
• Two dimensional
– Keys
• Primary Key
– Uniquely identifies each record.
• Foreign Key(s)
– A primary key is placed in a second table to maintain a
relationship.
Information Technology & Management
Thompson Cats-Baril
The McGraw-Hill Companies, Inc. 2002
11
Retrieving Data
• SQL—Structured Query Language
– Is a data manipulation language incorporated in the DBMS.
– SQL is a set of concise and powerful data management
commands
– SELECT ORDER.Order.Date, ORDER.OrderTotal
FROM ORDER
WHERE ORDER.CustomerNumber=10
– SQL can be embedded in a programming language,
embedded SQL
Information Technology & Management
Thompson Cats-Baril
The McGraw-Hill Companies, Inc. 2002
12
Presenting Information
• Report Generator, is a group of programs
that are designed to facilitate the creation of
standard, formatted output that is referred to
as a report.
– Paper
– Computer monitor
Information Technology & Management
Thompson Cats-Baril
The McGraw-Hill Companies, Inc. 2002
13
DBMS Vendors
• IBM
– Largest share of DBMSs running on a
mainframe.
• Oracle
– Leader in DBMSs running on servers.
• Microsoft
Information Technology & Management
Thompson Cats-Baril
The McGraw-Hill Companies, Inc. 2002
14
Performance Criteria of DBMSs
• Cost
– Includes software license fees
– Service and maintenance fees
– Consulting fees for installation
• Compatibility
– Ability to support necessary applications without major
modification
• Capacity
– Number of simultaneous users
– Volume of transactions
Information Technology & Management
Thompson Cats-Baril
The McGraw-Hill Companies, Inc. 2002
15
Object Oriented Database Model
• Object Oriented Database Management
System is based on a model that integrates
object-oriented concepts with the data-base
system
• Object Oriented Databases
• CAD – Computer Aided Design
• Object Query Language (OQL)
Information Technology & Management
Thompson Cats-Baril
The McGraw-Hill Companies, Inc. 2002
16
Data Warehouses
• Data Warehouse is a special type of
database that is designed to support decision
making, rather than transaction processing
– OLAP or Online Analytical Processing: on-line
systems that access databases and data
warehouses and then process data to support
decision making
– Data Mart: smaller subset of data warehouse
– Multidimensional Database (not just 2D)
Information Technology & Management
Thompson Cats-Baril
The McGraw-Hill Companies, Inc. 2002
17
Data Mining: getting the most out of
the data that have been collected
• Customer Relationship Management or
CRM
• Query and Reporting
• Neural Network Tools (uses raw data points
as inputs, and attempt to identify patterns)
• Ad targeting and direct marketing
Information Technology & Management
Thompson Cats-Baril
The McGraw-Hill Companies, Inc. 2002
18
Distributed Databases
• A distributed database is where a database
is duplicated allowing users at different
locations to access exact replications of the
database
• Issues with Distributed Databases
– Identical Copies
– Backups
• Security (Imarbank lost all its database for
weeks)
Information Technology & Management
Thompson Cats-Baril
The McGraw-Hill Companies, Inc. 2002
19