Download Chapter 9

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
Chapter 9
Database Systems
Courtesy of Chris Pascucci, Shelly/Vermaat,
Joanne Nichols
Database Basics
 Database
– Collection of data on a specific topic or purpose that is stored for
future use.
– Data is organized so you can access, retrieve, sort, and edit data.
 Database Management System (DBMS)
– Software used to create, use, and manage a database.
– Create forms, reports, and queries.
 Database System
– Comprised of a database, DBMS, and applications.
– Applications such as e-commerce and scheduling.
– University example: registration applications, financial applications,
etc…
2
Database Basics
 Data
– Unprocessed items like raw facts, numbers, text, etc…
 Information
– Data that has been processed in an organized and meaningful way.
 A major function of a computer is to process data into
information.
3
Database Integrity


Data Integrity is maintained when
data is accurate and up-to-date.
Garbage in
Garbage In, Garbage Out (GIGO)
–
computer phrase that means you
cannot create correct information
from incorrect data.
Garbage out
Data integrity
is lost
4
Characteristics of Information
Accurate
Verifiable
Timely
Organized
Accessible
Useful
Costeffective
5
Data dictionary
A data dictionary contains data about each
file in the database and each field in those
files
6
Validating Data
Alphabetic/Numeric
check
Range check
Consistency check
Completeness
check
Check digit
Other checks
7
Database Systems
 Multiple users can interact with the same database
8
The Hierarchy of Data
 A database contains files, files contain records, and records
contain fields.
 Database
– A collection of integrated and related files.
 Files
– A collection of related records.
 Records
– A collection of related fields.
 Fields
– A collection of characters that describe some aspect of an object.
– A single piece of information like a name, number, city, state, etc…
9
The Hierarchy of Data
10
Hierarchy of Data
11
Data File Example
records
Member ID
First Name
Last Name
Address
City
State
2295
Milton
Brewer
54 Lucy Court
Shelbyville
IN
2928
Shannon
Murray
33099 Clark Street
Montgomery
AL
3876
Louella
Drake
33 Timmons Place
Cincinnati
OH
3928
Adelbert
Ruiz
99 Tenth Street
Carmel
IN
4872
Elena
Gupta
2 East Penn Drive
Pittsboro
IN
key
field
fields
12
Database Management
Systems
13
File Maintenance
Modifying
Records
Adding
Records
Deleting
Records
14
Benefits of Using a DBMS
 Enter data quickly and easily.
 Organize records in different and useful ways.
 Locate records quickly.
 Eliminate redundant data.
 Create queries for specific data.
 Create reports.
15
DBMS
16
Database Approach to Data
Management
 Database Approach
– Many applications and users can share data in a database.
– Secures data so only authorized users can access it.
• Access privileges (none, read-only, full-update)
• Principle of least privilege
– Provides means to backup data.
– Requires a DBMS.
 File Processing System Approach
– Each department/area within an organization has its own set of files.
– Data redundancy - same data stored in multiple files.
– Isolated data - data stored in files at various physical locations difficult to access.
17
Benefits of Using a Database Approach
18
File Processing vs DBMS
19
Database Management
 Creating and implementing the right database system
involves:
– Determining how data is stored and retrieved.
– How people will see and use the database.
– How the database will be created and maintained.
– How reports and documents will be generated.
20
Types of Databases
 Relational Databases (most commonly used)
 Object-Oriented Databases
 Multi-Dimensional Databases (used for data warehouses)
 Others…
21
Relational Databases
 A relational database stores data in a table that consists of
rows and columns.
 Most common type of database used for payroll, inventory,
ordering, and other business-related functions.
 Also stores data relationships, which are connections within
data.
22
Relational Databases
23
Object-Oriented Databases
 An object-oriented database stores data in objects.
 An object is an item that contains data, as well as actions
that read and process the data.
 Mainly used for multimedia databases (video, audio,
graphics), CAD (computer aided design) , and Web
databases.
24
Multi-Dimensional Databases
 A multi-dimensional database stores data in dimensions.
 Multiple dimensions, also called hypercube, allow users to
analyze any view of data.
 Can consolidate data much faster than relational database.
25
Multi-Dimensional Databases
26
Large-Scale Databases
 Data-Warehouse
– Huge database that stores and manages massive amounts of data.
– Holds important information from a variety of sources.
– Usually a subset of multiple database.
 Data Mart
– Smaller version of a data warehouse.
– Often developed for a specific purpose.
• Examples: sales department, inventory and shipping department, finance
department, upper level management, and so on. Regional operating centers
might each have their own data mart that contributes to the master data
warehouse
27
Large-Scale Databases
 Data-Mining
– A technique used to extract information from a data warehouse or a
data mart.
– Sort through huge amounts of data to find patterns and establish
relationships among the data.
 Business Intelligence
– Business use of data mining can help increase efficiency, reduce
costs, or increase profits.
– Identifies trends.
– Identifies patterns in customer behaviors.
28
Example of Data Mining
Wal-Mart captures point-of-sale transactions from over 2,900
stores in 6 countries and continuously transmits this data to its
massive 500+ terabyte data warehouse. 1 Terabyte = 1 trillion characters (bytes)
Can determine what products are selling well or poorly in which
regions.
Database is refreshed every hour.
Wal-Mart allows more than 3,500 suppliers to access data on their
products and perform data analyses.
These suppliers use this data to identify customer buying patterns
at the store display level. They use this information to manage
local store inventory and identify new merchandising opportunities.
29
Data Mining
 Some concerns regarding Data-Mining
– DARPA (Defense Advanced Research Projects Agency ) developed project
TIA (Terrorism Information Awareness).
• Main goal of TIA is to preemptively uncover and disrupt terrorist attacks.
• TIA helps U.S. government monitor daily transactions such as, credit card
transactions, airline tickets, rental car, passport, driver’s license, etc…
– Medical Information
• Prescription reminders sent from a pharmacy require access to certain personal
information.
• Profiling patients based on factors such as, age, gender, disease, etc…
– Clinicians must make patients aware of how their information may be used.
– Limitations:
• Data mining tools are not self-sufficient applications and require trained specialists
to analyze the information generated by these tools.
• Patterns and connections that are found depend on “Real World” circumstances
that may be casual and not necessarily be a threat.
30
Databases
 How are databases important to us?
Shop for
products or
services
Buy or sell
stocks
Search for a
job
Make airline
reservations
Register for
college
classes
Check
semester
grades
31
Databases in Action
NCIC – National Crime Information Center




FBI’s huge database created in 1967 under J. Edgar Hoover.
Over 15 million active records in 19 files.
Makes available a variety of records for law enforcement and security
purposes.
Information in this database assists in:
– Apprehending fugitives
– Locating missing persons
– Locating and returning stolen property
 http://www.fbi.gov/about-us/cjis/ncic
32
Databases in Action
National DNA Database





Originally intended for sex offenders – has since been extended to
include almost any criminal offender.
FBI uses this database to store missing persons DNA.
Stores DNA crime scenes samples.
Used to ID unidentified human remains.
US has over 9 million records in CODIS
(Combined DNA index system) – largest
DNA DB in the world!
33
Databases in Action
National Security Agency (NSA) Database
 “largest database ever assembled in the world”, from unnamed source in
the NSA. Contains hundreds of billions of records of telephone calls.
 Existence of this database and the NSA program that compiled it was
unknown to the general public until USA TODAY broke the story in May
2006.
 Records and saves all phone calls ever made and all telecommunications
via a “black room” called “Room 641A”.
 Supercomputers analyze all data in their database
to find certain flags.
– Terrorist chatter
34