Download Chapter 10: Title

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Extensible Storage Engine wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Concurrency control wikipedia , lookup

Relational model wikipedia , lookup

Database wikipedia , lookup

Functional Database Model wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
Chapter 10:
Databases: Controlling
the Information
Every week, we have the capacity to double
our current store of information.
The Computer Continuum
10-1
Databases:
Controlling the Information

In this chapter:
• Why has the last third of the 20th century been dubbed “The
Information Age”?
• What computer readable media are used to collect data?
• How are statistical methods used to transform data into
information?
• What are the elements of a database?
• How are Database Management Systems used to organize
data?
• What are some advantages and disadvantages of databases?
• What are some advantages of Web-based databases?
The Computer Continuum
10-2
Introduction:
Information Overload

The last third of the 20th century was dubbed “The
Information Age”.
• We are inundated with information: facts, figures, opinions,
stories, pictures, records, predictions, etc.
• Internet users generate a terabyte of information daily = 70
million 300 page books per month.
• No longer store the bulk on paper:
– Computer technology provides many types of storage
medial that are economical, longer-lasting, and easier to
access.
The Computer Continuum
10-3
Introduction:
Information Overload

We will examine the role computers play performing:
• Collecting and manipulating large amounts of data.
• Accessing that data in a timely fashion.
• Analyzing and formatting it for easy understanding.

Data and information are generally used
interchangeably. However, here they have a specific
use:
• Data: A given thing or fact. (Computers process data.)
• Information: Data repackaged into a meaningful form that
we can understand and use.
The Computer Continuum
10-4
Introduction:
Information Overload

In order to include whatever data might be needed, a
database must be carefully designed.
• Database programmers and the users of the databases must
answer the following questions:
– How can we efficiently collect large amounts of relevant
data?
– How can we reliably store that data for later use?
– Who will use the data?
– How often will they need to access the data?
– What format will they need the data in (text, numerical,
visual)?
– How will they use the data?
The Computer Continuum
10-5
The Technology of
Data Collection

Early computers provided no means to collect data or
to store it from one program execution to the next.
• Each piece of data needed for a calculation had to be input
during the actual execution of the program.
• Earliest form of computer data was collected and stored on
paper.
– Hollerith card: Used for external storage of data.
• Each card consisted of 80 columns of digits from 0-9
(each column represented one character.)
• Data was entered using a keypunch machine, and
read with a card reader.
The Computer Continuum
10-6
The Technology of
Data Collection

Paper collected in a form that the
computer could read directly:
• Mark-Sensor Data Collection
Sheets: A sheet of paper used to
collect responses to multiple
choice questions.
– Using a graphite pencil, the
responder fills in small
spaces indicating answers.
– A computer scans forms to
read marks.
The Computer Continuum
10-7
The Technology of
Data Collection

Remote electronic data sensing:
• Uses a remote sensing device
such as a satellite to collect
information.
• Satellites collect millions of times
more data in one day than an
entire 10 year census.
• Hubble Space Telescope collects
85 million bytes per second.
• Right: Satellite image of
Washington D.C.
The Computer Continuum
10-8
The Technology of
Data Collection

Bar codes:
• Used for data collection.
• Uses a combination of thick and
thin lines to identify a specific
item in an inventory.
The Computer Continuum
10-9
The Technology of
Data Collection

Data probe tools:
• A key-like instrument which, when inserted into a meter, will
electronically read and record the meter location and usage
numbers into a computer.

Voice recognition data entry:
• The collector of information speaks into a microphone of a
portable terminal. The receiving computer accepts the voice
transmission, and transcribes it into a form the computer can
store.
– A laptop can be used as a remote terminal.
The Computer Continuum
10-10
The Technology of
Data Collection

Online Interactive Data Entry:
• New data can be entered into a database from a Web browser
screen.
• A company provides a user with an on-screen form in a secure
environment.
– Secure environment: A Web site that protects the security
of the data being entered or displayed, and the privacy of
the user to whom the data belongs.
– Most common use is in electronic commerce.
• Order taken, payment made, shipment made,
confirmation sent.
The Computer Continuum
10-11
Retrieving Data

Effective data retrieval is affected by how well the files
were organized.
• FBI Fingerprint Processing - A Study in Data Collection,
Storage, and Retrieval.
– FBI maintains a database of over 270 million sets of
fingerprints.
– The FBI fingerprint collection process:
• Fingertips are inked then are collected onto a card.
• Completed cards, including personal information, are
sent to the FBI.
• 800+ technicians determine each print’s Henry
classification. (classifies according to ridge patterns)
The Computer Continuum
10-12
Retrieving Data
• The FBI fingerprint storage process:
– The information and images on the cards are scanned into
the computer’s memory.
• The computer operator:
» Adjusts for location and orientation.
» Computes the center points.
» The computer scans and determines
classification.
– Until the mid-1970’s, all fingerprint data was stored in file
cabinets on cards.
The Computer Continuum
10-13
Retrieving Data
• The FBI fingerprint retrieval process:
– After the information and images on the cards are scanned
into the computer’s memory and has been classified:
• The set is compared to others of the same classification.
• If sufficient points of identification are found, a match is
declared.
• A final check is performed by a human technician to verify
the match.
• Before computers: 1400 fingerprint technicians processed
24,000 requests per day.
• After computers: Over 30,000 sets are processed with less
than half the technicians.
The Computer Continuum
10-14
Retrieving Data

Visualization of Information
• Uses many different
techniques to transform any
form of data into a visual
image.
Population densities in 1979.
The Computer Continuum
10-15
Retrieving Data

Ultrasound Imaging: A medical diagnostic technique that
provides visual images constructed from the sounds that reflect
off various organs of the body.
The Computer Continuum
10-16
The Role of Statistics:
Transforming Data/Information

The retrieval process involves the examination,
summarization, and manipulation of data into
information.
• Most commonly used method of transforming data is
statistical analysis.
– Some important statistical concepts include:
• Percents
• Probability
• Selecting data for statistical analysis (Sampling)
• Normal distribution
• Correlation
The Computer Continuum
10-17
The Role of Statistics:
Transforming Data/Information

Percents: A special type of fraction. It is the number of
parts out of the total of 100 parts that are in question.
• 25% of 100 people would be 25 people.
• 20% of 10 dogs would be 2 dogs (20/100 of 10).

Probability: Deals with our ability to predict whether
certain events will occur or not.
• “50% chance of showers this evening”: This prediction is
based on the probability that certain patterns of observed
weather will continue.
The Computer Continuum
10-18
The Role of Statistics:
Transforming Data/Information

Gathering data on a particular topic:
• Sample: A small group used to represent the much larger
group.
• Sampling: The technique of predicting a total situation using
a comparative few isolated representatives.
** A sample must be carefully selected to avoid
predetermined statistical results. **
• Skewed sample: Selection of the group participating in a
survey supports some predetermined outcome.
– If a sample is randomly chosen, we can expect some
semblance of a normal distribution.
• Normal distribution: 68% of all data values fall within a
limited range near the center of the distribution.
The Computer Continuum
10-19
The Role of Statistics:
Transforming Data/Information

Performing Statistical Analyses on data:
• Correlation: A connection or relation linking two or more
pieces of information.
• Example: There is a correlation between index finger length
and whether a person is male or female.
– Measurements of 100 male and 100 female index finger
measurements:
Gender
Male
Female
Mean Index Finger Length
72 Centimeters
68 Centimeters
The Computer Continuum
10-20
The Role of Statistics:
Transforming Data/Information

Performing Statistical Analyses on data:
• False Correlation or False Relevance: Involves the creation
of a cause and effect relationship between two facts that seem
to be related but are not. (The facts might be true, but the
relationship is not!)
– Fact 1: All human beings breath oxygen.
– Fact 2: All human beings must die sometime.
– False relevance: Oxygen must be toxic, because 100% of
those breathing oxygen today will die in the future.
The Computer Continuum
10-21
Creating a Custom Database

Database: An organized collection of information.
• The arrangement of a typical database:
– Field: A location which contains one specific piece of
information.
– Record: A collection of related data items.
– File: A group of records, all of the same type.
Last:
First:
ID:
Phone#:
DOB:
The Computer Continuum
10-22
Creating a Custom Database

DBMS (Database Management System): A software
application that allows you to store, organize and
retrieve data from one or more databases.
• Combines (into a complete package):
– Structural elements of a database (fields, records, files).
– A query language.
– Programs for data modification.
– Programs for statistical analysis.
– Report writing.
The Computer Continuum
10-23
Creating a Custom Database

Using a Relational Database System
• Relational Database: A commonly used DBMS based on the
relational model.
– Uses two-dimensional tables, called relations, to store
data.
• Relations are linked to each other by common fields.
– The relational model has two important features:
• Its structure is simple and direct. (data is stored in tables)
• Its structure is well suited to the client/server environment.
» Involves two computers connected by a network.
Database resides on one (server), and the software
needed to access the data resides on the other (client).
The Computer Continuum
10-24
Creating a Custom Database

Steps to create a new database using a DBMS:
1. Decide what information you might need about the subject.
2. Define the structure of your database. (Setting up the fields)
3. Enter the information about each item.
4. Select exactly the information you wish to extract.
5. Update the database.
6. Print out all or any part of the database in a format of your
choice.
The Computer Continuum
10-25
Creating a Custom Database

Using information to enhance targeted marketing in the
business world.
• Data warehouses or data marts: The collection and
consolidation of data from many individual sources into
centralized “warehouses”.
– Can apply the data to:
• Provide better customer service.
• Do better marketing analysis.
• Spot problems or opportunities.
• Data mining: Searching collections of databases to discover
relationships and global patterns that exist among them, and
applying these patterns to assist in management decisions.
The Computer Continuum
10-26
Creating a Custom Database

Database Advantages
•
•
•
•
•
•
Space saver. - Only need one copy.
Increase accuracy. - Less chance of human error.
Multiple use of data.
Data integrity. - Securing data is easier.
Time saver due to search abilities.
Easier to use the data. - Different questions can be asked of
the data.
The Computer Continuum
10-27
Creating a Custom Database

Ethical Hazards of Database Systems:
• Misrepresentation of data. - Can you “trust” the source of the
statistical analyses?
• Invasion of privacy
– Large databases hold personal information about each of
us.
– UID (Universal Identifier) The collection of all citizens’
data obtained from a single source.
The Computer Continuum
10-28
Web-Database Connectivity

Dynamically generated Web sites
• Web-database connectivity: The interaction between one or
more web pages and the contents of a specific database.
– The Web pages are designed as templates.
• Image areas and text boxes to be filled in with data
from the database.
• Separates the design and layout from of the Web page
from the content to be displayed on the screen.
• Query-based programming: Uses 4th generation
query language to select Web content from a database
and display it on a dynamically generated Web page.
The Computer Continuum
10-29
Web-Database Connectivity

Key Advantages of Web-Database Connectivity
• Database information is universally available. - Access
requires a Web browser and Internet connection.
• Web site design and development are easier and faster.
• Web site maintenance is easier and more efficient. - Changes
on the template updates all pages.
• Adjustments can be made by any authorized person rather
than a Web developer.
• Web site can display updated calculations and multimedia
information.
The Computer Continuum
10-30
Web-Database Connectivity

Getting Started: Connecting a database to the Web
• You need Middleware: Software that acts as an intermediary
between a Web server and a database. (i.e. ColdFusion, Java)
– Middleware accesses most databases by using ODBC
• ODBC (Open Database Connectivity): A set of standards
allowing information to be passed from a database to a
dynamic Web page.
• The use of a query language: The language that the database
can understand.
– Example: SQL (Structured Query Language) supported by most
database software.
– Query Wizard: Like other wizards, this tool helps frame
requests for retrieving specific data from a database.
• And, an appropriate database program.
The Computer Continuum
10-31
Web-Database Connectivity

Oracle WebDB: An integrated Solution
• A complete, integrated software solution (high-end database
program) for building, loading and monitoring Web database
applications and content-driven Web sites.
– Can help:
• Create and manage database objects.
• Develop HTML components.
• Build and maintain content-driven Web sites.
• Track Web site and database connectivity
performance.
• Manage database security.
The Computer Continuum
10-32