Download managing organizational data & information

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Big data wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Database wikipedia , lookup

Relational model wikipedia , lookup

Functional Database Model wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
INTRODUCTION TO
INFORMATION TECHNOLOGY
IS01
Dr.Anita Seth
Managing Organizational Data
•
•
•
Today’s business enterprises cannot survive
without quality data about their internal
operations and external environment.
Data can be anything…numbers, image or
raw fact.
Information-when the data is processed and
converted into meaningful and useful form.
Dr.Anita Seth
Foundation Data Concepts
•
Bit- Smallest unit of data; binary digit (0,1)
•
Byte- Group of bits that represents a single
character.
•
Character – single alphabetic, numeric or
other symbol
•
Field – group of related characters. E.g
student’s name etc.
Dr.Anita Seth
Foundation Data Concepts
•
Record – logical grouping of related fields.
•
File – group of related records
•
Entity- Person, place, thing, event about which
information is maintained
•
Attribute- Description of a particular entity
Dr.Anita Seth
Foundation Data Concepts
Dr.Anita Seth
Foundation Data Concepts
•
Key field- Identifier field used to retrieve,
update, sort a record
•
Primary Key- that uniquely identifies a record
so that the record can be retrieved, updated.
•
Foreign Key- primary key of one file and
appears in another file.
Dr.Anita Seth
Data Access Methods
•
Sequential Access– data records retrieved
in the same physical sequence in which
they are stored. e.g. magnetic tape
•
Direct Access- records can be retrieved in
any sequence. e.g. floppy disk
•
Indexed sequential Access-uses the key field
to locate physical address of a record.
- employs transform algorithm to translate
the key field into record’s storage location on
disk
Dr.Anita Seth
Types of Data Processing
Batch processing
Changes to data file accumulated and stored,
processing is done periodically. e.g.
generation of student’s mark sheet.
Online processing
Transactions are entered directly into
computer and processed immediately.
- In real time applications, data is captured
and processed. e.g. airline reservation
system
Dr.Anita Seth
Traditional File Processing
•
Data are organized, stored in independent
files each organized in a different way.
•
Each file was organized to be used by
different application program.
•
Difficult to get the required information.
Dr.Anita Seth
Problems of File Processing
•
•
Data Redundancy – independent data files
included lot of duplicated data; duplicated data
had to be updated.
Data inconsistency -various copies of data
may not agree.
•
Lack of Data Integrity – data values may not
be accurate across multiple data files.
•
Lack of Data security – new applications may
be added to the system on ad-hoc basis and
more people access the data.
Dr.Anita Seth
Database: Modern approach
•
Logically organized collection of similar or
related data.
•
Serves a base from which the desired
information can be retrieved and further
processing or reorganizing can be done.
•
Eliminates
problems
traditional file approach.
Dr.Anita Seth
associated
with
Types of Databases
Dr.Anita Seth
Types of Databases
•
Operational – contain the data to support the
business processes and operations of a
company. e.g. customer database.
•
Centralized database
- All the related files in one physical location.
- When centralized database computer fails,
all users affected.
Dr.Anita Seth
Types of Databases
•
Distributed – complete copies of database in
more than one physical location.
- Two types: replicated and partitioned.
- Replicated database has complete copy of
entire database in many locations; creates too
much overhead.
- In Partitioned database, data is subdivided;
data can be entered quickly;
widespread access to sensitive company data
increases security problems.
Dr.Anita Seth
Databases Management
System
•
A collection of programs that enable to
store, modify, and extract information form a
database.
•
Few examples
- computerized library
- flight reservation system
- computerized inventory system
Dr.Anita Seth
Data Abstraction
Process of distilling the data
•
Physical view specifies how the data actually
stored.
•
Logical view describes what relationship
exists between the various data.
Dr.Anita Seth
Database Structures
•
Hierarchical – relationships between records
form a hierarchy or treelike structure; Structure
characterized by one to many relationship.
•
Network – data can be accessed by one of
several paths because any data element or
record can be related to any number of other
data elements
- Depicts data logically as many-to-many
relationships
Dr.Anita Seth
Hierarchical and Network
DBMS
Disadvantages
•
Time consuming; difficult to install.
•
Less flexible compared to RDBMS
•
Lack support for ad-hoc and English languagelike queries
Dr.Anita Seth
Relational Database Structure
•
All data elements within the database are
viewed as being stored in the form of 2D tables
called as relations
•
Relates data across tables based on common
data element
•
Examples: DB2, Oracle, MS SQL Server
Dr.Anita Seth
Object-Oriented Database
Structure
•
Multi-dimensional database structure.
•
Can accommodate more complex data types
including graphics, pictures, voice and text
•
Inheritance – automatically creating new
objects by replicating some or all of the
characteristics of one or more existing
objects
Dr.Anita Seth
Evaluation of Database
Structures
•
Hierarchical data structure is best for
structured, routine types of transaction
processing.
•
Network data structure is best when manyto-many relationships are needed.
•
Relational data structure is best when ad
hoc reporting is required.
Dr.Anita Seth
Database Management
Approach
•
Consolidates data records into one database
that can be accessed by many different
application programs.
•
Software interface between users and
databases
•
Data definition is stored once, separately from
application programs
Dr.Anita Seth
Database Interrogation
Capability of a DBMS to report information from the
database in response to end users’ requests
•
Query Language – allows easy, immediate
access to ad hoc data requests
•
Report Generator - allows quick, easy
specification of a report format for information
users have requested
Dr.Anita Seth
Database Language
To create or manipulate a database
•
Data definition language (DDL)
- defines types of information in the database
and how they will be structured.
- provides the link between logical and physical
view of database.
- defines physical characteristics of each
record, fields within a record, field’s logical
name, data type and character length.
Dr.Anita Seth
Database Language
•
Data manipulation language (DML)
- used to query, retrieve, store, update,
delete or display the contents of the
database
- Query languages like SQL (Structured
Query Language), an important component
of DBMS.
- SQL combines both DML and DDL
features.
- can perform complicated searches with
simple statements
Dr.Anita Seth
Structured Query Language
•
Uses keywords like
SELECT (specify the desired attribute )
FROM ( specify the table to be used)
WHERE (specify conditions to apply)
•
Example: To find from university database, all
those students graduating with honors and
belonging to general category.
SQL statement would be
SELECT (student name), FROM (student
database), WHERE (category=G and Grade
point average >=5)
Dr.Anita Seth
Data Dictionary
•
•
•
In relational database, information organized
and accessed according to logical structure .
When relational database created, data
dictionary prepared.
Data dictionary contains logical properties of
field values.
e.g. Field name
Type- alphabetic, numeric etc.
Default value etc.
Dr.Anita Seth
On-line Transaction
Processing (OLTP)
•
Implies gathering information, processing
and updating.
•
DBMS and databases support OLTP.
Dr.Anita Seth
On-line analytical
Processing (OLAP)
•
Multidimensional data analysis
•
Supports manipulation and analysis of large
volumes
of
data
from
multiple
dimensions/perspectives
Dr.Anita Seth
On-line analytical
processing (OLAP)
Dr.Anita Seth
Data Warehouse
•
Large database that stores data that have
been extracted from the various operational,
external, and other databases of an
organization
•
Supports reporting and query tools
•
Stores current and historical data
•
Consolidates data for management
analysis
and decision making
Dr.Anita Seth
Meta Data
•
•
•
•
Data about data
What data is available, what their sources
are; where they are; how to access them?
Technical metadata- where the data come
from; how the data was changed?; how the
data is organized? how the data is stored?
who owns the data etc.
Business metadata- what data is available?;
how to access the data?; how current the
data is?; what the data mean?
Dr.Anita Seth
Data Warehouse System
Dr.Anita Seth
Data Mart
•
Scaled down version of data warehouse
and hold subsets of data from a data
warehouse.
•
Focus on specific aspects of a company,
such as a department or a business
process.
Dr.Anita Seth
Data Warehouse & Data
Marts
Dr.Anita Seth
Data Mining
•
Analyzing the data in a data warehouse to
reveal hidden patterns and trends.
•
Data mining tools include sophisticated,
automated algorithms to identify hidden
patterns, correlations and relationships.
Dr.Anita Seth
Data Mining
•
Predict trends and behavior to make proactive
decisions.
•
E.g.
forecasting
bankruptcy;
detecting
fraudulent credit card transactions; discovering
pattern in the retail sales data for the products
that are often purchased together
Dr.Anita Seth
Data Mining Uses
•
Perform “market-basket analysis” to identify
new product bundles.
•
Find root causes to quality or manufacturing
problems.
•
Prevent customer attrition and acquire new
customers.
•
Profile customers with more accuracy
Dr.Anita Seth
Database Schema
•
Graphical presentation of whole database.
•
Database system may have different schemas:
- Physical schema
describes database design at the physical
level.
- Logical schema
describes database design at the logical level
Dr.Anita Seth
Case #1: Data base
Business Value
•
•
•
Successful sellers of books, music other
entertainment on internet owe their success to
Muze company.
Muze aggregates and classifies millions of
products from thousands of publishers.
Muze stores this massive amount of information
in relational database and license its database
at a fraction of what it would cost sellers to
compile their own information.
Dr.Anita Seth
Case #1: Data Base
Business Value
•
Information provided by Muze enables retail
customers to get in-depth information
regarding books, CDs, videotapes without
having the product in hand.
•
Muze also provides classification data that
helps retailer’s search engine to opertae
more efficiently.
Dr.Anita Seth
Important Considerations
•
•
•
•
Data warehouse and data mining tools are
expensive.
Organization need to devote considerable time
to create a Data warehouse.
Training to use data minning tools is also
expensive.
Some organizations may not need data
warehouse; necessary information to support
decision making from operational databases.
Dr.Anita Seth
Summary
•
•
•
•
Managing organizational data requires IT and
software tools.
The
database
management
approach
consolidates data needed by different
applications.
DBMS are software packages that simplify the
creation, use, and maintenance of databases.
Several types of databases are used by
business organizations including operational,
distributed, and external databases.
Dr.Anita Seth
Summary
•
Data warehouses are a central source of data
from other databases that have been
transformed and cataloged for business
analysis and decision support applications.
Dr.Anita Seth