Download Data Warehouse

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
Data Warehouse- A Concept
UPENDRA SINGH RATHORE
DBA, SSG VLC,
O/o the Accountant General(A&E),
Rajasthan, Jaipur
Email:[email protected]
Data warehouse is a concept I am trying to put it before you as I thought one
day we would use this concept because our data is increasing day by day and any
organization may be Government/Non Government can use this data for decisionmaking.
The ‘data warehouse in the client/server environment is the repository of
data for decision support system.’
W.H. Inman
In todays computing environment we expect that data is to be doubled at
least once a year because modernization/computerization creates data. Basically OLTP
(Online Transaction Processing) creating all this data and in Our VLC system is a mix of
OLTP and Batch processing.
The client/server application falls into two categories. Decision support
system (DSS) and Online Transaction Processing (OLTP). These two client/server
categories serve different type of Business solution.
In general all the data collected by an OLTP system is of direct use to the
application. The people creating this data, understands and know how to use it to solve
their immediate day-to-day problem
But what happen if somebody outside the OLTP group wants this data?
How do they know what data is available? How do they access? What about security?
Still there are some more questions to see. Outsiders are the people those who need data
for analysis that can help them make better decision. Business people making strategic
decisions – pricing, market analysis that depends on availability of timely and accurate
data. If they can’t get to our data and we can’t get to theirs, then everybody loses.
DSS(Decision Support System)
Decision support system is used to analyze data and create reports. A
successful DSS must provide the user with flexible access to data and the tools to
manipulate and present that data in all kinds of reports formats. DSS are not generally
time critical and can tolerate slower response time.
Data Warehouse – A Concept
(1)
The Data Warehouse
Bill Inman is father of data warehouse concept introduced in 1981. He
writes that “ one cornerstone of client/server applications is the notion of the difference
between and separation of operational and decision support processing.”
Bill Inman defines a warehouse as a separate database for decision support,
which typically contains vast amount of information. Richard Hackthorn defines a
warehouse as “ a collection of data objects that have been intentioned for distribution to a
business community.” Data warehouse gather data from multiple sources under a unified
schema at a single site. In general warehouse is an intelligent store of data that can
manage and aggregate information from many sources, distribute it as and when
necessary.
Elements of data warehousing
1
The Data replication manager
Manages the copying and distribution of data across databases as defined by
the information users. The users defines the data that needs to be copied, the source and
destination platforms. Update and data transforms. Refresh involves copying over the
entire data source; update only generates the changes.
2.
The informational database
Is Database that organizes and stores copies of data from multiple data
sources? We can assume a decision support server that transform, aggregates and add
values to data from various sources. It also stores metadata, System level and semantic
level metadata.
3.
The information directory
It is an amalgam of functions of a technical directory business directory and
information navigator. Its main function is to help the information users to find out
what data is available on the different databases. What format it is in and how to
access it. It also helps the DBAs to manage the data warehouse. The information
directory gets its metadata by discovering which databases are on the network and the
querying their metadata repositories. DBA use the information directory to access
system level metadata keep track of data sources, data targets, cleanup rules,
transformation rules and details about predefined rules and reports.
Data Warehouse – A Concept
(2)
4.
Dos tool support
Is provided via SQL most vendors support ODBC and some other
protocol. In summary DBA must be able to assemble data from different sources,
replicate it, clean it , store it, catalog it and the make it available to DSS tools. Data
mining is one of them that refer loosely to finding relevant information from a large
volume of data. Data mining attempts to discover pre defined/user defined rules & pattern
automatically from data.
Architecture of a typical data warehouse
Data source 1
Query
and
analysis
tools
Data
loaders
Data source 2
DBMS
Data source 3
Though it seems easy to implement but there is the issues to be addressed
in building a warehouse are the following:a)
When and how to gather data for storing data.
The data transmit information either daily or periodically etc for new data
to sources.
b)
What schema to use
Data sources may likely have different schemas. The task of a warehouse
is to perform schema integration and to convert data to the integrated schema.
c)
How to propagate updates
Though updates at the source propagates to destination visually not a
problem for DSS but we have to keep in mind when building a warehouse.
Data Warehouse – A Concept
(3)
d)
What to summarize
As we know raw data may be too large to store for a warehouse however
we can answer many queries by managing just summary data obtained by aggregation on
a relation, rather them maintaining the entire relation.
Last words -though the topic is very vast and it is quite impossible to
summarize in one short topic. If you have some comments about this topic please write in
thread/Paper so the debate may be continue.
Upendra Singh Rathore
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Data Warehouse – A Concept
(4)