Download A Multidimensional Data Model and OLAP Analysis for

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
Proceedings of the 10th WSEAS Int. Conference on MATHEMATICS and COMPUTERS in BUSINESS and ECONOMICS
A Multidimensional Data Model and OLAP Analysis
for Agricultural Production
CONSTANTA ZOIE RADULESCU
National Institute for Research and Development in Informatics,
8-10 Averescu Avenue, 011455, Bucharest 1, ROMANIA,
[email protected]
MARIUS RADULESCU
Institute of Mathematical Statistics and Applied Mathematics, Casa Academiei Române,
Calea 13 Septembrie nr.13, Bucharest 5, RO-050711, ROMANIA,
[email protected]
ADRIAN TUREK RAHOVEANU,
Research Institute for Agricultural Economics and Rural Development - ICEADR, Bd.
Măraşti nr. 61, Bucharest, ROMANIA,
[email protected]
Abstract: In the paper a multidimensional data model called CULTEH is built. In the model are
defined the dimensions, the hierarchies and the facts. Based on this model an OLAP cube called
CUBECTH is built. The OLAP cube accepts queries on several dimensions and hierarchies.
OLAP operations are used for performing an analysis of some economic features of the agricultural
production for a period of 5 years and 12 levels of agricultural production. The economic features
include cropping systems, crops, types of farmers and fertilizers consumption.
Key-Words: On-Line Analytical Processing (OLAP), multidimensional data model, OLAP operations, data cube,
agricultural production
view of data and it is an indispensable component of the
so-called business intelligence technology [1]- [3], [10][13].
Technologies based on data warehouses and OLAP
allow the rapid analysis and the share of data and
information for critical decision activities. In recent
years a great number of commercial products based on
OLAP technologies are used mainly for the analysis of
business activities. However in recent years there is a
trend for using these technologies in other domains like:
industry
[8],
agriculture
[4]-[7],
[14]-[16],
environmental protection [7]-[9], transport.
At present the management of agricultural production is
a very important problem for the development of a
sustainable economy. The agricultural data bases contain
a great number of agricultural parameters: data on
agricultural inputs such as seeds and fertilizers, statistics
on the consumption of fertilizers, area, yield of crops,
market prices, land quality, livestock statistics, land-use
pattern statistics etc. The monitoring of all these
parameters is a very difficult task.
Consequently the management of agricultural
production rise many problems of increasing complexity
1 Introduction
The multidimensional database is a new database
concept dedicated to solving the demands of a decision
supporting system. To understand what the data is really
saying, the managers usually need to investigate data
from different perspectives and change the navigation
according to the previous observation. Toward this
purpose, data from various operational sources are
reconciled and stored in a repository database using a
multidimensional data model [1]. The data warehouses
and multidimensional data analysis use the
multidimensional data models.
These multidimensional data models allow analysts to
navigate easily in data structures and to understand and
exploit all the data. They improve the analysts capacity
of visualizing abstract queries.
The multidimensional modeling is a conceptual
modeling technique used by the OLAP applications.
Statistical databases, geographical and temporal
databases are strong connected to multidimensional data
modeling.
On-Line Analytical Processing (OLAP) is a trend in
database technology, based on the multidimensional
ISSN: 1790-5109
243
ISBN: 978-960-474-063-5
Proceedings of the 10th WSEAS Int. Conference on MATHEMATICS and COMPUTERS in BUSINESS and ECONOMICS
Development - ICEADR. The multidimensional data
model has five dimensions: Crops, Production level,
Producer, Cropping systems and Time.
In the dimension “Crops” are considered agricultural
crops. For the model CULTEH were considered 6
agricultural crops: wheat, corn, barley, sun flower,
rapeseed and soybean. For the dimension “Production
level” were considered 12 levels of production. The
dimension “Producer” refers to the type of producer. For
the CULTEH model were considered 3 types of
producers: individual households, family associations,
and commercial societies. The dimension “Cropping
systems” refers to the types of cropping systems. For the
CULTEH model were considered irrigated type and
non-irrigated type. The dimension “Time” contains a
period measured in years for which there exists data on
agricultural production.
In our model “Facts” are represented by the economic
features of the agricultural production. These are
presented in Table 1.
and diversity, that cannot be solved without
computerized tools.
Business Intelligence technologies such as OLAP, data
warehousing, data mining and decision support tools
have proven very useful for the management of
agricultural production see [ ], .
For example, the DSSAT4 package based on Business
Intelligence technologies, developed through financial
support of USAID during the 80's and 90's, has allowed
rapid assessment of several agricultural production
systems around the world to facilitate decision-making
at the farm and policy levels. There are, however, many
constraints to the successful adoption on DSS in
agriculture.
The management of information requires that data
should be shared and globally accessible by all the
heterogeneous products found in today's information
technology environment. Current day OLAP tools are
suitable for this task since they assume the availability
of the data in a centralized data warehouse. However,
the inherently distributed nature of data collection and
the huge amount of data extracted at each collection
point make it impractical to gather all data at a
centralized site. One solution is to maintain a distributed
data warehouse, consisting of local data warehouses at
each collection point and a coordinator site, with most of
the processing being performed at the local sites.
The paper presents an approach based on OLAP
technology for solving some decision problems about
agricultural production.
A multidimensional data model, called CULTEH, is
constructed in section 2. Based on this model a data
cube called CUBECTH is built. OLAP operations are
used in CUBECTH for performing an analysis of some
economic features of the agricultural production for a
period of 5 years and 12 levels of agricultural
production.
No.
Crt.
1
2
3
4
5
6
Main production
Total production costs
Production cost
Market price
Subsidies
Return rate in the presence
of subsidies
7
N fertilizer dose
8
P fertilizer dose
9
K fertilizer dose
Units of
measurement
Kg
Lei
Lei/Kg
Lei/Kg
Lei
%
Kg active
subst.
Kg active
subst.
Kg active
subst.
Table 1. Economical features of the
agricultural production considered in the
CULTEH model.
2. A multidimensional data model
The multidimensional data model allows the user data
visualization in multiple dimensions. It is defined by
tables called dimensions and facts. “Dimensions”
contain the description of data that give a meaning to the
numbers contained in the table “Facts”. Usually
“Dimensions” contain alphanumerical values. “Facts”
contain numerical values that the user wants to analyses.
The tables “Dimensions” and “Facts” are connected
each other by various structures as star schema,
snowflake schema or fact constellation.
In the present section we build a multidimensional data
model called CULTEH.
The data used in the building of our multidimensional
data model were provided by several databases from the
Research Institute for Agricultural Economics and Rural
ISSN: 1790-5109
Economical feature
For these features the considered measures are: min,
max, avg.
More details regarding the multidimensional data
model and OLAP analysis can be found in [5]. The
corresponding structure of our multidimensional
data model is a star schema (see figure 1). The link
between the tables “Dimensions” and “Facts” is
realized by link-codes between the corresponding
tables from the star schema. The cube that
implements the multidimensional data model is called
CUBECTH. Data from this cube can be analyzed
using OLAP operations.
244
ISBN: 978-960-474-063-5
Proceedings of the 10th WSEAS Int. Conference on MATHEMATICS and COMPUTERS in BUSINESS and ECONOMICS
dbo.Crops
dbo.Time
dbo.Facts
codey
type
codec
codey
codecs
codep
coden
Production
Total production costs
Production cost
Market price
Subsides
Rate of return with
subsidies
N dose
P dose
K dose
dbo.Cropping System
codecs
type
dbo.Producer
codep
type
codec
cropname
description
dbo.Level
coden
name
Figure 1. The star schema of the multidimensional data model CULTEH.
The first example deals with the application of the
drill down operation for all crops and all types of
producers. This example is illustrated in figure 2.
3. OLAP analysis of data cube CUBECTH
Typical operations for analysis of a data cube are
roll-up, drill-down, slice and dice and pivoting. By
the use of these operations one can obtain at once
answers to queries by walking dynamically in the
multi-dimensional structure, working with various
hierarchical levels (synthesis or detaliation). The
roll up operation lead to data synthetisation. This
synthetisation is realized either by walking from a
lower level to a higher level in an hierarchy of a
dimension or by the reduction of the dimension.
The drill down operation is inverse to the roll up
operation. It supposes the transition from a higher
level of synthesis to a lower level. The increasing
level of data detaliation can be realized by adding
of new dimensions. Slice and dice operations for
cubes suppose:
• The selection of a partition for each
dimension of a multidimensional data model
(this is realized by queries with the “group
by” clause)
• Dicing from a special partition along
one
or
several
dimensions
(corresponding the “where” clause)
The pivoting operation supposes the reorientation
of the data cube (3 D) for visualization in (2 D)
planes.
In the following are presented several examples
that illustrate the application of OLAP operations
from Analysis Manager for the cube CUBECTH.
ISSN: 1790-5109
In the second example is illustrated the application of
the drill down operation over the dimensions
„Time” and „Crops” and the application of the slice
and dice operation for the producer „Commercial
Society”. This example is illustrated in figure 3.
4. Conclusions
The great majority of multidimensional data models are
business
oriented.
The
paper
presents
a
multidimensional data model that is oriented towards the
management of agricultural production. Based on the
multidimensional data model we build an OLAP cube
called CUBECTH. The “cube” accepts queries on
several dimensions and hierarchies. Several examples
illustrate how OLAP operations are used for the analysis
of the data cube. The examples prove that OLAP is a
flexible tool that is suitable for complicated analyses of
multidimensional data. The analyses are done in a
screen-efficient way.
Acknowledgements
The work described in this paper was supported by the
CEEX – National Research and Development Program
of the Ministry of Education and Research – Contract
28/2005.
245
ISBN: 978-960-474-063-5
Proceedings of the 10th WSEAS Int. Conference on MATHEMATICS and COMPUTERS in BUSINESS and ECONOMICS
CUBECTH
Production level
Cropping Systems
Time
Crops
Producer Type
Production
Production
Production costs
N dose
P dose
Family
Individual Households
Comercial Societies
Sun Flower
Family
Individual Households
Comercial Societies
Wheat
Family
Individual Households
Comercial Societies
Barley
Family
Individual Households
Comercial Societies
Corn
Family
Individual Households
Comercial Societies
Rapeseed
Family
Individual Households
Comercial Societies
Soybean
Family
Individual Households
Comercial Societies
Figure 2. Drill down operation, dimension ”Crop” and ”Producer type”
ISSN: 1790-5109
246
ISBN: 978-960-474-063-5
Proceedings of the 10th WSEAS Int. Conference on MATHEMATICS and COMPUTERS in BUSINESS and ECONOMICS
CUBECTH
Production Level
Crop System
Producer
Year
Commercial Societies
Crops
Production Min
Production Max
Total Production Costs
N dose
Sun Flower
Wheat
Barley
Crop
Rapeseed
Soybean
Sun Flower
Wheat
Barley
Crop
Rapeseed
Soybean
Sun Flower
Wheat
Barley
Crop
Rapeseed
Soybean
Sun Flower
Wheat
Barley
Crop
Rapeseed
Soybean
Figure 3. Application of the drill down operation over the dimension ”Time” and ”Crop” and application of the
slice and dice operations for the producer ”Commercial Society”
[1]
[2]
[3]
S. Nilakanta, K. Scheibe, A. Rai,
Dimensional issues in agricultural data
warehouse designs, Computers and
Electronics in Agriculture, v.60 no.2, 2008,
pp.263-278.
[5] A. Rai, V. Dubey, K. K. Chaturvedi, P. K.
Malhotra, Design and development of data
mart for animal resources, Computers and
Electronics in Agriculture, Volume 64,
Issue 2, 2008, pp. 111-119.
References:
S. Chaudhuri, U. Dayal, An overview of data
warehouse and OLAP technology, ACM
SIGMOD Record 26(1), 1997, pp. 65–74. J. Han, M. Kamber, Data mining, Concepts and
Techniques, Elsevier, 2006. R. Kimball, The Data Warehouse Toolkit:
Practical Techniques for Building Dimensional
Data Warehouses, John Wiley & Sons, Inc.
1996.
ISSN: 1790-5109
[4]
247
ISBN: 978-960-474-063-5
Proceedings of the 10th WSEAS Int. Conference on MATHEMATICS and COMPUTERS in BUSINESS and ECONOMICS
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
C. Z. Rădulescu, D. Enachescu, M. Rădulescu,
V. Vlad, D. Veverca, I. Antohe, Set of models,
techniques and methods for decision making in
sustainable agriculture, Research report, CEEX
project 28/2005, ICI, Bucharest, 2006. (in
Romanian).
C-Z. Rădulescu, Decision making for
sustainable agriculture, Revista Română de
Informatică şi Automatică, vol 16, nr. 4, 2006,
pag. 129- 138; (in Romanian). C.Z.
Radulescu,
M.
Radulescu,
A
multidimensional data model for environment
protection, Proc. 12th WSEAS International
Conference on COMPUTERS, Heraklion,
Greece, 2008, (2008), pg. 1101-1106. WSEAS
Press.
C.Z. Radulescu, M. Radulescu, V. Vlad, D.M.
Motelica, A Multidimensional Data Model and
OLAP
Analysis
for
Soil
Physical
Characteristics, Proc. 9th WSEAS Int. Conf. on
Mathematics and Computers in Business and
Economics (MCBE'08), Bucharest 2008, (2008),
pg. 25-29. WSEAS Press.
C. Răuţă, M. Dumitru, C. Ciobanu, V. Blănaru,
St. Cârstea, L. Latiş, D. M. Motelică,
R.Lăcătuşu, E.Dumitru, R. Enache. Monitoring
of the Romanian soil quality. National Research
and Development Institute for Soil Science,
Agrochemistry and Environment Protection,
Publistar SRL, Bucharest, vol. I and II, 1998,
414 pg. (in Romanian).
Z. Tang, J. MacLennan, Data Mining with SQL
Server 2005, Wiley, 2005 E. Thomsen, OLAP Solutions, Building
Multidimensional Information Systems, John
Wiley&Sons. 2002
E. Turban, J. E. Aronson, T.P.Liang, R. Sharda,
Decision Support and Bussiness Inteligence
Systems, Prentice Hall, 2007
V. Vlad, E. Târhoacă, D. Popa, V. Albu, R.
Iancu, M. Băluţă, M. Tapalagă, A. Canarache, I.
Munteanu, N. Florea, A. Rîşnoveanu, L. Vlad,
M. Nache. Database of soil profiles
(PROFISOL) - Structure and functions, Stiinta
Solului / Soil Science, Bucharest, XXXII, nr.2,
1997, pp. 93-118. (in Romanian). M. Yost, Data warehousing and decision
support at the National Agricultural Statistics
Service, Social Science Computer Review, v.18
no.4, 2000, pp.434-441.
Yu Zhao, Q. I. Guoqiang, Application of data
warehouse in decision support system of rice
cultivation management, Journal of Northeast
Agricultural University, Vol. 37, No. 4, 2006,
pp.557-562.
ISSN: 1790-5109
248
ISBN: 978-960-474-063-5