Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Proceedings of the 10th WSEAS Int. Conference on MATHEMATICS and COMPUTERS in BUSINESS and ECONOMICS A Multidimensional Data Model and OLAP Analysis for Agricultural Production CONSTANTA ZOIE RADULESCU National Institute for Research and Development in Informatics, 8-10 Averescu Avenue, 011455, Bucharest 1, ROMANIA, [email protected] MARIUS RADULESCU Institute of Mathematical Statistics and Applied Mathematics, Casa Academiei Române, Calea 13 Septembrie nr.13, Bucharest 5, RO-050711, ROMANIA, [email protected] ADRIAN TUREK RAHOVEANU, Research Institute for Agricultural Economics and Rural Development - ICEADR, Bd. Măraşti nr. 61, Bucharest, ROMANIA, [email protected] Abstract: In the paper a multidimensional data model called CULTEH is built. In the model are defined the dimensions, the hierarchies and the facts. Based on this model an OLAP cube called CUBECTH is built. The OLAP cube accepts queries on several dimensions and hierarchies. OLAP operations are used for performing an analysis of some economic features of the agricultural production for a period of 5 years and 12 levels of agricultural production. The economic features include cropping systems, crops, types of farmers and fertilizers consumption. Key-Words: On-Line Analytical Processing (OLAP), multidimensional data model, OLAP operations, data cube, agricultural production view of data and it is an indispensable component of the so-called business intelligence technology [1]- [3], [10][13]. Technologies based on data warehouses and OLAP allow the rapid analysis and the share of data and information for critical decision activities. In recent years a great number of commercial products based on OLAP technologies are used mainly for the analysis of business activities. However in recent years there is a trend for using these technologies in other domains like: industry [8], agriculture [4]-[7], [14]-[16], environmental protection [7]-[9], transport. At present the management of agricultural production is a very important problem for the development of a sustainable economy. The agricultural data bases contain a great number of agricultural parameters: data on agricultural inputs such as seeds and fertilizers, statistics on the consumption of fertilizers, area, yield of crops, market prices, land quality, livestock statistics, land-use pattern statistics etc. The monitoring of all these parameters is a very difficult task. Consequently the management of agricultural production rise many problems of increasing complexity 1 Introduction The multidimensional database is a new database concept dedicated to solving the demands of a decision supporting system. To understand what the data is really saying, the managers usually need to investigate data from different perspectives and change the navigation according to the previous observation. Toward this purpose, data from various operational sources are reconciled and stored in a repository database using a multidimensional data model [1]. The data warehouses and multidimensional data analysis use the multidimensional data models. These multidimensional data models allow analysts to navigate easily in data structures and to understand and exploit all the data. They improve the analysts capacity of visualizing abstract queries. The multidimensional modeling is a conceptual modeling technique used by the OLAP applications. Statistical databases, geographical and temporal databases are strong connected to multidimensional data modeling. On-Line Analytical Processing (OLAP) is a trend in database technology, based on the multidimensional ISSN: 1790-5109 243 ISBN: 978-960-474-063-5 Proceedings of the 10th WSEAS Int. Conference on MATHEMATICS and COMPUTERS in BUSINESS and ECONOMICS Development - ICEADR. The multidimensional data model has five dimensions: Crops, Production level, Producer, Cropping systems and Time. In the dimension “Crops” are considered agricultural crops. For the model CULTEH were considered 6 agricultural crops: wheat, corn, barley, sun flower, rapeseed and soybean. For the dimension “Production level” were considered 12 levels of production. The dimension “Producer” refers to the type of producer. For the CULTEH model were considered 3 types of producers: individual households, family associations, and commercial societies. The dimension “Cropping systems” refers to the types of cropping systems. For the CULTEH model were considered irrigated type and non-irrigated type. The dimension “Time” contains a period measured in years for which there exists data on agricultural production. In our model “Facts” are represented by the economic features of the agricultural production. These are presented in Table 1. and diversity, that cannot be solved without computerized tools. Business Intelligence technologies such as OLAP, data warehousing, data mining and decision support tools have proven very useful for the management of agricultural production see [ ], . For example, the DSSAT4 package based on Business Intelligence technologies, developed through financial support of USAID during the 80's and 90's, has allowed rapid assessment of several agricultural production systems around the world to facilitate decision-making at the farm and policy levels. There are, however, many constraints to the successful adoption on DSS in agriculture. The management of information requires that data should be shared and globally accessible by all the heterogeneous products found in today's information technology environment. Current day OLAP tools are suitable for this task since they assume the availability of the data in a centralized data warehouse. However, the inherently distributed nature of data collection and the huge amount of data extracted at each collection point make it impractical to gather all data at a centralized site. One solution is to maintain a distributed data warehouse, consisting of local data warehouses at each collection point and a coordinator site, with most of the processing being performed at the local sites. The paper presents an approach based on OLAP technology for solving some decision problems about agricultural production. A multidimensional data model, called CULTEH, is constructed in section 2. Based on this model a data cube called CUBECTH is built. OLAP operations are used in CUBECTH for performing an analysis of some economic features of the agricultural production for a period of 5 years and 12 levels of agricultural production. No. Crt. 1 2 3 4 5 6 Main production Total production costs Production cost Market price Subsidies Return rate in the presence of subsidies 7 N fertilizer dose 8 P fertilizer dose 9 K fertilizer dose Units of measurement Kg Lei Lei/Kg Lei/Kg Lei % Kg active subst. Kg active subst. Kg active subst. Table 1. Economical features of the agricultural production considered in the CULTEH model. 2. A multidimensional data model The multidimensional data model allows the user data visualization in multiple dimensions. It is defined by tables called dimensions and facts. “Dimensions” contain the description of data that give a meaning to the numbers contained in the table “Facts”. Usually “Dimensions” contain alphanumerical values. “Facts” contain numerical values that the user wants to analyses. The tables “Dimensions” and “Facts” are connected each other by various structures as star schema, snowflake schema or fact constellation. In the present section we build a multidimensional data model called CULTEH. The data used in the building of our multidimensional data model were provided by several databases from the Research Institute for Agricultural Economics and Rural ISSN: 1790-5109 Economical feature For these features the considered measures are: min, max, avg. More details regarding the multidimensional data model and OLAP analysis can be found in [5]. The corresponding structure of our multidimensional data model is a star schema (see figure 1). The link between the tables “Dimensions” and “Facts” is realized by link-codes between the corresponding tables from the star schema. The cube that implements the multidimensional data model is called CUBECTH. Data from this cube can be analyzed using OLAP operations. 244 ISBN: 978-960-474-063-5 Proceedings of the 10th WSEAS Int. Conference on MATHEMATICS and COMPUTERS in BUSINESS and ECONOMICS dbo.Crops dbo.Time dbo.Facts codey type codec codey codecs codep coden Production Total production costs Production cost Market price Subsides Rate of return with subsidies N dose P dose K dose dbo.Cropping System codecs type dbo.Producer codep type codec cropname description dbo.Level coden name Figure 1. The star schema of the multidimensional data model CULTEH. The first example deals with the application of the drill down operation for all crops and all types of producers. This example is illustrated in figure 2. 3. OLAP analysis of data cube CUBECTH Typical operations for analysis of a data cube are roll-up, drill-down, slice and dice and pivoting. By the use of these operations one can obtain at once answers to queries by walking dynamically in the multi-dimensional structure, working with various hierarchical levels (synthesis or detaliation). The roll up operation lead to data synthetisation. This synthetisation is realized either by walking from a lower level to a higher level in an hierarchy of a dimension or by the reduction of the dimension. The drill down operation is inverse to the roll up operation. It supposes the transition from a higher level of synthesis to a lower level. The increasing level of data detaliation can be realized by adding of new dimensions. Slice and dice operations for cubes suppose: • The selection of a partition for each dimension of a multidimensional data model (this is realized by queries with the “group by” clause) • Dicing from a special partition along one or several dimensions (corresponding the “where” clause) The pivoting operation supposes the reorientation of the data cube (3 D) for visualization in (2 D) planes. In the following are presented several examples that illustrate the application of OLAP operations from Analysis Manager for the cube CUBECTH. ISSN: 1790-5109 In the second example is illustrated the application of the drill down operation over the dimensions „Time” and „Crops” and the application of the slice and dice operation for the producer „Commercial Society”. This example is illustrated in figure 3. 4. Conclusions The great majority of multidimensional data models are business oriented. The paper presents a multidimensional data model that is oriented towards the management of agricultural production. Based on the multidimensional data model we build an OLAP cube called CUBECTH. The “cube” accepts queries on several dimensions and hierarchies. Several examples illustrate how OLAP operations are used for the analysis of the data cube. The examples prove that OLAP is a flexible tool that is suitable for complicated analyses of multidimensional data. The analyses are done in a screen-efficient way. Acknowledgements The work described in this paper was supported by the CEEX – National Research and Development Program of the Ministry of Education and Research – Contract 28/2005. 245 ISBN: 978-960-474-063-5 Proceedings of the 10th WSEAS Int. Conference on MATHEMATICS and COMPUTERS in BUSINESS and ECONOMICS CUBECTH Production level Cropping Systems Time Crops Producer Type Production Production Production costs N dose P dose Family Individual Households Comercial Societies Sun Flower Family Individual Households Comercial Societies Wheat Family Individual Households Comercial Societies Barley Family Individual Households Comercial Societies Corn Family Individual Households Comercial Societies Rapeseed Family Individual Households Comercial Societies Soybean Family Individual Households Comercial Societies Figure 2. Drill down operation, dimension ”Crop” and ”Producer type” ISSN: 1790-5109 246 ISBN: 978-960-474-063-5 Proceedings of the 10th WSEAS Int. Conference on MATHEMATICS and COMPUTERS in BUSINESS and ECONOMICS CUBECTH Production Level Crop System Producer Year Commercial Societies Crops Production Min Production Max Total Production Costs N dose Sun Flower Wheat Barley Crop Rapeseed Soybean Sun Flower Wheat Barley Crop Rapeseed Soybean Sun Flower Wheat Barley Crop Rapeseed Soybean Sun Flower Wheat Barley Crop Rapeseed Soybean Figure 3. Application of the drill down operation over the dimension ”Time” and ”Crop” and application of the slice and dice operations for the producer ”Commercial Society” [1] [2] [3] S. Nilakanta, K. Scheibe, A. Rai, Dimensional issues in agricultural data warehouse designs, Computers and Electronics in Agriculture, v.60 no.2, 2008, pp.263-278. [5] A. Rai, V. Dubey, K. K. Chaturvedi, P. K. Malhotra, Design and development of data mart for animal resources, Computers and Electronics in Agriculture, Volume 64, Issue 2, 2008, pp. 111-119. References: S. Chaudhuri, U. Dayal, An overview of data warehouse and OLAP technology, ACM SIGMOD Record 26(1), 1997, pp. 65–74. J. Han, M. Kamber, Data mining, Concepts and Techniques, Elsevier, 2006. R. Kimball, The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses, John Wiley & Sons, Inc. 1996. ISSN: 1790-5109 [4] 247 ISBN: 978-960-474-063-5 Proceedings of the 10th WSEAS Int. Conference on MATHEMATICS and COMPUTERS in BUSINESS and ECONOMICS [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] C. Z. Rădulescu, D. Enachescu, M. Rădulescu, V. Vlad, D. Veverca, I. Antohe, Set of models, techniques and methods for decision making in sustainable agriculture, Research report, CEEX project 28/2005, ICI, Bucharest, 2006. (in Romanian). C-Z. Rădulescu, Decision making for sustainable agriculture, Revista Română de Informatică şi Automatică, vol 16, nr. 4, 2006, pag. 129- 138; (in Romanian). C.Z. Radulescu, M. Radulescu, A multidimensional data model for environment protection, Proc. 12th WSEAS International Conference on COMPUTERS, Heraklion, Greece, 2008, (2008), pg. 1101-1106. WSEAS Press. C.Z. Radulescu, M. Radulescu, V. Vlad, D.M. Motelica, A Multidimensional Data Model and OLAP Analysis for Soil Physical Characteristics, Proc. 9th WSEAS Int. Conf. on Mathematics and Computers in Business and Economics (MCBE'08), Bucharest 2008, (2008), pg. 25-29. WSEAS Press. C. Răuţă, M. Dumitru, C. Ciobanu, V. Blănaru, St. Cârstea, L. Latiş, D. M. Motelică, R.Lăcătuşu, E.Dumitru, R. Enache. Monitoring of the Romanian soil quality. National Research and Development Institute for Soil Science, Agrochemistry and Environment Protection, Publistar SRL, Bucharest, vol. I and II, 1998, 414 pg. (in Romanian). Z. Tang, J. MacLennan, Data Mining with SQL Server 2005, Wiley, 2005 E. Thomsen, OLAP Solutions, Building Multidimensional Information Systems, John Wiley&Sons. 2002 E. Turban, J. E. Aronson, T.P.Liang, R. Sharda, Decision Support and Bussiness Inteligence Systems, Prentice Hall, 2007 V. Vlad, E. Târhoacă, D. Popa, V. Albu, R. Iancu, M. Băluţă, M. Tapalagă, A. Canarache, I. Munteanu, N. Florea, A. Rîşnoveanu, L. Vlad, M. Nache. Database of soil profiles (PROFISOL) - Structure and functions, Stiinta Solului / Soil Science, Bucharest, XXXII, nr.2, 1997, pp. 93-118. (in Romanian). M. Yost, Data warehousing and decision support at the National Agricultural Statistics Service, Social Science Computer Review, v.18 no.4, 2000, pp.434-441. Yu Zhao, Q. I. Guoqiang, Application of data warehouse in decision support system of rice cultivation management, Journal of Northeast Agricultural University, Vol. 37, No. 4, 2006, pp.557-562. ISSN: 1790-5109 248 ISBN: 978-960-474-063-5