Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN ENGINEERING, TECHNOLOGY AND SCIENCES (IJ-CA-ETS) AN INVESTIGATION AND EVALUATION ON PRÉCISED DECISION FOR SCIENTIFIC DATA USING NEW APPROACHES IN DMDW 1 DR.SUDARSON JENA, 2SANTHOSH PASULADI, 3KARTHIK KOVURI,4G.L.ANAND BABU 1 Gitam University, Rudraram, Hyderabad, A.P, INDIA, S.V.College of Engineering and Technology, Moinabad (M), R.R.Dist, A.P, INDIA, 3 Trinity College of Engineering & Technology, Peddapally (M), Karimnagar Dist-505172, 4 CVSR College of Engineering, Venkatapur (v), Ghatkesar, R.R Dist-501301. 2 [email protected], [email protected], [email protected] and [email protected] ABSTRACT. This paper provides an overview of scientific data warehousing and on-line analytical processing (OLAP) technologies, with an emphasis on their data warehousing requirements. The methods that we used include the efficient computation of data cubes by integration of MOLAP and ROLAP techniques, the integration of data cube methods with dimension relevance analysis and data dispersion analysis for concept description and data cube based multi-level association, classification, prediction and clustering techniques. Keywords: Scientific Data Warehouses, Olap, Data Mining, On-Line Analytical Mining (Olam), Dbm, Data Cubes. I. Introduction Nowadays, we find ourselves in the decade dominated by the expansion of multimedia data. The growing interest concerning the storage and knowledge discovery in data in heterogeneous forms (text, images, video, relational views, etc.), that we shall call complex data, animates research communities including new architectures and processing tools which are better adapted. Complex data, more than being heterogeneous, encloses several classic data. For instance, an image can be described by several characteristics/descriptors that constitute data to be analyzed. Then how can we represent these data? Complex data warehousing needs innovation in its phases in order to answer this question. The ETL phase needs to be adapted to take into account the specialty of complex data. Furthermore, the multidimensional modeling is not obvious. It needs to consider all possible information concerning the complex data. For example, some of this information can be determined through data mining techniques. In this context, we propose a new approach for the complex data warehousing process, focusing on the data integration and multidimensional modeling phases. Research and Development produce a very large amount of Scientific and Technical data. The analysis and interpretation of these data is crucial for the proper understanding of Scientific / Technical phenomena and discovery of new concepts. Data warehousing and on-line analytical processing (OLAP) are essential elements of decision support, which has increasingly become a focus of the database industry. Many commercial products and services are now available, and all of the principal database management system vendors now have offerings in these areas. Decision support places some rather different requirements on database technology compared to traditional on-line transaction processing applications. Data Warehousing (DW) and On-Line. Analytical Processing (OLAP) systems based on a dimensional view of data are being used increasingly in traditional business applications as well as in applications such as health care and biochemistry for the purpose of analyzing very large amounts of data. The use of DW and OLAP systems for scientific purposes raises several new challenges to the traditional technology. Efficient implementation and fast response is the major challenge in the realization of On-line analytical mining in large databases and scientific data warehouses. Therefore, the study has been focused on the efficient implementation of the Online analytical mining mechanism. The methods that I used include the efficient computation of data cubes by integration of MOLAP and ROLAP techniques, the integration of data cube methods ISSN: 0974-3596 | APRIL 2012- SEPTEMBER 2012 | Volume 4 : Issue 2| Page: 62 INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN ENGINEERING, TECHNOLOGY AND SCIENCES (IJ-CA-ETS) with dimension relevance analysis and data dispersion analysis for concept description and data cube based multi-level association, classification, prediction and clustering techniques II Data Mining Methods A. Statistics There are several statistical methods used in data mining projects that are widely used in science and industry and provide excellent features for describing and visualizing large chunks of data. Some of the methods commonly used are regression analysis, correlation, Chaid analysis, hypothesis testing, and discriminate analysis. Pros Statistical analysis is sometimes a good ‘first step’ in understanding data. These methods deal well with numerical data where the underlying probability distributions of the data are known. They are not as good with nominal data such as “good”, better”, “best” or “Europe”, “North America”, “Asia” or “South America”. Cons Statistical methods require statistical expertise, or a project person well versed in statistics who is heavily involved. Such methods require difficult to verify statistical assumptions and do not deal well with non-numerical data. They suffer from the “black box aversion syndrome”. This means that that nontechnical decision makers, those who will either accept or reject the results of the study, are often unwilling to make important decisions based on a technology that gives them answers but does not explain how it got the answers. To tell a nonstatistician CEO that she or he must make a crucial business decision because of a favorable R statistic is not usually well received. With Nuggets® you can be told exactly how the conclusion was arrived at. Another problem is that statistical methods are valid only if certain assumptions about the data are met. Some of these assumptions are: linear relationships between pairs of variables, non-multicollinearity, and normal probability distributions, independence of samples. B. Neural Nets This is a popular technology, particularly in the financial community. This method was originally developed in the 1940’s to model biological nervous systems in an attempt to mimic thought processes. Pros The end result of a Neural Net project is a mathematical model of the process. It deals primarily with numerical attributes but not as well with nominal data. Cons granularities, and present knowledge/results in different forms. On-line analytical mining There is still much controversy regarding the efficacy of Neural Nets. One major objection to the method is that the development of a Neural Net model is partly an art and partly a science in that the results often depend on the individual who built the model. That is, the model form (called the network topology) and hence the results, may differ from one researcher to another for the same data. There is the problem that often occurs of “over fitting” that results in good prediction of the data used to build the model but bad results with new data. The “black box syndrome” also applies here. C. Decision Trees Decision tree methods are techniques for partitioning a training file into a tree representation. The starting node is called the root node. Depending upon the results of a test this node is then partitioned into two or more sub-sets. Each node is then further partitioned until a tree is built. This tree can be mapped into a set of rules. Pros Fairly fast and results can be presented as rules. Cons By far the most important negative for decision trees is that they are forced to make decisions along the way based on limited information that implicitly leaves out of consideration the vast majority of potential rules in the training file. This approach may leave valuable rules undiscovered since decisions made early in the process will preclude some good rules from being discovered later. III. OLAP+ Data Mining On-Line Analytical Mining On-line analytical processing (OLAP) is a powerful data analysis method for multidimensional analysis of data warehouses. Motivated by the popularity of OLAP technology, I use an On-Line Analytical Mining (OLAM) mechanism for multi-dimensional data mining in large databases and scientific data warehouses. I believe this is a promising direction to pursue for the scientific data warehouses, based on the following observations. 1. Most data mining tools need to work on integrated, consistent, and cleaned data, which requires costly data cleaning, data transformation and data integration as pre-? Processing steps. A data warehouse constructed by such pre-processing serves as a Valuable source of cleaned and integrated data for OLAP as well as for data mining. 2. Effective data mining needs exploratory data analysis. A user often likes to traverse flexibly through a database, select any portions of relevant data, analyze data at different exploratory data mining. 3. It is facilities often difficult a user to predict what of data and at dif provides for datafor mining on different subsets ISSN: 0974-3596 | APRIL 2012- SEPTEMBER 2012 | Volume 4 : Issue 2| Page: 63 INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN ENGINEERING, TECHNOLOGY AND SCIENCES (IJ-CA-ETS) kinds of knowledge to be mined beforehand, by integration of OLAP with multiple data mining functions. On-line analytical mining provides flexibility for users to select desired data mining functions and swap data mining tasks dynamically. However, data mining functions usually cost more than simple OLAP operations. Efficient implementation and fast response is the major challenge in the realization of On- line analytical mining in large databases and scientific data warehouses. Therefore, our study has been focused on the efficient implementation of the On-line analytical mining mechanism. The methods that I used include the efficient computation of data cubes by integration of MOLAP and ROLAP techniques, the integration of data cube methods with dimension relevance analysis and data dispersion analysis for concept description and data cube based multi- level association, classification, prediction and clustering techniques. These methods will be discussed in detail in the following subsections. User GUI API OLAM Engine OLAP Engine API Meta Data Data cleaning Data integration Data Cube Data Base Filtering Data Warehouse Figure.1. An integrated OLAM and OLAP Architecture time-series analysis, etc. Therefore, an OLAM A. Architecture for On-Line Analytical Mining: An OLAM engine performs analytical mining in data engine is more sophisticated than an OLAP engine cubes in a similar manner as an OLAP engine since it usually consists of multiple mining modules which may interact with each other for effective performs on- line analytical processing. Therefore, it is suggested to have an integrated mining in a scientific data warehouse. Since some OLAM and OLAP architecture as shown in below requirements in OLAM, such as the construction of numerical dimensions, may not be readily Figure.1., where the OLAM and OLAP engines both available in the commercial OLAP products, I have accept users on-line queries (instructions) and work chosen to construct our own data cube and with the data cube in the analysis Furthermore, an OLAM engine may perform multiple data build the mining modules on such data cubes. With many OLAP products available on the market, it mining tasks, such as concept description, is important to develop association, classification, prediction, clustering, OLAP analyses. Since OLAM engines are constructed on-line analytical mining mechanisms directly on either on customized data cubes which often top of the constructed data cubes and OLAP work with relational database systems, or on top engines. Based on our analysis, there is no of the data cubes provided by the OLAP fundamental difference between the data cube products, it is suggested to build online analytical required for OLAP and that for OLAM, although mining systems on top of the existing OLAP and OLAM analysis may often involve the analysis of a relational database systems rather than from the larger number of dimensions with finer ground up. granularities, and thus require more powerful data cube construction and accessing tools than ISSN: 0974-3596 | APRIL 2012- SEPTEMBER 2012 | Volume 4 : Issue 2| Page: 64 INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN ENGINEERING, TECHNOLOGY AND SCIENCES (IJ-CA-ETS) B. Data cube construction Data cube technology is essential for efficient online analytical mining. There have been many studies on efficient computation and access of multidimensional databases. These lead us to use data cubes for scientific data warehouses. The attributeoriented induction method adopts two generalization techniques (1) attribute removal, which removes attributes which represent lowlevel data in a hierarchy, and (2) attribute generalization which generalizes attribute values to their corresponding high level ones. Such generalization leads to a new, compressed generalized r e l a t i o n w i t h c o u n t a n d /or o t h e r a g g r e g a t e values accumulated. This is similar to the relational OLAP (ROLAP) implementation of the roll-up operation. For fast response in OLAP and data mining, the later implementation has adopted data cube technology as follows, when data cube contains a small number of dimensions, or when it is generalized to a high level, the cube is structured as compressed sparse array but is still stored in a relational database (to reduce the cost of construction and indexing of different data structures). The cube is pre-computed using a chunkbased multi-way array aggregation technique. However, when the cube has a large number of dimensions, it becomes very sparse with a huge number of chunks. In this case, a relational structure is adopted to store and compute the data cube, similar to the ROLAP implementation. We believe such a dual data structure technique represents a balance between multidimensional OLAP (MOLAP) and relational OLAP (ROLAP) implementations. It ensures fast response time when handling mediumsized cubes/cuboids and high scalability when handling large databases with high dimensionality. Notice that even adopting the ROLAP technique, it is still unrealistic to materialize all the possible cuboids for large databases with high dimensionality due to the huge number of cuboids it is wise to materialize more of the generalized, low dimensionality cuboids besides considering other factors, such as accessing patterns and the sharing among different cuboids. A 3-D data cube/cuboids can be selected from a highdimensional data cube and be browsed conveniently using the DBMiner 3-D cube browser as shown in Figure.2. Where the size of a cell (displayed as a tiny cube) represents the entry count in the corresponding cell, and the brightness of the cell represents another measure of the cell. Pivoting, drilling, and slicing/dicing operations can be performed on the data cube browser with mouse clicking. III OLAP++ System Architecture: The overall architecture of the OLAP++ system is seen in Figure.4. The object part of the system is based on the OPM tools that implements the Object Data Management Group (ODMG) object data model and the Object Query Language (OQL) on top of a relational DBMS, in this case the ORACLE RDBMS. The OLAP part of the system is based on Microsoft’s SQL Server OLAP Services using the Multi- Dimensional expressions (MDX) query language. When a SumQL++ query is received by the Federation Coordinator (FC), it is first parsed to identify the measures, categories, links, classes and attributes referenced in the query. Based on this, the FC then queries the metadata to get information about which databases the object data and the OLAP data reside in and which categories are linked to which classes. Based on the object parts of the query, the FC then sends OQL queries to the object databases to retrieve the data for which the particular conditions holds true. This data is then put into a “pure” SumQL statement (i.e. without object references) as a list of category values. This SumQL statement is then sent to the OLAP database layer to retrieve the desired measures, grouped by the requested categories. The SumQL statement is translated into MDX by a separate layer, the “SumQL-to-MDX translator”, and the data returned from OLAP Services is returned to the FC. The reason for using the intermediate SumQL statements is to isolate the implementation of the OLAP data from the FC. As another alternative, we have also implemented a translator into SQL statements against a “star schema” relational database design. The system is able to support a good query performance even for large databases while making it possible to integrate existing OLAP data with external data in object databases in a flexible way that can adapt quickly to changing query needs. A. Back End Tools and Utilities: Data warehousing systems use a variety of data extraction and cleaning tools, and load and refresh utilities for populating warehouses Data extraction from “foreign” sources is usually implemented via gateways and standard interfaces (such as Information Builders EDA/SQL, ODBC, Oracle Open Connect, Sybase Enterprise Connect, Informix Enterprise Gateway). ISSN: 0974-3596 | APRIL 2012- SEPTEMBER 2012 | Volume 4 : Issue 2| Page: 65 INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN ENGINEERING, TECHNOLOGY AND SCIENCES (IJ-CA-ETS) Graphical User Interface SumQL++ Federation Coordinator QQL SumQL QQL QQL-to-SQL Translator SQL ODB Data (oracle) Link Meta- data QQL-to-SQL Translator SQL SumQL-to-MDX Translator MDX Link Data (oracle) SDB Data (MS OLAP) suspicious pattern (based on statistical analysis) that Figure.2 OLAP++ Architecture a certain car dealer has never received any complaints.Load: After extracting, cleaning and Data Cleaning: Since a data warehouse is used for decision making, it is important that the data in the warehouse be correct. However, since large volumes transforming, data must be loaded into the of data from multiple sources are involved, there is a warehouse. Additional preprocessing may still be high probability of errors and anomalies in the data. required: checking integrity constraints; sorting; Therefore, tools that help to detect data anomalies summarization, aggregation and other computation and correct them can have a high payoff. Some to build the derived tables stored in the warehouse; examples where data cleaning becomes necessary building indices and other access paths; and are: inconsistent field lengths, inconsistent partitioning to multiple target storage areas. The descriptions, inconsistent value assignments, missing load utilities for data warehouses have to deal with entries and violation of integrity constraints. Not much larger data volumes than for operational surprisingly, optional fields in data entry forms are databases. There is only a small time window significant sources of inconsistent data. There are (usually at night) when the warehouse can be taken three related, but somewhat different, classes of offline to refresh it. Sequential loads can take a very data cleaning tools. Data migration tools allow long time, e.g., loading a terabyte of data can take simple transformation rules to be specified; e.g., weeks and months! Hence, pipelined and partitioned “replace the string gender by sex”. Warehouse parallelisms are typically exploited 6. Doing a full Manager from Prism is an example of a popular tool load has the advantage that it can be treated as a of this kind. Data scrubbing tools use domainlong batch transaction that builds up a new specific knowledge (e.g., postal addresses) to do the scrubbing of data. They often exploit parsing and database. While it is in progress, the current database can still support queries; when the load fuzzy matching techniques to accomplish cleaning transaction commits, the current database is from multiple sources. Some tools make it possible replaced with the new one. Using periodic to specify the “relative cleanliness” of sources. Tools checkpoints ensures that if a failure occurs during such as Integrity and Trillum fall in this category. the load, the process can restart from the last Data auditing tools make it possible to discover rules checkpoint. However, even using parallelism, a full and relationships (or to signal violation of stated load may still take too long. Most commercial rules) by scanning data. Thus, such tools may be utilities (e.g., RedBrick Table Management Utility) considered variants of data mining tools. For example, such a tool may discover a use incremental loading during refresh to reduce the volume of data that has to be incorporated into the warehouse. Only the updated tuples are inserted. ISSN: 0974-3596 | APRIL 2012- SEPTEMBER 2012 | Volume 4 : Issue 2| Page: 66 INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN ENGINEERING, TECHNOLOGY AND SCIENCES (IJ-CA-ETS) However, the load process now is harder to manage. The incremental load conflicts with ongoing queries, so it is treated as a sequence of shorter transactions (which commit periodically, e.g., after every 1000 records or every few seconds), but now this sequence of transactions has to be coordinated to ensure consistency of derived data and indices with the base data. (B) Conceptual Model and Front End Tools: A popular conceptual model that influences the front-end tools, database design, and the query engines for OLAP is the multidimensional view of data in the warehouse. In a multidimensional data model, there is a set of numeric measures that are the objects of analysis. Examples of such measures are sales, budget, revenue, inventory, ROI (return on investment). Each of the numeric measures depends on a set of dimensions, which provide the context for the measure. For example, the dimensions associated with a sale amount can be the city, product name, and the date when the sale was made. The dimensions together are assumed to uniquely determine the measure. Thus, the multidimensional data views a measure as a value in the multidimensional space of dimensions. Each dimension is described by a set of attributes. For example, the Product dimension may consist of four attributes: the category and the industry of the product, year of its introduction, and the average profit margin. For example, the soda Surge belongs to the category beverage and the food industry, was introduced in 1996, and may have an average profit margin of 80%. The attributes of a dimension may be related via a hierarchy of relationships. In the above example, the product name is related to its category and the industry attribute through such a hierarchical relationship. (C) Front End Tools The multidimensional data model grew out of the view of business data popularized by PC spreadsheet programs that were extensively used by business analysts. The spreadsheet is still the most compelling front-end application for OLAP. The challenge in supporting a query environment for OLAP can be crudely summarized as that of supporting spreadsheet operations efficiently over large multigigabyte databases. Indeed, the Essbase product of Arbor Corporation uses Microsoft Excel as the frontend tool for its multidimensional engine. IV Advantages: On-line analytical processing (OLAP) is a powerful data analysis method for multi-dimensional analysis of data warehouses. OLAM engine may perform multiple data mining tasks, such as concept description, association, classification, prediction, clustering, time series analysis, etc. Therefore, an OLAM engine is more sophisticated than an OLAP engine since it usually consists of multiple mining modules which may interact with each other for effective mining. Based on our analysis, there is no fundamental difference between the data cube required for OLAP and that for OLAM, although OLAM analysis may often involve the analysis of a larger number of dimensions with finer granularities, and thus require more powerful data cube construction and accessing tools than OLAP analyses. The attribute-oriented induction method adopts two generalization techniques (1) attribute removal, which removes attributes which represent low-level data in a hierarchy, and (2) attribute generalization which generalizes attribute values to their corresponding high level ones. Such generalization leads to a new, compressed generalized relation with count and/or other aggregate values accumulated. Data warehousing systems use a variety of data extraction and cleaning tools, and load and refresh utilities for populating warehouses Data extraction from “foreign” sources is usually implemented via gateways and standard interfaces. Data Cleaning, Load, Refresh and After_row operations can be performed more efficiently. Data cleaning is a problem that is reminiscent of heterogeneous data integration, a problem that has been studied for many years. But here the emphasis is on data inconsistencies instead of schema inconsistencies. Data cleaning, as I indicated, is also closely related to data mining, with the objective of suggesting possible inconsistencies. This architecture gives user the multidimensional view of data and can provide easy Drill Down, rotate and ad-hoc analysis of data. It can also support iterative discovery process. It can provide unique descriptions across all levels of data. The OLAP in this type architecture can empower end user to do own scientific analysis, can give ease of use. This also provides easy Drill Down facility to the users. This architecture can provide virtually no knowledge of tables required for the users. This architecture can also improve exception analysis and variance analysis. Provides high query performance and keeps local processing at sources unaffected and can operate when sources unavailable. Can query data not stored in a DBMS through Extra information at warehouse The use of DW and OLAP systems for scientific purposes raises several new challenges to the traditional technology. The methods that I used include the efficient computation of data cubes by integration of MOLAP and ROLAP techniques, the integration of data cube methods with dimension relevance analysis and data dispersion analysis for concept description and data cube based multi-level association, classification, prediction and clustering ISSN: 0974-3596 | APRIL 2012- SEPTEMBER 2012 | Volume 4 : Issue 2| Page: 67 INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN ENGINEERING, TECHNOLOGY AND SCIENCES (IJ-CA-ETS) techniques. I describe back end tools for extracting, cleaning and loading data into a scientific data warehouse; multidimensional data models typical of OLAP; front end client tools for querying and data analysis and tools for metadata management and for managing the warehouse. CONCLUSION: Data warehouses for scientific purposes pose several great challenges to existing data warehouse technology. This paper provides an overview of scientific data warehousing and OLAP technologies, with an emphasis on their data warehousing requirements. Data warehousing using multidimensional view and on-line analytical processing have become very popular in both business and science in recent years and are essential elements of decision support, analysis and interpretation of data. The methods, that used include the efficient computation of data cubes by integration of MOLAP and ROLAP techniques, the integration of data cube methods with dimension relevance analysis and data dispersion analysis for concept description and data cube based multi-level association, classification, prediction and clustering techniques. Here we describe back end tools for extracting, cleaning and loading data into a scientific data warehouse, multidimensional data models typical of OLAP. ACKNOWLEDGMENT Part of the work presented here is resulted from work done by research scholar of JNTU,Hyderabad. Special thanks toDr.Madhan Kumar Srinivas ,member in Research and Development,Infosys,Mysore for his interest in this work. Thanks to shri manohar reddy, ,chairman of TRINITY Group of Institutions for their constant encouragement in preparation of this work. Special Thanksto,shri,Rajeshwar Reddy,,Chairman, CVSR College of Engineering, Venkatapur (v), Ghatkesar , deserves acknowledgment for their sufficient funding and according for extenstion of existing facilities for execution of the problem proposed. REFERENCES: [1]. Microsoft Corporation. OLE DB for OLAP Version 1.0 Specification. Microsoft Technical Document, 1998. [2]. The OLAP Report. Database Explosion. www.olapreport.com/DatabaseExplosion.htm>. February 18, 2000. [3]. T. B. Pedersen and C. S. Jensen. Research Issues in Clinical Data Warehousing. In Proceedings of the Tenth International Conference on Statistical and Scientific Database Management, pp. 43–52, 1998. [4]. T. B. Pedersen, C. S. Jensen, and C. E. Dyreson. Supporting Imprecision in Multidimensional Databases Using Granularities. In Proceedings of the Eleventh International Conference on Statistical and Scientific Database Management, B, 1995, pp 90-10, 1999 [5]. T. B. Pedersen, C. S. Jensen, and C. E. Dyreson. Extending PractiPre-Aggregation in On-Line Analytical Processing. In Proceedings of the Twentyfifth International Conference on Very Large Data Bases, pp. 663–674, 1999. [6]. T. B. Pedersen and C. S. Jensen. Multidimensional Data Modeling for Complex Data. In Proceedings of the Fifteenth International Conference on Data Engineering, 1999. Extended version available as TimeCenter Technical Report TR37, [7]. http://www.olapcouncil.org [8]. Codd, E.F., S.B. Codd, C.T. Salley, “Providing OLAP (On-LineAnalytical Processing) to User Analyst: An IT [9]. Mandate.”Available from Arbor Software’s web site http://www.arborsoft.com/OLAP.html. [10]. Kimball, R. The Data Warehouse Toolkit. John Wiley, 1996. [11]. Barclay, T., R. Barnes, J. Gray, P. Sundaresan, “Loading Databases using Dataflow Parallelism.” SIGMOD Record, Vol. [12]. 23, No. 4, Dec.1994. [13]. O’Neil P., Quass D. “Improved Query Performance with Variant Indices”, To appear in Proc. of SIGMOD Conf., 1997. [14]. Harinarayan V., Rajaraman A., Ullman J.D. “Implementing Data Cubes Efficiently” Proc. of SIGMOD Conf., 1996. [15]. Chaudhuri S., Krishnamurthy R., Potamianos S., Shim K.“Optimizing Queries with Materialized Views” Intl. Conference on Data Engineering, 1995. [16]. Widom, J. “Research Problems in Data Warehousing.” Proc. 4th Intl. CIKM Conf., 1995. [17]. R. G. G. Cattell et al. (Eds). The Object Database Standard: ODMG 2.0. Morgan Kaufmann, 1997. [18]. E. Thomsen. OLAP Solutions.Wiley, 1997. [19]. 17 R. Winter. Databases: Back in the OLAP game. Intelligent Enterprise Magazine, 1(4):60–64, 1998. [20]. Wu, M-C., A.P. Buchmann. “Research Issues in Data Warehousing.” Submitted for publication. [21]. Levy A., Mendelzon A., Sagiv Y. “Answering Queries Using Views” Proc. of PODS, 1995. [22]. Seshadri P., Pirahesh H., Leung T. “Complex Query Decorrelation” Intl. Conference on Data Engineering, 1996. [23]. 21. Widom, J. “Research Problems in Data Warehousing.” Proc. 4th Intl. CIKM Conf., 1995.Gupta A., Harinarayan V., Quass D. “AggregateQuery Processing in Data Warehouse Environments”, Proc. of VLD ISSN: 0974-3596 | APRIL 2012- SEPTEMBER 2012 | Volume 4 : Issue 2| Page: 68