Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data Warehousing And Data Mining Applications For Atmospheric Studies DATA WAREHOUSING AND DATA MINING APPLICATIONS FOR ATMOSPHERIC STUDIES 1 VENKATA SHESHANNA KONGARA, 2D. PUNYASESUDU Research Scholar, Department of Computer Science & Technology, Sri Krishnadevaraya, University, Anantapur, Andhra Pradesh, India Professor, Department of Physics, Rayalaseema University, Kurnool, Andhra Pradesh, India Abstract— Meteorology is an important area of practice and research of the atmospheric considerations that focuses on weather conditions. In current global scientific environment the atmospheric data and its information is one of the most valuable asset for scientists and researchers to evaluate the similarity in appearance with hidden patterns for analytical reporting and atmospheric forecasting. There are exceptional opportunities in the information systems to analyze these datasets and extract the useful information to execute and determine the imminent directions in the atmospheric disciplines. The data warehousing and data mining applications are the most emerging technologies which are endorsed that information to be easily and efficiently accessed while building and deploying data driven analytics for better knowledge in assisting the right decision making activities. However there has not been much work done to relate meteorology with data warehousing and data mining in general and practices. Hence this paper presents the importance of various data warehousing and data mining applications from these streams including star schema based data model, efficient architecture framework and methodology to process the atmospheric data. And also discussed the approaches how to analyze, integrate and manage large volume of atmospheric data with query and information analysis techniques for effective scientific decision supporting and predictive analysis. The proposal will provide the reliable solutions and improve the productivity and decision making efficiency in the meteorological domain. Keywords—Atmospheric data, Data warehousing, Data mining, Decision supporting, Hidden patterns, Information systems and Predictive analysis I. Data warehouse is pretend by W. H. Inmon [1] in the book “Building the Data Warehouse” (1996). He gave the first definition of data warehouse as “A warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process”. Initially data warehouse is used in commercial business to help manager’s decision. In these years, data warehouse is progressively used in wider fields which include many scientific fields. It is a massive collection of storage area which serves as a centralized repository of all the data collected from various departments or processing units from the large organizations and managed systematically for meaningful information and analysis for effective decision supporting. The illustration of the definition is as following Subject oriented: Means that all the relevant datasets storage is organized into specific subject area with summarized information. Integrated: Means that datasets storage is to be distributed from heterogeneous sources, which have to be integrated and data are made consistent with globally accepted standards and measurements Time-variant: Means that data stored may not be current but differ with representing the long term time window like 5 to 10 years. Non-volatile: Means that data storage is never over-written or deleted, once committed, the data are static, read-only, though managed for future analysis. INTRODUCTION Meteorology is an essential area of practice and research about Atmosphere from the earth to higher levels in the space. However the studying approach has been changed while finding the facts and trends to improve the scope of forecasting and evaluate the effects of dynamically changing atmospheric conditions. There are different statistical and scientific methods to process the meteorological datasets and measure the correlated innovations. However researchers are facing technical challenges in storing, retrieving, managing and exploration of these structured and un-structured data which is very large in size. Hence the approaches of data maintenance and conversion involves in time consuming, expensive, and complex to mitigate the accurate results. There are a number of exceptional opportunities in the information systems to analyze these datasets and extract the useful information to execute and determine the imminent directions in the atmospheric disciplines. The data warehousing and data mining applications are the most emerging technologies with powerful data managing concepts. Which are endorsed that information to be easily and efficiently accessed while building and deploying data driven analytics for better knowledge in assisting the right decision making activities. Proceedings of 5th IACEECE-2013, 22nd September 2013, Hyderabad, India. ISBN: 978-93-82702-30-6 75 Data Warehousing And Data Mining Applications For Atmospheric Studies In order to understand data warehouses, it is important to learn some essential theories, concepts, domains, techniques and models for data management and analysis. There are many data warehousing approaches and research & development activities which are initiated during few decades in various fields. However in current scenario, as there are many innovations in the hardware and software applications usages there is a necessity to acknowledge the new data challenges from innovative technology trends and the value of real time insight and analysis , Hence Inmon[1] who is the father of data warehousing revisited the existing data warehousing framework and functionalities to upgraded with DW2.0 [2], Since there has been lot of progress in architecture, technology and information systems these advances have been pushed into the next generation of data warehousing that includes many missing features and functions that were not recognized as part of a data warehouse.– the next generation of data warehousing that has many integrated features, which support the present technology trends will be strategic for most of the organizations. Apart from that Dan Linstedt [3] developed a new Data Model as discussed a patent-pending technique called the Data Vault™ – the next evolution in data modeling for enterprise data warehousing. The Data Vault is a detail oriented, historical tracking and uniquely linked set of normalized tables that support one or more functional areas of business. It is a hybrid approach encompassing the best of breed between 3rd normal form (3 NF) and star schema. The design is flexible, scalable, consistent, and adaptable to the needs of the enterprise. It is a data model that is architected specifically to meet the needs of today’s enterprise data warehouses. The data Vault model follows in all the characteristics as defined by Bill Inmon [1], excepting the subject oriented feature of the data warehouse definition, to whatever extent in the Data vault model those subject oriented structures are substituted as functionally based structures. The business keys are organized in horizontal in nature and providing the visibility across lines of business. Bill Inmon [1] also granted that the Data Vault is the optimal choice for modeling the Enterprise Data Warehousing in the DW 2.0 [2] [3] proposal. The rest of the paper is organized as follows. In section II review the related work for atmospheric studies. Details of the data warehousing applications for atmospheric data are described in Section III. The data mining applications for atmospheric data are presented in section IV and conclude the paper in Section V. II. The methods presented by Jayanthi T et al [4] to develop large Scientific & Technical Databases (i.e. Data Warehouse) creation and reconciliation of data object and establishing metadata standards are very important. Further defining metadata semantics, creating discipline specific data dictionaries, information models for organizing metadata, and data models for describing data set structure also need to be worked out. They discussed the Main Data Centres available at IGCAR, Design and Development of Scientific & Technical Data Warehouse, its importance and visualization techniques leading to knowledge discovery. Ramon Lawrence [5] proposed an architecture for archiving and analyzing real-time scientific data that isolates researchers from the complexities of data storage and retrieval. Data access transparency is achieved by using a database to store metadata on the raw data, and retrieving data subsets of interest using SQL queries on metadata Wang Zhijuan et al [6] presented a UML-based data warehouse design method that spans the three design phases (conceptual, logical and physical). Their method comprises a set of Meta models used at each phase, as well as a set of transformations that can be semi-automated. Ruilian Hou [7] introduced the development and conception of data warehouse and database and research the relationship between database and data warehouse, and has studied the difference between their technologies. He also discusses the combination and application of the database and date warehouse technology. Gerasimos Marketos et al [8] discussed the architecture of a so-called seismic data management and mining system (SDMMS) for quick and easy data collection, processing and visualization. The SDMMS architecture includes, among others, a seismological database for efficient and effective querying and a seismological data warehouse for OLAP analysis and data mining Xiaoguang Tan [9] expressed that data warehouse as a new kind of Artificial Intelligence (AI) system that combines database and meteorological graphics technology. It helps forecasters accumulate, manage and use their knowledge in operational forecast. It is a new generation of DSS. Obviously data warehouse will not become whole system of forecaster’s workbench, because operational forecast mission is very complex. But it is a system to help forecasters accumulate, manage and use their knowledge. Aditya Kumar Gupta et al [10] proposed a multidimensional data warehouse for agriculture that provides solutions for farmers and gives a response of their ad-hoc quires. This multidimensional schema further promotes star schema and snowflake schema that are commonly used to design data warehouses. And also normalization is applied to store the data into RELATED WORK Proceedings of 5th IACEECE-2013, 22nd September 2013, Hyderabad, India. ISBN: 978-93-82702-30-6 76 Data Warehousing And Data Mining Applications For Atmospheric Studies star schema and duplicate values are removed, so that space and time complexities could be minimized Keshav Dev Gupta et al [11] discussed that, in order to accurately reflect the user’s requirements into an out of error, easy to understand, and easily extendable data warehouse schema, special attention should be paid at the dimensional modeling phase. And also present a set of user modeling requirements Vuda Sreenivasarao et al[12] discussed an overview of scientific data warehouse and OLAP technologies, with an emphasis on their data warehousing requirements, The methods that were used include the efficient computation of data cubes by integration of MOLAP and ROLAP techniques, the integration of data cube methods with dimension relevance analysis and data dispersion analysis for concept description and data cube based multi-level association, classification, prediction and clustering techniques Folorunsho Olaiya et al [13] investigated the use of data mining techniques in forecasting maximum temperature, rainfall, evaporation and wind speed. This was carried out using Artificial Neural Network and Decision Tree algorithms and meteorological data. A data model for the meteorological data was developed and this was used to train the classifier algorithms. The performances of these algorithms were compared using standard performance metrics, and the algorithm which gave the best results used to generate classification rules for the mean weather variables. A predictive Neural Network model was also developed for the weather prediction program and the results compared with actual weather data for the predicted periods. Meghali A. Kalyankar et al [14] discussed on Data mining Process for weather data to study weather data using data mining technique like clustering technique. By using this technique one can acquire Weather data and can find the hidden patterns inside the large dataset so as to transfer the retrieved information into usable knowledge for classification and prediction of climatic condition. Gaurav J. Sawale et al [15] discussed how to use a data mining technique to analyze the Metrological data like Weather data. A variety of data mining tools and techniques are available in the industry, but they have been used in a very limited way for meteorological data. And also explained how a neural network-based algorithm for predicting the atmosphere for a future time and a given location is presented. III. relationship between the atmospheric entities and the objects in any type of processing rules to organize the data management in an efficient way as shown in Figure 1. However due to the limitations in data integrity constraints with moderate schema design approaches for large historical data, it makes a move to dimensional modeling, and still there is a room to connecting the relational databases directly with currently available business intelligence applications by querying and expanding the knowledge for effective decision supporting. Figure 1:Relational data model used for atmospheric data in business intelligence reporting and data mining. 3.2. Dimensional Modeling with Star schema The Dimensional modeling (DM) is different from entity-relationship modeling (ER) and schema defined with set of methods and concepts used in the data warehouse design, each entity comes up with a context which is a dimension table and qualifies a measurable number (Fact Table), each fact table is associated in centre of the schema surrounded by multiple dimension tables like a star as shown in Figure 2 for rapid query processing. According to data warehousing expert Ralph Kimball [16] Dimensional Modeling does not necessarily involve a relational database. The same modeling approach, at the logical level, can be used for any physical form, such as multidimensional database or even flat files. DATA WAREHOUSING APPLICATIONS FOR ATMOSPHERIC DATA Figure 2: Dimensional modeling with star schema. Ms. Alpa R. Patel et al [17] discussed that the conceptual Entity-Relationship (ER) is extensively used for database design in relational database environment, which emphasized on day-to-day operations. Multidimensional data (MD) modeling, 3.1. Advances in Relational databases and ER Modeling: The Entity Relationship (ER) Modeling is one of the best data modeling technique which represents the Proceedings of 5th IACEECE-2013, 22nd September 2013, Hyderabad, India. ISBN: 978-93-82702-30-6 77 Data Warehousing And Data Mining Applications For Atmospheric Studies on the other hand, is crucial in data warehouse design, which targeted for managerial decision support. It supports decision making by allowing users to drill-down for a more detailed information, roll-up to view summarized information, slice and dice a dimension for a selection of a specific item of interest and pivot to re-orient the view of MD data. They also explored how the Multidimensional model can be used as the yardstick of data warehouse design instead of ER Model. 3.3. Data Extraction, Transformation and Loading (ETL) Extraction, Transformation and Loading (ETL) systems is the back bone of the data warehouse development. The ETL prototype control the data availability in three different phases as showing figure 3., in extraction phase the data sets involves such as text files spreadsheet and legacy systems for day to day atmospheric operational data, in transformation phase it maintains the standard structure with data consistency and in loading phase it deals the semi transformed and fully transformed dimensions data and its metadata. 3.4. Data Organizing with Data Marts Data warehouse is a large volume of enterprise data; and the data mart is a subject oriented functional data storage area specific to a group/departmental data from the data warehouse as shown in figure 4. Usually the data is extracted from the data warehouse and organized in data marts with data de-normalization and applied indexes to support the end users analysis. Figure 4: Data Mart with subject area Paulraj M et al [19] discussed that Data warehousing collects the data at different levels (i.e., departmental, operational, functional) and stored as a collective data repository with better storage efficiency. Various data warehousing models concentrate on storing the data more efficiently and quickly. In addition accessibility of data from the warehouse needs better understanding of the structure in which the data layers are stored in the repository. However function requirements of users are not easily understood by the data warehouse model. It needs efficient decision support system to extract the required user demanded data from data warehouse. They built a Functional Layer Interfaced Data Mart Architecture (FLIDMA) to provide a better decision support system for larger corporate and enterprise data applications. 3.5. Online analytical processing (OLAP) and Multidimensional Analysis Online analytical processing and multidimensional analysis techniques are the key data processing and presentation techniques in the arm of business intelligence application layer with broad category of applications and methods like drill-down analysis, roll-up analysis, drill-through analysis, slice and dice analysis for gathering, sorting, analyzing and presenting data to help end users to make better business Decisions. As shown in figure 5 business intelligence applications includes the activities of business value drivers, decision support systems, query and reporting, online analytical processing, statistical analysis and forecasting. Figure 3:Extraction, Transformation and Loading system prototype. Shaker H. Ali El-Sappagh et al [18] discussed that Extraction–transformation–loading (ETL) tools are pieces of software responsible for the extraction of data from several sources, its cleansing, customization, reformatting, integration, and insertion into a data warehouse. Building the ETL process is potentially one of the biggest tasks of building a warehouse; it is complex, time consuming, and consume most of the data warehouse project’s implementation efforts, costs, and resources. Building a data warehouse requires focusing closely on understanding three main areas: the source area, the destination area, and the mapping area (ETL processes). The source area has standard models such as entity relationship diagram, and the destination area has standard models such as star schema, but the mapping area has not a standard model. Hence they have discussed about a model for conceptual design of ETL processes and proposed a method with some enhancements from existing to support the missing mappings of the ETL system Proceedings of 5th IACEECE-2013, 22nd September 2013, Hyderabad, India. ISBN: 978-93-82702-30-6 78 Data Warehousing And Data Mining Applications For Atmospheric Studies survey of mining of the conditional hybrid dimensional association rule mining. From this comparative study, the Boolean matrix based approach is the best suited for mining multidimensional association rule and for mining conditional hybrid dimensional association rule. A Boolean matrix based approach has been used to find the frequent item sets, the items forming a rule come from different dimensions 4.2. Classification and Decision Trees based analysis Classification is a supervised machine learning technique used to build a model, once model has been built and applied to unseen data for prediction of class label. Building the accurate and efficient classifiers for large databases is one of the essential tasks of data mining and machine learning research. Building effective classification systems is one of the central tasks of data mining. Many different types of classification techniques are available that includes Decision Trees, Naive-Bayesian methods, Neural Networks, Logistic Regression , Support Vector Machines (SVM) and KNN (K- Means) etc. Decision tree technique is used for finding classification because it is ease of use for practice. Sharma, N et al [22] discussed that troposphere temperature measurements at high temporal, spatial, and vertical resolutions are required for many meteorological studies. Radiosonde and Global Positioning System radio occultation (GPSRO) observations have very high vertical resolutions but poor in spatial and temporal coverage. Although the sounders on geostationary satellites can provide high temporal and spatial resolutions, their vertical resolution is poor. Hence they proposed a method to increase the vertical resolution of troposphere temperature profiles obtained from geostationary satellite observations based on an artificial neural network (ANN) approach so that high-resolution temperature profiles are available in all four dimensions. 4.3. Cluster based analysis The clustering is an unsupervised machine learning technique, in which the class label not know in advance consequently it is used to divide the data into different clusters based on the similarity within the cluster and dissimilar to another cluster. Many different types of clustering techniques are available that includes partitioning, hierarchical and density based clustering etc. In this mainly k-means clustering algorithm is mostly used in number of applications. A. Santhi Latha et al [23] discussed that how cluster analysis can be helpful for mining spatial data. Cluster analysis divides data into meaningful or useful groups (clusters). If meaningful clusters are the goal, then the resulting clusters should capture the “natural” structure of the data. Figure 5: Data warehousing and business intelligence OLAP Cube. IV. DATA MINING APPLICATIONS FOR ATMOSPHERIC DATA Data mining is a process of identifying the knowledge from the various atmospheric datasets to find unknown patterns or rules for information analysis to predict future trends and behaviors in effective decision supporting. Current global scientific environment data mining has become a powerful information technology tool to evaluate the resemblance with hidden patterns for atmospheric forecasting. J. Han and M. Kamber [20] stated that data mining is a multidisciplinary field with various data mining techniques in drawing work from areas including database technology, machine learning, statistics, pattern recognition, information retrieval, neural networks, knowledge-based systems, artificial intelligence, high-performance computing, and data visualization for knowledge discovery. Various data mining techniques such as association rules, Classification, regression, Clustering, Outlier analysis and neural network based applications are broadly used in atmospheric studies 4.1. Association Rules Association rule mining is the mining of Association rules for finding the relationships between data items in large datasets the goal of association rule mining is to find interesting patterns in various fields. The two measures of interesting patterns that are used most often in association rule mining are support and confidence. The support of a rule represents the percentage of transactions that a given rule satisfies. The support of a rule is the probability that both X and Y, where X is the antecedent and Y is the consequent of the rule, are contained in a transaction in the data that is being mined. Association rules are considered interesting if they satisfy both a minimum support threshold and a minimum confidence threshold. The apriori algorithm mostly used to find the frequent patterns, many different types of association techniques are available that quantitative association rules, multi dimensional association rules, multilevel association rules, Boolean association rules, etc. Nilam K. Nakod et al [21] presented the overall survey of mining multidimensional as well as the Proceedings of 5th IACEECE-2013, 22nd September 2013, Hyderabad, India. ISBN: 978-93-82702-30-6 79 Data Warehousing And Data Mining Applications For Atmospheric Studies V. Architecture The data warehouse architecture typically consists of several components which consolidate raw data from several scientific operational and legacy systems to support a variety of data presentation and data mining analytical tools in the front-end and the data collection, pre-processing data is in the back-end part are considered as a data ware house. ARCHITECTURAL FRAMEWORK AND METHODOLOGY FOR ATMOSPHERIC DATA 5.1. Architectural framework As acknowledged in the Introduction of this paper data warehousing and data mining applications are the best suitable for atmospheric studies. However it is an essential to improve architectural frame work and methodology in current fast growing technology trends and the data processing needs in a better way. The goal of the framework is to simplify the design, implementation, and management of data for efficient solutions. A survey conducted to present in this paper, different authors discussed on approaches used for framework design. M. Laxmaiah et al [24] discussed with a conceptual metadata framework for spatial data warehouse. Ginjala Srikanth Reddy et al [25] discussed the importance of data warehousing and data mining concepts and suggested ad-hoc data-mining framework for data warehouse technology with association rules based data-mining framework that is tightly integrated with the data warehousing technology. Expressed their framework has several advantages over the use of separate data mining tools Nenad Jukic et al [26] address the issue of failure of data warehousing projects due to inadequate requirements collection and definition process. They have described a framework that can help accomplish the objective of developing a business-driven, actionable set of requirements. They expressed that the framework would consist of a series of steps to facilitate for collection and definition of requirements in data warehousing projects. A comprehensive survey conducted on framework and identified set of limitations in practices and implementation factors in data warehousing and data mining application design. Hence in this learning a new prototype framework has been proposed as shows in figure 6. It enables the reconciliation of an overall outlook for developers and resources involved in the project. And also helps in reducing the costs and easier to see the control of changes on the whole implementation. Figure 7: Data warehousing and mining architecture. 5.2. Atmospheric dimensional model The dimensional modeling is plays an important role in data warehouse design. As an implementation case study collected 15 years of large volume of meteorological radiosondes data from British Atmospheric Data Centre (BDAC) are used for the water vapour studies and built a dimensional model. The radiosondes data which obtained are containing stations details, air pressure, air temperature, dew point temperature, wind speed, wind direction and various parameters are processed with calculated measures and its datasets into star based dimensional model as shown in Figure 8 Figure 8: Star Schema based dimensional model for atmospheric data. 5.3. Atmospheric data warehouse building methodology The data warehousing development is obviously complex moreover at the end of the results is expensive and time consuming. Presently there are several methodologies available in the data Figure 6: framework prototype for data warehousing and data mining applications development Proceedings of 5th IACEECE-2013, 22nd September 2013, Hyderabad, India. ISBN: 978-93-82702-30-6 80 Data Warehousing And Data Mining Applications For Atmospheric Studies warehousing market; Saroop, S et al [27] conducted comparative study of different approaches used in data warehouse design. Different authors have proposed different techniques at different levels. However the data processing approaches are different from data warehouse to data mining framework, hence proposing the integrated IDCARD methodology for end to end data warehousing and data mining application development as shown in the figure 6. Implication prototype framework. The comprehensive IDCARD methodology will cover the entire process of data warehousing and data mining requirements of the end users. The methodology consists of 6 different phases like Initiate (I), Design (D), Construct (C), Review (R) and Decision (D) as a sequential step by step process to execute in an efficient way with set of standard and guidelines. The proposal will establish a link between the methodology and the requirement domain to improve the effectiveness of project implementation for effective decision supporting. Initiate (I) phase: The main purpose of Initiate phase is to identify the scope, goals, objectives and technical requirements of the DW and DM applications. project with high level and low-level design specifications. Following are the key activities in this phase and the process flow is shown in figure 10. Logical design/physical design for the system Data model design for entire system ETL Mappings design with data assessment OLAP design with presentation engine Design the DW/DM Architecture Data visualizations design DWH and DM process design Integration testing design Meta data repository design for DWH and DM tools usage Figure 10: Design process flow diagram. Construct (C) Phase: The main purpose of construct phase is to construct entire system using the integration services, complete the technical documentation, and to execute the test cases. Figure 9: Initiate process flow diagram. Following are the key activities in this phase and the process flow is shown in figure 9. Describe the project scope with understanding the requirements domain Initiate the meetings and Requirements gathering Consolidates the gaps if any from the requirements and technical implications Initiate the architecture and infrastructure plans and software selection Outline the project risks for ETL/ Reports and data mining process distinguish the quality assurance and test planning Design (D) Phase: The main purpose of design phase includes logical design and building architectural components of a Figure 11: Construct process flow diagram Following are the key activities in this phase and the process flow is shown in figure 11. Construct the Integration systems. Develop the ETL programs Build the DW/Data Mart BI/DM applications or Reports Proceedings of 5th IACEECE-2013, 22nd September 2013, Hyderabad, India. ISBN: 978-93-82702-30-6 81 Data Warehousing And Data Mining Applications For Atmospheric Studies Develop ETL/BI/DM unit tests Build the Metadata repository Construct the User Acceptance Test (UAT) cases Arrange (A) and Review (R) Phase: The purpose of arrange and review phase is to ensure that the system integrations meets the requirements documented in the initiate phase. Then review the system with quality standards through the quality assurance plan and user acceptance testing. Then it is deployed into production. Data mining model prototype/algorithm evaluation Information Analysis Knowledge deployment Decision implementation CONCLUSION The data warehousing and data mining applications are the most emerging technologies which are endorsed that information to be easily and efficiently accessed while building and deploying data driven analytics for better knowledge in assisting the right decision making activities. Hence various data warehousing and data mining techniques are discussed. Apart from that a comprehensive survey conducted on data warehousing and data mining framework and identified a set of limitations in practices and implementation factors in data warehousing and data mining application design. Hence with this learning a new prototype framework has been proposed for data warehousing and data mining application design for large volume of atmospheric data. And also conducted another survey in usage of the data mining techniques in the field of atmospheric studies and found that neural networks are the best reliable source for classifying the atmospheric data and predictive analysis. Figure 12: Arrange (A) and Review (R) process flow diagram. Following are the key activities in this phase and the process flow is shown in figure 12. Arrange the system Integration tests and review User acceptance test and review Performance tuning and capacity planning and review Production rollout and data review for data mining REFERENCES [1] Inmon. W, "Build the data warehouse", John Wiley and Sons, New York, 1996. [2] Inmon, W H - "DW 2.0 Architecture For The Next Generation of Data Warehousing" Morgan – Kaufman, 2008. [3] Dan Linstedt, "Data Vault Series 1 – Data Vault Overview", Data Vault Series, The Data Administration Newsletter, Retrieved 12 September 2011. [4] Jayanthi T, Ananthanarayanan S. and Rajeswari S,"Scientific Data Warehouse and Visualization Techniques", Computer Division, IGCAR, Kalpakkam. [5] Ramon Lawrence, "An Architecture for Real-Time Warehousing of Scientific Data", Department of Computer Science,University,USA,http://www.unidata.ucar.edu/projects/i dd/overview/idd.html [6] Wang Zhejiang ,Wei Hongchang and Wu Xuefang , "A Data Warehouse Design Method”, 2012, International Conference on " Computer Science & Service System (CSSS), Page(s): 2063 2066 [7] Ruilian Hou ,"Analysis and research on the difference between data warehouse and database " International Conference on Computer Science and Network Technology (ICCSNT), 2011 Volume: 4 , Publication Year: 2011 , Page(s): 2636 - 2639 [8] Gerasimos Marketos, Yannis Theodoridis and Ioannis S. Kalogeras," Seismological Data Warehousing and Mining", University of Piraeus, Greece [9] Xiaoguang Tan, "Data Ware housing and its potential using in weather forecast", Institute of Urban Meteorology, CMA, Beijing, China [10] Aditya Kumar Gupta, Bireshwar Dass Mazumdar, "Multidimensional Schema for Agriculture Data Warehouse"(IJRET), 2013 Volume: 2, Page(s): 245 - 253 [11] Keshav Dev Gupta, Jyoti Gupta and Prakati Prasoon "Novel Architecture with Dimensional Approach of Data Warehouse", (IJARCSSE), 2013, Volume :3, Page(s): 301-303 Decision (D) Phase: The purpose of the decision phase is to ensure that the data mining data assessment and identify the practically associated algorithms for an efficient training and result Interpretation Figure 13: Decision implementation process flow diagram. Following are the key activities in this phase and the process flow is shown in figure 12. Data reconciliation for data mining Data sampling analysis Proceedings of 5th IACEECE-2013, 22nd September 2013, Hyderabad, India. ISBN: 978-93-82702-30-6 82 Data Warehousing And Data Mining Applications For Atmospheric Studies [12] Vuda Sreenivasarao,Venata Subbareddy Pallamreddy, "Advanced Data Warehousing Techniques for Analysis, Interpretation and Decision Support for Scientific Data", (CCIS 198), 2011, Page(s): 162-174 [13] Folorunsho Olaiya, Adesesan Barnabas Adeyemo "Application of Data Mining Techniques in Weather Prediction and Climate Change Studies",(MECS), 2012,page(s) 51-59 [14] Meghali A. Kalyankar,Prof. S. J. Alaspurkar "Data Mining Technique to Analyse the Metrological Data", (IJARCSSE), 2013, Volume :3, Page(s) :114-118 [15] Gaurav J. Sawale, Dr. Sunil R. Gupta, "Use of Artificial Neural Network in Data Mining For Weather Forecasting", (IJCSA), 2013, Volume:6, Page(s) 383-387 [16] Ralph Kimball, Margy Ross, Warren Thornthwaite, and Joy Mundy (January 10, 2008). The Data Warehouse Lifecycle Toolkit: Expert Methods for Designing, Developing, and Deploying Data Warehouses (Second ed.). Wiley. ISBN 978-0-470-14977-5. [17] Ms. Alpa R. Patel and Prof F. (Dr.) Jayesh M. Patel,"Data Modeling Techniques for Data Warehouse", (2012), International Journal of Multidisciplinary Research, Vol.2, Issue 2, Page(s) 240-246 [18] Shaker H. Ali El-Sappagh, Abdeltawab, M. Ahmed Hendawi and Ali Hamed El Bastawissy, "A proposed model for data warehouse ETL processes", Journal of King Saud University – Computer and Information Sciences (2011) 23, Page(s) 91–104 [19] Paulraj M and Sivaprakasam P,"Functional Behavior Pattern for DataMart based on Attribute Relativity",(IJCSI),2012, Vol.9, Issue 4, Page(s) 278-283 [20] J. Han and M. Kamber, “Data Mining-Concepts and Technique” (The Morgan Kaufmann Series in Data Management Systems), 2nd ed. San Mateo, CA: Morgan Kaufmann, 2006. [21] Nilam K. Nakod, M.B.Vaidya,"Survey on Multidimensional and Conditional Hybrid Dimensional Association Rule Mining", (IJESE), 2013, Volume-1, Issue-4, Page(s) 63-66 [22] Sharma, N.; Ali, M. M., "A Neural Network Approach to Improve the Vertical Resolution of Atmospheric Temperature Profiles From Geostationary Satellites," Geoscience and Remote Sensing Letters, IEEE , vol.10, no.1, pp.34,37, Jan. 2013, doi:10.1109/LGRS.2012.2191763 [23] A.Santhi Latha ,J.Swapna Priya,Sk.Abdul Kareem and M.Pavani Devi, "Spatial Data Mining Through Cluster Analysis", (IJECCE), 2012, Volume:3, Issue No:2, Page(s): 372-375 [24] M.Laxmaiah and A.Govardhan,"A conceptual Meatadata Framework for Spatial Data Warehouse", International Journal of Data Mining & Knowledge Management Process (IJDKP), 2013, Vol.3, No.3 Page(s):63-73 [25] Ginjala Srikanth Reddy,Khasim Pasha Sd and Sadalaxmi Morthala, "Ad-hoc Data-Mining Framework for Data Warehouse Technology", International Journal of Advanced Technology & Engineering Research (IJATER), 2012, Volume 2, Issue 4, page(s): 58-67 [26] Nenad Jukic and John Nicholas, "A Framework for Collecting and Defining Requirements for Data Warehousing Projects", Journal of Computing and Information Technology - CIT, 2010, Volume: 18, Issue:4, Page(s):377-384 [27] Saroop, S.; Kumar, M., "Comparison of Data Warehouse Design Approaches from User Requirement to Conceptual Model: A Survey," Communication Systems and Network Technologies (CSNT), 2011 International Conference on , vol., no., pp.308,312, 3-5 June 2011,doi: 10.1109/CSNT.2011.161 Proceedings of 5th IACEECE-2013, 22nd September 2013, Hyderabad, India. ISBN: 978-93-82702-30-6 83