Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Open Database Connectivity wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Team Foundation Server wikipedia , lookup
Relational model wikipedia , lookup
Database Management Systems Politecnico di Torino – School of Information Engineering Master of Science in Computer Engineering Practice 5 SQL Server 2005 Duration: 2 weeks 1. Introduction The practice purpose is to design a data warehouse that addresses the issues described in the following. ● Use Microsoft SQL Server 2005 and Visual Studio 2005 to create the needed tables and cubes. ● Name the tables, cubes and projects using your own student ID number. ● The SQL Server 2005 hostname is CCLIX3. ● The source data base name is Stab1 on CCLIX3. 2. Problem specifications The management of a chain of Italian seaside resorts wants to analyze the number of rented items in their seaside resorts. Currently, there is a separate operational database for each seaside resort and the management is not able to compare the number of rented items and the net income of the seaside resorts. Each operational database only stores the information about the number of rented items (beach umbrellas, sun beds, etc.) for one seaside resort and the price paid by clients to rent the items. The operational database of one seaside resort is Stab1 and its schema is reported in Figure 1. The management of the company needs a data warehouse to analyze the net income and the number of rented items for every seaside resort and compare them. The management frequently performs the following analyses. ● Monthly net income for each province and region of the seaside resorts. ● Quarterly net income for each province and region of the seaside resorts. ● Yearly net income for each province and region of the seaside resorts. ● Monthly, quarterly, four-monthly and yearly net income for each category of rented items. ● Monthly, quarterly, four-monthly and yearly net income for each category of rented items, detailed for region and province of the seaside resorts. ● Number of rented items per month, quarter, four-month period and year for each category of rented items. ● Number of monthly, quarterly, four-monthly and yearly rented items for each province and region of the seaside resorts. The database Stab1, stored on CCLIX3 server, is the operational databases of one seaside resort. During the practice the following steps should be executed: Analyze the schema in order to under stand which kind of data are stored in the operational database Stab1 of the seaside resort (source of the data warehouse that should be created). Design a data warehouse to manage the issues described above. Write scripts to create the tables of the data warehouse. The name of the data should be defined with the prefix DW followed by the student id (e.g., student id: s12345, data warehouse name: DWs12345). 3. Data loading (ETL) Execute the ETL process to load data from the Stab1 source data base to the data warehouse you designed in Point 2. Key steps of the ETL process are the following: Creation of a temporary staging table and loading data o create a temporary staging table in the database into the data warehouse. o load into the staging table the data from Stab1 using proper aggregations, so that data of the staging table can be used to successively feed the data warehouse dimensions and facts, without needing the original Stab1 data base. Data loading into dimension tables o Load the data into the dimension tables. The source data are retrieved by querying the temporary staging table defined in the previous step. Data loading into fact table o after the data loading into the dimension tables, load the data into the fact table. During the practice the following steps should be executed: Use the Business Intelligence – Integration Services project of Visual Studio 2005 to design a proper workflow to execute the ETL process. 4. Queries and cubes Use the data warehouse created and fed in points 2 and 3 as source for the cubes creation. In the following some of the query frequently executed are listed: 1. Analyze the net income for each pair (region, quarter), the total net income for each region and the total net income for each quarter. 2. Analyze the net income for each pair (item category, month), the net income for each item category (independently of the month) and the net income for each month (for all the item categories). 3. Analyze the total number of rented items for each pair (item category, month). Moreover, analyze the number of rented items for each category (independently of the month) and the number of rented items for each month (for all the item categories). During the practice the following steps should be executed: Use Microsoft Visual Studio 2005 (Analysis Services project) to create the cubes needed to answer the proposed queries efficiently. Browse the cube data and structure by means of the graphical interface provided by Microsoft Visual Studio (Analysis Services project). 5. Integration into Excel During the practice the following steps should be executed: Exploit Microsoft Excel to browse the content of the cubes by using a pivot table. Ite m K e y D e s c rip tio n Ite m d a te s ta rtH o u r e n d H o u r (0 ,N ) re n te d (1 ,1 ) R e n ta l (1 ,1 ) (1 ,1 ) cost ty p e p a y m e n tT y p e re n ts (1 ,N ) (1 ,N ) C lie n tK e y T ypeK ey D e s c rip tio n T ype C a te g o r y firs tn a m e s u rn a m e Figure 1 – ER Model of the source database (Stab1) Figure 2 – Schema of the source database (Stab1) …. C lie n t T y p e o f ite m