Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Schemas Stars,, Snowflakes,, and Fact Constellations: Schemas for Multidimensional Databases The E-R data model is commonly used in the design of relational databases, where a database schema consists of a set of entities and the relationships between them. them Such a data model is appropriate pp p for OLTP. A data warehouse requires a subject-oriented schema th t facilitates that f ilit t OLAP. OLAP For Data Warehouse three types of schema exist in the form of a star schema, a snowflake schema and fact constellation schema. Star schema The most common schema is the star schema. In Star schema, the data warehouse contains: (1) a llarge central t l table t bl (fact (f t table) t bl ) containing the bulk of the data, with no redundancy. (2) a set of smaller dimension tables, one for each dimension. The schema graph look like a “STAR”, with the dimension tables circles around the central fact table. An example of a star schema for All Electronics l . SSale l are considered id d along l ffour di i sales dimensions, namely, time, item, branch, and location ` The schema contains a central fact table for sales that contains keys to each of the four dimensions, along with two measures: dollars_sold and units_sold. Each dimension is represented by only one table Each table contains a set of attributes. In Fig, the “location” dimension table contains attribute set {location key street, {location_key, street city, city province-or_state, province or state country} Example of Star Schema Dimension Table Dimension Table time item time_key time key day day_of_the_week month quarter year Sales Fact Table time key time_key item_key Dimension Table branch key branch_key item_key item_name brand type supplier_type Dimension Table location branch location_key branch_key bbranch a c _name a e branch_type Measures units sold units_sold dollars_sold location_key street city state_or_province country Snowflake schema The snowflake schema is a variant of the star schema model, where some dimension tables are normalized N Normalization li ti in i thi this schema h by b splitting litti th the ddata t iinto t additional tables. The resulting schema graph forms a shape similar to a snowflake. Thee major ajo ddifference e e ce be between ee thee ssnowflake o a e aandd sstar a schema models is that the dimension tables of the snowflake model may be kept in normalized form to reduce redundancies redundancies. Snowflake schema Cont ... S h a ttable Such bl iis easy tto maintain i t i andd saves storage t space. But this saving space is negligible in comparison to the fact table. table Snowflake structure can reduce the effectiveness of browsingg since more joins will be needed to execute a query. A a result, As lt th the system t performance f adversely d l iimpacted. t d Hence, the snowflake schema is not as popular as the star Hence schema in data warehouse design. Example of Snowflake schema The main difference between the two schemas is in the definition dimension tables. The single dimension table for item in the star schema, normalized in the snowflake schema, schema resulting in new item and supplier table For example example, the item dimension table now contains the attributes item_key, item_name, brand, type, and supplier_key. Supplier_key is linked to supplier dimension table, containing supplier_key and supplier_type information. Example of Snowflake Schema time time_key time key day day_of_the_week month quarter year item Sales Fact Table time key time_key item_key item_key item_name brand t type supplier_key supplier supplier_key supplier_type branch_key y location branch location_key branch_key bbranch a c _name a e branch_type units_sold dollars_sold avg_sales Measures location_key street city_key it k city city_key city state_or_province country Similarly, the single dimension table for location in the star schema can be normalized into two new tables: location and city. The city_key in the location table links to the city dimension. Notice that further normalization can be performed on province_or_state and country in the snowflake schema, when desirable. Fact constellation Sophisticated applications may require multiple fact tables to share dimension tables. This kind of schema can be viewed as a collection of stars. So is called a galaxy schema or fact constellation. constellation example of a fact constellation Thi schema This h specifies ifi ttwo ffactt ttables, bl sales l andd shipping. hi i The sales table definition is identical to that of the star schema. The shipping table has five dimensions, dimensions (or keys): item_key, time_key, shipper _key, from_location, and to location and two measures: dollars_cost to_location, dollars cost and units_shipped. A fact constellation schema allows dimension tables to be shared between fact tables. For example, example the dimensions tables for time, time item, and location are shared between both the sales and shipping fact tables. tables Diagram showing Fact schema Distinction between Data Warehouse and Data Mart A data warehouse collects information about subjects of entire organization, such as customers, items, sales and its scope is enterprise wide enterprise-wide In data warehouses, the fact constellation schema is commonly used since it can model multiple, interrelated subjects. A data d t martt is i a department d t t subset b t off the th data d t warehouse h that th t focuses on selected subjects, and its scope is department-wide. In data marts, the star or snowflake schema are commonly used because both used to model single subjects.