* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download What is Universe - dbmanagement.info
Survey
Document related concepts
Transcript
DATA WAREHOUSE 1. What is Universe? What are the steps involved in creating a universe? It is a semantic layer between the database and the user interface. A mapping of the data structure found in databases: tables, columns, joins, etc. A universe, which is made up of classes, objects, and conditions, can represent any specific application, system, or group of users. Steps involved in creating Universe: Identifying tables Understand the requirement Insert tables Then create objects Create joins Create calculated measures Check universe integrity 2. What is SCD? What are the different types of SCD? SCD stands for slowly changing dimensions. Slowly changing dimensions are of three types. They are: SCD1: It maintains only updated values. Ex: A customer address modified we update existing record with new address. SCD2: maintains historical information and current information by using Effective date Versions Flags Or combination of these SCD3: by adding new columns to target table we maintain historical information and current information 3. What is Normalization? What is Denormalization? Normalization It is a process of putting things normal or making those right. This process results in deep insights into the information used in business and the understanding of how various elements of information are related to each other. Denormalization: It is the reverse process of normalization where by you are not strict about being normalized. A related scenario would be Data warehousing where, in order to generate reports and other SELECT queries, we may relax the table structure by including nonrelated columns in a table so that big complex queries are avoided. 4. What is normal form? What are the different types of normal forms? Normalization is a process of putting things normal or making those right. This process results in deep insights into the information used in business and the understanding of how various elements of information are related to each other. There are three types of normal forms. They are First Normal Form: The first step is to put the data into first normal form. This can be done by moving the data into separate tables where the data is of similar type in each table. Each table is given a primary key. This will eliminate repeating groups of data. Second Normal Form: This step helps in taking out data that is only dependent on a part of the key. Third Normal Form: Step three gives the Third Normal form. This step is taken to get ride of anything that does not depend entirely on the primary key. One can arrive to the third normal form without going to first and second normal forms 5. What is a loop? A loop is a situation that occurs when more than one path exists from one table to another. Loops result in ambiguity in the design of a universe. Designer enables you to identify loops in one of two ways: 1.You can run the Check Integrity function, which indicates the existence of any loops. 2.You can select the Detect Loops command from the Tools menu. If there are loops, the Loop Detection viewer appears; it indicates the joins causing a loop. 6. What is Context? A method by which Designer can decide which path to choose when more than one path is possible from one table to another in the universe. It helps in resolving the loops created by various joins in the universe tables. You can create contexts manually, or cause them to be detected by designer. 7. What is an alias? An alias is a logical pointer to an alternate table name. The purpose of an alias is to resolve loops in the paths of joins. In some cases, more than one alias may be necessary for a given table. 8. What is Fan Trap? A one to many join links a table, which in turn linked by a one to many, joins. This type of fanning out of one to many joins can lead to a join path problem called a fan trap. The fanning out effect of “one to many” joins can cause incorrect results to be returned when a query includes objects based on both tables. This Fan trap is resolved by using alias 9. What is Chasm Trap? Many to one joins from two fact tables converge on a single lookup table. This type of join convergence can lead to a join path problem called chasm trap. This Chasm trap is resolved by using context. 10. What is a Join? What are the different types of Joins? Join: A relational operation that causes two tables with a common column to be combined into a single table. Designer supports equi-joins, theta joins, outer joins, short cut joins, isolated join. Equi-join: A join based on the equality between the values in the column of one table and the values in the column of another. Because the same column is present in both tables, the join synchronizes the two tables. Outer join: A join that links two tables, one of which has rows that do not match those in the common column of the other table. Theta join: A join that links tables based on a relationship other than equality between two columns. Short cut join: A join that links two tables by bypassing one or more other tables in the universe. Isolated Join: An isolated join is one that has not been included in any context in your universe. 11. What is an Object? What are the different types of Objects? Object: A component that maps to data or a derivation of data in the database. For the purpose of multidimensional analysis, an object can be qualified as a dimension, detail or measure. Objects are grouped into classes. There are 3 types of Objects: Dimension Detail Measure Object Type Description Dimension Parameters for analysis. Dimensions typically relate to a hierarchy such as geography, product or time Detail Provide a Description of a dimension, but are not the focus for analysis. For example: Phone number Measure Convey numeric information, which is used to quantify a dimension object. For example sales revenue. 12. What are object, class, and subclass? Object: A component that maps to data or a derivation of data in the database. For the purposes of multidimensional analysis, an object can be qualified as a dimension, detail, or measure. Objects are grouped into classes. Class: A logical grouping of objects and conditions within a universe. In general, the name of a class reflects a business concept that conveys the category or type of objects. Subclass: A component within a class that groups objects. A subclass can itself contain other subclasses or objects. 13. What is Check Integrity? Checks the validity of the active universe including its structure, joins, cardinalities, objects, contexts, and conditions. It can also detect whether there are any loops. You can check the entire universe or only certain of its components. 14. What is a surrogate key? Why are we going for surrogate keys? Surrogate keys are the keys that are maintained within the data warehouse instead of keys taken from source data systems. 1. Data tables in various source systems may use different keys for the same entity. 2. Keys may change or be used in the data source systems. 3. Changes in organizational structures may move the keys in the hierarchy. 15. What are the different types of data warehouse tools available in market? Business objects Cognos Hyperion Asbase Microstategy Microsoft Reporting Services Crystal reports 16. What are cardinalities? Cardinality expresses the min and max number of instances of an entity B that can be associated with an instance of an entity A. The min and the max number of instances can be equal to 0, 1, or N. There are two main methods for detecting or editing cardinalities: 1.The Detect Cardinalities command 2.The Edit Join dialog box 17. What is a strategy? How many types are there? Scripts that automatically extract structural information about tables, columns, joins, or cardinalities from a database. Designer provides default strategy but a designer can also create strategies. These are referred to as external strategies. There are two types of strategies. They are 1. Built-in strategy 2. External strategy Built-in strategy: Extract the joins with tables. Detect cardinalities in joins. 18. What is List of Values (LOV)? A list of values contains the data values associated with an object. These data values can originate from a corporate database, or a flat file such as a text file or Excel file. In Designer you create a list of values by running a query from the Query Panel. 19. What is aggregate awareness? Aggregate awareness is a feature that makes use of predefined aggregate tables to enhance the performance of SQL transactions. It is used to improve the speed by which aggregates are calculated in the database. 20. What is multidimensional analysis? Multidimensional analysis is a technique for manipulating data in order to view it from different perspectives and on different levels of detail. In Business Objects, multidimensional analysis involves drill mode and sliceand-dice mode, and is enabled by the Analyzer and Explorer components of the User module. 21. What is the Business Objects repository? The Business Objects repository is a set of relational data structures stored on a database. It enables Business Objects users to share resources in a controlled and secured environment. The repository is made up of three domains: The security domain: A set of data structures in the Business Objects repository. The security domain contains information on the other domains (universe and document) and on the identification of Business Objects users. It also contains information relating to the management of the different products. The security domain is created with the wizard the first time Supervisor is launched. Basically it contains one domain. The universe domain: The area of the repository that holds exported universes. The universe domain makes it possible to store, distribute, and administrate universes. There may be multiple universe domains in a repository. There may be n no of universe domains. Document domain: The area of the repository that stores documents, templates, scripts, and lists of values. The repository is created by the general supervisor with the setup wizard during the first-time use of the product. You can create and use more than one repository, typically to manage multiple sites. There may be n no of document domains. 22. What is Supervisor? Supervisor is the product for the secured deployment of Business Objects products. It provides a powerful and easy-to-use solution for user administration. Supervisor can run only in client/server mode. Its use requires a connection to a relational database. Any operation you perform with supervisor is written to the repository. 23. What are OLAP, MOLAP, ROLAP, DOLAP, and HOLAP? Examples? OLAP - On-Line Analytical Processing: Designates a category of applications and technologies that allow the collection, storage, manipulation and reproduction of multidimensional data, with the goal of analysis. MOLAP - Multidimensional OLAP: This term designates a Cartesian data structure more specifically. In effect, MOLAP contrasts with ROLAP. In the former, joins between tables are already suitable, which enhances performances. In the latter, joins are computed during the request. Targeted at groups of users because it's a shared environment. Data is stored in an exclusive server-based format. It performs more complex analysis of data. DOLAP - Desktop OLAP: Small OLAP products for local multi dimensional analysis Desktop OLAP. There can be a mini multi dimensional database (using Personal Express), or extraction of a data cube (using Business Objects). Designed for low-end, single, departmental user. Data is stored in cubes on the desktop. It's like having your own spreadsheet. Since the data is local, end users don't have to worry about performance hits against the server. ROLAP - Relational OLAP. Designates one or several star schemas stored in relational databases. This technology permits multidimensional analysis with data stored in relational databases. Used for large departments or groups because it supports large amounts of data and users. HOLAP: Hybridization of OLAP, which can include any of the above. 24. What are the various modules in Business Object Products? Business Objects Reporter - Reporting & Analyzing tool Designer - Universe creation, database interaction & connectivity Supervisor - For administrative purposes Web intelligence - Access of report data through Internet. Broadcast Agent - For scheduling the reports Data Integrator - The ETL tool of Business Objects, designed to handle huge amounts of data. 25. What is BAS? What is the function? Its called Broadcast Agent Server. Its function is to run the jobs or reports scheduled and can be monitored using Broadcast Agent Console. 26. What is OLAP? OLAP (Online Analytical Processing) spans multiple versions of database schema due to the evolutionary process of an organization; Integration from many organizational locations and data stores. It is a category of software tools that provides analysis of data stored in a database. OLAP tools enable users to analyze different dimensions of multidimensional data. For example, it provides time series and trend analysis views. The chief component of OLAP is the OLAP server, which sits between a client and a database management systems (DBMS). The OLAP server understands how data is organized in the database and has special functions analyzing the data. 27. What is Data warehousing? It is subject oriented, nonvolatile, industrial data specifically structured for querying and reporting. 28. What is data mart? It is a subset of data warehouse that focus on one or more specific subject areas. Data marts are used on a business division / department level 29. Difference between data marts and data warehousing? Data marts are used on a business division / department level whereas data warehouse is used on enterprise level. A data mart only contains the required subject specific data for local analysis. 30. What is Data model? A roadmap to the data in the database. It is the description of an OLTP database to OLAP database, in order to make predictions and decisions for new data. 31. What is the difference between OLTP and OLAP? OLTP It focus on day to day transaction OLAP It focuses on future predictions and decisions. Data stability dynamic Static until refreshed. Highly normalized Demoralized and replicated data Access frequency high Medium to low 32. What is metadata? Data about the data contains the location and description of data warehouse system components such as name, definitions and end user views. 33. What is star schema? Why does design that way? A single fact table containing the compound primary key with one segment for each dimension and additional columns of additive numeric facts. It allows for the highest level of flexibility of metadata. Low maintenance as the data warehouse matures. Best possible performance. 34. What Snow Flake Schema? Snowflake Schema, each dimension has a primary dimension table, to which one or more additional dimensions can join. The primary dimension table is the only table that can join to the fact table. 35. When should you use a star schema and when a snowflake schema? A star schema is a simplest data warehouse schema. Snowflake schema is similar to the star schema. It normalizes the dimension table to save data storage space. It can be used to represent hierarchies of information. 36. What is a fact table? This is a central table in the star architecture that controls all of the key performance indicators (also called data points, also called attributes, also called the informational dimension) Technically the fact table is an intersection entity whose primary key is a composite key. The domain of each component of the key consists of the union of the domains of the different dimension levels. 37. What is Repository? A centralized set of relational data structures stored in a database. It enables Business Objects users to share resources in a controlled and secured environment. The repository is made up of three domains: the security domain, the universe domain and the document domain. 38. What are linked universes? Linked universes are universes that share common components such as parameters, classes, objects, or joins. Among linked universes, one universe is said to be the kernel or master while the others are the derived universes. A kernel or master universe represents a re-useable library of components. Derived universes may contain some or all of the components of the kernel or master universe, in addition to any components that have been added to it. You can use one of three approaches when linking universes: 1. The kernel approach 2. The master approach 3. The component approach Some of the benefits inherent in the use of linked universes are as follows: A dynamic link may considerably reduce development and maintenance time. When you modify a component in the kernel universe, Designer propagates the change to the same component in all the derived universes. Instead of re-creating common components each time you create a new universe, you can centralize such components in a kernel universe, and then include them in all new universes. Linked universes promote workgroup design. Common components can be shared among several designers. 39. What is Broadcast Agent? Broadcast Agent is a Business Objects server product that enables you to automatically process documents via the Business Objects repository, an Intranet, or the World Wide Web. You can process both Business Objects documents and Web Intelligence documents. Broadcast Agent can automate tasks using either add-ins or macros contained in Business Objects documents. In order to do so, you deploy Business Objects SDK add-ins or documents containing macros on Broadcast Agent. 40. How do you distribute universes? You can distribute a universe to end users or another designer by: Moving it as a file through the file server Exporting it to the repository If you distribute a universe as a file through the file server, any designer or end user can open it unless you have set a password on it. 41. What are the benefits of data warehouse? Immediate information delivery. Data integration from across, even outside the organization. Future versions of historical trends. Tools for looking at data in new ways. Enhanced customer service. 42. What is a Designer? It is a business object product that is intended to develop the universes. This universe is the semantic layer of the database structure that isolates from technical issues. 43. What is a dimension table? It contains data used to reference data stored in the fact table. Fewer rows primarily character data one primary key updatable data. 44. What is Drill up, drill down, drill through, drill by? Drill mode allows you to analyze data from different angles and different levels of detail. Drill down displays next level of detail in hierarchy. Drill up goes backup through the hierarchy to display data on less detailed levels. By using drill by option you can move to another hierarchy to analyze other data that belongs to a different hierarchy. 45. Which columns go to the fact table and which columns go the dimension table? The Primary Key columns of the tables (entities) go to the dimension tables as foreign keys. The Primary Key columns of the dimension tables go to the Fact Tables as Foreign Keys. 46. Differences between star and snowflake schemas? Star schema - all dimensions will be linked directly with a fact table. Snowflake schema - dimensions may be interlinked or may have one-tomany relationship with other tables. 47. What is performance tuning? It is used to reduce the complex filters. It is used for operations like AND, OR, NOT. 48. What are conformed dimension? Conformed dimensions mean the exact same thing with every possible fact table to which they are joined. Ex: Date dimension is connected to all facts like sales facts, inventory facts etc. 49. What is ODS? ODS means Operational data store. A collection of operation or base data that is extracted from operation databases and standardized, cleansed, consolidated, transformed and loaded into enterprise data architecture. An ODS is used to support data mining of operational data or as the store for the base data that is summarized as a data warehouse. 50. What are the different types of reports? There are two types of reports 1. Standard reports 2. Template Reports Standard reports: Simple reports Grouped reports Pivot tables (cross tab) Chart 51. Why are OLTP database designs not generally a good idea for a DWH? Since in OLTP tables are normalized and hence response will be slow for end user and OLTP does not contain years of data and hence cannot be analyzed. 52. What is Business Object Auditor? It is a web-based product that allows you to monitor and analyze user and system activity for 3-tier development of business objects 53. What are the job responsibilities of a general supervisor? He creates repositories. Creates any type of users, including other general supervisors. Create user groups. Administer user accounts and privileges for repository users. Import and export universes to and from the repository. Use any feature of all business objects products. Defines a Broadcast Agent for a group. Launch a Broadcast Agent from Broadcast Agent Administrator. 54. What is Business Intelligence? It is a broad category of applications and technologies for gathering, storing, analyzing and providing access to data to help enterprise users make better business decisions. 55. What resources does Supervisor manage? Supervisor lets you manage the following types of resources: Products Universes Stored procedures Documents Domains Categories Channels