Download What is Universe - dbmanagement.info

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
DATA WAREHOUSE
1. What is Universe? What are the steps involved in creating a universe?
It is a semantic layer between the database and the user interface. A
mapping of the data structure found in databases: tables, columns, joins,
etc. A universe, which is made up of classes, objects, and conditions, can
represent any specific application, system, or group of users.
Steps involved in creating Universe:
Identifying tables
Understand the requirement
Insert tables
Then create objects
Create joins
Create calculated measures
Check universe integrity
2. What is SCD? What are the different types of SCD?
SCD stands for slowly changing dimensions. Slowly changing dimensions
are of three types. They are:
SCD1: It maintains only updated values.
Ex: A customer address modified we update existing record with new
address.
SCD2: maintains historical information and current information by using

Effective date

Versions

Flags
Or combination of these
SCD3: by adding new columns to target table we maintain historical
information and current information
3. What is Normalization? What is Denormalization?
Normalization It is a process of putting things normal or making those
right. This process results in deep insights into the information used in
business and the understanding of how various elements of information
are related to each other.
Denormalization: It is the reverse process of normalization where by you
are not strict about being normalized. A related scenario would be Data
warehousing where, in order to generate reports and other SELECT
queries, we may relax the table structure by including nonrelated columns
in a table so that big complex queries are avoided.
4. What is normal form? What are the different types of normal forms?
Normalization is a process of putting things normal or making those right.
This process results in deep insights into the information used in business
and the understanding of how various elements of information are related
to each other.
There are three types of normal forms. They are
First Normal Form: The first step is to put the data into first normal form.
This can be done by moving the data into separate tables where the data
is of similar type in each table. Each table is given a primary key. This will
eliminate repeating groups of data.
Second Normal Form: This step helps in taking out data that is only
dependent on a part of the key.
Third Normal Form: Step three gives the Third Normal form. This step is
taken to get ride of anything that does not depend entirely on the primary
key. One can arrive to the third normal form without going to first and
second normal forms
5. What is a loop?
A loop is a situation that occurs when more than one path exists from one
table to another. Loops result in ambiguity in the design of a universe.
Designer enables you to identify loops in one of two ways:
1.You can run the Check Integrity function, which indicates the existence
of any loops.
2.You can select the Detect Loops command from the Tools menu. If
there are loops, the Loop Detection viewer appears; it indicates the joins
causing a loop.
6. What is Context?
A method by which Designer can decide which path to choose when more
than one path is possible from one table to another in the universe. It
helps in resolving the loops created by various joins in the universe tables.
You can create contexts manually, or cause them to be detected by
designer.
7. What is an alias?
An alias is a logical pointer to an alternate table name. The purpose of an
alias is to resolve loops in the paths of joins. In some cases, more than
one alias may be necessary for a given table.
8. What is Fan Trap?
A one to many join links a table, which in turn linked by a one to many,
joins. This type of fanning out of one to many joins can lead to a join path
problem called a fan trap. The fanning out effect of “one to many” joins
can cause incorrect results to be returned when a query includes objects
based on both tables. This Fan trap is resolved by using alias
9. What is Chasm Trap?
Many to one joins from two fact tables converge on a single lookup table.
This type of join convergence can lead to a join path problem called
chasm trap. This Chasm trap is resolved by using context.
10. What is a Join? What are the different types of Joins?
Join: A relational operation that causes two tables with a common column
to be combined into a single table. Designer supports equi-joins, theta
joins, outer joins, short cut joins, isolated join.
Equi-join: A join based on the equality between the values in the column
of one table and the values in the column of another. Because the same
column is present in both tables, the join synchronizes the two tables.
Outer join: A join that links two tables, one of which has rows that do not
match those in the common column of the other table.
Theta join: A join that links tables based on a relationship other than
equality between two columns.
Short cut join: A join that links two tables by bypassing one or more other
tables in the universe.
Isolated Join: An isolated join is one that has not been included in any
context in your universe.
11. What is an Object? What are the different types of Objects?
Object: A component that maps to data or a derivation of data in the
database. For the purpose of multidimensional analysis, an object can be
qualified as a dimension, detail or measure. Objects are grouped into
classes.
There are 3 types of Objects:
Dimension
Detail
Measure
Object Type
Description
Dimension Parameters for analysis. Dimensions typically relate to
a hierarchy such as geography, product or time
Detail
Provide a Description of a dimension, but are not the
focus for analysis. For example: Phone number
Measure
Convey numeric information, which is used to quantify
a dimension object. For example sales revenue.
12. What are object, class, and subclass?
Object: A component that maps to data or a derivation of data in the
database. For the purposes of multidimensional analysis, an object can be
qualified as a dimension, detail, or measure. Objects are grouped into
classes.
Class: A logical grouping of objects and conditions within a universe. In
general, the name of a class reflects a business concept that conveys the
category or type of objects.
Subclass: A component within a class that groups objects. A subclass
can itself contain other subclasses or objects.
13. What is Check Integrity?
Checks the validity of the active universe including its structure, joins,
cardinalities, objects, contexts, and conditions. It can also detect whether
there are any loops. You can check the entire universe or only certain of
its components.
14. What is a surrogate key? Why are we going for surrogate keys?
Surrogate keys are the keys that are maintained within the data
warehouse instead of keys taken from source data systems.
1. Data tables in various source systems may use different keys for the
same entity.
2. Keys may change or be used in the data source systems.
3. Changes in organizational structures may move the keys in the
hierarchy.
15. What are the different types of data warehouse tools available in market?

Business objects

Cognos

Hyperion Asbase

Microstategy

Microsoft Reporting Services

Crystal reports
16. What are cardinalities?
Cardinality expresses the min and max number of instances of an entity B
that can be associated with an instance of an entity A. The min and the
max number of instances can be equal to 0, 1, or N.
There are two main methods for detecting or editing cardinalities:
1.The Detect Cardinalities command
2.The Edit Join dialog box
17. What is a strategy? How many types are there?
Scripts that automatically extract structural information about tables,
columns, joins, or cardinalities from a database. Designer provides default
strategy but a designer can also create strategies. These are referred to
as external strategies.
There are two types of strategies. They are
1. Built-in strategy
2. External strategy
Built-in strategy:

Extract the joins with tables.

Detect cardinalities in joins.
18. What is List of Values (LOV)?
A list of values contains the data values associated with an object. These
data values can originate from a corporate database, or a flat file such as
a text file or Excel file. In Designer you create a list of values by running a
query from the Query Panel.
19. What is aggregate awareness?
Aggregate awareness is a feature that makes use of predefined aggregate
tables to enhance the performance of SQL transactions. It is used to
improve the speed by which aggregates are calculated in the database.
20. What is multidimensional analysis?
Multidimensional analysis is a technique for manipulating data in order to
view it from different perspectives and on different levels of detail. In
Business Objects, multidimensional analysis involves drill mode and sliceand-dice mode, and is enabled by the Analyzer and Explorer components
of the User module.
21. What is the Business Objects repository?
The Business Objects repository is a set of relational data structures
stored on a database. It enables Business Objects users to share
resources in a controlled and secured environment.
The repository is made up of three domains:
The security domain: A set of data structures in the Business Objects
repository. The security domain contains information on the other domains
(universe and document) and on the identification of Business Objects
users. It also contains information relating to the management of the
different products. The security domain is created with the wizard the first
time Supervisor is launched. Basically it contains one domain.
The universe domain: The area of the repository that holds exported
universes. The universe domain makes it possible to store, distribute, and
administrate universes. There may be multiple universe domains in a
repository. There may be n no of universe domains.
Document domain: The area of the repository that stores documents,
templates, scripts, and lists of values. The repository is created by the
general supervisor with the setup wizard during the first-time use of the
product. You can create and use more than one repository, typically to
manage multiple sites. There may be n no of document domains.
22. What is Supervisor?
Supervisor is the product for the secured deployment of Business Objects
products. It provides a powerful and easy-to-use solution for user
administration.
Supervisor can run only in client/server mode. Its use requires a
connection to a relational database. Any operation you perform with
supervisor is written to the repository.
23. What are OLAP, MOLAP, ROLAP, DOLAP, and HOLAP? Examples?
OLAP - On-Line Analytical Processing: Designates a category of
applications and technologies that allow the collection, storage,
manipulation and reproduction of multidimensional data, with the goal of
analysis.
MOLAP - Multidimensional OLAP: This term designates a Cartesian
data structure more specifically. In effect, MOLAP contrasts with ROLAP.
In the former, joins between tables are already suitable, which enhances
performances. In the latter, joins are computed during the request.
Targeted at groups of users because it's a shared environment. Data is
stored in an exclusive server-based format. It performs more complex
analysis of data.
DOLAP - Desktop OLAP: Small OLAP products for local multi
dimensional analysis Desktop OLAP. There can be a mini multi
dimensional database (using Personal Express), or extraction of a data
cube (using Business Objects). Designed for low-end, single,
departmental user. Data is stored in cubes on the desktop. It's like having
your own spreadsheet. Since the data is local, end users don't have to
worry about performance hits against the server.
ROLAP - Relational OLAP. Designates one or several star schemas
stored in relational databases. This technology permits multidimensional
analysis with data stored in relational databases. Used for large
departments or groups because it supports large amounts of data and
users.
HOLAP: Hybridization of OLAP, which can include any of the above.
24. What are the various modules in Business Object Products?
Business Objects Reporter - Reporting & Analyzing tool
Designer - Universe creation, database interaction & connectivity
Supervisor - For administrative purposes
Web intelligence - Access of report data through Internet.
Broadcast Agent - For scheduling the reports
Data Integrator - The ETL tool of Business Objects, designed to handle
huge amounts of data.
25. What is BAS? What is the function?
Its called Broadcast Agent Server. Its function is to run the jobs or reports
scheduled and can be monitored using Broadcast Agent Console.
26. What is OLAP?
OLAP (Online Analytical Processing) spans multiple versions of database
schema due to the evolutionary process of an organization; Integration
from many organizational locations and data stores.
It is a category of software tools that provides analysis of data stored in a
database. OLAP tools enable users to analyze different dimensions of
multidimensional data. For example, it provides time series and trend
analysis views. The chief component of OLAP is the OLAP server, which
sits between a client and a database management systems (DBMS). The
OLAP server understands how data is organized in the database and has
special functions analyzing the data.
27. What is Data warehousing?
It is subject oriented, nonvolatile, industrial data specifically structured for
querying and reporting.
28. What is data mart?
It is a subset of data warehouse that focus on one or more specific subject
areas. Data marts are used on a business division / department level
29. Difference between data marts and data warehousing?
Data marts are used on a business division / department level whereas
data warehouse is used on enterprise level. A data mart only contains the
required subject specific data for local analysis.
30. What is Data model?
A roadmap to the data in the database. It is the description of an OLTP
database to OLAP database, in order to make predictions and decisions
for new data.
31. What is the difference between OLTP and OLAP?
OLTP
It focus on day to day transaction
OLAP
It focuses on future predictions and
decisions.
Data stability dynamic
Static until refreshed.
Highly normalized
Demoralized and replicated data
Access frequency high
Medium to low
32. What is metadata?
Data about the data contains the location and description of data
warehouse system components such as name, definitions and end user
views.
33. What is star schema? Why does design that way?
A single fact table containing the compound primary key with one segment
for each dimension and additional columns of additive numeric facts. It
allows for the highest level of flexibility of metadata. Low maintenance as
the data warehouse matures. Best possible performance.
34. What Snow Flake Schema?
Snowflake Schema, each dimension has a primary dimension table, to
which one or more additional dimensions can join. The primary dimension
table is the only table that can join to the fact table.
35. When should you use a star schema and when a snowflake schema?
A star schema is a simplest data warehouse schema. Snowflake schema
is similar to the star schema. It normalizes the dimension table to save
data storage space. It can be used to represent hierarchies of information.
36. What is a fact table?
This is a central table in the star architecture that controls all of the key
performance indicators (also called data points, also called attributes, also
called the informational dimension) Technically the fact table is an
intersection entity whose primary key is a composite key. The domain of
each component of the key consists of the union of the domains of the
different dimension levels.
37. What is Repository?
A centralized set of relational data structures stored in a database. It
enables Business Objects users to share resources in a controlled and
secured environment. The repository is made up of three domains: the
security domain, the universe domain and the document domain.
38. What are linked universes?
Linked universes are universes that share common components such as
parameters, classes, objects, or joins. Among linked universes, one
universe is said to be the kernel or master while the others are the derived
universes.
A kernel or master universe represents a re-useable library of
components. Derived universes may contain some or all of the
components of the kernel or master universe, in addition to any
components that have been added to it.
You can use one of three approaches when linking universes:
1. The kernel approach
2. The master approach
3. The component approach
Some of the benefits inherent in the use of linked universes are as follows:
A dynamic link may considerably reduce development and maintenance
time. When you modify a component in the kernel universe, Designer
propagates the change to the same component in all the derived
universes. Instead of re-creating common components each time you
create a new universe, you can centralize such components in a kernel
universe, and then include them in all new universes. Linked universes
promote workgroup design. Common components can be shared among
several designers.
39. What is Broadcast Agent?
Broadcast Agent is a Business Objects server product that enables you to
automatically process documents via the Business Objects repository, an
Intranet, or the World Wide Web. You can process both Business Objects
documents and Web Intelligence documents.
Broadcast Agent can automate tasks using either add-ins or macros
contained in Business Objects documents. In order to do so, you deploy
Business Objects SDK add-ins or documents containing macros on
Broadcast Agent.
40. How do you distribute universes?
You can distribute a universe to end users or another designer by:
Moving it as a file through the file server
Exporting it to the repository
If you distribute a universe as a file through the file server, any designer or
end user can open it unless you have set a password on it.
41. What are the benefits of data warehouse?
Immediate information delivery.
Data integration from across, even outside the organization.
Future versions of historical trends.
Tools for looking at data in new ways.
Enhanced customer service.
42. What is a Designer?
It is a business object product that is intended to develop the universes.
This universe is the semantic layer of the database structure that isolates
from technical issues.
43. What is a dimension table?
It contains data used to reference data stored in the fact table. Fewer rows
primarily character data one primary key updatable data.
44. What is Drill up, drill down, drill through, drill by?
Drill mode allows you to analyze data from different angles and different
levels of detail.
Drill down displays next level of detail in hierarchy.
Drill up goes backup through the hierarchy to display data on less detailed
levels.
By using drill by option you can move to another hierarchy to analyze
other data that belongs to a different hierarchy.
45. Which columns go to the fact table and which columns go the dimension
table?
The Primary Key columns of the tables (entities) go to the dimension
tables as foreign keys. The Primary Key columns of the dimension tables
go to the Fact Tables as Foreign Keys.
46. Differences between star and snowflake schemas?
Star schema - all dimensions will be linked directly with a fact table.
Snowflake schema - dimensions may be interlinked or may have one-tomany relationship with other tables.
47. What is performance tuning?
It is used to reduce the complex filters. It is used for operations like AND,
OR, NOT.
48. What are conformed dimension?
Conformed dimensions mean the exact same thing with every possible
fact table to which they are joined.
Ex: Date dimension is connected to all facts like sales facts, inventory
facts etc.
49. What is ODS?
ODS means Operational data store.
A collection of operation or base data that is extracted from operation
databases and standardized, cleansed, consolidated, transformed and
loaded into enterprise data architecture. An ODS is used to support data
mining of operational data or as the store for the base data that is
summarized as a data warehouse.
50. What are the different types of reports?
There are two types of reports
1. Standard reports
2. Template Reports
Standard reports:
Simple reports
Grouped reports
Pivot tables (cross tab)
Chart
51. Why are OLTP database designs not generally a good idea for a DWH?
Since in OLTP tables are normalized and hence response will be slow for
end user and OLTP does not contain years of data and hence cannot be
analyzed.
52. What is Business Object Auditor?
It is a web-based product that allows you to monitor and analyze user and
system activity for 3-tier development of business objects
53. What are the job responsibilities of a general supervisor?
He creates repositories.
Creates any type of users, including other general supervisors.
Create user groups.
Administer user accounts and privileges for repository users.
Import and export universes to and from the repository.
Use any feature of all business objects products.
Defines a Broadcast Agent for a group.
Launch a Broadcast Agent from Broadcast Agent Administrator.
54. What is Business Intelligence?
It is a broad category of applications and technologies for gathering,
storing, analyzing and providing access to data to help enterprise users
make better business decisions.
55. What resources does Supervisor manage?
Supervisor lets you manage the following types of resources:
Products
Universes
Stored procedures
Documents
Domains
Categories
Channels