Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Extensible Storage Engine wikipedia , lookup

Clusterpoint wikipedia , lookup

Big data wikipedia , lookup

Relational model wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Functional Database Model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Informatica
1)What is Data warehouse?
In 1980, Bill Inmon known as father of data warehousing. "A Data warehouse is a subject
oriented, integrated ,time variant, non volatile collection of data in support of
management's decision making process".

Subject oriented : means that the data addresses a specific subject such as sales, inventory
etc.

Integrated : means that the data is obtained from a variety of sources.

Time variant : implies that the data is stored in such a way that when some data is changed.

Non volatile : implies that data is never removed. i.e., historical data is also kept.
2)What is the difference between database and data warehouse?
A database is a collection of related data.
A data warehouse is also a collection of information as well as a supporting system.
3)What are the benefits of data warehousing?

Historical information for comparative and competitive analysis.

Enhanced data quality and completeness.

Supplementing disaster recovery plans with another data back up source.
4)What are the types of data warehouse?
There are mainly three type of Data Warehouse are :

Enterprise Data Warehouse

Operational data store

Data Mart
5)What is the difference between data mining and data warehousing?
Data warehouse is the base for Data Mining. Data Mining basically extract the knowledge
from Data warehouse. It means by analyzing data in data warehouse, will predict trend in
business.
6)What are the applications of data warehouse?

Datawarehouse are used extensively in banking and financial services, consumer goods.

Datawarehouse is mainly used for generating reports and answering predefined queries.

Datawarehouse is used for strategic purposes, performing multidimensional analysis.

Datawarehouse is used for knowledge discovery and strategic decision making using data
mining tools.
7)What are the types of datawarehouse applications?

Info processing

Analytical processing

Data mining.
8)What is metadata?
Metadata is defined as the data about data. Metadata describes the entity and attributes
description.
9)What are the benefits of Datawarehousing?
The implementation of a data warehouse can provide many benefits to an organization. A
data warehouse can :

Facilitate integration in an environment characterized by un–integrated applications.

Integrate enterprise data across a variety of functions.

Integrate external as well as internal data.

Support strategic and long–term business planning.

Support day–to–day tactical decisions.

Enable insight into business trends and business opportunities.

Organize and store historical data needed for analysis.

Make available historical data, extending over many years, which enables trend analysis.

Provide more accurate and complete information.

Improve knowledge about the business.

Enable cost–effective decision making.

Enable organizations to understand their customers, and their needs, as well competitors.

Enhance customer service and satisfaction.

Provide competitive advantage.

Provide easy access for end–users.

Provide timely access to corporate information.
10)What is the difference between dimensional table and fact table?
A dimension table consists of tuples of attributes of the dimension. A fact table can be
thought of as having tuples, one per a recorded fact. This fact contains some measured or
observed variables and identifies them with pointers to dimension tables.
11)What is data mart?
A data mart can contain one fact table to address one subject. In such a case,when a
number of data marts are integrated to create a data warehouse, it is important that the
facts in each table mean the same thing. Such facts (i.e., measures or metrics) that have the
same meaning in different data marts are called conformed facts.
12)What are the different types of datasmarts?

Stand–alone Data Marts : Data marts that do not interact with other data marts are called
stand–alone data marts. On the other hand, data marts can be integrated to create a data
warehouse.

Multi–source Data Mart : A data mart for which the input data is obtained from multiple
sources is called a multi–source data mart.

Personal Data Mart : A data mart for use by individuals such as Chief Executive Officer
(CEO), Chief Technology Officer (CTO) or Chief Financial Officer (CFO) is called Personal Data
Mart.

Operational Data Store : ODS is a database system that obtains data from different sources,
consolidates it and keeps it at a single location. However, it does not have the sophistication
of a data mart or a data warehouse for analytical processing. ODS is generally developed as
a pre-cursor for ERP systems.
13)What is fact table?
A fact table is the central table that contains the measures or facts of a business process.
14)What is the mapplet?
Mapplet is a set of transformations that you build in the mapplet designer and you can use
in multiple mappings.
15)What is meant by grain?
Granularity of data is a very important factor in the design of fact table. In fact table, we can
represent data at different atomic levels, called grains.
16)What is difference between view and materialized view?
Views contains query whenever execute views it has read from base table.
A view which is physically stored in a database. Materialized views allow query performance
to be enhanced by precalculating costly operations.
17)What is surrogate key?
A system generated artificial primary key that is not derived from any data in the database.
It is similar to the object identifier in an object oriented system.
18)What is a folder?
Folder contains repository objects such as sources, targets, mappings, transformation which
are helps logically organize our data warehouse.
19)What is pivot?
The pivot (or rotate) operation rotates the axes of a cube to provide an alternative
presentation of the data.
20)What is the difference between mapping and session?
Maping : It is a set of source and target definitions linked by transformation objects that
define the rules for transformation.
Session : It is a set of instructions that describe how and when to move data from source to
targets.
21)What is meant bt ETL?
The overall data acquisition process, called ETL (extraction, transformation,and loading), is
generally grouped into three main components :

Extraction : Involves obtaining the required data from the various sources.

Transformation : Source data undergoes a number of operations that prepare it for import
into the data warehouse (target database). To perform this task, integration and
transformation programs are used which can reformat, recalculate, modify structure and
data elements, and add lime elements. They can also perform calculations, summarization,
de-normal-ization, etc.

Loading : Involves physically placing extracted and transformed data in the target database.
The initial loading involves a massive data import into the data warehouse. Subsequently, an
extraction procedure periodically loads fresh data based on business rules and a pre–
determined frequency.

22)What are the types of dimensional schema?
There are two types of dimensional schema :

Star schema

Snowflake schema
23)What is star schema?
Star schema, there is only one central fact table, and a set of dimension tables, one for each
dimension. In star schema, each dimension is represented by only one table, and each table
contains a set of attributes.
24)What is snowflake schema?
A snowflake schema avoids the redundancy of star schemas by normalizing the dimension
tables. Therefore, a dimension is represented by several tables related by referential
integrity constraints.
25)What is meant by starflake schema?
A starflake schema is a combination of the star and the snowflake schemas where some
dimensions are normalized while others are not.
26)What is Operational Data Store?
Operational Data Store (ODS) is a hybrid data architecture to cover the requirements for
both analytical and operational tasks.
27)What is the difference between star schema and snowflake schema?
The star schema consists of a fact table with a single table for each dimension. The
snowflake schema is a variation on the star schema in which the dimensional tables from a
star schema are organized into a hierarchy by normalizing them. A fact constellation is a set
of fact tables that share some dimension tables.
28)What is data staging?
Data staging is the process of transferring the data from the data sources (operational
systems) into the target database of the data warehouse.
29)What is a session?
A session is a set of instructions that describes how and when to move data from sources to
targets.
30)What is mapping?
A mapping is a set of source and target definitions linked by transformation objects that
define the rules for data transformation.
31)What is Datadriven?
The informatica server follows instructions coded into update strategy transformations with
in the session maping determine how to flag records for insert,update,delete or reject. If u
do not choose data driven option setting, the informatica server ignores all update strategy
transformations in the mapping.
32)What are the three major types of metadata in a data warehouse?
Metadata in a data warehouse fall into three major categories :

Operational Metadata

Extraction and Transformation Metadata

End–User Metadata
33)What is OLAP?

Allow users to run complex dimensional queries.

Enable users to generate canned queries.

Two categories of online analytical processing are multidimensional online analytical
processing (MOLAP) and relational online analytical processing (ROLAP).
34)What is meant by geographic information system(GIS)?
A software system that allows users to define, create, maintain, and control access to a
geographic database.
35)What is dimension table?
A relational table that contains dimension data.
36)What is the difference between OLTP and OLAP?
The main differences between OLTP and OLAP are:

OLTP systems are for doing clerical/operational processing of data whereas OLAP systems
are for carrying out analytical processing of the data.

OLTP systems look at data in one dimension; whereas in OLAP systems, data can be viewed
in different dimensions and hence interesting business intelligence can be extracted from
the data.

Operational personnel of an organization use the OLTP systems whereas management uses
OLAP systems, though operational personnel may also use portions of OLAP system.

OLTP systems contain the current data as well as the details of the transactions. OLAP
systems contain historical data, and also data in summarized form.

OLTP database size is smaller as compared to OLAP systems. If the OLTP database occupies
Gigabytes (GB) of storage space, OLAP database occupies Terabytes (TB) of storage space.
37)What is DTM?
DTM transform data received from reader buffer and its moves transformation to
transformation on row by row basis and it uses transformation caches when necessary.
38)What is meant by spatial data warehouse?
A data warehouse that manipulates spatial data, thus allowing spatial analysis. This is to be
contrasted with conventional and temporal data warehouses.
39)What is a Batch?
Batches provide a way to group sessions for either serial or parallel execution by the
Informatica Server.
40)What are the types of batch?
There are two types of batches are :

Sequential batch : Runs sessions one after the other.

Concurrent batch : Runs sessions at the same time.
41)What is Rolap?
Relational OLAP (ROLAP) servers store data in relational databases and support extensions
to SQL and special access methods to efficiently implement the multidimensional data
model and the related operations.
42)What is Molap?
Multidimensional OLAP (MOLAP) servers directly store multidimensional data in special
data structures (for instance, arrays) and implement the OLAP operations over those data
structures. While MOLAP systems offer less storage capacity than ROLAP systems, MOLAP
systems provide better performance when multidimensional data is queried or aggregated.
43)What is Holap?
Hybrid OLAP (HOLAP) servers combine both technologies, benefiting from the storage
capacity of ROLAP and the processing capabilities of MOLAP.
44)Describe Informatica Architecture?
Informatica consist of client and server. Client tools such as Repository manager, Designer,
Server manager. Repository data base contains metadata it read by informatica server used
read data from source, transforming and loading into target.
45)What is fact?
A central component of a multidimensional model that contains the measures to be
analyzed. Facts are related to dimensions.
46)What are the types of dimensional table?
There are three types of dimensional table are :

Conformed dimensional table

Junk dimensional table

Degenerate dimensional model
47)What are the types of fact table?
There are three types of fact table are :

Additive fact

Semi additive fact

Non additive fact.
48)What are the types of data mart?
There are two types of data mart are :

Dependent data mart

Independent data mart
49)What is data source view?
A Data source view (DSV) allows one to define the relational schema that will be used in the
analysis services database. This schema is derived from the schemas of the various data
sources. Cubes and dimensions are created from data source views rather than directly from
data source objects.
50)What are the different threads in DTM process?

Master thread

Maping thread

Pre and post session threads

Reader thread

Writer thread

Transformation thread
51)What is a transformation?
A transformation is a repository object that generates, modifies, or passes data. The
Designer provides a set of transformations that perform specific functions.
52)What are the types of transformations?
There are two types of transformations.

Active

Passive
53)What parameters can be tweaked to get better performance from a session?
DTM shared memory, Index cache memory, Data cache memory, by indexing, using
persistent cache, increasing commit interval etc
54)What are the different types of transformations available in Informatica?

Aggregator

Application Source Qualifier

Custom

Expression

External Procedure

Filter

Input

Joiner

Lookup

Normalizer

Output

Rank

Router

Sequence Generator

Sorter

Source Qualifier

Stored Procedure

Transaction Control

Union

Update Strategy

XML Generator

XML Parser

XML Source Qualifier
55)What is a transformation language?
It is a language which is similar to SQL functions so, one can write expressions to modify the
data or test the data.
56)Is Aggregator a passive/active transformation?
Active
57)What are the constants used in update strategy?

DD_INSERT

DD_UPDATE

DD_DELETE

DD_REJECT
58)What are the benefits of DWH?

Immediate information delivery

Data Integration from across, even outside the organization

Future vision of historical trends

Tools for looking at data in new ways

Enhanced customer service.
59)Explain informatica repository?
The Informatica repository is a relational database that stores information, or metadata,
used by the Informatica Server and Client tools. Metadata can include information such as
mappings describing how to transform source data, sessions indicating when you want the
Informatica Server to perform the transformations, and connect strings for sources and
targets.
60)What is power centre repository?
The Informatica repository is a relational database that stores information, or metadata,
used by the Informatica Server and Client tools. Metadata can include information such as
mappings describing how to transform source data, sessions indicating when you want the
Informatica Server to perform the transformations, and connect strings for sources and
targets.
61)What is power centre repository?
The Power center repository allows you to share metadata across repositories to create a
data mart domain. In a data mart domain, you can create a single global repository to store
metadata used across an enterprise, and a number of local repositories to share the global
metadata as needed.
62)What are the types of Informaica client tools?

Designer

Server Manager

Repository Manager
63)What are the types of designer tools?
The Designer provides the following tools are :

Source Analyzer

Warehouse Designer

Transformation Developer

Mapplet Designer

Mapping Designer
64)What is Aggregator Transformation?

Aggregator transformation is an Active and Connected transformation.

The Aggregator transformation allows you to perform aggregate calculations, such as
averages and sums.

The Aggregator transformation is unlike the Expression transformation, in that you can use
the Aggregator transformation to perform calculations on groups.

The Expression transformation permits you to perform calculations on a row-by-row basis
only.
65)What is filter transformation?

Filter transformation is an Active and Connected transformation.

The Filter transformation provides the means for filtering rows in a mapping. You pass all
the rows from a source transformation through the Filter transformation, and then enter a
filter condition for the transformation.
66)What is Joiner transformation?
Joiner Transformation is an Active and Connected transformation. This can be used to join
two sources coming from two different locations or from same location. The Joiner
transformation to join two sources with at least one matching port. The Joiner
transformation uses a condition that matches one or more pairs of ports between the two
sources.
67)What is clustering?
Clustering is the process of grouping the data into classes or clusters so that objects with in
a cluster have high similarity in comparison to one another, but are very dissimilar to objects
in other clusters.
68)What is difference between aggregator and expression?
Aggregator is active transformation and Expressionis passive transformation. Aggregator
transformation used to perform aggregate calculation on group of records Where as
expression used perform calculation with single record.
69)What are the types of cache?

Static cache

Dynamic cache

Persistent cache

Recache from lookup

Shared cache
70)What are the costly transformation in informatica?

L : Look up

A : Aggregator

R : Rank

J : Joiner
71)What are the transformation having cache concept?

Data Cache - 2GB

Index Cache - 1GB
72)How does the recovery mode work in informatica?
In case of load failure an entry is made in OPB_SERV_ENTRY(?) table from where the extent
of loading can be determined.
73)What is a command that used to run a batch?
pmcmd is used to start a batch.
74)What are the main difference between Data Warehousing and Business Intelligence?
Data Warehousing : is a way of storing data and creating information through leveraging
data marts. DM's are segments or categories of information and/or data that are grouped
together to provide 'information' into that segment or category. DW does not require BI to
work. Reporting tools can generate reports from the DW.
Business Intelligence : is the leveraging of DW to help make business decisions and
recommendations. Information and data rules engines are leveraged here to help make
these decisions along with statistical analysis tools and data mining tools.
75)What are the advantages of Dimensional modelling?

Ease of use

High performance

Predictable,standard framework

Understandable

Extensible to accomodate unexpected new data elements and new design decisions
76)What transformations are used for variable port?

Expression Transformation

Aggregated transformation

Rank Transformation
77)What is the difference between OLTP and OLAP?

OLTP is nothing but OnLine Transaction Processing which contains a normalised tables. But
OLAP Online Analtical Programming contains the history of OLTP data which is non-volatile
acts as a Decisions Support System.

OLTP systems are for doing clerical / operational processing of data whereas OLAP systems
are for carrying out analytical processing of data.

OLAP system contains the current data as well as the details of the transactions.

OLTP system contains historical data, and also data in summarized form.
78)What are the various aggregate calculations?
Aggregate functions are :

AVG

COUNT

FIRST

LAST

MAX

MEDIAN

MIN

PERCENTILE

STDDEV

SUM

VARIANCE
79)What are designer objects?

Source

Target

Transformation

Mapping

Mapplet
80)What are Expression transformation?
Expression transformation is a Passive and Connected transformation. This can be used to
calculate values in a single row before writing to the target.
81)What are the types of joiner transformation?
Joiner transformation supports four types of joins at Informatica level are :

Normal join

Master Outer

Detail Outer

Full Outer
82)What is filter transformation?
Filter transformation is an Active and Connected transformation. This can be used to filter
rows in a mapping that do not meet the condition.
83)What are the tools of workflow manager?

Task developer

Workflow designer

Worklet designer
84)What is router transformation?
Router transformation is an Active and Connected transformation. A Router transformation
is similar to a Filter transformation because both transformations allow you to use a
condition to test data. The only difference is, filter transformation drops the data that do
not meet the condition whereas router has an option to capture the data that do not meet
the condition. It is useful to test multiple conditions.
85)What is lookup transformation?
Lookup transformation is Passive and it can be both Connected and UnConnected as well. It
is used to look up data in a relational table, view, or synonym. Lookup definition can be
imported either from source or from target tables.
86)What is source qualifier transformation?
Source Qualifier transformation is an Active and Connected transformation. When adding a
relational or a flat file source definition to a mapping, it is must to connect it to a Source
Qualifier transformation. The Source Qualifier performs the various tasks such as overriding
default SQL query, filtering records; join data from two or more tables etc.
87)what is update strategy transformation ?
Update strategy transformation is an active and connected transformation. It is used to
update data in target table, either to maintain history of data or recent changes.
88)What is the functionality of Repository manager?
Repository manager is used to navigate through multiple folders and perform the basic
repository tasks.
89)What is normaliser transformation?
Normaliser Transformation is an Active and Connected transformation. It is used mainly
with COBOL sources where most of the time data is stored in renormalized format. Also,
Normaliser transformation can be used to create multiple rows from a single row of data.
90)What is rank transformation?
Rank transformation is an Active and Connected transformation. It is used to select the top
or bottom rank of data.
91)What is sorter transformation?
Sorter transformation is a Connected and an Active transformation. It allows to sort data
either in ascending or descending order according to a specified field.
92)what is stored procedure transformation?
Stored Procedure transformation is an Passive & Connected or Unconnected
transformation. It is useful to automate time-consuming tasks and it is also used in error
handling, to drop and recreate indexes and to determine the space in database, a
specialized calculation.
93)What is the difference between lookup and fact tables?
A lookup table contains information about the entities. In general the Dimension and details
objects are derived from lookup tables.
A fact table contains the statistical information about transactions.
94)What is the functionality of Repository server administration console?
Repository server administration console is used to create and administer the repository
through the repository server.
95)What is the difference between OLTP and ODS?

OLTP is online transaction processing systems and ODS os operational database system. In
OLTP we can save the current data, it depends on the day to day transactions and it stores
the day to day data.

In ODS we can store data for a month also and it is not restricted to a specific day or
transaction.
96)What is sequence generator transformation?
Sequence Generator transformation is a Connected and an Passive transformation. The
Sequence Generator transformation generates numeric values. You can use the Sequence
Generator to create unique primary key values, replace missing primary keys, or cycle
through a sequential range of numbers.
97)what is SCD?
SCD means Slowly Changing Dimension. If the values of attributes in a dimension table
change over a period of time, then these dimensions are called slowly changing dimension.
98)What are the types of SCD?

Type 1 : In the dimensional table, the new data replaces the old data. In other words, the
historical data is not preserved.

Type 2 : New records are added to the dimension table. The old data is retained and the
new records contain the new data.

Type 3 : New fields are added to the dimension table so that the table can hold both old
values and new values in the same record.