Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Relational algebra wikipedia , lookup

Concurrency control wikipedia , lookup

Oracle Database wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Database wikipedia , lookup

SQL wikipedia , lookup

PL/SQL wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
1)What is type1 dimensions and type2 dimension?
Ans)In Type 1 Slowly Changing Dimension, the new information simply overwrites the
original information.
In other words, no history is kept.
In Type 2 Slowly Changing Dimension, a new record is added to the table to
represent the new
information.Therefore,both the original and the new record will be present.The new
record gets its own primary key.
In Type 3 slowly changing dimension, The original record is modified to reflect the
change.
2)What is correlated sub query?
Ans)A correlated sub query is a sub query that contains a reference to a table that also
appears in the outer query.
SELECT * FROM t1 WHERE column1 = ANY
(SELECT column1 FROM t2 WHERE t2.column2 = t1.column2);
3)What is difference between function and procedure?
Ans)Function returns value and procedure does not return value.
4)What is inline view?
Ans)The inline view is a construct in Oracle SQL where you can place a query in the
SQL FROM, clause, just as if the query was a table name.
5)A common use for in-line views in Oracle SQL is to simplify complex queries by
removing join operations and Ans)condensing several separate queries into a single
query.
Example: The best example of the in-line view is the common Oracle DBA script that
is used to show the amount
of free space and used space within all Oracle table spaces.
6)What is Dimension?
Ans)A category of information. For example, the time dimension.
7)What is Attribute?
Ans)A unique level within a dimension. For example, Month is an attribute in the Time
Dimension.
8)What is Hierarchy?
Ans)The specification of levels that represents relationship between different attributes
within a dimension.
For example, one possible hierarchy in the Time dimension is Year ? Quarter ?
Month ? Day
9)What is Fact Table?
Ans)A fact table is a table that contains the measures of interest. For example, sales
amount would be such a
measure.
This measure is stored in the fact table with the appropriate granularity. For example,
it can be sales amount
by store by day. In this case, the fact table would contain three columns:
A date column, a store column, and a sales amount column.
10)What is Lookup Table?
Ans)The lookup table provides the detailed information about the attributes. For
example, the lookup table for
the Quarter attribute would include a list of all of the quarters available in the data
warehouse. Each
row
(each quarter) may have several fields, one for the unique ID that identifies the
quarter, and one or
more additional fields that specifies how that particular quarter is represented on a
report (for example,
first quarter of 2001 may be represented as "Q1 2001" or "2001 Q1").
A dimensional model includes fact tables and lookup tables. Fact tables connect to
one or more lookup tables,
but fact tables do not have direct relationships to one another. Dimensions and
hierarchies are represented
by lookup tables. Attributes are the non-key columns in the lookup tables.
11)What is Star Schema?
Ans)In the star schema design, a single object (the fact table) sits in the middle and is
radically connected to
other surrounding objects (dimension lookup tables) like a star. A star schema can be
simple or complex.
A simple star consists of one fact table; a complex star can have more than one fact
table.
It is non-relational databases.
12)What is Snowflake Schema?
Ans)The snowflake schema is an extension of the star schema, where each point of the
star explodes into more
points.
The main advantage of the snowflake schema is the improvement in query
performance due to minimized
disk storage requirements and joining smaller lookup tables. The main disadvantage
of the snowflake schema
is the additional maintenance efforts needed due to the increase number of lookup
tables. It is a
relation databases.
13)What is Granularity?
Ans)The first step in designing a fact table is to determine the granularity of the fact
table. By granularity,
we mean the lowest level of information that will be stored in the fact table. This
constitutes two steps:
1. Determine which dimensions will be included.
2. Determine where along the hierarchy of each dimension the information will be
kept.
14)What are different Types of Facts?
Ans)There are three types of facts:
* Additive: Additive facts are facts that can be summed up through all of the
dimensions in the fact table.
* Semi-Additive: Semi-additive facts are facts that can be summed up for some of the
dimensions in the fact
table, but not the others.
* Non-Additive: Non-additive facts are facts that cannot be summed up for any of the
dimensions present in the
fact table.
15)What are levels of data modeling?
Ans)There are three levels of data modeling. They are conceptual, logical, and physical.
This section will explain the difference among the three, the order with which each
one is created,
and how to go from one level to the other.
Conceptual Data Model
Features of conceptual data model include:
* Includes the important entities and the relationships among them.
* No attribute is specified.
* No primary key is specified.
At this level, the data modeler attempts to identify the highest-level relationships
among the different entities.
Logical Data Model
Features of logical data model include:
* Includes all entities and relationships among them.
* All attributes for each entity are specified.
* The primary key for each entity specified.
* Foreign keys (keys identifying the relationship between different entities) are
specified.
·
Normalization occurs at this level.
At this level, the data modeler attempts to describe the data in as much detail as possible,
without regard to how they will be physically implemented in the database.
In data warehousing, it is common for the conceptual data model and the logical data
model to be combined into a single step (deliverable).
The steps for designing the logical data model are as follows:
1. Identify all entities.
2. Specify primary keys for all entities.
3. Find the relationships between different entities.
4. Find all attributes for each entity.
5. Resolve many-to-many relationships.
6. Normalization.
Physical Data Model
Features of physical data model include:
* Specification all tables and columns.
* Foreign keys are used to identify relationships between tables.
* Denormalization may occur based on user requirements.
* Physical considerations may cause the physical data model to be quite different from
the logical data model.
At this level, the data modeler will specify how the logical data model will be realized in
the database schema.
The steps for physical data model design are as follows:
1. Convert entities into tables.
2. Convert relationships into foreign keys.
3. Convert attributes into columns.
4. Modify the physical data model based on physical constraints / requirements.
·
What are different types of OLAP’S?
In the OLAP world, there are mainly two different types: Multidimensional OLAP
(MOLAP) and Relational OLAP (ROLAP). Hybrid OLAP (HOLAP) refers to
technologies that combine MOLAP and ROLAP.
MOLAP
This is the more traditional way of OLAP analysis. In MOLAP, data is stored in a
multidimensional cube. The storage is not in the relational database, but in proprietary
formats.
Advantages:
* Excellent performance: MOLAP cubes are built for fast data retrieval, and is optimal
for slicing and dicing operations.
* Can perform complex calculations: All calculations have been pre-generated when the
cube is created. Hence, complex calculations are not only doable, but they return quickly.
Disadvantages:
* Limited in the amount of data it can handle: Because all calculations are performed
when the cube is built, it is not possible to include a large amount of data in the cube
itself. This is not to say that the data in the cube cannot be derived from a large amount of
data. Indeed, this is possible. But in this case, only summary-level information will be
included in the cube itself.
* Requires additional investment: Cube technology are often proprietary and do not
already exist in the organization. Therefore, to adopt MOLAP technology, chances are
additional investments in human and capital resources are needed.
ROLAP
This methodology relies on manipulating the data stored in the relational database to give
the appearance of traditional OLAP's slicing and dicing functionality. In essence, each
action of slicing and dicing is equivalent to adding a "WHERE" clause in the SQL
statement
Advantages:
* Can handle large amounts of data: The data size limitation of ROLAP technology is the
limitation on data size of the underlying relational database. In other words, ROLAP
itself places no limitation on data amount.
* Can leverage functionalities inherent in the relational database: Often, relational
database already comes with a host of functionalities. ROLAP technologies, since they sit
on top of the relational database, can therefore leverage these functionalities.
Disadvantages:
* Performance can be slow: Because each ROLAP report is essentially a SQL query (or
multiple SQL queries) in the relational database, the query time can be long if the
underlying data size is large.
* Limited by SQL functionalities: Because ROLAP technology mainly relies on
generating SQL statements to query the relational database, and SQL statements do not
fit all needs (for example, it is difficult to perform complex calculations using SQL),
ROLAP technologies are therefore traditionally limited by what SQL can do. ROLAP
vendors have mitigated this risk by building into the tool out-of-the-box complex
functions as well as the ability to allow users to define their own functions.
HOLAP
HOLAP technologies attempt to combine the advantages of MOLAP and ROLAP. For
summary-type information, HOLAP leverages cube technology for faster performance.
When detail information is needed, HOLAP can "drill through" from the cube into the
underlying relational data.
16)What are stored procedures?
Ans)A stored procedure is a named collection of SQL statements and procedural logic
that is compiled, verified and stored in a server database. It is typically treated like any
other database object. Stored procedures accept input parameters so that a single
procedure can be used over the network by multiple clients using different input data. A
single remote message triggers the execution of a collection of stored SQL statements.
The results are a reduction of network traffic and better performance.
17)What are the differences between stored procedures and triggers?
Ans)Stored procedures are compiled collection of programs or SQL statements that live
in the database. A stored procedure can access and modify data present in many tables.
Also a stored procedure is not associated with any particular database object. But triggers
are event-driven special procedures which are attached to a specific database object say a
table. Stored procedures are not automatically run and they have to be called explicitly by
the user. But triggers get executed when the particular event associated with the event
gets fired. For example in case of a database having say 200 users and the last modified
timestamp need to be updated every time the database is accessed and changed. To
ensure this one may have a trigger in the insert or update event. So that whenever any
insert or update event of the table gets fired the corresponding trigger gets activated and
updates the last modified timestamp column or field with the current time. Thus the main
difference between stored procedure and trigger is that in case of stored procedure the
program logic is executed on the database server explicitly under eth user’s request but in
case of triggers event-driven procedures attached to database object namely table gets
fired automatically when the event gets fired.
18)What is the difference between Datasource and Database?
Ans)Database means any type of database like Oracle, Db2, Taradata and etc…
Data source means from where we r retrieving data it means data source may be database
or cognos EP series mappings or SAP BW mappings and so on
19)What is namespace?
Ans)Name space means we r given a name for Query objects (QO) retrieving from one
data source for avoiding duplication of QO names
EX: if we r retrieving employee from DB2 database and from Oracle database so name
collection occurs in between 2 tables at that time we one name space for DB2 and another
name space for Oracle DB
What is the meaning of stitched query?
Stitch queries send two separate queries to the data source and then merge them locally
20)What is difference between Co-related sub query and nested sub query?
Ans)Co-related sub query is one in which inner query is evaluated only once and from
that result outer query is evaluated.
Nested query is one in which Inner query is evaluated for multiple times for gatting one
row of that outer query.
ex. Query used with IN() clause is Co-related query.
Query used with = operator is Nested query
21)What is the main difference between the IN and EXISTS clause in subqueries?
The main difference between the IN and EXISTS predicate in subquery is the way in
which the query gets executed.
IN -- The inner query is executed first and the list of values obtained as its result is used
by the outer query.The inner query is executed for only once.
EXISTS -- The first row from the outer query is selected ,then the inner query is executed
and , the outer query output uses this result for checking.This process of inner query
execution repeats as many no.of times as there are outer query rows. That is, if there are
ten rows that can result from outer query, the inner query is executed that many no.of
times.
22)What is the Subquery ?
Ans)Sub query is a query whose return values are used in filtering conditions of the main
query.
When do you use WHERE clause and when do you use HAVING clause?
Ans)Where Clause :- Used to filter the records from the table before group by cluse.
Having Clause :- Used to filter the grouped records after group By clause.
What is the difference between DW and BI?
There may be a Feature film (movie) without a Trialer.But there will be no trialer without
a movie.similarly Data warehousing is a concept related to extracting client's business
data and applying business processing features on that data according to user needs and
finally loading the processed data into a database,this database is what we call a
warehouse or data warehouse. After the completion of a data warehouse the business user
ultimately want to view his data (a precise and summary data)but as a business person he
may don't have knowledge of accessing a database( a computer person can access the
database with SQL)..so there comes olap toos(which help that person to access the
database )we can call these olap tools as Business Intelligence tools(Intelligence in sense
they generate sql queries internally and provide lot of facilities and privileges for a
reporting developers in formating the data and presenting it in a higly convenient
manner). So data warehouse(movie) is a database and business intelligence tools(trialers)
present the content of a database in an effecticient manner.
·