Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Relational algebra wikipedia , lookup
Concurrency control wikipedia , lookup
Oracle Database wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Clusterpoint wikipedia , lookup
1)What is type1 dimensions and type2 dimension? Ans)In Type 1 Slowly Changing Dimension, the new information simply overwrites the original information. In other words, no history is kept. In Type 2 Slowly Changing Dimension, a new record is added to the table to represent the new information.Therefore,both the original and the new record will be present.The new record gets its own primary key. In Type 3 slowly changing dimension, The original record is modified to reflect the change. 2)What is correlated sub query? Ans)A correlated sub query is a sub query that contains a reference to a table that also appears in the outer query. SELECT * FROM t1 WHERE column1 = ANY (SELECT column1 FROM t2 WHERE t2.column2 = t1.column2); 3)What is difference between function and procedure? Ans)Function returns value and procedure does not return value. 4)What is inline view? Ans)The inline view is a construct in Oracle SQL where you can place a query in the SQL FROM, clause, just as if the query was a table name. 5)A common use for in-line views in Oracle SQL is to simplify complex queries by removing join operations and Ans)condensing several separate queries into a single query. Example: The best example of the in-line view is the common Oracle DBA script that is used to show the amount of free space and used space within all Oracle table spaces. 6)What is Dimension? Ans)A category of information. For example, the time dimension. 7)What is Attribute? Ans)A unique level within a dimension. For example, Month is an attribute in the Time Dimension. 8)What is Hierarchy? Ans)The specification of levels that represents relationship between different attributes within a dimension. For example, one possible hierarchy in the Time dimension is Year ? Quarter ? Month ? Day 9)What is Fact Table? Ans)A fact table is a table that contains the measures of interest. For example, sales amount would be such a measure. This measure is stored in the fact table with the appropriate granularity. For example, it can be sales amount by store by day. In this case, the fact table would contain three columns: A date column, a store column, and a sales amount column. 10)What is Lookup Table? Ans)The lookup table provides the detailed information about the attributes. For example, the lookup table for the Quarter attribute would include a list of all of the quarters available in the data warehouse. Each row (each quarter) may have several fields, one for the unique ID that identifies the quarter, and one or more additional fields that specifies how that particular quarter is represented on a report (for example, first quarter of 2001 may be represented as "Q1 2001" or "2001 Q1"). A dimensional model includes fact tables and lookup tables. Fact tables connect to one or more lookup tables, but fact tables do not have direct relationships to one another. Dimensions and hierarchies are represented by lookup tables. Attributes are the non-key columns in the lookup tables. 11)What is Star Schema? Ans)In the star schema design, a single object (the fact table) sits in the middle and is radically connected to other surrounding objects (dimension lookup tables) like a star. A star schema can be simple or complex. A simple star consists of one fact table; a complex star can have more than one fact table. It is non-relational databases. 12)What is Snowflake Schema? Ans)The snowflake schema is an extension of the star schema, where each point of the star explodes into more points. The main advantage of the snowflake schema is the improvement in query performance due to minimized disk storage requirements and joining smaller lookup tables. The main disadvantage of the snowflake schema is the additional maintenance efforts needed due to the increase number of lookup tables. It is a relation databases. 13)What is Granularity? Ans)The first step in designing a fact table is to determine the granularity of the fact table. By granularity, we mean the lowest level of information that will be stored in the fact table. This constitutes two steps: 1. Determine which dimensions will be included. 2. Determine where along the hierarchy of each dimension the information will be kept. 14)What are different Types of Facts? Ans)There are three types of facts: * Additive: Additive facts are facts that can be summed up through all of the dimensions in the fact table. * Semi-Additive: Semi-additive facts are facts that can be summed up for some of the dimensions in the fact table, but not the others. * Non-Additive: Non-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table. 15)What are levels of data modeling? Ans)There are three levels of data modeling. They are conceptual, logical, and physical. This section will explain the difference among the three, the order with which each one is created, and how to go from one level to the other. Conceptual Data Model Features of conceptual data model include: * Includes the important entities and the relationships among them. * No attribute is specified. * No primary key is specified. At this level, the data modeler attempts to identify the highest-level relationships among the different entities. Logical Data Model Features of logical data model include: * Includes all entities and relationships among them. * All attributes for each entity are specified. * The primary key for each entity specified. * Foreign keys (keys identifying the relationship between different entities) are specified. · Normalization occurs at this level. At this level, the data modeler attempts to describe the data in as much detail as possible, without regard to how they will be physically implemented in the database. In data warehousing, it is common for the conceptual data model and the logical data model to be combined into a single step (deliverable). The steps for designing the logical data model are as follows: 1. Identify all entities. 2. Specify primary keys for all entities. 3. Find the relationships between different entities. 4. Find all attributes for each entity. 5. Resolve many-to-many relationships. 6. Normalization. Physical Data Model Features of physical data model include: * Specification all tables and columns. * Foreign keys are used to identify relationships between tables. * Denormalization may occur based on user requirements. * Physical considerations may cause the physical data model to be quite different from the logical data model. At this level, the data modeler will specify how the logical data model will be realized in the database schema. The steps for physical data model design are as follows: 1. Convert entities into tables. 2. Convert relationships into foreign keys. 3. Convert attributes into columns. 4. Modify the physical data model based on physical constraints / requirements. · What are different types of OLAP’S? In the OLAP world, there are mainly two different types: Multidimensional OLAP (MOLAP) and Relational OLAP (ROLAP). Hybrid OLAP (HOLAP) refers to technologies that combine MOLAP and ROLAP. MOLAP This is the more traditional way of OLAP analysis. In MOLAP, data is stored in a multidimensional cube. The storage is not in the relational database, but in proprietary formats. Advantages: * Excellent performance: MOLAP cubes are built for fast data retrieval, and is optimal for slicing and dicing operations. * Can perform complex calculations: All calculations have been pre-generated when the cube is created. Hence, complex calculations are not only doable, but they return quickly. Disadvantages: * Limited in the amount of data it can handle: Because all calculations are performed when the cube is built, it is not possible to include a large amount of data in the cube itself. This is not to say that the data in the cube cannot be derived from a large amount of data. Indeed, this is possible. But in this case, only summary-level information will be included in the cube itself. * Requires additional investment: Cube technology are often proprietary and do not already exist in the organization. Therefore, to adopt MOLAP technology, chances are additional investments in human and capital resources are needed. ROLAP This methodology relies on manipulating the data stored in the relational database to give the appearance of traditional OLAP's slicing and dicing functionality. In essence, each action of slicing and dicing is equivalent to adding a "WHERE" clause in the SQL statement Advantages: * Can handle large amounts of data: The data size limitation of ROLAP technology is the limitation on data size of the underlying relational database. In other words, ROLAP itself places no limitation on data amount. * Can leverage functionalities inherent in the relational database: Often, relational database already comes with a host of functionalities. ROLAP technologies, since they sit on top of the relational database, can therefore leverage these functionalities. Disadvantages: * Performance can be slow: Because each ROLAP report is essentially a SQL query (or multiple SQL queries) in the relational database, the query time can be long if the underlying data size is large. * Limited by SQL functionalities: Because ROLAP technology mainly relies on generating SQL statements to query the relational database, and SQL statements do not fit all needs (for example, it is difficult to perform complex calculations using SQL), ROLAP technologies are therefore traditionally limited by what SQL can do. ROLAP vendors have mitigated this risk by building into the tool out-of-the-box complex functions as well as the ability to allow users to define their own functions. HOLAP HOLAP technologies attempt to combine the advantages of MOLAP and ROLAP. For summary-type information, HOLAP leverages cube technology for faster performance. When detail information is needed, HOLAP can "drill through" from the cube into the underlying relational data. 16)What are stored procedures? Ans)A stored procedure is a named collection of SQL statements and procedural logic that is compiled, verified and stored in a server database. It is typically treated like any other database object. Stored procedures accept input parameters so that a single procedure can be used over the network by multiple clients using different input data. A single remote message triggers the execution of a collection of stored SQL statements. The results are a reduction of network traffic and better performance. 17)What are the differences between stored procedures and triggers? Ans)Stored procedures are compiled collection of programs or SQL statements that live in the database. A stored procedure can access and modify data present in many tables. Also a stored procedure is not associated with any particular database object. But triggers are event-driven special procedures which are attached to a specific database object say a table. Stored procedures are not automatically run and they have to be called explicitly by the user. But triggers get executed when the particular event associated with the event gets fired. For example in case of a database having say 200 users and the last modified timestamp need to be updated every time the database is accessed and changed. To ensure this one may have a trigger in the insert or update event. So that whenever any insert or update event of the table gets fired the corresponding trigger gets activated and updates the last modified timestamp column or field with the current time. Thus the main difference between stored procedure and trigger is that in case of stored procedure the program logic is executed on the database server explicitly under eth user’s request but in case of triggers event-driven procedures attached to database object namely table gets fired automatically when the event gets fired. 18)What is the difference between Datasource and Database? Ans)Database means any type of database like Oracle, Db2, Taradata and etc… Data source means from where we r retrieving data it means data source may be database or cognos EP series mappings or SAP BW mappings and so on 19)What is namespace? Ans)Name space means we r given a name for Query objects (QO) retrieving from one data source for avoiding duplication of QO names EX: if we r retrieving employee from DB2 database and from Oracle database so name collection occurs in between 2 tables at that time we one name space for DB2 and another name space for Oracle DB What is the meaning of stitched query? Stitch queries send two separate queries to the data source and then merge them locally 20)What is difference between Co-related sub query and nested sub query? Ans)Co-related sub query is one in which inner query is evaluated only once and from that result outer query is evaluated. Nested query is one in which Inner query is evaluated for multiple times for gatting one row of that outer query. ex. Query used with IN() clause is Co-related query. Query used with = operator is Nested query 21)What is the main difference between the IN and EXISTS clause in subqueries? The main difference between the IN and EXISTS predicate in subquery is the way in which the query gets executed. IN -- The inner query is executed first and the list of values obtained as its result is used by the outer query.The inner query is executed for only once. EXISTS -- The first row from the outer query is selected ,then the inner query is executed and , the outer query output uses this result for checking.This process of inner query execution repeats as many no.of times as there are outer query rows. That is, if there are ten rows that can result from outer query, the inner query is executed that many no.of times. 22)What is the Subquery ? Ans)Sub query is a query whose return values are used in filtering conditions of the main query. When do you use WHERE clause and when do you use HAVING clause? Ans)Where Clause :- Used to filter the records from the table before group by cluse. Having Clause :- Used to filter the grouped records after group By clause. What is the difference between DW and BI? There may be a Feature film (movie) without a Trialer.But there will be no trialer without a movie.similarly Data warehousing is a concept related to extracting client's business data and applying business processing features on that data according to user needs and finally loading the processed data into a database,this database is what we call a warehouse or data warehouse. After the completion of a data warehouse the business user ultimately want to view his data (a precise and summary data)but as a business person he may don't have knowledge of accessing a database( a computer person can access the database with SQL)..so there comes olap toos(which help that person to access the database )we can call these olap tools as Business Intelligence tools(Intelligence in sense they generate sql queries internally and provide lot of facilities and privileges for a reporting developers in formating the data and presenting it in a higly convenient manner). So data warehouse(movie) is a database and business intelligence tools(trialers) present the content of a database in an effecticient manner. ·