Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Information privacy law wikipedia , lookup
Clusterpoint wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Business intelligence wikipedia , lookup
Data vault modeling wikipedia , lookup
Versant Object Database wikipedia , lookup
Expense and cost recovery system (ECRS) wikipedia , lookup
Open data in the United Kingdom wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Relational model wikipedia , lookup
Informatica Overview Contents • • • • • Introduction Clients Server(s) Source, Target, Repository Connectivity What is Informatica? • Allows you to load data into a centralized location, such as a datamart or data warehouse. • ETL Tool – Extract data from multiple sources – Transform the data according to business logic and needs – Load the transformed data into file and relational targets Example EMPLOYEE EMP_ID EMP_DETAILS EMP_NAME EMP_NAME EMP_CITY EMP_CITY EMP_STATE EMP_STATE EMP_COUNT RY EMP_DATE_O F_JOINING EMP_COUNTRY EMP_ID Transform Date of Joining to Yrs of Service YRS_OF_SRV Data Warehousing Developer End User Metadata Operational Sources Extract Transform Load Data Warehouse Informatica Architecture Design Process Client Tool Review – Repository Manager – Designer – Server Manager Informatica’s Architecture Data Models Designer Repository Manager Server Manager Targets 1-n Sources 1-n PowerPlugs Repository Server Informatica Design Process 2. 1. 3. Source Def 5. Mapping Sessions 4. Target Def 1. Create Repository 2. Import Source Definitions 3. Create Target Schema 4. Create Mappings 5. Load Data Informatica Client Repository Manager – Can view much of the metadata in the Repository through the Repository Manager. Designer – Create Source-to-Target mappings that contain transformation instructions for the Informatica Server. Server Manager – Create, schedule, and monitor sessions. You create a session based on a transformation and schedule it to run on the Informatica Server. Informatica Client Repository Manager Metadata Repository Information about the data mart system Catalogs the repository Directs the server Contains record of user access Can be shared Can be searched and reported Bridged through Metadata Exchange Repository Manager Navigator Window Analysis Window Dependency Window Output Window Folder Attributes • FOLDER OWNER - user who serves as focal point for folder permissions • PERMISSIONS - rights to read, write, and/or execute objects in a folder • SHARED - property that allows you to make shortcuts to objects in a folder • SHORTCUT - a dynamic link to an object stored in a shared folder • VERSIONS - folder iterations that indicate development stages Informatica Client Designer Designer Workspace Open Folder List Navigator Workspace Workbook Tabs Output Window Status Bar Designer Options General workspace options reload objects on open group source definitions Tables columns viewed column size object size object colors Format workspace colors import keys automatic Source Qualifier creation Informatica Client Server Manager Server Manager Navigator Configure Window Monitor Window Output Window Designer • • • • • Source Analyzer Warehouse Designer Transformation Developer Mapplet Designer Mapping Designer Source Analyzer • Identify the sources used to build the warehouse. • Create repository definitions for these sources Analyzing Sources Relational – Oracle, Sybase, Informix, IBM DB2, Microsoft SQL Server, and Teradata File – Fixed and delimited flat file, COBOL file, and XML Other – Microsoft Excel, Microsoft Access Extended – PeopleSoft, SAP R/3, Sieble, and IBM MQSeries (need to purchase additional products for these sources) Mainframe – Need to purchase additional products. Warehouse Designer • Create relational tables in Target database • Edit target definitions • Preview relational target data Targets • Relational – Oracle, Sybase, Sybase IQ, Informix, IBM DB2, Microsoft SQL Server, and Teradata • File – Fixed and delimited flat files and XML • Extended – SAP BW, IBM MQ Series (need to purchase additional products for these targets) • Other - Microsoft Access Mixing Sources and Targets • You can combine data from different platforms and source types. Oracle Sybase Flat File Transformation Developer • Generates ,modifies, passes data through ports • 12 objects for transforming data Transformations Types • Source Qualifier represents all data queried from the source • Normalizer normalizes records from VSAM or relational sources • Expression performs simple calculations • Filter serves as a conditional filter • Aggregator performs aggregate calculations • Rank limits records to top or bottom range Transformations Types contd… • Update Strategy allows for logic to insert, update, delete, or reject data • Lookup looks up values and passes to other objects • Stored Procedure calls a stored procedure and captures return values • External Procedure calls a procedure defined in a shared library • Sequence Generator generates unique ID values • Joiner allows for heterogeneous joins SourceQualifier Transformation • Represents records that Informatica server reads when it runs a session • Automatically attached when a Source is added to a mapping Use a Source Qualifier to: Filter Records when the Informatica Server reads source data Specify sorted ports – Order by clause Select only distinct values from a source Create a custom query for the Informatica Server to read source data Expression Transformation Calculate values in a single row – Adjust employee salaries, concatenate first and last names, convert string to number… Perform Any Non-Aggregate Calculations Test conditional statements before you output to target Example EMPLOYEE EMP_SALARY EMP_ID EMP_ID EMP_NAME EMP_NAME ROLE_CODE ROLE_CODE BASIC_SALA RY Gross Salary= Basic Salary * 3.5 GROSS_SALARY Aggregator Transformation Allows you to perform aggregate calculations, such as averages and sums While the Expression is on a row-by-row basis, the aggregator can perform calculations on groups Example REVENUE PU_REVNUE PU_CODE PU_CODE PROJECT_CO DE REVENUE Aggregator Transformat ion MAX_REVENUE MIN_REVENUE AVG_REVENUE Max Revenue = Max (Revenue) Min Revenue = Min(Revenue) Avg Revenue = Avg (Revenue) Filter Transformation Provides the means for filtering rows in a mapping – Employees who are currently working in the project “NML” of “WENA” as “SE” Only rows that meet the condition pass through the mapping. Filter Transformation All ports are input/output Returns TRUE or FALSE for each row passed through the mapping based on the condition Discarded rows do not appear in the session log or reject files The input ports must only come from one transformation Filter vs Source Qualifier (SQ) SQ provides better performance SQ only lets you filter rows from relational sources, Filter Transformation filters rows from any source SQ only uses standard SQL, Filter can use any statement or function that returns True/False Example EMPLOYEE NM_EMP_DETAILS EMP_ID EMP_ID EMP_NAME EMP_NAME PROJECT_CO DE Filter Transform ation PU_CODE ROLE_CODE Where Project = “NML” and PU = “WENA” and Role = “SE” PROJECT_CODE PU_CODE ROLE_CODE Router Transformation Groups data into many groups Routes rows of data that do not meet any condition to a default group Can enter any expression that returns a single value Condition returns True or False for each row If the condition = NULL, the row is assumed as FALSE Router Transformation • One Group can be connected to One transformation or target • One Output Port can be connected to multiple transformations or targets • Multiple Output ports in One Group can be connected to multiple transformations or targets • CANNOT Connect more than One Group to One Transformation or Target Lookup Transformation Looks up data in a relational table – Can be the Source, Target, or any database that the Informatica Client and Server can connect to – Lookup table can be a single table or can join multiple tables Lookups can: – Get a related value (your source include Employee_ID and you want Employee_Name), – Perform a calculation – Update a slowly changing dimension table (check if records exist on a target) Lookup Transformation For each input row, the Informatica Server queries the lookup table based on the lookup ports and the condition in the transformation – The Informatica Server can return values from that lookup (static cache) OR – The Informatica Server inserts a row into the cache to flag rows as new or existing (dynamic cache) Connected and Unconnected Lookup Transformations CONNECTED LOOKUP Part of the mapping data flow Returns multiple values (by linking output ports to another transformation) Executed for every record passing through the transformation More visible, shows where the lookup values are used Default values are used UNCONNECTED LOOKUP Separate from the mapping data flow Returns one value (by checking the Return (R) port option for the output port that provides the return value) Only executed when the lookup function is called Less visible, as the lookup is called from an expression within another transformation Default values are ignored Example EMPLOYEE_PROJECT EMP_ID NM_EMP_DETAILS EMP_ID EMP_NAME EMP_NAME PROJECT_CO DE PROJECT_CODE PROJECT PROJECT_CO DE PROJECT_DE SC LOOK UP Transform ation PROJECT_DESC Get PROJECT.PROJECT_DESC Where PROJECT.PROJECT_CODE = NM_EMP_DETAILS.PROJE CT_CODE Update Strategy Two Ways Of doing Within a Session Within a Mapping Update Strategy Within a Session Instruct the Informatica Server how to treat the rows when the session is configured Within a Mapping Use the update strategy transformation to flag records for insert, delete, update, or reject. Constraint for each Database Operation Operation Insert Constant DD_INSERT Numeric Value 0 Update DD_UPDATE 1 Delete DD_DELETE 2 Reject DD_REJECT 3 Joiner Transformation Active Transformation Join two flat files Join two tables from different databases Join a flat file with a relational table Transformation Overview Three views: Iconized View -- shows transformation in relation to mapping Normal View -- shows data flow through transformation Edit View -- shows transformation properties and allows for editing Transformation Overview Normal view shows data flow through the transformation Data passes through I/O ports unchanged DATE_ENTERED passes into transformation through an input port. It is used in MTH port to extract month. Month is passed through MTH output port to another transformation. Transformation Overview Edit view provides flexibility in defining transformation rules Define port level handling -Switch between transformation s -Enter comments -Make reusable Define transformation level properties Transformations and Expressions Calculation or conditional statement Used in Expression, Aggregator, Rank, Filter, Update Strategy Performs calculation based on ports, functions, operators, variables, literals, constants, and return values from other transformations Mapplets • • • • Reusable Object Include multiple transformations Include Source definitions Multiple groups of output ports Mapping Move and transform data from sources to targets Includes source definitions target definitions transformations. Source Transformations Mapping Target Mapping Designer Transformation Toolbar Mapping List Iconized Status Bar Validation Three different levels of validation: Connection validation Expression validation Mapping validation Connection Validation Connecting ports with mismatched datatypes Connecting output ports to a source Connecting a source to anything but a Source Qualifier or Normalizer Connecting an output to a output, or an input to a input Connecting more than one active transformation to another transformation Copying columns to a target definition Expression Validation Parse the current expression, with remote port searching (references to a port in another transformation are resolved) Parse expression attributes such as filter condition, lookup condition, SQL Query, etc. Parse default values Mapping Validation Mapping validation will take place with menu commands: Mapping | Validate Repository | Save Mapping validation will: Perform connection validation Perform expression validation Check the mapping flow validation Data from Source Qualifier mapped to a target Targets are connected to transformations Informatica Server • Reads information from the Repository • Extracts data from the Sources and stores the data in memory while it applies the transformation rules you created • Loads the transformed data into the mapping targets Transformation Process Repository Session Metadata Source Def Mapping Target Def Session Server Manager source information target information mapping scheduling error handling pre- / post-session scripts tuning parameters output log information transformation overrides Server Targets Sources Definitions • Session - A set of instructions that tells the Informatica Server how and when to move data from sources to targets • Batch - A group of Sessions which are to be run together Server Manager Navigator Configure Window Monitor Window Output Window Process Configure server Create session Run session Monitor session View logs Tune session Server Configuration Server Variables Establish default directories for files and caches. Variables are server specific. Allows for easy deployment. Can be overridden at session level. Changing the variable updates sessions. Directories must exist prior to session launch. Server Output Control .dat .perf .idx Target Data E-mail .bad .log Error Log Event Log Source Settings - Session Wizard Source Select source type File, Relational, Heterogeneous Source Database Source Type: Relational Database connection Treat rows as: Source Type: Relational, File Tells server how to treat source rows Insert, update, delete, data driven Works in conjunction with Target Options Source Options... Source Type: Relational Database name Source Type: File fixed / delimited properties file list FTP properties Target Settings - Session Wizard Target Select target type File, Relational Target Database Target Type: Relational Database connection Flat File Options: File properties FTP properties Loader properties Target Options: Target type: Relational Tell server how to load target Insert, Update Delete Truncate target Bulk, Normal, Test Launching Sessions Manual start - Manually launch a session from within the Server Manager Session Schedule - Schedule a session using business cycle start, stop, and repeat intervals Batching - Use batches to run session concurrently or sequentially Event based - Configure a session to launch based upon the appearance of an indicator file Command Line - Launch a session from the command line prompt After the Session Launches... Poll/Refresh Session Status View Session Details View Performance Monitor View Logs Tune the Session Correct Session Problems Monitor Session Select Server Requests | Session Details Number of rows loaded/failed Read/Write throughput Most current Server message Audit trail in repository or session log Monitor Session Performance Monitor Select Server Requests | Session Performance Details or open file <sessionname>.perf Need to configure session properties to save the performance detail counters Help determine where session performance can be improved Log Files - Session Wizard Log Files... Log file path and name Reject file path and name Session log archive options Connectivity Overview