* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Integrating Relational Data with Netezza`s TwinFin
Survey
Document related concepts
Transcript
Integrating Relational Data with Netezza’s TwinFin™Data Warehouse and Analytic Appliance Copyright This document is copyrighted and protected by worldwide copyright laws and treaty provisions. No portion of this documentation may be distributed or reproduced by any means, or in any form, without HiT Software's prior written permission. COPYRIGHT NOTICE: Copyright © 2010 HiT Software, Inc., A BackOffice Associates, LLC Company. All Rights Reserved. Disclaimer Information in these HTML documents is subject to change without notice. Although efforts have been made to ensure the accuracy of these documents, HiT Software, Inc. assumes no responsibility for damages incurred directly or indirectly from errors, omissions, and discrepancies between the software and the documents. If you find any problems in the documentation, please report them to HiT Software, Inc. Trademarks HiT Software, DBMoto and Ritmo are registered trademarks or trademarks of HiT Software, Inc. All other marks are used for the benefit of their respective owners and HiT Software, Inc. disclaims any interest in such marks. Contact Information HiT Software, Inc. Tel. +1-408-345-4001 Fax +1-408-345-4899 Electronic mail: [email protected] Web site: www.hitsw.com DBMoto’s Data Replication Options DBMoto allows you to replicate data from relational database tables to Netezza’s TwinFin Data Warehouse in the following ways: Refresh (Snapshot replication) A one-time complete replication from any major relational database source to Netezza’s TwinFin as a target, according to replication settings and scripts. You can control the timing of the replication, identify the columns to be replicated and add scripts to transform data during replication. Source databases include Oracle, Microsoft SQL Server, IBM DB2 for i, IBM DB2 LUW, Sybase, Informix, MySQL. When Netezza TwinFin is used as a target in refresh operations, DBMoto uploads the data to the Netezza database using its Data Loading feature which provide a fast bulk loading of the data. The bulk size can be configured using the "Block Size" option in the DBMoto replication properties which has a default value of 100,000 records. Continuous refresh A regularly scheduled refresh replication as described above. The schedule is defined in the replication settings. One-way mirroring (Incremental Replication) A continuous update of replicated tables based on changes to the source database that have been recorded in the database server log. Typically, this involves an initial refresh operation, as described above, to set up the target table. Then you can define the replication settings to check the transaction log on the source database at regular intervals. Any changes found in the log would be applied to the Netezza TwinFin solution. When Netezza TwinFin is used as a target in mirroring operations, DBMoto uploads changed data to the Netezza database using its Data Loading feature which provide a fast bulk loading of the data. The data is then processed in bulk using SQL statements. The bulk size can be configured using the "Block Size" option in the replication properties which has a default value of 100,000 records. This document describes how to set up replications using: Refresh Mode One-way Mirroring Mode HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 3 Steps for Replicating Tables Using Refresh Mode Set up database connections> Define the Replication>Run the Replication A refresh is a one-time complete replication from source to target, according to replication settings and scripts. You can control the timing of the replication, identify the columns to be replicated and add scripts to transform data during replication. 1. Set Up Database Connections This document demonstrates the process of setting up the replication using Oracle 10 as a source database and Netezza’s TwinFin Data Warehouse as a target. 1. Make sure you have an Oracle .NET data provider installed and running, with access to the tables you plan to replicate to TwinFin. 2. Install and configure the Netezza ODBC driver as directed by Netezza. 3. Install DBMoto using the setup.exe file provided via download from HiT Software (www.hitsw.com) 4. Start the DBMoto Enterprise Manager. DBMoto provides a default database (Microsoft SQL Server CE) for your DBMoto metadata, all the information that DBMoto needs to store about your replication setup. 5. In the Enterprise Manager tree, expand the metadata node to view the Sources and Targets nodes. 6. Select the Sources node. HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 4 7. From the right mouse button menu, choose Add New Connection. 8. In the Source Connection Wizard Select Provider screen, select Oracle in the Database field,. 9. In the Provider field, select Oracle .NET Driver. 10. In the Assembly field, browse to find the path to the .NET Assembly Oracle.DataAccess. The filename is Oracle.DataAccess.dll. NOTE: DBMoto can perform replication from any major database to Netezza TwinFin. Check the HiT Software knowledge base article on data providers for details. HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 5 11. Click Next to display the Set Connection String screen. 12. In the Data Source field, type the name or IP address of the Oracle server. 13. Enter your user ID and password for the server. Check that the Oracle user ID you are planning to use has sufficient permissions to complete all operations for a refresh replication in DBMoto. The user ID should have permissions to connect, select tables. 14. Click Next to display the Setup Info screen (used for mirroring replications only). 15. Click Next to display the Select Tables screen. 16. Select the tables that you plan to replicate. 17. Click Next to review the source connection setup. 18. Click Finish to complete the wizard. Now you need to create the connection to Netezza TwinFin. 19. In the Enterprise Manager tree, select the Targets node. 20. From the right mouse button menu, choose Add New Connection. HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 6 21. In the Target Connection Wizard Select Provider screen, select Netezza in the Database field. The Provider field automatically shows Netezza SQL ODBC Driver. HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 7 22. Click Next to display the Set Connection String screen. 23. Click in the Connection String field to display … in the Connection String value field. 24. Click … to open the Netezza ODBC Driver Setup dialog. 25. Enter the connection information for your Netezza data source (Server, Database, User Name and Password.) HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 8 26. Click Test Connection to make sure that the connection to the TwinFin server is functioning. If you are unable to connect, verify that the connection parameters you entered are correct. 27. Click OK to close the dialog. 28. Click Next to display the Setup Info screen (used for mirroring replications only). 29. Click Next to display the Select Tables screen. 30. Select the tables to which you want to replicate source data. NOTE: It is also possible to create target tables in TwinFin, based on source tables you have previously added. 31. Click Next to review the source connection setup. 32. Click Finish to complete the wizard. 2. Define the Replication 1. In the Enterprise Manager tree, select the Replications node and choose Create Multiple Replications. 2. In the Replication Type screen, type a replication name. All table replications will be created with this name and a suffix that consists of a sequential number. 3. Check the Use Group option to group your replications for optimization purposes. 4. Click Create New to create a group for the replications. 5. In the Group Properties dialog, type a group name and click OK. HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 9 6. In the Replication Mode area, make sure that Refresh is selected. 7. Click Next to display the Source Connection screen. 8. Select the Oracle connection you created above. 9. Click Next to display the Target Connection screen. HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 10 10. Select your Netezza connection. 11. Click Next to display the Set Replications screen. The Replication List displays a list of possible source table-target table replication pairs, based on information gathered from the source connection. Note that tables in the Target Table column may not yet exist in the target database. By default, all replications in the list are selected for creation. If you proceed from this screen without making any changes, a replication object will be created for each source table-target table replication pair that is displayed. You can deselect a pair (i.e. the replication will not be created) by clicking twice in the checkbox to the left of the Source Table column. HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 11 12. Click Next to go to the Scheduling screen. 13. Make sure the Enable Replication option is checked. This is required for the replication to run. HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 12 14. Set a start time for the replication. The Start Time field indicates the time at which the Data Replicator will begin considering the replication for execution. 15. Select how you want to run the replication: Once only: the replication will run once at the time specified in Start Time. Recurrently: the replication will run according to the schedule you set by clicking the Schedule button to open the Scheduler dialog. 20. Click Next to go to the Summary screen. 21. Click Finish to complete the wizard. If the tables you indicated as target tables do not exist in Netezza TwinFin, DBMoto provides a dialog with the option to create them while setting up the replications. 22. To set the block size for the amount of data to be loaded at any one time (via the Netezza Data Loading feature), in the Enterprise Manager Tree View, select the replication. From the right mouse button menu, choose Replication Properties. 23. In the Prefences tab, Refresh Options section, enter a value in the Block Size field. 4. Run the Replication If you installed DBReplicator as a service during DBMoto setup, you just need to start the service using the DBMServiceMonitor program (located in the DBMoto install folder or on the Windows Start > Programs > Startup menu. The replication that you have scheduled should start at the specified time. Use the Replication Monitor tab in the Enterprise Manager to track the progress of the replication. If you would like to set up the DBMoto Replicator as a service: From the Windows Desktop Start menu, choose Programs, then HiT Software DBMoto, then Service Installer. Manage the service from DBMServiceMonitor program (located in the DBMoto install folder or on the Windows Start > Programs > Startup menu). Use the Replication Monitor tab in the Enterprise Manager to track the progress of the replication. To run the DBMoto replicator interactively: From the Windows Desktop Start menu, choose Programs then HiT Software DBMoto then DBReplicator. The replication that you have scheduled should start at the specified time. Use the Replication Monitor tab in the Enterprise Manager to track the progress of the replication. HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 13 Steps for Replicating Tables with One-Way Mirroring (Data Changes Only) Set up database connections>Create a target table>Define the Replication>Run the Replication One-way mirroring provides a continuous update of a replicated table based on changes to the source database that have been recorded in the database server log. Typically, this involves an initial refresh operation, as described above, to set up the target table. Then you can define the replication settings to check the transaction log on the source database at regular intervals. Any changes found in the log are applied to the target database. 1. Set Up Database Connections This document demonstrates the process of setting up the replication using Oracle 10 as a source database and Netezza’s TwinFin Data Warehouse as a target. 1. Make sure you have an Oracle .NET data provider installed and running, with access to the tables you plan to replicate to TwinFin. 2. Install and configure the Netezza ODBC driver as directed by Netezza. 3. Install DBMoto using the setup.exe file provided via download from HiT Software (www.hitsw.com) 4. Start the DBMoto Enterprise Manager. DBMoto provides a default database (Microsoft SQL Server CE) for your DBMoto metadata, all the information that DBMoto needs to store about your replication setup. 5. In the Enterprise Manager tree, expand the metadata node to view the Sources and Targets nodes. 6. Select the Sources node. HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 14 7. From the right mouse button menu, choose Add New Connection. 8. In the Source Connection Wizard Select Provider screen, select Oracle in the Database field,. 9. In the Provider field, select Oracle .NET Driver. 10. In the Assembly field, browse to find the path to the .NET Assembly Oracle.DataAccess. The filename is Oracle.DataAccess.dll. NOTE: DBMoto can perform replication from any major database to Netezza TwinFin. Check the HiT Software knowledge base article on data providers for details. HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 15 11. Click Next to display the Set Connection String screen. 12. In the Data Source field, type the name or IP address of the Oracle server. 13. Enter your user ID and password for the server. Check that the Oracle user ID you are planning to use has sufficient permissions to complete all operations for a refresh replication in DBMoto. The user ID should have permissions to connect, select tables. 14. Click Next to display the Setup Info screen (used for mirroring replications only). 15. Check the Use Transactional Replication option HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 16 16. Makes sure that Use Online Dictionary is selected. 17. Click Next to display the Setup Info screen (used for mirroring replications only). 18. Click Next to display the Select Tables screen. 19. Select the tables that you plan to replicate. 20. Click Next to review the source connection setup. 21. Click Finish to complete the wizard. Now you need to create the connection to Netezza TwinFin. 22. In the Enterprise Manager tree, select the Targets node. HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 17 23. From the right mouse button menu, choose Add New Connection. 24. In the Target Connection Wizard Select Provider screen, select Netezza in the Database field. The Provider field automatically shows Netezza SQL ODBC Driver. HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 18 25. Click Next to display the Set Connection String screen. 26. Click in the Connection String field to display … in the Connection String value field. 27. Click … to open the Netezza ODBC Driver Setup dialog. 28. Enter the connection information for your Netezza data source (Server, Database, User Name and Password.) HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 19 29. Click Test Connection to make sure that the connection to the TwinFin server is functioning. If you are unable to connect, verify that the connection parameters you entered are correct. 30. Click OK to close the dialog. 31. Click Next to display the Setup Info screen (used for mirroring replications only). 32. Click Next to display the Select Tables screen. 33. Select the tables to which you want to replicate source data. NOTE: It is also possible to create target tables in TwinFin, based on source tables you have previously added. 34. Click Next to review the source connection setup. 35. Click Finish to complete the wizard. 2. Define the Replication 1. In the Enterprise Manager tree, select the Replications node and choose Create Multiple Replications. 2. In the Replication Type screen, type a replication name. All table replications will be created with this name and a suffix that consists of a sequential number. 3. Check the Use Group option to group your replications for optimization purposes. 4. Click Create New to create a group for the replications. 5. In the Group Properties dialog, type a group name and click OK. HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 20 6. In the Replication Mode area, make sure that Continuous Mirroring is selected. 7. Click Next to display the Source Connection screen. 8. Select the Oracle connection you created above. 9. Select the appropriate schema (owner) from the drop-down list. HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 21 10. Click Next to display the Source Log Info screen. 11. Click Read to retrieve from the database the transaction ID and timestamp from which to begin replication. 12. For now, leave the read interval at 60 seconds, although this number can be adjusted for more or less frequent log checks. 13. Click Next to display the Target Connection screen. HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 22 14. Select your Netezza connection. 15. Click Next to display the Set Replications screen. The Replication List displays a list of possible source table-target table replication pairs, based on information gathered from the source connection. Note that tables in the Target Table column may not yet exist in the target database. By default, all replications in the list are selected for creation. If you proceed from this screen without making any changes, a replication object will be created for each source table-target table replication pair that is displayed. You can deselect a pair (i.e. the replication will not be created) by clicking twice in the checkbox to the left of the Source Table column. HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 23 16. Click Next to go to the Scheduling screen. 17. Make sure the Enable Replication option is checked. This is required for the replication to run. 18. Set a start time for the replication. The Start Time field indicates the time at which the Data Replicator will begin considering the replication for execution. HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 24 19. Select how you want to run the replication: Once only: the replication will run once at the time specified in Start Time. Recurrently: the replication will run according to the schedule you set by clicking the Schedule button to open the Scheduler dialog. 24. Click Next to go to the Summary screen. 25. Click Finish to complete the wizard. If the tables you indicated as target tables do not exist in Netezza TwinFin, DBMoto provides a dialog with the option to create them while setting up the replications. 4. Run the Replication If you installed DBReplicator as a service during DBMoto setup, you just need to start the service using the DBMServiceMonitor program (located in the DBMoto install folder or on the Windows Start > Programs > Startup menu. The replication that you have scheduled should start at the specified time. Use the Replication Monitor tab in the Enterprise Manager to track the progress of the replication. If you would like to set up the DBMoto Replicator as a service: From the Windows Desktop Start menu, choose Programs, then HiT Software DBMoto, then Service Installer. Manage the service from DBMServiceMonitor program (located in the DBMoto install folder or on the Windows Start > Programs > Startup menu). Use the Replication Monitor tab in the Enterprise Manager to track the progress of the replication. To run the DBMoto replicator interactively: From the Windows Desktop Start menu, choose Programs then HiT Software DBMoto then DBReplicator. The replication that you have scheduled should start at the specified time. Use the Replication Monitor tab in the Enterprise Manager to track the progress of the replication. HiT Software, Inc., A Back Office Associates, LLC Company | Copyright 2010 25