Download Teradata Parallel Data Pump Reference

Document related concepts

Neuroinformatics wikipedia , lookup

Data analysis wikipedia , lookup

Pattern recognition wikipedia , lookup

Waveform graphics wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Data remanence wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Transcript
Teradata Parallel Data Pump
Reference
Release 12.00.00
B035-3021-067A
July 2007
The product or products described in this book are licensed products of Teradata Corporation or its affiliates.
Teradata, BYNET, DBC/1012, DecisionCast, DecisionFlow, DecisionPoint, Eye logo design, InfoWise, Meta Warehouse, MyCommerce,
SeeChain, SeeCommerce, SeeRisk, Teradata Decision Experts, Teradata Source Experts, WebAnalyst, and You’ve Never Seen Your Business Like
This Before are trademarks or registered trademarks of Teradata Corporation or its affiliates.
Adaptec and SCSISelect are trademarks or registered trademarks of Adaptec, Inc.
AMD Opteron and Opteron are trademarks of Advanced Micro Devices, Inc.
BakBone and NetVault are trademarks or registered trademarks of BakBone Software, Inc.
EMC, PowerPath, SRDF, and Symmetrix are registered trademarks of EMC Corporation.
GoldenGate is a trademark of GoldenGate Software, Inc.
Hewlett-Packard and HP are registered trademarks of Hewlett-Packard Company.
Intel, Pentium, and XEON are registered trademarks of Intel Corporation.
IBM, CICS, DB2, MVS, RACF, Tivoli, and VM are registered trademarks of International Business Machines Corporation.
Linux is a registered trademark of Linus Torvalds.
LSI and Engenio are registered trademarks of LSI Corporation.
Microsoft, Active Directory, Windows, Windows NT, and Windows Server are registered trademarks of Microsoft Corporation in the United
States and other countries.
Novell and SUSE are registered trademarks of Novell, Inc., in the United States and other countries.
QLogic and SANbox trademarks or registered trademarks of QLogic Corporation.
SAS and SAS/C are trademarks or registered trademarks of SAS Institute Inc.
SPARC is a registered trademarks of SPARC International, Inc.
Sun Microsystems, Solaris, Sun, and Sun Java are trademarks or registered trademarks of Sun Microsystems, Inc., in the United States and other
countries.
Symantec, NetBackup, and VERITAS are trademarks or registered trademarks of Symantec Corporation or its affiliates in the United States
and other countries.
Unicode is a collective membership mark and a service mark of Unicode, Inc.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Other product and company names mentioned herein may be the trademarks of their respective owners.
THE INFORMATION CONTAINED IN THIS DOCUMENT IS PROVIDED ON AN “AS-IS” BASIS, WITHOUT WARRANTY OF ANY KIND, EITHER
EXPRESS OR IMPLIED, INCLUDING THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR
NON-INFRINGEMENT. SOME JURISDICTIONS DO NOT ALLOW THE EXCLUSION OF IMPLIED WARRANTIES, SO THE ABOVE EXCLUSION
MAY NOT APPLY TO YOU. IN NO EVENT WILL TERADATA CORPORATION BE LIABLE FOR ANY INDIRECT, DIRECT, SPECIAL, INCIDENTAL,
OR CONSEQUENTIAL DAMAGES, INCLUDING LOST PROFITS OR LOST SAVINGS, EVEN IF EXPRESSLY ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.
The information contained in this document may contain references or cross-references to features, functions, products, or services that are
not announced or available in your country. Such references do not imply that Teradata Corporation intends to announce such features,
functions, products, or services in your country. Please consult your local Teradata Corporation representative for those features, functions,
products, or services available in your country.
Information contained in this document may contain technical inaccuracies or typographical errors. Information may be changed or updated
without notice. Teradata Corporation may also make improvements or changes in the products or services described in this information at any
time without notice.
To maintain the quality of our products and services, we would like your comments on the accuracy, clarity, organization, and value of this
document. Please e-mail: [email protected]
Any comments or materials (collectively referred to as “Feedback”) sent to Teradata Corporation will be deemed non-confidential. Teradata
Corporation will have no obligation of any kind with respect to Feedback and will be free to use, reproduce, disclose, exhibit, display, transform,
create derivative works of, and distribute the Feedback and derivative works thereof without limitation on a royalty-free basis. Further, Teradata
Corporation will be free to use any ideas, concepts, know-how, or techniques contained in such Feedback for any purpose whatsoever, including
developing, manufacturing, or marketing products or services incorporating Feedback.
Copyright © 1996-2007 by Teradata Corporation. All Rights Reserved.
Preface
Purpose
This book provides information about Teradata TPump (TPump), which is a Teradata® Tools
and Utilities product. Teradata Tools and Utilities is a group of products designed to work
with Teradata Database.
TPump is a data loading utility that helps you maintain (update, delete, insert, and atomic
upsert) the data in your Teradata Database. TPump uses standard Teradata SQL to achieve
moderate to high data loading rates to the Teradata Database. Multiple sessions and
multi-statement requests are typically used to increase throughput.
Audience
This book is intended for use by:
•
System and application programmers
•
System administrators
Supported Releases
This book supports the following releases:
•
Teradata Database 12.00.00
•
Teradata Tools and Utilities 12.00.00
•
Teradata TPumpVersion 12.00.00
Note: See “TPump Script Example” on page 72 to verify the Teradata TPump version
number.
To locate detailed supported release information:
1
Go to www.info.teradata.com.
2
Navigate to General Search>Publication Product ID.
3
Enter 3119.
4
Open the version of the Teradata Tools and Utilities Supported Versions spreadsheet
associated with this release.
The spreadsheet includes supported Teradata Database versions, platforms, and product
release numbers.
Teradata Parallel Data Pump Reference
3
Preface
Prerequisites
Prerequisites
The following prerequisite knowledge is required for this product:
•
Basic computer technology
•
SQL and Teradata SQL
•
Teradata Database, database management systems
•
Teradata utilities that load and retrieve data
Changes to This Book
The following changes were made to this book in support of the current release. Changes are
marked with change bars. For a complete list of changes to the product, see the Release
Definition associated with this release.
Date and Release
Description
July 2007
12.00.00
Updated to Teradata Warehouse 12.00.00, TTU 12.00.00, TPump 12.00.00.
See “Supported Releases” on page 3.
Extended text delimiter size and multi-character delimiters. See description
for the syntax elements “FORMAT” on page 148 and “'c'” on page 149.
Added query banding feature support. See Teradata SQL Statement “SET
QUERY_BAND” on page 30.
Added new data-related retryable error code. See “Error Types” on
page 193 and “5991” on page 196 of “Table 19: TPump Error Conditions”
on page 194.
Added note regarding the use of the latency option when using AXSMOD
and NPAXSMOD. See “LATENCY” on page 102 and “AXSMOD name” on
page 145.
Updated Run-time Parameters information. See Table 5 on page 45.
Document has incorrect title for TPump Log Field. See “Example” on
page 39, “TPump Script Example” on page 72, and “Table 13: TPump
Statistics” on page 75.
Supports an option to show Version and stop. See “RVERSION” on
page 50.
Supports multi-byte characters in object names when the client session
character set is UTF8 or UTF16. See “Rules for Using Chinese and Korean
Character Sets” on page 24.
Added a statement regarding BOM not being supported on MVS in data
file or AXSMODs using UTF8. See “UTF8 Character Sets” on page 25.
4
Teradata Parallel Data Pump Reference
Preface
Additional Information
Date and Release
Description
TPump does not always place bad row in Error Table. See“ERRLIMIT” on
page 98.
Unicode data dictionary support. See “Multibyte Character Sets” on
page 65.
Limitations/characteristics of Teradata Database versions earlier than
V2R6.0 documented in Teradata Parallel Data Pump Reference are no
longer relevant and have been reworded or removed.
Updated NOTIMERPROCESS syntax element in the BEGIN LOAD
Command. On MVS, there were intermittent abends with SEC6, reason
code = 0000FFOE. See “NOTIMERPROCESS” on page 102.
Data Conversion Capabilities, Checkpoints, and Multibyte Character Sets
information missing. See “Data Conversion Capabilities” on page 24 and
“Character Set Specifications for AXSMODs” on page 65.
DBS unable to handle 128th DML when the APPLY condition for 128th
DML has value 1 in TPump script. See changes in Chapter 3.
Removed references to ASF2TR product, discontinued effective with
12.00.00.
Added information regarding new logon string size limit of 30 bytes. See
“Multibyte Character Sets” on page 65.
Additional Information
Additional information that supports this product and Teradata Tools and Utilities is available
at the web sites listed in the table that follows. In the table, mmyx represents the publication
date of a manual, where mm is the month, y is the last digit of the year, and x is an internal
publication code. Match the mmy of a related publication to the date on the cover of this book.
This ensures that the publication selected supports the same release.
Type of Information
Description
Access to Information
Release overview
Use the Release Definition for the following
information:
1 Go to www.info.teradata.com.
• Overview of all of the products in the
release
• Information received too late to be
included in the manuals
• Operating systems and Teradata
Database versions that are certified to
work with each product
• Version numbers of each product and
the documentation for each product
• Information about available training
and the support center
3 In the Publication Product ID box, type 2029.
Late information
Teradata Parallel Data Pump Reference
2 Select the General Search check box.
4 Click Search.
5 Select the appropriate Release Definition from
the search results.
5
Preface
Additional Information
Type of Information
Description
Access to Information
Additional product
information
Use the Teradata Information Products
Publishing Library site to view or download
specific manuals that supply related or
additional information to this manual.
1 Go to www.info.teradata.com.
2 Select the Teradata Data Warehousing check box.
3 Do one of the following:
• For a list of Teradata Tools and Utilities
documents, click Teradata Tools and Utilities
and then select a release or a specific title.
• Select a link to any of the data warehousing
publications categories listed.
Specific books related to Teradata TPump are as
follows:
• Teradata Parallel Data Pump Reference
B035-3021-mmyx
• Teradata Tools and Utilities Command Summary
B035-2401-mmyx
CD-ROM images
Ordering
information for
manuals
Access a link to a downloadable CD-ROM
image of all customer documentation for
this release. Customers are authorized to
create CD-ROMs for their use from this
image.
1 Go to www.info.teradata.com.
Use the Teradata Information Products
Publishing Library site to order printed
versions of manuals.
1 Go to www.info.teradata.com.
2 Select the General Search check box.
3 In the Title or Keyword box, type CD-ROM.
4 Click Search.
2 Select the How to Order check box under Print &
CD Publications.
3 Follow the ordering instructions.
General information
about Teradata
The Teradata home page provides links to
numerous sources of information about
Teradata. Links include:
1 Go to Teradata.com.
2 Select a link.
• Executive reports, case studies of
customer experiences with Teradata,
and thought leadership
• Technical information, solutions, and
expert advice
• Press releases, mentions, and media
resources
6
Teradata Parallel Data Pump Reference
Table of Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3
Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3
Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3
Supported Releases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3
Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4
Changes to This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4
Additional Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5
Chapter 1:
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
TPump Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Complementing MultiLoad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
TPump Support Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
What it Does . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
How it Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
15
16
17
17
18
Operating Features and Capabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Operating Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Input Data Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Client Character Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Data Conversion Capabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Checkpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Unicode Character Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Client Character Set/Client Type Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
21
22
22
24
25
25
26
TPump Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
TPump Command Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Teradata SQL Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
The TPump Task. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Task Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
DML Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Upsert Feature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
TPump Macros. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Locks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Teradata Parallel Data Pump Reference
31
32
32
32
32
33
7
Table of Contents
Access Rights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .33
Fallback vs. Nonfallback Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35
Chapter 2:
Using TPump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37
Invoking TPump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37
TPump Support Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37
File Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .41
On IBM Mainframe Client-Based Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .41
On UNIX- and Windows-based Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .42
In Interactive Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .42
In Batch Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .42
Run-time Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .44
Examples - Redirection of Inputs and Outputs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .51
Terminating TPump. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .51
Normal Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .51
Abort Termination. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .52
After Terminating a TPump Job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .52
Restarting and Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .53
Basic TPump Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .53
Protection and Location of TPump Database Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . .53
Reinitializing a TPump Job . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .55
Recovering an Aborted TPump Job . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .55
Recovering from Script Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .56
Programming Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .56
TPump Command Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .56
Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .58
Using ANSI/SQL DateTime Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .63
Using Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .63
Specifying a Character Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .64
Using Graphic Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .66
Using Graphic Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .67
Restrictions and Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .67
Termination Return Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .68
Writing a TPump Job Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .69
Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .69
Script Writing Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .69
Procedure for Writing a Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .71
TPump Script Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .72
Viewing TPump Output. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .74
8
Teradata Parallel Data Pump Reference
Table of Contents
TPump Statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
TPump Options Messages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Logoff/Disconnect Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Monitoring TPump Jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Monitor Interface Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
TPump Monitor Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
TPump Monitor Macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
80
81
82
83
Estimating Space Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Space Calculation Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Chapter 3:
TPump Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Syntax Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
TPump Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
TPump Teradata SQL Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
ACCEPT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
BEGIN LOAD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
DATABASE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
DATEFORM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
DELETE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
DISPLAY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
DML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Serialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The Basic Upsert Feature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Upsert. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The Atomic Upsert Feature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
113
117
120
121
122
END LOAD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
EXECUTE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
FIELD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
FILLER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
IF, ELSE, and ENDIF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
IMPORT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
INSERT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
LAYOUT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
LOGDATA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
LOGMECH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
LOGOFF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
LOGON . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
Teradata Parallel Data Pump Reference
9
Table of Contents
LOGTABLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .168
NAME . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .170
PARTITION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .172
ROUTE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .176
RUN FILE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .178
SET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .180
SYSTEM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .182
TABLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .184
UPDATE Statement and Atomic Upsert . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .186
Chapter 4:
Troubleshooting in TPump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .193
Early Error Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .193
Error Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .193
Error Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .194
Reading TPump Error Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .197
TPump Performance Checklist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .200
Chapter 5:
Using INMOD and Notify Exit Routines. . . . . . . . . . . . . . . . . . . . . . . . . . .201
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .201
INMOD Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .201
Notify Exit Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .202
Programming Languages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .202
Programming Structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .203
Routine Entry Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .204
The TPump/INMOD Routine Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .205
TPump/Notify Exit Routine Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .206
Rules and Restrictions for Using Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .209
Using INMOD and Notify Exit Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .211
TPump-specific Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .211
TPump/INMOD Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .212
Preparing the INMOD Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .214
INMOD Input Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .215
INMOD Output Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .215
Programming INMODs for UNIX-based Clients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .216
10
Teradata Parallel Data Pump Reference
Table of Contents
Compiling and Linking a C INMOD on a UNIX-based Client. . . . . . . . . . . . . . . . . . . .
Compiling and Linking a C INMOD on MP-RAS and Sun Solaris SPARC. . . . . . . . . .
Compiling and Linking a C INMOD on a Sun Solaris Opteron . . . . . . . . . . . . . . . . . . .
Compiling and Linking a C INMOD on HP-UX PA RISC . . . . . . . . . . . . . . . . . . . . . . .
Compiling and Linking a C INMOD on HP-UX Itanium. . . . . . . . . . . . . . . . . . . . . . . .
Compiling and Linking a C INMOD on an IBM AIX . . . . . . . . . . . . . . . . . . . . . . . . . . .
Compiling and Linking a C INMOD on a Linux Client . . . . . . . . . . . . . . . . . . . . . . . . .
216
217
218
219
220
221
222
Programming INMODs for a Windows Client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
Compiling and Linking a C INMOD on a Windows Client . . . . . . . . . . . . . . . . . . . . . . 223
Appendix A:
How to Read Syntax Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
Syntax Diagram Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
Appendix B:
TPump Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
Simple Script Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
Restarted Upsert Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
Example Using the TABLE Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
Appendix C:
INMOD and Notify Exit Routine Examples . . . . . . . . . . . . . . . . . . . . . . 241
COBOL Pass-Thru INMOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
Assembler INMOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
PL/I INMOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
C INMOD - UNIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Sample Notify Exit Routine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
Teradata Parallel Data Pump Reference
11
Table of Contents
12
Teradata Parallel Data Pump Reference
List of Tables
Table 1: TPump Data Formats. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Table 2: TPump Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Table 3: Supported Teradata SQL Statements in TPump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Table 4: Comparison of Fallback and Nonfallback Target Tables . . . . . . . . . . . . . . . . . . . . . . 35
Table 5: Run-time Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Table 6: TPump Operators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Table 7: TPump Conditional Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Table 8: Predefined System Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Table 9: Ways to Either Specify a Character Set or Accept a Default Specification . . . . . . . . 64
Table 10: GRAPHIC Data Types for datadesc option in FIELD or FILLER Statement . . . . . 67
Table 11: Restrictions and Limitations on Operational Features and Functions . . . . . . . . . . 67
Table 12: Termination Return Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Table 13: TPump Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Table 14: Monitor Interface Table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Table 15: TPump Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Table 16: TPump Teradata SQL Statements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Table 17: Events that Create Notifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Table 18: ANSI/SQL DateTime Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Table 19: TPump Error Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
Table 20: Acquisition Error Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
Table 21: Programming Routines by Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
Table 22: TPump-to-INMOD Status Codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
Table 23: INMOD-to-TPump Interface Status Codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
Table 24: Events Passed to the Notify Exit Routine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
Table 25: INMOD Input Return Code Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Table 26: INMOD Output Return Code Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Teradata Parallel Data Pump Reference
13
List of Tables
14
Teradata Parallel Data Pump Reference
CHAPTER 1
Overview
This chapter provides an introduction to the Teradata TPump (TPump) utility. Topics
include:
•
TPump Utility
•
Operating Features and Capabilities
•
TPump Commands
•
The TPump Task
TPump Utility
The following information provides a general overview of the TPump utility.
Description
TPump is a data loading utility that helps you maintain (update, delete, insert, and atomic
upsert) the data in your Teradata Database. TPump allows you to achieve near-real time data
in your data warehouse.
TPump uses standard Teradata SQL to achieve moderate to high data loading rates to the
Teradata Database. Multiple sessions and multistatement requests are typically used to
increase throughput.
TPump provides an alternative to Teradata MultiLoad for the low volume batch maintenance
of large databases under control of a Teradata system. Instead of updating Teradata Databases
overnight, or in batches throughout the day, TPump updates information in real time,
acquiring data from the client system with low processor utilization. It does this through a
continuous feed of data into the data warehouse, rather than through traditional batch
updates. Continuous updates result in more accurate, timely data.
Unlike most load utilities, TPump uses row hash locks rather than table level locks. This allows
you to run queries while TPump is running. This also means that TPump can be stopped
instantaneously.
TPump provides a dynamic throttling feature that enables it to run “all out” during batch
windows, but within limits when it may impact other business uses of the Teradata Database.
Operators can specify the number of statements run per minute, or may alter throttling
minute-by-minute, if necessary.
TPump’s main attributes are:
Teradata Parallel Data Pump Reference
15
Chapter 1: Overview
TPump Utility
•
Simple, hassle-free setup – does not require staging of data, intermediary files, or special
hardware.
•
Efficient, time-saving operation – jobs can continue running in spite of database restarts,
dirty data, and network slowdowns. Jobs restart without intervention.
•
Flexible data management – accepts an infinite variety of data forms from an infinite
number of data sources, including direct feeds from other databases. TPump is also able to
transform that data on the fly before sending it to Teradata. SQL statements and
conditional logic are usable within the utilities, making it unnecessary to write wrapper
jobs around the utilities.
Note: Full tape support is not available for any function in TPump for network-attached
client systems. If you want to import data from a tape, you will need to write a custom access
module that interfaces with the tape device. Refer to the Teradata Tools and Utilities Access
Module Programmer Guide for information about how to write a custom access module.
Complementing MultiLoad
TPump uses MultiLoad-like syntax, which leverages MultiLoad knowledge and power,
provides easy transition from MultiLoad to TPump, and supports the useful upsert feature.
TPump shares much of its command syntax with MultiLoad, which facilitates conversion of
scripts between the two utilities; however, there are substantial differences in how the two
utilities operate.
TPump complements MultiLoad in the following ways:
16
1
Economies of Scale: MultiLoad has an economy of scale and is not necessarily efficient
when operating on really large tables when there are not many rows to insert or update.
For MultiLoad to be efficient, it must touch more than one row per data block in the
Teradata Database. For example, to achieve efficient MultiLoad performance on a two
billion, 65-byte row table, composed of 16KB blocks, more than 0.4% of the table
(8,125,000 rows) must be affected. While 0.4% of a table is a small update, eight million
records is probably more data than you are going to want to run through a BTEQ script.
2
Concurrency: MultiLoad is limited to a Teradata Database variable limit for the maximum
number of instances running concurrently. TPump does not impose this limit. In
addition, while MultiLoad uses table-level locks, TPump uses row-hash locks, making
concurrent updates on the same table a possibility. Finally, because of the phased nature of
MultiLoad, there are potentially inconvenient windows of time when MultiLoad cannot be
stopped without losing access to the target tables. In contrast, TPump can always be
stopped and all of its locks dropped with no ill effect.
3
Resource Consumption: MultiLoad is designed for the highest possible throughput, and
uses any database and host resources that help to achieve this capability. There is no way to
reduce MultiLoad's resource consumption—even if you are willing to accept a longer run
time for your job. TPump, however, has a built-in resource governing facility. This allows
the operator to specify how many updates occur (the statement rate) minute by minute,
and then change the statement rate, while the job continues to run. Thus, this facility can
be used to increase the statement rate during windows when TPump is running by itself,
but then decrease the statement rate later on, if users log on for ad hoc query access.
Teradata Parallel Data Pump Reference
Chapter 1: Overview
TPump Utility
TPump Support Environment
The data-handling functionality of TPump is enhanced by the TPump support environment.
In addition to coordinating activities involved in TPump tasks, it provides facilities for
managing file acquisition, conditional processing, and performing certain Data Manipulation
Language (DML) and Data Definition Language (DDL) activities on the Teradata Database.
The TPump support environment enables an additional level of user control over TPump.
For more information, see “TPump Support Environment” on page 37.
What it Does
Within a single invocation of TPump, one or more distinct TPump tasks can be executed in
series with any TPump support commands.
The TPump task provides the acquisition of data from client files for application to target
tables through INSERT, UPDATE, or DELETE statements that specify the full primary index.
Data is retrieved from the client, and sent as transaction rows to the Teradata Database, which
are immediately applied to the various target tables.
Each TPump task can acquire data from one or many client files with similar or different
layouts. From each source record, one or more INSERT, UPDATE, or DELETE statements can
be generated and directed to any target table.
The following concepts may improve your understanding of TPump.
•
The language of TPump commands and statements is used to describe the task you want
to accomplish.
•
TPump examines all commands and statements for a task, from the BEGIN LOAD
command through the END LOAD command, before actually executing the task.
•
After all commands and statements involved in a given task have been processed and
validated by TPump, the TPump task is executed as described in this and subsequent
chapters.
•
Optionally, TPump supports data serialization for a given row, which guarantees that if a
row insert is immediately followed by a row update, the insert is processed first. This is
done by hashing records to a given session.
•
TPump supports bulletproof restartability using time-based checkpoints. Using frequent
checkpoints provides a greater ease in restarting, but at the expense of the checkpointing
overhead.
•
TPump supports upsert logic similar to MultiLoad.
•
TPump supports insert/update/delete statements in multiple-record requests.
•
TPump uses macros to minimize network overhead. Before TPump begins a load, it sends
the statements to the Teradata Database to create equivalent macros for every insert/
update/delete statement used in the job script. The execute macro requests, rather than
lengthy text requests, are then executed iteratively during a job run.
•
TPump supports interpretive, record manipulating and restarting features similar to
MultiLoad.
•
TPump supports conditional apply logic, similar to MultiLoad.
Teradata Parallel Data Pump Reference
17
Chapter 1: Overview
TPump Utility
•
TPump supports error treatment options, similar to MultiLoad.
•
TPump runs as a single process.
•
TPump supports Teradata Database internationalization features such as kanji character
sets.
•
Up to 600 operations can be packed into a single request for network efficiency. The limit
of 600 may vary as the overall limit for a request is one megabyte. TPump assumes that
every statement is a one- or two- (for fallback) step request.
How it Works
TPump is a Teradata utility with functions similar to the MultiLoad utility. MultiLoad edits
Teradata tables by processing insert, updates, and deletes, and so does TPump. This section
provides insight into the important differences between MultiLoad and TPump. All of the
information in this section is discussed in further detail later in this document, either
explicitly or by implication.
Methods of Operation
MultiLoad performs an update on the Teradata Database in phases. during the first phase of
operation, MultiLoad uses a special database and CLIv2 protocol for efficiently sending
“large” (64 KB) data messages to the RDBMS. The data is stored in a temporary table. During
the second phase of operation, the temporary table is sorted and then changes from it are
“applied” to the various target tables. In this phase, processing is entirely in the RDBMS and
the MultiLoad application on the client waits to see if the job completes successfully.
TPump performs updates on the Teradata Database in a synchronous manner. Changes are
sent in conventional CLIv2 parcels and applied immediately to the target table(s). To improve
its efficiency, TPump builds multiple statement requests and provides the serialize option to
help reduce locking overhead.
Economy of Scale and Performance
MultiLoad performance improves as the volume of changes increases. This is because, in
phase two of MultiLoad, the changes are applied to the target table(s) in a single pass and all
changes for any physical data block are effected using one read and one write of the block.
Furthermore, the temporary table and the sorting process used by MultiLoad are additional
overheads that must be “amortized” through the volume of changes.
TPump, on the other hand, does better on relatively low volumes of changes because there is
no temporary table overhead. TPump becomes expensive for large volumes of data because
multiple updates to a physical data block will most likely result in multiple reads and writes of
the block.
Multiple Statement Requests
The most important technique used by TPump to improve performance over MultiLoad is the
multiple statement request. Placing more statements in a single request is beneficial for two
reasons. First, it reduces network overhead because large messages are more efficient than
small ones. Secondly, (in ROBUST mode) it reduces TPump recovery overhead, which
amounts to one extra database row written for each request. TPump automatically packs
18
Teradata Parallel Data Pump Reference
Chapter 1: Overview
TPump Utility
multiple statements into a request based upon the PACK specification in the BEGIN LOAD
command.
Macro Creation
TPump uses macros to efficiently modify tables, rather than using the actual DML
commands. The technique of changing statements into equivalent macros before beginning
the job greatly improves performance. Specifically, the benefits of using macros are:
1
the size of network (and channel) messages sent to the RDBMS by TPump are reduced.
2
RDBMS parsing engine overhead is reduced because the execution plans (or “steps”) for
macros are cached and re-used. This eliminates “normal” parser handling, where each
request sent by TPump is planned and optimized.
Because the space required by macros is negligible, the only issue regarding the macros is
where the macros are placed in the RDBMS. The macros are put into the database that
contains the restart log table or the database specified using the MACRODB keyword in the
BEGIN LOAD command.
Locking and Transactional Logic
In contrast to MultiLoad, TPump uses conventional row hash locking which allows for some
amount of concurrent read and write access to its target tables. At any point TPump can be
stopped and the target tables are fully accessible. Note however, that if TPump is stopped,
depending on the nature of the update process, it may mean that the “relational” integrity of
the data is impaired.
This differs from MultiLoad, which operates as a single logical update to one or more target
tables. Once MultiLoad goes into phase two of its logic, the job is “essentially” irreversible and
the (entire set of) table(s) is locked for write access until it completes.
If TPump operates on rows that have associated “triggers”, the triggers are invoked as
necessary.
Recovery Logic and Overhead
TPump, in “ROBUST mode”, writes one database row in the log restart table for every request
that it issues. This collection of rows in the restart log table can be referred to as the request
log. Because a request is guaranteed by the RDBMS to either completely finish or completely
rollback, the request log will always accurately reflect the completion status of a TPump
import. Thus, the request log overhead for restart logic decreases as the number of statements
packed per request increases.
TPump also allows you to specify a checkpoint interval. During the checkpoint process
TPump flushes all pending changes from the import file to the database and also cleans out
the request log. The larger the checkpoint interval, the larger the request log (and its table) is
going to grow. Upon an unexpected restart, TPump scans the import data source along with
the request log in order to re-execute the statements not found in the request log.
TPump in “SIMPLE (non-ROBUST) mode”, provides basic checkpoints. If a restart occurs
between checkpoints, then some requests will likely be reprocessed. This is adequate
protection under some circumstances.
Teradata Parallel Data Pump Reference
19
Chapter 1: Overview
TPump Utility
During phase one, MultiLoad uses checkpoints so that restarts do not force the job to always
restart from the beginning. During phase two, MultiLoad uses its temporary table as a
repository of all changes to be applied and the RDBMS process of applying the changes
guarantees that no changes are missed or applied more than once.
Serialization of Changes
In certain uses of TPump or MultiLoad it is possible to have multiple changes to one row in
the same job. For instance, the row may be inserted and then updated during the batch job or
it may be updated and then deleted. In any case, the correct ordering of these operations is
obviously very important. MultiLoad automatically guarantees that this ordering of
operations is maintained correctly. By using the serialization feature, TPump can also
guarantee that this ordering of operations is maintained correctly, but it requires some small
amount of scripting work and a small amount of utility overhead.
The use of the serialize option on the BEGIN LOAD command guarantees that TPump will
send each change for a data record of a given key in order. The KEY modifier to the FIELD
command is how a script specifies that a given field is to be part of the serialization key. The
intent of this feature is to allow you to specify the key corresponding to the primary index of
the target table. In fact, the TABLE command automatically qualifies the generated fields with
the KEY modifier when the fields are part of the primary index of the table. If the DML
statements in the TPump script specify more than one target table then it is up to the script
author to make sure that primary indices of all the tables match when using the serialization
feature.
The serialization feature works by hashing each data record based upon its key to determine
which session transmits the record to the RDBMS. Thus the extra overhead in the application
is derived from the mathematical operation of hashing and from the extra amount of
buffering necessary to save data rows when a request is already pending on the session chosen
for transmission.
The serialization feature greatly reduces the potential frequency of RDBMS deadlock.
Deadlocks can occur when requests for the application happen to affect row(s) that use the
same hash code within the RDBMS. Although deadlocks are handled by the RDBMS and by
TPump correctly, the resolution process is time-consuming and adds additional overhead to
the application because it must re-execute requests that roll back due to deadlock.
In addition to using SERIALIZEON in the BEGIN LOAD command, the SERIALIZEON
keyword can also be specified in the DML command. This lets you turn serialization on for
the fields you specify. For more information on the DML-based serialization feature, refer to
“DML” on page 113.
Dual Database Strategy
The serialization feature is intended to support a variety of other potential customer
applications that go under the general heading dual database. These are applications that in
some way take a “live feed” of inserts, updates, and deletes from another database and apply
them without any preprocessing to a Teradata Database.
Both TPump and MultiLoad are potential parts of the dual database strategy. A dual database
application will generate a DML stream which will be routed to TPump or MultiLoad through
20
Teradata Parallel Data Pump Reference
Chapter 1: Overview
Operating Features and Capabilities
a paramod/inmod specific to the application. The choice between TPump or MultiLoad will
depend on such things as the volume of data (with higher volumes favoring MultiLoad) and
the concurrent access requirements (with greater access requirements favoring TPump).
Resource Usage and Limitations
A feature unique to TPump is the ability to constrain run-time resource usage through the
statement rate feature. TPump gives you control over the rate per minute at which statements
are sent to the RDBMS and the statement rate correlates directly to resource usage on both the
client and in the RDBMS. The statement rate can be controlled in two ways, either
dynamically while the job is running, or it can be scripted into the job with the RATE keyword
on the BEGIN LOAD command. Dynamic control over the statement rate is provided by
updates to a table on the RDBMS.
In contrast with TPump, MultiLoad always uses CPU and memory very efficiently. During
phase one (assuming that the RDBMS is not a bottleneck), MultiLoad will probably
bottleneck on the client, consuming significant network or channel resources. During phase
two, MultiLoad uses very significant RDBMS disk, CPU, and memory resources. In fact, the
RDBMS limits the number of concurrent MultiLoad, FastLoad, and FastExport jobs for the
very reason that they are so resource-intensive. TPump has no such RDBMS-imposed
limitation.
Warning:
Although there is no RDBMS-imposed limitation on the number of concurrent TPump jobs, an
excessive number of small jobs causes contention on the Teradata Database system catalogue.
The limit will vary from one installation to another, and each installation should determine its
own capacity for running a multiplicity of TPump jobs to avoid potential deadlocks.
Operating Features and Capabilities
The following section describes the operating modes; input data formats; and client, unicode,
and site-defined character sets for TPump. For specific information on supported operating
systems, refer to Teradata Tools and Utilities 12.00.00 Supported and Certified Versions,
B035-3119-067K. This spreadsheet shows version numbers and platform information for all
Teradata Tools and Utilities release 12.00.00 products and is available
at www.info.teradata.com.
Operating Modes
TPump runs in the following operating modes:
•
Interactive – Interactive processing involves the more or less continuous participation of
the user.
•
Batch – Batch programs process data in discrete groups of previously scheduled
operations, typically in a separate operation, rather than interactively or in real time.
Teradata Parallel Data Pump Reference
21
Chapter 1: Overview
Operating Features and Capabilities
Input Data Formats
TPump supports the input data formats on UNIX and Windows platforms as listed in Table 1.
Mainframes have standard records.
Table 1: TPump Data Formats
Data Format
Description
BINARY
Specifies that each input record is a 2-byte integer, n, followed by n bytes of data.
FASTLOAD
Specifies that each input record is a 2-byte integer, n, followed by n bytes of data,
followed by an end-of-record marker, either X’0A’ or X’0D’.
TEXT
Specifies that each input record is an arbitrary number of bytes followed by an endof-record marker, which is a:
• Linefeed (X’0A’) on UNIX platforms.
• Carriage-return/linefeed pair (X’0D0A’) on Windows platforms.
UNFORMAT
Specifies that each input record is defined by FIELD commands of the specified
layout.
VARTEXT
Specifies that each variable-length text record has each field separated by a
delimiter character.
For a description of the supported input file formats, see the discussion of the FORMAT
option for network-attached client systems in the IMPORT Command description in
Chapter 3: “TPump Commands.”
Client Character Sets
Standard Character Sets
The following standard character sets are supported by Teradata Database.
Standard Character Sets
System Configuration
Name
Channel-Attached
EBCDIC
Network-Attached
ASCII
The terms ASCII and EBCDIC are often used in ambiguous ways, and this presents a difficulty
for accented and non-Latin characters. The user should select a client character set that exactly
matches the character set that the import data uses.
If you use accented and non-Latin characters, do not use the ASCII or EBCDIC client
character sets. Instead, load and use one of the other Teradata-supplied character sets, or a
site-defined character set that exactly matches the application character set, such as:
EBCDIC037_0E for channel-attached clients (for the United States or Canada), LATIN1_0A,
22
Teradata Parallel Data Pump Reference
Chapter 1: Overview
Operating Features and Capabilities
LATIN9_0A (for Western European languages), LATIN1252_0A for Western European
Microsoft® Windows clients, or UTF8 for UNIX clients.
Japanese Characters Sets
The following Japanese character sets are supported by Teradata Database.
Japanese Character Sets
System Configuration
Character Set Name
Channel-Attached
KATAKANAEBCDIC
KANJIEBCDIC5026_0I
KANJIEBCDIC5035_0I
Network-Attached
KANJIEUC_0U
KANJISJIS_0S
For more information on kanji character sets, refer to International Character Set Support.
Caution:
TPump statements do not accept object names specified in internal RDBMS hexadecimal
form and do not display object names in hexadecimal form.
Chinese and Korean Character Sets
Chinese and Korean character sets are available for channel- and network-attached client
systems.
The following table defines the Chinese character sets:
Chinese Character Sets
System Configuration
Name
Channel-Attached
SCHEBCDIC935_2IJ
TCHEBCDIC937_3IB
Network-Attached
SCHGB2312_1T0
TCHBIG5_1R0
The following table defines the Korean character sets:
Korean Character Sets
System Configuration
Name
Channel-Attached
HANGULEBCDIC933_1II
Teradata Parallel Data Pump Reference
23
Chapter 1: Overview
Operating Features and Capabilities
Korean Character Sets
Network-Attached
HANGULKSC5601_2R4
Rules for Using Chinese and Korean Character Sets
Certain rules apply when using Chinese and Korean character sets on channel- and networkattached platforms.
•
Object Names
Since 12.0, Teradata Database supports multi-byte characters in object names when the
client session character set is UTF8 or UTF16. For a list of valid and non-valid characters
when multi-byte object names are used, see the Appendix of International Character Set
Support.
If multi-byte characters are used in object names in TPump script, they must be enclosed
in double quotes.
•
Maximum String Length
The Teradata Database requires two bytes to process each of the Chinese or Korean
characters. This limits both request size and record size. For example, if a record consists of
one string, the length of that string is limited to a maximum of 32,000 characters or
64,000 bytes.
Data Conversion Capabilities
Teradata TPump can redefine the data type specification of numeric, character, and date input
data so it matches the type specification of its destination column in the TPump table on the
Teradata Database.
For example, if an input field with numeric type data is targeted for a column with a character
data type specification, Teradata TPump can change the input data specification to character
before inserting it into the table.
You use the datadesc specification of the Teradata TPump FIELD command to convert input
data to a different type before inserting it into the TPump table on the Teradata Database.
The types of data conversions you can specify are:
•
Numeric-to-numeric (for example integer-to-decimal)
•
Character-to-numeric
•
Character-to-date
•
Date-to-character
Note: Redundant conversions, such as integer-to-integer, are legal and necessary to support
the zoned decimal format. For more information about the zoned decimal format, data types,
and data conversions, see SQL Reference: Data Types and Literals.
24
Teradata Parallel Data Pump Reference
Chapter 1: Overview
Operating Features and Capabilities
Checkpoints
Teradata TPump supports the use of checkpoints. Checkpoints are entries posted to a restart
log table at regular intervals during the TPump data transfer operation. If processing stops
while a TPump job is running, you can restart the job at the most recent checkpoint.
For example, assume you are loading 1,000,000 records in a table and have specified
checkpoints every 50,000 records. Then Teradata TPump pauses and posts an entry to the
restart log table whenever multiples of 50,000 records are successfully sent to the Teradata
Database.
If the job stops after record 60,000 has been loaded, you can restart the job at the record
immediately following the last checkpoint—record 50,001.
You enable the checkpoint function by specifying a checkpoint value in the BEGIN LOAD
command. For more information, see “BEGIN LOAD” on page 95.
Unicode Character Sets
UTF8 and UTF16 are two of the standard ways of encoding Unicode character data. Teradata
Database supports UTF8 and UTF16 client character sets. The UTF8 client character set
supports UTF8 encoding. Currently Teradata Database supports UTF8 characters that can
consist of from one to three bytes. The UTF16 client character set supports UTF16 encoding.
Currently, the Teradata Database supports the Unicode 2.1 standard, where each defined
character requires exactly 16 bits.
There are restrictions imposed by Teradata Database on using the UTF8 or UTF16 character
set. Refer to Teradata Database International Character Set Support manual for restriction
details.
UTF8 Character Sets
TPump supports UTF8 character set on network-attached platforms and IBM MVS. When
using UTF8 character set on the network-attached platform, the predefined macro should be
used. This will ensure that the import data will be translated into the server character set
defined in the predefined macro, with Teradata Database applying the import data to the
macro during the load. When predefined macros are not supplied, TPump will create macros
according to the user's default server character set (defined when the user account is created),
which may lead to character translation errors if the user's default server character set is not
Unicode.
On IBM MVS, the job script must be in Teradata EBCDIC when using UTF8 client character
set. TPump will translate commands in the job script from Teradata EBCDIC to UTF8 during
the load. Be sure to examine the definition in the International Character Set Support manual
to determine the code points of any special characters you might require in the job script.
Different versions of EBCDIC do not always agree as to the placement of these characters.
Refer to the mappings between Teradata EBCDIC and Unicode in Appendix E of International
Character Set Support manual.
Currently, UTF8 Byte Order Mark (BOM) is not supported on the MVS platform when using
access modules or data files.
Teradata Parallel Data Pump Reference
25
Chapter 1: Overview
Operating Features and Capabilities
See Chapter 3 for complete information on TPump commands. Refer to parameters
commands nullexpr, fieldexpr, VARTEXT format delimiter, WHERE condition, and
CONTINUEIF condition for additional information on using UTF8 client character set on the
mainframe.
UTF16 Character Sets
TPump supports UTF16 character set on network-attached platforms. In general, the
command language and the job output should be the same as the client character set used by
the job. However, for user’s convenience and because of the special property of Unicode, the
command language and the job output are not required to be the same as the client character
set when using UTF16 character set. When using UTF16 character set, the job script and the
job output can either be in UTF8 or UTF16 character set. This is provided by specifying
runtime parameters "-i" and "-u" when the job is invoked.
For more reference information on runtime parameters "-i" and "-u", see parameters
-i scriptencoding and -u outputencoding on “-u outputencoding” on page 46.
Also refer to parameters commands fieldexpr “fieldexpr” on page 132, nullexpr on “nullexpr”
on page 131, WHERE condition on “WHERE condition” on page 109 and CONTINUEIF
condition on “CONTINUEIF condition” on page 158 for additional information on using
UTF16 client character set.
Client Character Set/Client Type Compatibility
Use the following table as a general guideline for choosing client character sets that may work
better for your client environment.
26
If the Client Type is
Client Character Sets that May Work Best are
Channel-attached
•
•
•
•
•
•
•
•
•
EBCDIC
EBCDIC037_0E
KANJIEBCDIC5026_0I
KANJIEBCDIC5035_0I
KATAKANAEBCDIC
SCHEBCDIC935_2IJ
TCHEBCDIC937_3IB
HANGULEBCDIC933_1II
UTF8
Network-attached running UNIX
•
•
•
•
•
•
•
•
•
ASCII
KANJIEUC_0U
LATIN1_0A
LATIN9_0A
UTF8
UTF16
SCHGB2312_1T0
TCHBIG5_1R0
HANGULKSC5601_2R4
Teradata Parallel Data Pump Reference
Chapter 1: Overview
TPump Commands
If the Client Type is
Client Character Sets that May Work Best are
Network-attached running Windows
•
•
•
•
•
•
•
•
ASCII
KANJISJIS_0S
LATIN1252_0A
UTF8
UTF16
SCHGB2312_1T0
TCHBIG5_1R0
HANGULKSC5601_2R4
Note: TPump supports UTF8 client character set on IBM MVS but not IBM VM on channelattached platforms.
Site-Defined Character Sets
When the character sets defined by Teradata Database are not appropriate for your site, you
can define your own character sets.
Refer to Teradata Database International Character Set Support for information on defining
your own character set.
TPump Commands
TPump accepts both TPump commands and a subset of Teradata SQL statements. These are
described in the following sections:
TPump Command Input
TPump commands perform two types of activities. The following table provides a description
of those activities and functions.
Activity
Description
Support
Support commands establish the TPump sessions with the Teradata Database and
establish the operational support environment for TPump.
Support commands are not directly involved in specifying a TPump task.
Task
The TPump task commands specify the actual processing that takes place for each
MultiLoad task.
The task commands, combined with Teradata SQL INSERT, UPDATE, and DELETE
statements, are used to define TPump IMPORT and DELETE tasks.
The TPump commands that perform the support and task activities are listed in Table 2:
Teradata Parallel Data Pump Reference
27
Chapter 1: Overview
TPump Commands
Table 2: TPump Commands
Activity
TPump
Command
Function
Support
ACCEPT
Allows the value of one or more utility variables to be accepted from a file
DATEFORM
Lets you define the form of the DATE data type specifications for the TPump job.
DISPLAY
Writes messages to the specified destination
ELSE
Followed by commands and statements which execute when the preceding IF command is false
ENDIF
Delimits the group of TPump commands and statements that were subject to previous IF or
ELSE commands or both
IF
When followed by a conditional expression, initiates execution of subsequent commands and
statements
LOGDATA
Supplies parameters to the LOGMECH command beyond those needed by the logon
mechanism, such as user ID and password, to successfully authenticate the user
LOGMECH
Identifies the appropriate logon mechanism by name
LOGOFF
Disconnects all active sessions and terminates TPump support on the client
LOGON
Specifies the LOGON string to be used in connecting all sessions established by TPump
LOGTABLE
Identifies the table to be used for journaling checkpoint information required for safe, automatic
restart of the TPump support environment in the event of a client or Teradata Database hardware
platform failure.
NAME
Sets the variable SYSJOBNAME to the jobname string specified. The jobname string can be up to
16 bytes in length and can contain kanji characters.
ROUTE
Identifies the destination of output produced by the TPump support environment.
RUN FILE
Invokes the specified external source as the current source of commands and statements.
SET
Assigns a data type and a value to a utility variable.
SYSTEM
Suspends TPump to issue commands to the local operating system.
Task
28
BEGIN LOAD Specifies the kind of TPump task to be executed, the target tables to be used, and the parameters
for executing the task
FIELD
Defines a field of the data source record
Used with LAYOUT command
DML
Defines a label and error treatment option(s) for the Teradata SQL DML statement(s) following
the DML command
END LOAD
Indicates completion of TPump command entries and initiates execution of the task
Teradata Parallel Data Pump Reference
Chapter 1: Overview
TPump Commands
Table 2: TPump Commands (continued)
Activity
TPump
Command
Function
FILLER
Defines a field in the data source that is not sent to the Teradata Database. Used with LAYOUT
command
IMPORT
Identifies the data source, the layout, and the DML operation(s) to be performed, with optional
conditions for performing these operations
LAYOUT
Introduces the record format of the data source to be used in the TPump task
This command is followed by a sequence or combination of FIELD, FILLER, and TABLE
commands.
PARTITION
Establishes session partitions to transfer SQL requests to the Teradata Database
TABLE
Identifies a table whose column names and data descriptions are used as the field names and data
descriptions of the data source records
Used with LAYOUT command
Teradata SQL Statements
Teradata SQL statements define and manipulate the data stored in the Teradata Database.
TPump supports a subset of Teradata SQL statements so you do not need to invoke other
utilities to perform routine database maintenance functions before executing TPump utility
tasks. You can, for example, use the supported Teradata SQL statements to:
•
Create the table that you want to load
•
Establish a database as an explicit table name qualifier
•
Add checkpoint specifications to a journal table
The Teradata SQL statements supported by TPump are summarized in Table 3. TPump
supports only the Teradata SQL statements listed in this table. To use any other Teradata SQL
statements, you must enter them from another application, such as BTEQ.
The subset of Teradata SQL supported by the TPump support environment excludes usergenerated transactions (BEGIN TRANSACTION; END TRANSACTION;).
Table 3: Supported Teradata SQL Statements in TPump
Teradata SQL Statement
Function
ALTER TABLE
Changes the column configuration or options of an existing table
CHECKPOINT
Adds checkpoint entry to a journal table
COLLECT STATISTICS
Collects statistical data for one or more columns of a table
COMMENT
Stores or retrieves a comment string associated with a database
object
Teradata Parallel Data Pump Reference
29
Chapter 1: Overview
TPump Commands
Table 3: Supported Teradata SQL Statements in TPump (continued)
Teradata SQL Statement
Function
CREATE DATABASE
CREATE MACRO
CREATE TABLE
CREATE VIEW
Creates a new database, macro, table, or view
DATABASE
Specifies a new default database for the current session
DELETE
Removes rows from a table
DELETE DATABASE
Removes all tables, views, and macros from a database
DROP DATABASE
Removes an empty table from the Teradata Database
EXECUTE
Specifies a user-created (predefined) macro for execution. The
macro named in this statement resides in the Teradata Database
and specifies the type of DML statement (INSERT, UPDATE,
DELETE, or UPSERT) being handled by the macro.
GIVE
Transfers ownership of a database to another user
GRANT
Grants access privileges to a database object
INSERT
Inserts new rows to a table
MODIFY DATABASE
Changes the options of an existing database
RENAME
Changes the name of an existing table, view, or macro
REPLACE MACRO
REPLACE VIEW
Redefines an existing macro or view
REVOKE
Rescinds access privileges to a database object
SET QUERY_BAND
Sets the query band for a session and transaction
Note: The statement can be used in two ways:
SET QUERY_BAND = 'Document=XY1234;
Universe=East;' FOR SESSION;
SET QUERY_BAND = NONE FOR SESSION;
SET SESSION COLLATION
Overrides the collation specification for the current session
SET SESSION OVERRIDE
REPLICATION ON/OFF
Turn on/off replication service
UPDATE Statement and Atomic Changes the column values of an existing row in a table
Upsert
TPump supports statements starting with anything in the above list only in the sense that it
submits them to the Teradata Database and deals with the success, failure, or error response.
TPump rejects as unsupported any statements beginning with anything not in the above list
and does not submit them to the Teradata Database. While restarting, only DATABASE and
SET statements are reexecuted. The existence of a log table causes TPump on the client to
execute its restart logic.
30
Teradata Parallel Data Pump Reference
Chapter 1: Overview
The TPump Task
Note that, although SET is in the list, the only SET statements truly supported are the Teradata
SQL SET statements: SET SESSION COLLATION and SET SESSION DATABASE. Any other
SET statement passed through to the Teradata Database is rejected.
Teradata SQL statements from the input command file are sent to the Teradata Database for
execution via CLIv2. Pertinent information returned in SUCCESS, FAILURE, or ERROR
parcels is listed in the message destination.
Caution:
Do not issue a DELETE DATABASE statement to delete the database containing the restart log
table because this terminates the TPump job. See “Reinitializing a TPump Job” on page 55 for
restart instructions if the restart log table is accidentally dropped.
Support environment statements may be executed between invocations of TPump tasks.
These include DATABASE, CHECKPOINT, and CREATE TABLE statements. The BEGIN
LOAD command then starts a TPump task script.
You may direct the action of TPump by commands and DML statements retrieved from an
external source. The data source for these commands and statements may be specified in the
TPump RUN FILE command, if one is used.
The TPump support environment parses lines that begin with a period as commands. The
period distinguishes commands from Teradata SQL statements, which are passed to the
Teradata Database without parsing. More than one statement per line is not allowed but
statements can span multiple lines.
TPump follows the same rules as standard Teradata SQL for OPERATIONS on NULL.
Refer to Teradata Database SQL Reference: Fundamentals for more information about using
Teradata SQL statements.
The TPump Task
The TPump task is designed for the batch application of client data to one or more tables on
the Teradata Database via DML commands and statements (INSERT, UPDATE, or DELETE).
TPump executes these DML statements in multiple-record requests.
The following information provides more information about the TPump task:
•
Task Limits
•
DML Commands
•
Upsert Feature
•
TPump Macros
•
Locks
•
Access Rights
•
Fallback vs. Nonfallback Tables
Teradata Parallel Data Pump Reference
31
Chapter 1: Overview
The TPump Task
Task Limits
TPump supports only single-row, primary index operations. Up to 600 of these operations
can be packed into a single request for network efficiency. The 600-statement upper limit is
arbitrary and may actually be lower for statements associated with large data parcels that may
exceed the overall limit of 64 KB for a request, or where a statement itself is very long.
DML Commands
DML commands appear with their associated INSERT, UPDATE, or DELETE DML
statements, together with the IMPORT commands that identify data to be read from the
client.
TPump DML statements support a conditional apply logic similar to MultiLoad, in which
DML statements are applied based on record field contents.
Specified DML statements following a DML command apply data from one or more separate
data sources. The data sources contain a record for each table row to which one or more
statements apply. Each IMPORT command identifies a separate data source, and references
LAYOUT and DML commands. The IMPORT command matches records of the data source
to the applicable DML statement or statements by means of its APPLY clauses.
The LAYOUT command defines the layout of the records of a data source, using the
parameters and a sequence of FIELD, FILLER, and TABLE commands. The DML command
identifies an immediately following set of one or more DML statements.
Each DML statement is converted into a macro and used for the duration of the import.
As TPump reaches the end of one data source, as identified by the IMPORT command, it
continues with the next IMPORT command.
Upsert Feature
TPump’s upsert feature is a composite of UPDATE and INSERT functionality applied to a
single row. TPump upsert logic is similar to that used in MultiLoad, the only other load utility
with this feature. The DML statements required to execute each iteration of upsert are a single
UPDATE statement, followed by a single INSERT statement.
With upsert, if the UPDATE fails because the target row does not exist, TPump automatically
executes the INSERT statement. This capability can save considerable loading time by
completing this operation in a single pass instead of two.
TPump Macros
Before beginning a load, TPump creates equivalent macros on the RDBMS, based on the
actual DML statements. That is, for every INSERT, UPDATE, DELETE, and UPSERT
statement in the DML statement, TPump creates an equivalent macro for it. These macros are
then executed iteratively, in place of the actual DML statement, when an import task begins,
and are removed when all import tasks are complete. The use of macros in place of lengthy
requests helps to minimize network and parsing overhead.
32
Teradata Parallel Data Pump Reference
Chapter 1: Overview
The TPump Task
For greater efficiency, TPump also supports the use of predefined macros, rather than creating
macros from the actual DML statements. A predefined macro is created by the user and
resides on the RDBMS before a TPump import task begins. When a predefined macro is used,
TPump uses this macro directly instead of creating another macro. The use of predefined
macros allows TPump to avoid the overhead of creating/dropping macros internally, and also
to avoid modifying the data dictionary on the Teradata Database during the job run.
TPump uses the EXECUTE command to support predefined macros. For more information
on using predefined macros, refer to the EXECUTE command in this manual. For more
information about creating a macro, see the Teradata Database SQL Reference: Data Definition
Statements.
For more information about executing a macro, see the Teradata Database SQL Reference:
Data Manipulation Statements.
Locks
TPump uses conventional row hash locking, which allows for some amount of concurrent
read and write access to its target tables. At any point, TPump can be stopped, making the
target tables fully accessible. If TPump is stopped, however, depending on the nature of the
update process, the relational integrity of the data may be compromised.
Although TPump always uses conventional row hash locking, based on the nature of SQL
statements used in the TPump job and the status of the target tables, a TPump job may
introduce other levels of locking in a job run. For example, if a target table of a TPump job has
a trigger defined and this trigger uses table-level locking when it is triggered, this TPump job
may cause a table level-locking if such a trigger is triggered during the run. The TPump script
developer should be familiar with the property of the database on which the TPump job will
run and be aware of such possibilities.
Access Rights
TPump users must have access rights on the database containing the log restart table because
TPump orchestrates the creation of macros to use during the task.
Dropping the log table makes it impossible to restart a TPump job. Dropping the macros or
the error table makes it very difficult to restart a TPump job.
TPump does not have any special protections on database objects it creates. Therefore, it is the
responsibility of TPump administrators and users to ensure that access rights on databases
used by TPump have been established.
Most of the access rights for TPump are intuitive. For example:
•
CREATE TABLE is required on the database where the log table is placed.
•
CREATE TABLE is required on the database where the error table is placed.
•
CREATE/DROP MACRO is required on the database where macros are placed.
•
EXECUTE MACRO is required on the database where the macros are placed.
Teradata Parallel Data Pump Reference
33
Chapter 1: Overview
The TPump Task
Macros
The use of macros slightly complicates the access rights for TPump. The remaining access
rights necessary to run a TPump job have two different scenarios.
1
Where a TPump macro is placed in the same database as the table which it affects, required
rights are INSERT/UPDATE/DELETE on the table affected, corresponding to the DML
executed.
2
Where a TPump macro is placed in a different database from the table it affects, required
rights specify that the database where the macro is placed must have INSERT/UPDATE/
DELETE, WITH GRANT in the table affected, corresponding to the DML executed. You
must also have EXECUTE MACRO on the database where the macro is placed.
Note that when the TPump job uses EXEC to directly specify a macro, the access rights
scenarios are the same, except that you do not need the CREATE/DROP MACRO privilege
since the macro exists both before and after the job.
Tables
You must have the corresponding INSERT, UPDATE, or DELETE privilege for each table to be
changed by the TPump task. Multiple tables can be targeted by a single TPump job.
The BEGIN LOAD command invokes TPump to execute task processing. Any statement of
this task applies each matching imported data record to each of its target table rows having the
specified index value. TPump supports all table types. Unlike MultiLoad, there are no
forbidden index types. Thus, the tables may be either empty or populated, as well as being
either with or without secondary indices.
All required data is imported; none is obtained from tables already existing in the Teradata
Database. No statement of an IMPORT task may make any reference to a table or row other
than the one affected by the statement.
All INSERT statements, when considered in conjunction with each applicable imported
record, must explicitly specify values for all columns except those for which a default value
(including null) is defined. All UPDATE and DELETE statements, when considered in
conjunction with each applicable imported record, must explicitly specify values for all
columns of the primary index. In order to fulfill this requirement for UPDATE and DELETE
statements, you must supply a series of ANDed terms of either form:
column_reference = colon_variable_reference
or
column_reference = constant
TPump does not process UPDATE and DELETE statements that contain ORed terms because
TPump must hash the imported records with a value from the import file (or with a NULL).
Any attempt to use an OR with these statements causes TPump to fail. You can work around
this by simply creating two separate DML statements and applying them conditionally.
TPump imposes some restrictions on the updates of an IMPORT task. It rejects updates that
try to change the value of the primary index of a row, but accepts even reflexive updates of
other columns. A reflexive update of a column computes the new value as the result of an
expression that involves the current value of one or more columns.
34
Teradata Parallel Data Pump Reference
Chapter 1: Overview
The TPump Task
TPump processes and validates all statements from the BEGIN LOAD through the END
LOAD statements. TPump control and processing sessions are established and Teradata SQL
requests are transmitted to the Teradata Database. TPump creates a single error table and a set
of macros, one for each DML statement. Nothing protects target tables from concurrent
access.
TPump imports data, evaluating each record according to specified apply conditions. For each
satisfied apply condition, a record is sent to the Teradata Database. If the record causes an
error, this sequence number is available in the error table so that the record can be identified.
When the task completes, all locks are released, all macros dropped and, if empty, the error
table is dropped. Statistics concerning the outcome of the IMPORT task are reported.
Access logging can cause a severe performance penalty. If all successful table updates are
logged, a log entry is made for each operation. The primary index of the access logging table
may then create the possibility of row hash conflicts.
Fallback vs. Nonfallback Tables
Target tables can be either fallback or nonfallback. The differences between, and
characteristics of, these tables are listed in Table 4:
Table 4: Comparison of Fallback and Nonfallback Target Tables
Fallback Tables
Nonfallback Tables
TPump task continues to execute even if AMPs
are down, as long as there is not more than one
AMP down, either logically or physically in a
cluster.
If one or more AMPs are down prior to entering
the task and if one or more target tables are
nonfallback, TPump terminates.
If two or more AMPs in a cluster are logically or
physically down, or both, the task does not run,
or terminates if running.
The TPump task may be restarted as soon as all
AMPs are back up.
During the task, if AMPs are down to the extent
that data on the DSU is corrupted, the affected
tables must be restored.
If an AMP goes down once the task has started,
the task cannot be restarted until all AMPs are
back up.
Not applicable.
If more than one AMP in the same cluster is
down, the Teradata Database cannot come up.
Not applicable.
Certain I/O errors during the task corrupt the
target table so that it must be restored. In this
case, TPump terminates.
Teradata Parallel Data Pump Reference
35
Chapter 1: Overview
The TPump Task
36
Teradata Parallel Data Pump Reference
CHAPTER 2
Using TPump
This chapter provides detailed information about using the TPump utility. Topics include:
•
Invoking TPump
•
Terminating TPump
•
Restarting and Recovery
•
Programming Considerations
•
Writing a TPump Job Script
•
Viewing TPump Output
•
Monitoring TPump Jobs
•
Estimating Space Requirements
Invoking TPump
TPump Support Environment
This section describes those TPump functions that are invoked from the TPump support
environment on the client system. The TPump support environment is a platform from which
TPump and a number of standard Teradata SQL, DDL, and DML operations can be
performed. This client program includes a facility for executing Teradata SQL and is separate
from TPump tasks that run in the Teradata Database.
TPump support environment functionality includes:
•
Providing access to the data manipulation and data definition functions of the Teradata
SQL language.
•
User-defined variables and variable substitution.
•
System-defined variables (for example, date and time).
•
Conditional execution based on the value of return codes and variables.
•
Expression evaluation.
•
Redirection of command input.
•
Runtime parameters for TPump invocation, foreign language support, and error logging
functions.
•
Character set selection options for IBM mainframe and UNIX client-based systems.
The TPump support environment allows you to prepare for an initial invocation or
resumption of a TPump task without having to invoke multiple distinct utilities. For example,
Teradata Parallel Data Pump Reference
37
Chapter 2: Using TPump
Invoking TPump
you may need to create the table that is to be loaded, establish a database as an implicit tablename qualifier, or checkpoint the relevant permanent journal.
Any statement not preceded by a period (.) is assumed to be a Teradata SQL statement and is
sent to the Teradata Database to be processed. An object name in an Teradata SQL statement
may contain Katakana or multibyte characters when the appropriate character set is used.
The TPump support environment interprets the commands and statements that define the
job. It also controls the execution of those commands and manages recovery from the
Teradata Database or client failures during processing.
Those commands not directly involved in defining a TPump task, but providing supportive
functions (routing output, for example), are considered TPump support commands. These
are individually executed as they are encountered.
The commands that define a single TPump task are processed by the client as a single unit.
These are considered to be TPump task commands. The actual execution of the TPump task is
deferred until all pertinent task commands have been considered and validated by the client
program.
Support Environment Input/Output
Support environment I/O appears in the following forms:
•
Command and statement input (default = SYSIN/stdin)
•
Accept command input from file
•
Command and statement output (default = SYSPRINT/stdout)
•
Display command output (default = SYSPRINT/stdout)
•
Error output (default = SYSPRINT/stdout)
Note: For IBM statement input, the default is initially the user-provided invocation parameter
(JCL PARM), if specified. After all commands and nested files in the parameter are processed,
the default is SYSIN.
SYSPRINT/stdout Output
The characteristics of SYSPRINT output for VM/MVS and UNIX standard output (stdout)
are:
38
•
The first five positions of each output line are reserved. They contain a statement number
if the line is the beginning of a TPump statement. This also applies to comments preceding
TPump statements.
•
If the output line is a TPump-generated message, the first five positions contain the string
****.
•
In all other cases, the first five positions are blank.
•
A message indicating the processing start date appears at the beginning of every job.
•
TPump-generated messages are preceded by a header displaying system time. This
timestamp appears on the same line as the message and follows the **** string.
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Invoking TPump
Example
This example depicts each type of SYSPRINT/stdout output line noted in the previous list.
**** 13:57:16 UTY6633 WARNING: No configuration file, using build defaults
========================================================================
=
=
=
Teradata Parallel Data Pump Utility
Release 12.00.00.00 =
=
Platform MP-RAS
=
=
=
========================================================================
=
=
=
Copyright 1997-2007, NCR Corporation. ALL RIGHTS RESERVED. =
=
=
========================================================================
**** 13:57:16 UTY2411 Processing start date: FRI MAY 04, 2007
========================================================================
=
=
=
Logon/Connection
=
=
=
========================================================================
0001 .LOGTABLE sfdlogtable;
0002 .LOGON 9/sfd,;
**** 13:57:19 UTY8400 Teradata Database Release: 12.00.00.00
**** 13:57:19 UTY8400 Teradata Database Version: 12.00.00.00
**** 13:57:19 UTY8400 Default character set: ASCII
**** 13:57:19 UTY8400 Maximum supported buffer size: 1M
**** 13:57:19 UTY8400 Upsert supported by RDBMS server
**** 13:57:24 UTY6211 A successful connect was made to the RDBMS.
**** 13:57:24 UTY6217 Logtable 'sfd.sfdlogtable' has been created.
========================================================================
=
=
=
Processing Control Statements
=
=
=
========================================================================
0003 /*****************************************************************/
/* Test handling multiple TPump tasks.
*/
/*****************************************************************/
create table ImpX01A ( f1 char(1),
f2 char(2),
f3 char(3) );
**** 13:57:26 UTY1016 'CREATE' request successful.
0004 .begin LOAD
sessions 4 1
pack 10
robust off
serialize off
checkpoint 30
nomonitor
errortable ImpX01A_errtbl;
========================================================================
=
=
=
Processing TPump Statements
=
=
=
========================================================================
0005 .layout Lay1;
0006 .field f1 * char(1);
0007 .field f2 * char(2);
0008 .field f3 * char(3);
Teradata Parallel Data Pump Reference
39
Chapter 2: Using TPump
Invoking TPump
0009 .dml label dml1;
0010 insert ImpX01A (f1, f2, f3) values ( :f1, :f2, :f3);
0011 .import infile dat01
layout lay1
apply dml1;
0012 .end LOAD;
**** 13:57:27 UTY6609 Starting to log on sessions...
**** 13:57:27 UTY6610 Logged on 4 sessions.
========================================================================
=
=
=
TPump Import(s) Beginning
=
=
=
========================================================================
**** 13:57:27 UTY6630 Options in effect for following TPump Import(s):
.
Tenacity:
4 hour limit to successfully connect load sessions.
.
Max Sessions:
4 session(s).
.
Min Sessions:
1 session(s).
.
Checkpoint:
30 minute(s).
.
Errlimit:
No limit in effect.
.
Restart Mode:
SIMPLE.
. Serialization:
OFF.
.
Packing:
10 Statements per Request.
.
StartUp Rate:
UNLIMITED Statements per Minute.
**** 13:57:31 UTY6608 Import 1 begins.
**** 13:57:36 UTY6641 Since last chkpt., 200 recs. in, 200 stmts., 20 reqs
**** 13:57:36 UTY6647 Since last chkpt., avg. DBS wait time: 0.25
**** 13:57:36 UTY6612 Beginning final checkpoint...
**** 13:57:36 UTY6641 Since last chkpt., 200 recs. in, 200 stmts., 20 reqs
**** 13:57:36 UTY6647 Since last chkpt., avg. DBS wait time: 0.25
**** 13:57:36 UTY6607 Checkpoint Completes with 200 rows sent.
**** 13:57:36 UTY6642 Import 1 statements: 200, requests: 20
**** 13:57:36 UTY6643 Import 1 average statements per request: 10.00
**** 13:57:36 UTY6644 Import 1 average statements per record: 1.00
**** 13:57:36 UTY6645 Import 1 statements/session: avg. 50.00, min. 50.00, max.
50.00
**** 13:57:36 UTY6646 Import 1 requests/session: average 5.00, minimum 5.00, max
imum 5.00
**** 13:57:36 UTY6648 Import 1 DBS wait time/session: avg. 1.25, min. 0.00, max.
3.00
**** 13:57:36 UTY6649 Import 1 DBS wait time/request: avg. 0.25, min. 0.00, max.
0.60
**** 13:57:36 UTY1803 Import processing statistics
.
IMPORT 1
Total thus far
.
=========
==============
Candidate records considered:........
200.......
200
Apply conditions satisfied:..........
200.......
200
Errors loggable to error table:......
0.......
0
Candidate records rejected:..........
0.......
0
** Statistics for Apply Label : DML1
Type
Database
Table or Macro Name
Activity
I
sdf
ImpX01A
200
**** 13:57:37 UTY0821 Error table sdf.ImpX01A_errtbl is EMPTY, dropping table.
========================================================================
=
Logoff/Disconnect
=
========================================================================
**** 13:57:45 UTY6216 The restart log table has been dropped.
**** 13:57:45 UTY6212 A successful disconnect was made from the RDBMS.
**** 13:57:45 UTY2410 Total processor time used = '0.270389 Seconds'
.
Start : 13:57:16 - MON JUNE 25, 2007
40
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Invoking TPump
.
End
: 13:57:45 - MON JUNE 25, 2007
.
Highest return code encountered = '0'.
turn code encountered = '0'.
File Requirements
Certain files are required for invoking TPump. In addition to the input data source, TPump
accesses four different data sets/files or input/output devices:
Data Set/File or Device
Provides
standard input
TPump commands and Teradata SQL statements that make up
your TPump job
standard output
Destination for TPump output responses and messages
standard error
Destination for TPump errors
configuration
Optional specification of TPump utility default values
When running TPump in interactive mode, the keyboard functions as the standard input
device and the display screen is the standard output/error device.
When running TPump in batch mode, you must specify a data set or file name for each of
these functions. The method of doing this varies, depending on the configuration of your
client system:
•
On network-attached client systems, use the standard redirection mechanism
(< infilename and > outfilename) to specify the TPump files when you invoke the utility.
•
On channel-attached client systems, use standard VM EXEC or MVS JCL control
statements (FILEDEF and DD) to allocate and create the TPump data sets or files before
you invoke the utility.
On IBM Mainframe Client-Based Systems
Start TPump with a RUN FILE command, with optional invocation parameters, such as JCL
PARM. These are interpreted as a string of TPump support environment commands,
separated by, and ending with, semicolons.
After invocation, the first two commands executed must be LOGON and LOGTABLE. These
commands are required and are permitted only once. Either can be supplied in the command
string invoking TPump, and the other (or both) can appear in the INPUT file, or in a file
called with the RUN FILE command. No commands can precede the LOGON, LOGTABLE,
or RUN FILE commands.
If you do not use a RUN FILE command to specify an initial source of commands and
Teradata SQL statements, TPump defaults to the conventional source of control input, such as
SYSIN.
If a RUN FILE command is found in the parameter (PARM) input, the input source it
identifies is used prior to SYSIN. Whether the input source is specified by RUN FILE, or by
Teradata Parallel Data Pump Reference
41
Chapter 2: Using TPump
Invoking TPump
SYSIN, processing continues until a LOGOFF command, the end of control input, or a
terminating error is encountered, whichever occurs first. If all input is exhausted without
encountering a LOGOFF command, or if the program terminates because of an error, TPump
automatically performs the LOGOFF function.
The LOGON command establishes a Teradata SQL session that TPump uses for processing.
The LOGTABLE command specifies a table to be used as the restart log in the event of failure.
This table is placed in the default database unless otherwise specified.
You must have CREATE TABLE, INSERT, UPDATE, and SELECT rights on the database
containing the restart log table.
Preparatory statements, which are processed by the Teradata SQL-processing function of
TPump, must be executed before beginning a TPump task. It is here that any desired
DATABASE statement and any desired CREATE TABLE statements are specified. At this point,
a BEGIN LOAD command initiates a TPump task.
On UNIX- and Windows-based Systems
Start the TPump utility on Teradata Database for UNIX and Windows with a UNIX-style
command format.
The rules for invoking TPump under UNIX are the same as for IBM mainframe client-based
systems described in the preceding section. The difference lies in UNIX syntax.
The Windows syntax and the UNIX syntax are essentially the same, the main difference being
that single quotes should be used on UNIX systems and double quotes should be used on
Windows systems.
In Interactive Mode
To invoke TPump in interactive mode, enter tpump at your system command prompt:
tpump
TPump displays the following message to begin your interactive session:
================================================================
=
=
= Teradata Parallel Data Pump Utility
Release mm.mm.mm.mmm =
= Platform xxxxx
=
=
=
================================================================
where
•
mm.mm.mm.mmm is the release level of your TPump utility software and
•
xxxxx is the platform on which the TPump utility software is running.
In Batch Mode
This section covers invoking TPump in batch mode on network-attached and channelattached systems.
42
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Invoking TPump
For a description of how to read the syntax diagrams used in this book, see Appendix A: “How
to Read Syntax Diagrams.”
In Batch Mode on Network-attached UNIX Systems
Refer to the runtime parameter descriptions in Table 5 on page 45 and use the following
syntax to invoke TPump on network-attached UNIX client systems:
tpump
-b
c charactersetname
< infilename
> outfilename
-C filename
-d periodicityvalue
-e errorfilename
-f numberofbuffers
-m
-r 'tpump command'
-v
-y
-i scriptencoding
-u outputencoding
-t nn
-V
3021E016
In Batch Mode on Network-attached Windows Systems
Refer to the runtime parameter descriptions in Table 5 on page 45 and use the following
syntax to invoke TPump on network-attached Windows client systems:
tpump
-b
-c charactersetname
-C filename
< infilename
> outfilename
-d periodicityvalue
-e errorfilename
-f numberofbuffers
-m
-r "tpumpcommand"
-v
-y
-i scriptencoding
-u outputencoding
-t nn
-V
3021E015
Note: The Windows syntax is essentially the same as the UNIX system, the main difference
being that single quotes should be used on UNIX systems and double quotes should be used
on Windows systems.
Teradata Parallel Data Pump Reference
43
Chapter 2: Using TPump
Invoking TPump
In Batch Mode on Channel-attached MVS Systems
Refer to the runtime parameter descriptions in Table 5 on page 45 and use the following
syntax to invoke TPump on channel-attached MVS client systems.
// EXEC TDSTPUMP
PARM = ,
,
BRIEF
BUFFERS = numberofbuffers
CHARSET = charactersetname
CONFIG = filename
ERRLOG = filename
MACROS
PRDICITY = periodicityvalue
VERBOSE
'tpump command'
RTYTIMES = nn
3021D014
RVERSION
In Batch Mode on Channel-attached VM Systems
Refer to the runtime parameter descriptions in Table 5 on page 45 and use the following
syntax to invoke TPump on channel-attached VM client systems.
EXEC TPUMP
BRIEF
,
BUFFERS = numberofbuffers
CHARSET = charactersetname
CONFIG = filename
ERRLOG = filename
MACROS
PRDICITY = periodicityvalue
VERBOSE
'tpump command'
RTYTIMES = nn
RVERSION
3021D013
Note: On VM, you must use the following statement before the EXEC LOAD statement:
"GLOBAL LOADLIB DYNAMC"
Run-time Parameters
Table 5 describes the run-time parameters used by TPump on channel-attached and networkattached systems.
44
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Invoking TPump
Table 5: Run-time Parameters
Run-time parameter/systems
Channel-attached
Network-attached
Description
BRIEF
-b
Specifies reduced print output, which limits TPump
printout to the minimal information required to
determine the success of the job:
•
•
•
•
•
BUFFERS =
numberofbuffers
-f numberofbuffers
Header information
Logon/logoff information
Candidate records
Insert, update, and delete results
Error table counts
Sets the number of request buffers.
For Teradata Tools and Utilities 06.02.00 and earlier, you
can set the buffers runtime parameter from 2 to a
maximum of 10. The default value is 2.
Beginning with Teradata Tools and Utilities 06.02.00.01,
you can set the buffers runtime parameter with a lower
limit of 2 and no upper limit. The default value is 3.
The maximum number of request buffers that may be
allocated is BUFFERS * session_count.
Beginning with Teradata Tools and Utilities 06.02.00.01,
request buffers are a global resource, so buffers are
assigned to any session as needed, and then returned to a
free pool. At any point in time, the number of request
buffers assigned to a session can vary from zero to
BUFFERS * session_count.
Prior to Teradata Tools and Utilities 06.02.00.01, a request
buffer was permanently owned by the session to which it
was first assigned, and so could not be used by any other
session. The maximum number of requests buffers that a
session could own was determined by the value of
BUFFERS.
CHARSET =
charactersetname
-c charactersetname Defines a character set for the TPump job.
The character set specification remains in effect for the
entire TPump job, even if the Teradata Database server
resets, causing the TPump job to be restarted.
Note: The character set specification does not remain in
effect if the client system fails, or if you cancel the TPump
job. In these cases, when you resubmit the job, you must
use the same character set specification that you used on
the initial job. If you use a different character set
specification when you resubmit such a job, the data
loaded by the restarted job will not appear the same as the
data loaded by the initial job.
Teradata Parallel Data Pump Reference
45
Chapter 2: Using TPump
Invoking TPump
Table 5: Run-time Parameters (continued)
Run-time parameter/systems
Channel-attached
Network-attached
Description
If you do not enter a character set specification, the
default is whatever character set that is specified for the
Teradata Database when you invoke TPump.
Note: See Client Character Sets in Chapter 1 for more
information on supported character sets.
When using a UTF16 client character set on the network
or UTF8 client character set on the mainframe, specify
the client character set name by the runtime parameter
(that is, "-c" on the network and "CHARSET" on the
mainframe.)
Not Applicable
-i scriptencoding
Specifies the encoding form of the job script. If this
parameter is not specified and the client character set is
UTF16, TPump interprets the job script to UTF16. If
character-type data is also specified in the script, TPump
converts the string literals and the corresponding field in
the import data to the same character set before
comparing or concatenating them. (String literals are
specified with .APPLY…WHERE….;
LAYOUT…CONTINUEIF….; FIELD…NULLIF…;
FIELD…||…commands.)
Valid encoding options are:
• UTF8
• UTF16-BE
• UTF16-LE
• UTF16
The specified encoding character set applies to all script
files included by the .RUN FILE commands.
The UTF16 or UTF 8 Byte Order Mark (BOM) can be
present or absent in the script file.
When UTF16 BOM is present and 'UTF16' is specified,
TPump interprets the script according to the endianness
indicated by the UTF16 BOM. When the UTF16 BOM is
not present, TPump interprets the script according to the
endianness indicated by the encoding option.
Not Applicable
-u outputencoding
Specifies the encoding form of the job output. The
parameter is valid only when the UTF16 client character
set is used.
When this parameter is used, it should be placed in front
of other runtime parameters to ensure the whole job
output is printed in the desired encoding form.
46
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Invoking TPump
Table 5: Run-time Parameters (continued)
Run-time parameter/systems
Channel-attached
Network-attached
Description
If is not placed ahead of other runtime parameters when
invoking the job, a warning message will be printed.
Available output encoding options are:
• UTF16-BE
• UTF16-LE
• UTF16
UTF16-BE instructs TPump to print the job output in big
endian UTF16 encoding scheme.
UTF-LE instructs TPump to print the job output in little
endian UTF16 encoding scheme. On big endian client
systems, UTF16 instructs TPump to print the job output
in big endian UTF16 encoding scheme. On little endian
client systems, UTF16 instructs TPump to print the job
output in little endian UTF16 encoding scheme.
The UTF16 BOM is not printed as a part of job output.
When this parameter is not specified and the client
character set is UTF16, if TPump output needs to be
redirected to a log file on network platforms, “-u
outputencoding” must be specified.
CONFIG =
filename
-C filename
Specifies the configuration file for the TPump job. The
configuration file contains various configuration and
tuning parameters for TPump. The particular usefulness
of this file is for values that:
• are site- or host-specific
• script that developers may not necessarily be aware of
• will likely change independently of TPump scripts.
The installation of TPump installs a default configuration
file. On UNIX, it also installs a shell script that calls
TPump using the default configuration file on the
command line.
The format of the entries in the file is:
<keyword> <value>
• Lines in the file that do not begin with a valid keyword
are ignored.
• Keywords are case insensitive.
• On UNIX systems, this file is called tdatpump.cfg and
is expected to be found in the directory /usr/lib.
• If the configuration file is not found, the program
issues a warning message and uses the default values
wherever necessary.
Teradata Parallel Data Pump Reference
47
Chapter 2: Using TPump
Invoking TPump
Table 5: Run-time Parameters (continued)
Run-time parameter/systems
Channel-attached
Network-attached
Description
At this time, the only valid keyword is INMEMSORT,
which is an integer data type containing the maximum
number of bytes that can be sorted in memory. TPump
recovery logic uses this value. This keyword can be
modified if you want to increase the amount of memory
available for sorting.
If this keyword is not provided in the configuration file,
or the configuration file is not provided, the default value
for INMEMSORT is 6,000,000 for UNIX, 12,000,000 for
VM and MVS, and 3,000,000 for Windows.
PRDICITY
periodicityvalue
-d periodicityvalue
Specifies to change the periodicity value to control the
rate at which statements are transferred to the RDBMS.
This parameter may be adjusted to improve the TPump
workflow.
This parameter is in effect whenever the BEGIN LOAD
command uses the RATE parameter to control the rate
that statements are sent to the RDBMS. The default
periodicity value is four 15-second periods per minute.
The periodicityvalue variable contains a value between 1
and 600, which is the value range for the number of
periods specified. The default value is 4.
Alternatively, periodicity can be changed by executing the
PumpMacro.UserUpdateSelect macro (provided with
TPump Monitor SQL scripts) to update the monitor
interface table while the job is running.
ERRLOG=
errorf ilename
-e errorfilename
Specifies to use the error logging function. Using this
parameter creates an alternate error log file to hold
messages generated by TPump. Specifying an alternate
file name produces a duplicate record of all TPump error
messages. This allows you to examine any errors detected
without having to go through the entire output stream.
The errorfilename you define is the location in which you
want to copy error messages. You can also include
directory identifiers in the file names you define.
On UNIX, the maximum length of the file name depends
on the UNIX version currently in use.
On channel-attached client systems, the alternate file
specification is limited to eight characters and:
• On MVS, it must be a DD name defined in the JCL
• On VM, it must be an existing file definition
(FILEDEF)
48
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Invoking TPump
Table 5: Run-time Parameters (continued)
Run-time parameter/systems
Channel-attached
Network-attached
Description
Note: If the file names that you define already exist, they
are overwritten. Otherwise, they are automatically
created.
There is no default error log errorfilename specification.
MACROS
-m
Invocation option to tell TPump to keep macros that were
created during the job run. These macros can be used as
predefined macros for the same job.
In order to use the same script after the -m parameter is
used in the previous run, the EXECMACRO command
must be added to the script.
To avoid duplicate macro names, a random number from
1 to 99 is used in each macro name when the NAME
command is not used. The format in which the macro is
created is:
MYYYYMMDD_HHMMSS_LLLLL_DDD_SSS
where
• LLLLL is the low-order 5 digit of the logon sequence
returned by the dbs from the .LOGON command.
• DDD is the .DML sequence (ordinal) number. This
value is not reset to one for successive loads (.BEGIN
LOAD) in a single job, but continues to be
incremented.
• SSS is the SQL statement sequence (ordinal) number
within the .DML group.
RTYTIMES=nn
-t nn
Specification for retry times. The default for nn is 16. If
nn = 0, retry times will be set back to 16.
The retry times options in the BEGIN LOAD can override
this option for the requests/data sent between "BEGIN
LOAD" and "END LOAD" pair.
‘tpump command’
-r 'tpump
command'
Invocation option that can signify the start of a TPump
job. This is usually a RUN FILE command specifying the
file containing your TPump job script because only one
tpump command may be specified. For example, on
UNIX:
’.RUN FILE tpump.script;’
Teradata Parallel Data Pump Reference
49
Chapter 2: Using TPump
Invoking TPump
Table 5: Run-time Parameters (continued)
Run-time parameter/systems
Channel-attached
Network-attached
Description
VERBOSE
-v
Specifies to turn on verbose mode. Using this parameter
provides additional statistical data in addition to the
regular statistics.
In verbose mode, the input file from which statistics are
normally displayed includes such additional statistics as
the number of RDBMS requests sent, in addition to the
normal number of requests.
Note: In verbose mode, TPump displays each retryable
error when it occurred.
Not Applicable
-y
Specification for the data encryption option. When
specified, data and requests will be encrypted in all
sessions used by the job.
The encryption options in the BEGIN LOAD or the
PARTITION commands can override this option for the
sessions associated with those commands
Not Applicable
< infilename
Name of the standard input file containing your TPump
commands and Teradata SQL statements. Your infilename
specification redirects the standard input (stdin). If you
do not enter an infilename specification, the default is
stdin. If end-of-file is reached on the specified input file,
the input does not refer to stdin and the job terminates.
Note: On channel-attached client systems, you must use
the FILEDEF or DD control statement to specify the
input file before you invoke TPump.
Not Applicable
> outfilename
Name of the standard output file for TPump messages.
Your outfilename specification redirects the standard
output (stdout). If you do not enter an outfilename
specification, the default is stdout.
Note: If you use an outfilename specification to redirect
stdout, do not use the same outfilename as an output or
echo destination in the DISPLAY or ROUTE commands.
Doing so produces incomplete results because of the
conflicting write operations to the same file.
Note: On channel-attached client systems, you must use
the FILEDEF or DD control statement to specify the
output file before you invoke the utility.
RVERSION
-V
Display version number and stop.
Note: See the invocation examples in Appendix B: “TPump Examples” for sample JCL
listings, commands, and output samples for the invocation options.
50
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Terminating TPump
Examples - Redirection of Inputs and Outputs
The following examples show various ways to redirect stdin and stdout via UNIX.
Example 1
tpump </home/tpuser/tests/test1 >/home/tpuser/tests/out1
This example specifies both an input file and an output file. The TPump script is in
/home/tpuser/tests/test1 and the job output is written to /home/tpuser/tests/out1.
Example 2
tpump </home/tpuser/tests/test1
This example specifies only an input file. The TPump script is in /home/tpuser/tests/test1 and
the job output is written to stdout, which ordinarily would be your terminal.
Example 3
tpump >/home/tpuser/tests/out1
This example specifies only an output file. You enter the TPump script via stdin, normally at
your terminal. When input is complete, type Control-D to indicate end-of-file. Type
Control-D by simultaneously pressing the Control key and the letter D. The job output is
written to /home/tpuser/tests/out1.
Example 4
tpump
This example specifies neither an input nor an output file. TPump commands are typed at
your terminal via stdin and job output is written to your terminal via stdout.
Terminating TPump
This section covers methods of termination and other topics related to terminating TPump.
There are two ways to terminate TPump:
•
Normal termination
•
Abort termination
Normal Termination
Use the TPump LOGOFF command in your TPump job script to terminate the utility
normally on both network- and channel-attached client systems:
;
.LOGOFF
retcode
HE03A003
Teradata Parallel Data Pump Reference
51
Chapter 2: Using TPump
Terminating TPump
TPump logs off all sessions with the Teradata Database and returns a status message
indicating:
•
The total processor time that was used
•
The job start and stop date/time
•
The highest return code that was encountered:
•
0 if the job completed normally
•
4 if a warning condition occurred
•
8 if a user error occurred
•
12 if a fatal error occurred
•
16 if no message destination is available
TPump also:
•
Either maintains or drops the restart log table, depending on the success or failure of the
job.
•
If specified, returns the optional retcode value to your client operating system.
See the LOGON command description in Chapter 3 for more information about return codes
and the conditions that maintain or drop the restart log table.
Abort Termination
The procedure for aborting a TPump job depends on whether the utility is running on a
network-attached or a channel-attached client system:
To abort a TPump job running on a channel-attached client system
✔ Cancel the job from the client system console.
To abort a TPump job running on a network-attached client system
✔ Press the Control + C key combination three times on your workstation keyboard.
After Terminating a TPump Job
After terminating a TPump job, you can:
•
Restart the job and allow it to run to completion, or
•
Reinitialize the job and run it to completion, or
•
Abandon the job and clean up the database objects.
For more information on how to perform the above options, see the following section.
52
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Restarting and Recovery
Restarting and Recovery
This section explains TPump’s handling of restart and recovery operations in the event of a
system failure.
The TPump facility includes a number of features that enable recovery from client or Teradata
Database failure, with minimal requirements for job resubmission or continuation. Upon
restart or resubmission, TPump interrogates the restart log table on the Teradata Database
and resumes operations from where it had left off.
Caution:
Do not tamper with the contents of the restart log table. A missing or altered restart log table
will cause the TPump job to be recovered incorrectly.
Basic TPump Recovery
Whenever a RDBMS restart is detected or a TPump job is restarted on the host system, the
following activity occurs:
1
The restart log table is scanned with reference to the TPump script. Each statement within
the script is either executed because a row does not exist or ignored because a row exists in
the restart log.
2
In the case of the END LOAD statement, there are a number of rows which are placed in
the restart log table which let TPump decide what to do. TPump ignores any complete
IMPORT within a LOAD and begins at the incomplete IMPORT.
3
Within an unfinished IMPORT, TPump begins processing at the last complete checkpoint.
If the TPump job was running in SIMPLE mode before the restart, then recovery is
complete and processing continues at the last complete checkpoint.
4
If TPump was running in ROBUST mode before it was restarted, then TPump must next
ascertain how much processing has been completed since the last checkpoint. This is
accomplished by reading back a set of “Partial Checkpoints” from the restart log table in
the Teradata Database, sorting them, and then reprocessing all transactions which were left
incomplete when the job was interrupted.
Protection and Location of TPump Database Objects
The restart log table is critical to the recovery process. If the restart log table is dropped, there
is no way to recover an interrupted TPump job.
In addition to the restart log table, TPump also creates an error table and a number of macros
(where each macro corresponds to a DML SQL statement involved in current IMPORT). If
these database objects are dropped, they can, with some effort, being recreated. However, it is
much more convenient for this NOT to be necessary.
TPump does not have special locks that it places on database objects. It is important that
administrators take security precautions to avoid the loss of these objects.
If the objects are dropped accidentally, the following information should allow an
administrator to recreate them.
Teradata Parallel Data Pump Reference
53
Chapter 2: Using TPump
Restarting and Recovery
TPump macros are placed in the same database that contains the log restart table.
The macros are named according to the following convention:
Jobname_DDD_SSS
where
•
Jobname is the job name, which, if not explicitly specified, defaults to:
MYYYYMMDD_HHMMSS_LLLLL.
•
LLLLL is the low-order 5 digits of the logon sequence number returned by the dbs from
the .LOGON command.
•
DDD is the number of the .DML sequence (ordinal) number. This value is not reset to one
for successive loads (.BEGIN LOAD) in a single job, but continues to be incremented.
•
SSS is the SQL statements sequence (ordinal) number within the .DML group.
Thus, given the following script fragment:
.LOGTABLE LT_SIGH;
.LOGON TDPID/CME,CME;
...
.LAYOUT LAY1A
...
.DML LABEL TAB1PART1;
INSERT into tab1 values (:F0,:F1,:F2,:F3);
.DML LABEL TAB2PART1;
INSERT into tab2 values (:F0,:F1,:F2,:F3);
...
.IMPORT INFILE TPDAT
LAYOUT LAY1A
APPLY TAB1PART1
APPLY TAB2PART1;
and assuming the job name is defaulted, the macros are named:
M20020530_171209_06222_001_001 and M20020530_171209_06222_002_001.
The contents of a TPump macro is taken directly from the script and consists of a parameter
clause derived from the LAYOUT and the actual statement which is specified in the script.
Continuing the example above, if the LAYOUT associated with the statement is as follows:
.LAYOUT LAY1A;
.FIELD
F0 * integer key;
.FIELD
F1 * integer;
.FIELD
F2 * integer;
.FILLER FX * integer;
.FIELD
F3 * char(38);
then the macros will be created as follows:
CREATE MACRO CME.M20020530_171209_06222_001_001 (
F0 (INTEGER), F1 (INTEGER), F2 (INTEGER), F3 (CHAR(38))
) AS (INSERT INTO TAB1 VALUES(:F0, :F1, :F2, :F3);
);
CREATE MACRO CME.M20020530_171209_06222_002_001 (
F0 (INTEGER), F1 (INTEGER), F2 (INTEGER), F3 (CHAR(38))
) AS ( INSERT INTO TAB2 VALUES(:F0, :F1, :F2, :F3);
);
54
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Restarting and Recovery
Note that the actual names of the parameters in the parameter list are not important; however,
what is important is that the types of the parameters are specified in the macro in exactly the
same order as the types in the LAYOUT. Also important is the fact that FILLER fields are not
included in the parameter list since they are stripped out by TPump.
The error table, if it is not explicitly specified,is:
<JobName>_nnn_ET
Where nnn is the load sequence number.
If the database for the error table is not explicit in the script, the table is placed in the database
associated with the TPump user logon, unless the DATABASE command has been issued.
Continuing the above example, assuming the user defaults the error table, then the create table
command for it will be:
CREATE SET TABLE M20020530_171209_06222_001_ET,
NO BEFORE JOURNAL,
NO AFTER JOURNAL
(
ImportSeq BYTEINT,
DMLSeq BYTEINT,
SMTSeq BYTEINT,
ApplySeq BYTEINT,
sourceseq INTEGER,
DataSeq BYTEINT,
ErrorCode INTEGER,
ErrorMsg VARCHAR(255) CHARACTER SET UNICODE NOT CASESPECIFIC,
ErrorField SMALLINT,
HostData VARBYTE(63677))
UNIQUE PRIMARY INDEX ( ImportSeq ,DMLSeq ,SMTSeq ,ApplySeq ,sourceseq ,
DataSeq );
Reinitializing a TPump Job
If the restart log table has been accidentally dropped or corrupted for a TPump job, follow this
procedure to reinitialize the job:
1
Determine how much of the job has completed in order to take data out of the TPump
input data set. How this is done will depend on the table and procedures involved with
table maintenance. This will vary between jobs and with the procedures in effect at each
customer site.
2
Delete any database objects associated with the TPump job that may exist since TPump
did not get a chance to clean up. These objects include the error table and any DMLassociated macros. Directions for finding these objects are provided in the previous
section.
Recovering an Aborted TPump Job
An aborted TPump job is one that has been terminated early for any number of reasons (out
of database space, accidental cancellation by mainframe operators, UNIX kernel panic, error
limit in the TPump job exceeded, and so on) and all TPump database objects, the restart log
table, the error table, and DML macros are intact.
Teradata Parallel Data Pump Reference
55
Chapter 2: Using TPump
Programming Considerations
An aborted TPump job may be restarted using the same job script that was used in the
original job, and TPump will perform the recovery of the job.
Recovering from Script Errors
When TPump encounters an error in the input script, a diagnostic message is generated and
the operation is stopped with a non-zero return code. You can then modify the script, correct
the faulty code, and resubmit the job. Operations begin with the statement following the last
one that was successfully completed.
Programming Considerations
This section provides information to help applications programmers to design and script
TPump jobs. Additional information needed by programmers and/or system administrators
includes space requirements, locks, and the use of fallback or nonfallback tables.
The information in this section includes TPump command conventions, variables, and ANSI/
SQL DateTime Data types. You will find information related to using comments, specifying a
character set, using graphic data types, and using graphic constants. Restrictions and
limitations, as well as termination return codes, are covered as well.
TPump Command Conventions
The following command conventions apply when using TPump.
TPump Reserved Words
Commands supported by TPump do not use reserved words (or keywords), except those that
are operators, and only where an expression is allowed. Although there is no official
restriction against the use of TPump reserved words as variable names, it is strongly
recommended that you avoid their use, as well as the use of Teradata SQL reserved words. You
should especially avoid words that are operators (see Table 6), as their use can result in
ambiguous expressions.
Table 6: TPump Operators
Commands
56
AND
BETWEEN
EQ
GE
GT
IN
IS
LE
LIKE
LT
MOD
NE
NOT
NULL
OR
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Programming Considerations
Teradata SQL Reserved Words
TPump supports a subset of Teradata SQL listed in Table 16 on page 91. The subset of
Teradata SQL consists only of statements beginning with one of the reserved words (or
keywords) in Table 1. Avoid the use of the Teradata SQL reserved words listed in TPump
commands.
Conditional Expressions
Some of the commands described in this chapter use conditional expressions. If they evaluate
to true, conditional expressions return a result of 1; if false, they return a result of 0.
Table 7: TPump Conditional Expressions
Commands
+
-
/
MOD
||
IS NOT NULL
IS NULL
EQ
=
NE
<>
^=
NOT=
~=
GE
>=
GT
>
LE
<=
LT
<
BETWEEN
NOT BETWEEN
AND
OR
IN
NOT IN
NOT
These conditional expressions are similar to those described in the Teradata Database SQL
Reference: Functions and Operators, with the following exceptions:
1
In the reference manual, a column name in a conditional expression is equivalent, in this
document, to the field name, in records from an external data source or a utility variable.
2
In logical expressions that make up a conditional expression, the LIKE operator is not
supported. In these expressions, only the following operators are supported:
3
a
All comparison operators documented in Teradata Database SQL Reference: Functions
and Operators.
b
NOT IN operator (only the first of the two forms).
In arithmetic expressions that make up a logical expression, the following elements are not
supported:
a
The exponentiation operator
b
Aggregate operators
c
Arithmetic functions
Using Task Commands
A BEGIN LOAD command must begin each task to declare, at a minimum, the number of
sessions involved in the load.
Teradata Parallel Data Pump Reference
57
Chapter 2: Using TPump
Programming Considerations
The logged on user must have the appropriate user privileges on the tables. At the time the
BEGIN LOAD is initiated, you also must have SELECT privileges, as well as INSERT,
UPDATE, and DELETE privileges, depending on the DML statements specified in the current
task. Access privileges needed follow standard Teradata access privilege rules. The kind of
privilege required depends on the kind of DML statements to be applied. TPump tasks require
that you either own, or have select access to, the target table. Additional privilege for the target
table is required, depending on the DML command, INSERT, UPDATE, or DELETE. The
additional privilege is described for each statement type in later sections. Regardless of the
kind of statement, you must have CREATE TABLE privilege on the databases where the error
tables are going to be placed. You must also have CREATE TABLE privilege for TPump to
create a restart log table. If the restart log table specified for the support environment already
exists, INSERT and UPDATE privileges on the table are required.
In a TPump task, it is possible for more than one statement/data record combination to affect
a single row. If application of any statement/data record combination to a row would produce
an error, it is not applied, but all prior and subsequent error-free combinations affecting the
same row or other rows are applied.
TPump can guarantee the order of operations on a given row via the correct use of the serialize
option to specify the primary index of a given target table. When serialize is used, operations
for a given set of rows occurs in order on one session. Without serialize, statements are
executed on the first session available; hence, operations may occur out of order.
Assuming that the serialize option is in effect, note that the order in which DML statement or
host record pairings are applied to a given target row is totally deterministic; so too is the
order in which rows are applied to the target rows. Operations occur in exactly the same order
as they are read from the data source and, if there are multiple apply clauses, in order by apply
clause from first to last.
In addition to using serialize option in the BEGIN LOAD command, the SERIALIZEON
keyword can also be specified in the DML command, which lets you turn serialization on for
the fields you specify. You can use the SERIALIZEON keyword in the DML command with the
SERIALIZE keyword in the BEGIN LOAD command. When you do, the DML-level
serialization ignores and overrides the BEGIN LOAD-level serialization. In this case, the DML
command with the serialization option in effect will be serialized on the fields specified.
Operations generated from the first IMPORT statement take place before operations
generated from the second IMPORT.
Variables
This section contains information about the variables used in TPump.
Predefined System Variables
Avoid use of the prefix &SYS in user-defined symbols because the names of predefined utility
variables begin with the prefix. Predefined system variables are listed in Table 8.
58
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Programming Considerations
Table 8: Predefined System Variables
System Variable
Description
&SYSDATE
8-character date format yy/mm/dd
&SYSDATE4
10-character date format yyyy/mm/dd
&SYSDAY
3-character day of week: MON TUE WED THU FRI SAT SUN
&SYSDELCNT[n}
Number of rows deleted from all the target tables of Import n. If n is not
specified, it gives the count of deletes done to all the target tables for all
imports. The maximum value of n is 4.
&SYSETCNT[n}
Number of records inserted into the error table for import n. If n is not
specified, it gives the total count of all the records inserted into the error
table for all imports. The maximum value of n is 4.
&SYSINSCNT[n}
Number of rows inserted into all the target tables for import n. If n is not
specified, this variable gives the total inserts done to all the target tables
for all imports. The maximum value of n is 4.
&SYSJOBNAME
Up to 16 characters (ASCII or EBCDIC) in length, in whichever
character set is appropriate.
If &SYSJOBNAME is not set using the NAME command, it defaults to
MYYYYMMDD_hhmmss_lllll, where
M = macro
YYYY = year
MM = month
DD = day
hh = hour
mm = minute
ss = second
lllll = is the low-order 5 digits of the logon sequence number returned by
the dbs from the .LOGON command.
Teradata Parallel Data Pump Reference
59
Chapter 2: Using TPump
Programming Considerations
Table 8: Predefined System Variables (continued)
System Variable
Description
&SYSOS
Client operating system:
•
•
•
•
•
•
UNIX
HP-UX
IBM-AIX
Win32
Linux
For VM:
VM/SP
VM/XA SP
VM/HPO
VM/XA
VM/ESA
• For MVS:
VS1
MVS
MVS/SP
MVS/ESA
&SYSRC
Completion code from last response by Teradata Database.
&SYSRCDCNT[n]
Total number of records read for import n. If n is not specified, it gives
the total records read for all imports.
&SYSTIME
8-character time format hh:mm:ss
&SYSUPDCNT[n]
Gives total updates to all target tables for import n. If n is not given, it
gives the total updates done to all the target tables for all the imports.
The maximum value of n is 4.
&SYSUSER
Client system dependent: CMS user ID, MVS Batch user ID. (MVS-batch
returns user ID only when a security package such as RACF, ACF2, or
Top Secret has been installed).
&SYSAPLYCNT[n]
Number of records applied for import n. If n is not given, it gives the
total number of records applied for all imports.
&SYSNOAPLYCNT[n]
Number of records not applied for import n. If n is not given, it gives the
total number of records not applied for all imports.
&SYSRJCTCNT[n]
Number of records rejected for import n. If n is not given, it gives the
total number of rejected records of all the imports. The maximum value
of n is 4.
Date and Time Variables
&SYSDATE, &SYSDATE4, &SYSTIME, and &SYSDAY reflect the time when TPump begins
execution. The original values are restored at restart. These values are character data types and
60
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Programming Considerations
should not be used in numeric operations. System variables cannot be modified, only
referenced.
The values returned by &SYSDAY are all in upper case. Monday, for example, is returned as
’MON’:
0003 .IF ’&SYSDAY’ NOT = ’MON’ THEN;
14:10:28 - FRI JUL 30, 1993
UTY2402 Previous statement modified to:
0004 .IF ’FRI’ NOT = ’MON’ THEN;
0005 .RUN FILE UTNTS39;
0006 .ENDIF;
This example causes the RUN FILE command to be executed every day but Monday. It can be
seen from this example that any of the system variables can be used as the subject condition
within an IF/ELSE/ENDIF command construct. This allows you to create a script forcing
certain events to occur or tasks to operate in a predetermined sequence, based on the current
setting of the variable.
As another example, if we create the following table:
.SET TABLE TO ’TAB&SYSDAY’;
Create table &TABLE (
Account_Number INTEGER NOT NULL,
Last_Name VARCHAR(25),
First_Name VARCHAR(25),
Street_Address VARCHAR(30),
City VARCHAR(20),
State CHAR(2),
Zip_Code CHAR(5)
Balance DECIMAL(9,2) FORMAT ’-$,$$$,$$$.99’ )
Unique primary Index (Account_Number);
and then check the system variable &SYSRC for a return code to verify if the table already
exists, a file containing options to continue or quit is logged at the console. Any other error
return codes terminate the job with a Teradata Database error, as follows:
.SET CREATERC TO &SYSRC;
.IF CREATERC = 3803 /* Table &TABLE already exists */
.RUN FILE RUN01;
.ELSE
.IF CREATERC <> 0 THEN
.LOGOFF CREATRC;
.ENDIF
.BEGIN LOAD ----------; /* No errors returned. We can start the job.*/
/*
TPump statements go here.....
*/
.END LOAD;
.LOGOFF;
File RUN01, which operates when the 3803 error causes the RUN FILE command to execute,
contains the following options:
.DISPLAY
to FILE
.DISPLAY
to FILE
.DISPLAY
to FILE
’&SYSUSER: Table FOO already exists....’
console;
’&SYSUSER: Reply <C> Continue anyway...’
console;
’&SYSUSER: Reply <A> Abort this JOB....’
console;
Teradata Parallel Data Pump Reference
61
Chapter 2: Using TPump
Programming Considerations
.DISPLAY ’&SYSUSER: Reply <C> or <A>.Default <A>’
to FILE console;
.ACCEPT RESPONSE FROM FILE CONSOLE;
.IF RESPONSE <> ’C’ THEN
.LOGOFF CREATERC; /* Quit before we cause trouble */
.ENDIF;
Row Count Variables
The row count variables, which are updated for each TPump task, allow you to query the
insert, update, and delete row counts and the error table counts for each import:
•
&SYSDELCNT[n]
•
&SYSETCNT[n]
•
&SYSINSCNT[n]
•
&SYSUPDCNT[n]
The values are stored in the TPump utility restart log table and are restored after a client
system or Teradata Database restart.
When EXECUTE <macroname> INSERT|UPDATE|DELETE is used, TPump must rely on the
user to have correctly identified the action (INSERT, UPDATE, or DELETE) which the macro
performs. TPump cannot always determine the number of target tables, and therefore can
only provide a single combined value for all target tables. Using the existing facility of variable
substitution, each new system variable can be referenced as soon as the variable is defined. The
new variables are defined during the import phase and should be referenced after the END
LOAD command and before any subsequent BEGIN LOAD command in your TPump job
script.
The values of the new system variables must be stored in the TPump log table and be restored
after a restart.
Utility Variables
TPump supports utility variables. These variables are set via either the SET command or the
ACCEPT command. Chapter 3 describes them in greater detail.
Additionally, TPump predefines some utility variables that provide information about the
TPump environment at execution time. The name of these variables must begin with an
ampersand (&) when variable substitution is desired. The rest of the name must obey the rules
for standard Teradata SQL column names. Consequently, the name of the variable without
ampersand can be no longer than 29 characters, so that with an ampersand it does not exceed
the 30-character limit.
TPump supports an environmental variable for each DML statement executed. At the end of
an import, a variable is established for each statement executed. The variable is named using
the number of the import (one through four), the label of the clause “containing” the DML
statement, and the number of the statement within the IMPORT’s apply clause.
Variable Substitution
Variable substitution, to allow for dynamic statement modification, is allowed on any
statement by preceding the variable name with an ampersand. Each occurrence of a variable
62
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Programming Considerations
name, preceded by an ampersand, is replaced by its current value. Numeric values are
permitted, but their values are converted to character for the replacement. This replacement
occurs before the statement is analyzed. The replacement operation for a given statement
occurs only once (one scan). This means that replacements generating ampersand variable
names are not replaced.
Even when it appears in a quoted string, an ampersand is always interpreted as the first
character of a utility variable unless it is immediately followed by another ampersand. Such a
double ampersand is converted to a single textual ampersand.
If a reference to a utility variable is followed by a nonblank character that could appear in a
variable name, there must be a period between the variable and the nonblank character(s).
TPump discards the period in this context.
For example, if a utility variable called &x has the value xy and is to be immediately followed
by the characters .ab in some context, the sequence of variable and characters must appear as
&x..ab to produce xy.ab as the result. Such a double period is converted to a single textual
period and concatenated with the value of the utility variable.
Using ANSI/SQL DateTime Data Types
The following ANSI/SQL DateTime data types can be specified as column or field modifiers in
some of the Teradata SQL statements you use with TPump:
•
DATE
•
TIME
•
TIMESTAMP
•
INTERVAL
For example, you can use them in CREATE TABLE statements and in INSERT statements.
However, some restrictions may apply when you use ANSI/SQL DateTime data types.
In the FIELD command, you must convert ANSI/SQL DateTime data types to fixed-length
CHAR(10) data types. See section “Using ANSI/SQL DateTime Data Types” on page 135 for a
description of the fixed-length CHAR representations for each ANSI/SQL DateTime data
types.
Using Comments
TPump supports C language style comments. A comment begins with a slash asterisk ’/*’ and
all subsequent input is treated as a comment until a terminating asterisk slash ’*/’ is
encountered. Comments may nest and they do not occur within string or character literals.
For an example, a ’/*’ within a quoted string is not treated as the beginning of a comment.
Comments are written to the message destination. Substitution of values for variable names
continues within comments. If the variable name is required, two ‘&’s should be coded. Note
that recursive comments are permitted, which means that to end the comment, the number of
terminating ’*/’ sequences must match the number of start ’/*’ comment sequences that are
outstanding.
Teradata Parallel Data Pump Reference
63
Chapter 2: Using TPump
Programming Considerations
You have the option of either sending or not sending comments to the Teradata Database. If a
comment is used together with a Teradata SQL statement, a semicolon may be placed as a
terminating character to end the comment. The semicolon tells TPump to strip out the
comment so that it is not sent to the Teradata Database. If a semicolon is not used, the
comment is sent to the Teradata Database together with the Teradata SQL statement.
Nested comments are supported when they occur before or after Teradata SQL statements.
Nested comments within Teradata SQL statements are not supported. Nested comments must
terminate with a semicolon. If a semicolon is not appended, the comment is erroneously sent
to the Teradata Database and a syntax error is returned.
Specifying a Character Set
Table 9 describes ways to either specify the character set or accept a default specification.
Table 9: Ways to Either Specify a Character Set or Accept a Default Specification
Specification or Default Selection
Description
Runtime parameter specification
Use when you invoke TPump, as described earlier in this
chapter:
• charset=charactersetname for channel-attached VM and
MVS client systems
• -c charactersetname for network-attached UNIX and
Windows client systems
Client System Specification
Specify the character set for your client system before invoking
TPump by configuring the:
• HSHSPB parameter for channel-attached VM and MVS
client systems
• clispb.dat file for network-attached UNIX and Windows
client systems
Note: The charactersetname specification used when you
invoke TPump always takes precedence over your current
client system specification.
Teradata Database Default
If you do not use a charactersetname specification when you
invoke TPump, and there is no character set specification for
your client system, TPump uses the default specification in the
Teradata Database system table DBC.Hosts.
Note: If you rely on the DBC.Hosts table specification for the
default character set, make sure that the initial logon is in the
default character set:
• EBCDIC for channel-attached VM and MVS client systems
• ASCII for network-attached UNIX and Windows client
systems
64
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Programming Considerations
Table 9: Ways to Either Specify a Character Set or Accept a Default Specification (continued)
Specification or Default Selection
Description
TPump Utility Default
If there is no character set specification in DBC.Hosts, then
TPump defaults to:
• EBCDIC for channel-attached VM and MVS client systems
• ASCII for network-attached UNIX and Windows client
systems
Character Set Specifications for AXSMODs
When you use an AXSMOD with TPump, the session character set is passed as an attribute to
the AXSMOD for possible use. The attribute value is a variable-length character string with
either the character set name or the character representation of the character set ID. The
attribute varies based on how you specify the character set.
Specify the session character set by
Attribute name is
name
CHARSET_NAME
ID
CHARSET_NUMBER
Multibyte Character Sets
Teradata Database supports multibyte characters in object names when the client session
character set is UTF8 or UTF16. Refer to Teradata Database International Character Set
Support manual for a list of valid characters used in object names. In TPump scripts, double
quote multibyte characters used in object names.
To log on with UTF8 character set or other supported multibyte character sets (Chinese,
Japanese, or Korean), create object names shorter than 30 bytes. This limitation applies to
userid, password, and account. The logon string might fail if it exceeds 30 bytes per an object
name.
Multibyte character sets impact the operation of certain TPump commands, as well as object
names in Teradata SQL statements, as shown in the following table.
TPump Command
Affected Element
Impact
ACCEPT
Utility variables
The utility variables may contain multibyte
characters. If the client does not allow multibyte
character set names, then the filename must be
in uppercase English.
BEGIN LOAD
Table names:
Target table names and error table names may
contain multibyte characters.
• Target tables
• Error tables
Teradata Parallel Data Pump Reference
65
Chapter 2: Using TPump
Programming Considerations
TPump Command
Affected Element
Impact
DML
DML label name
The label name in a DML statement may contain
multibyte characters. The label name may be
referenced in the APPLY clause of an IMPORT
statement.
FIELD
Field name
The field name specified may contain multibyte
characters. The name can be referenced in other
FIELD commands in NULLIF and field
concatenation expressions, and in APPLY
WHERE conditions in IMPORT commands.
The FIELD command can also contain a
NULLIF expression, which may use multibyte
characters.
FILLER
Filler name
The name specified in a FILLER command may
contain multibyte characters.
IF
IF condition
The condition in an IF statement may compare
multibyte character strings.
LAYOUT
Layout name
CONTINUEIF condition
The layout name may contain multibyte
characters and may be used in the LAYOUT
clause of an IMPORT command. The
CONTINUEIF condition may specify multibyte
character set character comparisons.
LOGON
User name
Password
The user name and password may contain
multibyte characters.
LOGTABLE
Table name
Database name
The logtable name and database name may
contain multibyte characters.
NAME
set SYSJOBNAME
This variable may contain kanji characters.
SET
Utility variable
The utility variable may contain multibyte
characters. The variable can be substituted
wherever substitution is allowed.
TABLE
Table and database name
The table name (and database name if the table
name is fully qualified) specified in a TABLE
statement may contain multibyte characters.
Avoid using the TABLE command when using
UTF8 or UTF16 character sets by explicitly
specifying the layout.
Using Graphic Data Types
To define double-byte character set data, the GRAPHIC, VARGRAPHIC, and LONG
VARGRAPHIC data types are supported. TPump accepts GRAPHIC data in the input data set
or file containing the TPump statements to be executed.
The FIELD and FILLER statements that describe the input data are the statements affected by
GRAPHIC data support. Table 10 lists the GRAPHIC data types that can be specified for the
datadesc option in the FIELD or FILLER statement.
66
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Programming Considerations
Table 10: GRAPHIC Data Types for datadesc option in FIELD or FILLER Statement
GRAPHIC Data Type
Input Length
Description
GRAPHIC (n)
if n specified, (n*2) bytes;
otherwise, 2 bytes (n=1 is
assumed)
n double-byte characters
VARGRAPHIC (n)
m+2 bytes where m/2 <= n
2-byte integer, followed by m/2
double-byte characters
LONG VARGRAPHIC
m+2 bytes where m/2 <=32000
2-byte integer, followed by m/2
double-byte characters
Using Graphic Constants
TPump supports two forms of graphic constants. The graphic literal or string constant is
allowed only in the kanji EBCDIC character sets. It must have an even number of bytes within
the quoted string to represent double-byte characters.
The two forms of graphic constants are as follows.
•
The kanjiEBCDIC graphic constant form (used for the IBM mainframe)
•
The hexadecimal representation of graphic data (used for both the IBM mainframe and
network platforms)
For more information on graphic constants and hexadecimal constants, refer to the Teradata
Database SQL Reference: Fundamentals.
Restrictions and Limitations
Table 11 describes TPump restrictions and limitations on operational features and functions.
Table 11: Restrictions and Limitations on Operational Features and Functions
Operational Feature/Function
Restriction/Limitation
Maximum file size
2 gigabytes (MP-RAS UNIX systems).
Maximum row size
The maximum row size for a TPump job, data plus indicators,
is approximately 64,000 bytes. This limit is a function of the
row size limit of the Teradata Database.
• Aggregate operators
• Concatenation of data files
• Data retrieval from the Teradata
Database via SELECT
statements
Not allowed
Expressions
Evaluated from left to right, using the Teradata Database order
of preference, but can be overridden by parentheses.
IMPORT commands
Limit of four IMPORT commands within a single TPump load
task.
Teradata Parallel Data Pump Reference
67
Chapter 2: Using TPump
Programming Considerations
Table 11: Restrictions and Limitations on Operational Features and Functions (continued)
Operational Feature/Function
Restriction/Limitation
Date specification
For dates before 1900 or after 1999, the year portion of the
date must be represented by four numerals (yyyy). The default
of two numerals (yy) to represent the year is interpreted to be
the 20th century, and must be overridden to avoid spurious
year information. If the table column defined as type DATE
does not have the proper format, your dates may not process
properly. The correct date format must be specified at the time
of table creation, for example:
CREATE TABLE tab (ADATE DATE*);
DEFINE a (char(10))
INSERT tab (ADATE = :a(DATE, FORMAT ’yyyy-mmdd’));
Access logging
Unlike the MultiLoad and FastLoad utilities, access logging can
cause a severe performance penalty in TPump. This is because
TPump uses normal SQL operations rather than a proprietary
protocol, and if all successful table updates are logged, a log
entry is made for each operation. The primary index of the
access logging table may then create the possibility of row hash
conflicts.
Primary Indexes and Partitioning
Column Sets
You should specify values for the partitioning column set when
performing TPump deletes and updates to avoid lock
contention problems that can degrade performance. Avoid
updating primary index and partitioning columns with
TPump to minimize performance degradation.
Termination Return Codes
When a TPump job terminates, it returns a completion code to the client system using the
conventions listed in Table 12:
Table 12: Termination Return Codes
Code
Description
0
Normal completion
4
Warning
8
User error
12
Severe internal error
16
No message destination is available
Note: To avoid ambiguous or conflicting results, always use values greater than 20 when you
specify a return code with your LOGOFF command.
Many CLI and Teradata Database errors generate return codes of 12. For Teradata Database
errors that can be corrected and resubmitted, TPump tries up to 16 times to resubmit and, at
68
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Writing a TPump Job Script
the end of this process, returns a code of 12. Many CLI and Teradata Database errors generate
the return code of 12, except for:
•
Errors on Teradata SQL statements outside of the TPump task (before the BEGIN LOAD
command or after the END LOAD command). TPump ignores these errors, which display
error messages but do not cause early termination, nor generate TPump return codes.
•
Retryable errors or errors caused by Teradata Database restarts.
Writing a TPump Job Script
This section describes the contents of a TPump job script and explains how to write one.
Definition
The TPump job script, or program, is a set of TPump commands and Teradata SQL statements
that alter the contents of the specified target tables in the Teradata Database. These commands
and statements:
•
Insert new rows
•
Update some or all of the contents of selected existing rows
•
Delete selected existing rows
Each TPump job includes a number of support commands that establish and maintain the
TPump support environment, and a number of task commands that perform the actual
database insert, update, or delete operations. These commands and statements identify and
describe the input data to be applied to the target table, and then place that data into the target
table. These activities may commence anytime after configuring the program as described in
“TPump Support Environment” on page 37.
Caution:
In the event of a client failure, the identical script must be resubmitted in order to restart. If
the script is edited, a restart will not be permitted.
Script Writing Guidelines
The following script writing guidelines will help you write a TPump job script:
•
A script may contain up to four IMPORTs (tasks), delimited by a leading BEGIN LOAD
command and a trailing END LOAD command.
•
The BEGIN LOAD command specifies the number of sessions and establishes a number of
controlling parameters.
The BEGIN LOAD command also specifies the error table, and is the only table specified.
An optional qualifying database name may also be specified. This database name may be
different from the database being modified, thus allowing tables to be created and dropped
with no impact on the production database.
In addition, the BEGIN LOAD command establishes acceptable threshold levels for
important task controls, such as number and percentage of errors, session limits, duration
Teradata Parallel Data Pump Reference
69
Chapter 2: Using TPump
Writing a TPump Job Script
of logon attempts in hours (tenacity), and checkpointing frequency. This command also
provides optional controls to:
•
•
determine where any macros are placed
•
guarantee serial operations on given rows
•
select the number of statements to pack into a multiple-statement request
•
select a restart logic mode
The next item appearing in a script is usually a description of the records in the external
file containing the change data for the target tables. The description of these input records
appears in a sequence of commands headed by the LAYOUT command.
The LAYOUT command tags the record layout being depicted with a unique name, which
is then referenced by subsequent script commands in tasks throughout the rest of the job.
The LAYOUT is followed by the supporting information contained in the sequence of one
or more FIELD, FILLER, and TABLE commands.
•
Each FIELD command describes a single data item occupying a column in the input row.
These items are described by data type, starting position, length, and several other
characteristics. The FIELD command is used only for those items (columns) relevant to
the current task, which are to be sent to the Teradata Database as changes to the target
table.
The FIELD command may include the KEY modifier if the column is to be considered part
of the primary index for purposes of serialization.
•
Each FILLER command describes a column in the input row in the same way as the FIELD
command. These FILLER fields are never sent to the Teradata Database. The FILLER
command, however, identifies those columns that you do not want to be sent to the
Teradata Database. Thus, if a sequence of 10 alternating FIELD and FILLER commands is
used to describe 10 contiguous columns in the row, every other column, a total of five
columns, would be sent to the Teradata Database.
•
The TABLE command identifies any existing table with the same layout as the input. The
TABLE command is used when the changes are being enacted on entire rows, rather than
selected columns.
•
The next entry in the script is the DML command, which is followed by the DML
statements INSERT, UPDATE, and DELETE. The DML command creates an identifying
label for the DML statement input, which immediately follows the command. The DML
command also defines an error handling process for handling missing and duplicate rows,
with respect to the error table.
The three DML statements (INSERT, UPDATE, and DELETE) follow the DML command,
and may occur in any order and in any quantity. The INSERT statement is used to place a
complete and entirely new row into the target table.
The UPDATE statement takes the data contents from columns in the input record, as
defined with the LAYOUT, FIELD, FILLER command sequence, and substitutes the data
into the target table. The UPDATE rows are selected based on criteria specified in a
conditional clause in the statement.
70
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Writing a TPump Job Script
The DML command also allows UPDATE and INSERT statements to be paired to provide
TPump with an upsert capability. This allows TPump, in a single pass, to attempt an
UPDATE and, if it fails, perform an INSERT on the same row.
The DELETE statement removes entire rows from the target table whenever the evaluation
of the deleting condition is true, as specified in a conditional clause in the statement.
•
The only information not yet provided in the task is the identity of the input data file, the
starting and ending records in the file that are being used in this task, and other related
information. This is done with the IMPORT command. This command basically tells the
TPump utility to bring in file X, from record A through record N, to associate the layout
name (and specifications) with the input records, and to apply the desired DML (INSERT,
UPDATE, and DELETE) statement to each record.
•
The last command in the script is the END LOAD command. This command flags the end
of the commands and statements for the task, and triggers the program to begin execution
of the task.
For compatibility with the MultiLoad utility, multiple IMPORTs (up to four) are allowed
within a single BEGIN/END LOAD pair. However, because TPump does have an apply
phase, there is no significant difference between a script containing four BEGIN/END
LOAD pairs, each with one IMPORT, and a script with one BEGIN/END LOAD pair and
four IMPORTs.
Procedure for Writing a Script
A complete TPump job includes:
•
Invoking TPump
•
Logging onto the Teradata Database and establishing the TPump support environment
•
Specifying the TPump tasks
•
Logging off from the Teradata Database and terminating TPump.
Use the following procedure as a guide for writing TPump job scripts:
1
Invoke TPump, specifying your runtime options:
•
Normal or abbreviated (brief) printout
•
Number of buffers per session
•
Character set
•
Configuration file
•
Periodicity rate
•
Error logging function
•
Macro save option
•
Alternate run file
•
Verbose mode
Refer to “Invoking TPump” for detailed information about how to specify these options.
2
Establish the TPump support environment using the support commands summarized in
Table 2.
Teradata Parallel Data Pump Reference
71
Chapter 2: Using TPump
Writing a TPump Job Script
As a minimum, this part of your TPump job must include:
•
A LOGTABLE command to specify the restart log table
•
A LOGON command to provide a logon string that is used to connect all Teradata SQL
and TPump utility sessions with the Teradata Database.
3
Specify the TPump task using the task commands summarized in Table 2.
4
If you want to specify another TPump task:
•
Use the support commands to modify the TPump support environment for the next
task.
•
Use the task commands to specify the next task.
Repeat these steps for each task in your TPump job.
Note: Though a single TPump job can include a number of different tasks, limiting your
jobs to a single task for each invocation of TPump provides the highest assurance of a
successful restart/recovery operation if a system failure interrupts your job.
5
Use the LOGOFF command to disconnect all active sessions with the Teradata Database
and terminate TPump on your client system.
TPump Script Example
The following example shows what a simple TPump script and its corresponding output
might look like. The lines that begin with 4-digit numbers (for example, 0001) are scripts, the
rest are output.
**** 16:07:17 UTY6633 WARNING: No configuration file, using build defaults
========================================================================
=
=
=
Teradata Parallel Data Pump Utility
Release 12.00.00.00 =
=
Platform MVS
=
=
=
========================================================================
=
=
=
Copyright 1997-200, NCR Corporation. ALL RIGHTS RESERVED. =
=
=
========================================================================
**** 16:07:17 UTY2411 Processing start date: TUE JUL 10, 2007
========================================================================
=
=
=
Logon/Connection
=
=
=
========================================================================
0001 .LOGTABLE tpperf1a_lt1a;
0002 .LOGON TDQY/tpperf1a,;
**** 16:07:18 UTY8400 Teradata Database Release: 12.00.00.00
**** 16:07:18 UTY8400 Teradata Database Version: 12.00.00.00
**** 16:07:18 UTY8400 Default character set: EBCDIC
**** 16:07:18 UTY8400 Maximum supported buffer size: 1M
**** 16:07:26 UTY6211 A successful connect was made to the RDBMS.
**** 16:07:26 UTY6217 Logtable 'TPPERF1A.TPPERF1A_LT1A' has been created.
========================================================================
=
=
=
Processing Control Statements
=
=
=
72
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Writing a TPump Job Script
========================================================================
0003 .NAME TPUP1;
0004 .BEGIN LOAD
SESSIONS 20
PACK 20
ROBUST ON
SERIALIZE ON
CHECKPOINT 0
ERRORTABLE ET_TPPERF1A_TASK1A;
========================================================================
=
=
=
Processing TPump Statements
=
=
=
========================================================================
0005 .LAYOUT LAY1A;
0006 .FIELD F0 * integer key;
0007 .FIELD F1 * integer;
0008 .FIELD F2 * integer;
0009 .FIELD F3 * char(38);
0010 .DML LABEL ONE;
0011 UPDATE tpperf1a set F2 = F2 + 1 where F0 = :F0 and F1 = :F1;
0012 .IMPORT INFILE DATA
FROM 1 THRU 96
LAYOUT LAY1A
APPLY ONE;
0013 .END LOAD;
**** 16:07:27 UTY6609 Starting to log on sessions...
**** 16:07:28 UTY6610 Logged on 20 sessions.
========================================================================
=
=
=
TPump Import(s) Beginning
=
=
=
========================================================================
**** 16:07:28 UTY6630 Options in effect for following TPump Import(s):
.
Tenacity:
4 hour limit to successfully connect load sessions.
.
Max Sessions:
20 session(s).
.
Min Sessions:
16 session(s).
.
Checkpoint:
0 minute(s).
.
Errlimit:
No limit in effect.
.
Restart Mode:
ROBUST.
. Serialization:
ON.
.
Packing:
20 Statements per Request.
.
StartUp Rate:
UNLIMITED Statements per Minute.
**** 16:07:36 UTY6608 Import 1 begins.
**** 16:07:39 UTY6641 Since last chkpt., 96 recs. in, 96 stmts., 20 reqs
**** 16:07:39 UTY6647 Since last chkpt., avg. DBS wait time: 0.05
**** 16:07:39 UTY6612 Beginning final checkpoint...
**** 16:07:39 UTY6641 Since last chkpt., 96 recs. in, 96 stmts., 20 reqs
**** 16:07:39 UTY6647 Since last chkpt., avg. DBS wait time: 0.05
**** 16:07:40 UTY6607 Checkpoint Completes with 96 rows sent.
**** 16:07:40 UTY6642 Import 1 statements: 96, requests: 20
**** 16:07:40 UTY6643 Import 1 average statements per request: 4.80
**** 16:07:40 UTY6644 Import 1 average statements per record: 1.00
**** 16:07:40 UTY6645 Import 1 statements/session: avg. 4.80, min. 4.00, max. 5.00
**** 16:07:40 UTY6646 Import 1 requests/session: average 1.00, minimum 1.00, maximum 1.00
**** 16:07:40 UTY6648 Import 1 DBS wait time/session: avg. 0.05, min. 0.00, max. 1.00
**** 16:07:40 UTY6649 Import 1 DBS wait time/request: avg. 0.05, min. 0.00, max. 1.00
**** 16:07:40 UTY1803 Import processing statistics
.
IMPORT 1
Total thus far
Teradata Parallel Data Pump Reference
73
Chapter 2: Using TPump
Viewing TPump Output
.
=========
==============
Candidate records considered:........
96.......
96
Apply conditions satisfied:..........
96.......
96
Errors loggable to error table:......
0.......
0
Candidate records rejected:..........
0.......
0
** Statistics for Apply Label : ONE
Type
Database
Table or Macro Name
Activity
U
tpperf1a
tpperf1a
96
**** 16:07:42 UTY0821 Error table tpperf1a.ET_TPPERF1A_TASK1A is EMPTY, dropping table.
0014 .if &imp1_one_1 = 96 then;
**** 16:07:44 UTY2402 Previous statement modified to:
0015 .if 96 = 96 then;
0016 .display 'rowcount ok' to file systest;
0017 .else;
0018 .display 'rowcount not ok' to file systest;
0019 .endif;
0020 .if &sysetcnt = 0 then;
**** 16:07:44 UTY2402 Previous statement modified to:
0021 .if 0 = 0 then;
0022 .display 'no errors' to file systest;
0023 .else;
0024 .display 'errors!!!' to file systest;
0025 .endif;
========================================================================
=
=
=
Logoff/Disconnect
=
=
=
========================================================================
**** 16:07:55 UTY6216 The restart log table has been dropped.
**** 16:07:55 UTY6212 A successful disconnect was made from the RDBMS.
**** 16:07:55 UTY2410 Total processor time used = '0.23474 Seconds'
.
Start : 16:07:17 - MON JULY 16, 2007
.
End
: 16:07:55 - MON JULY 16, 2007
.
Highest return code encountered = '0'.
Viewing TPump Output
TPump reporting functions provide timely information about the status of tasks in progress
and those just completed.
74
•
TPump Statistics—The TPump Statistics facility provides information on the success or
failure of TPump processing, with respect to data records transferred, target table row
modifications, and error table statistics. These statistics are accumulated and presented at
the end of the job.
•
TPump Options Messages—The options messages list the settings of some important
TPump task parameters.
•
Logoff/Disconnect Messages—The logoff/disconnect messages report several key run
statistics.
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Viewing TPump Output
TPump Statistics
For each task, TPump accumulates statistical items and writes them to the customary output
destination of the external system, SYSPRINT/stdout (or the redirected stdout), or the
destination specified in the ROUTE command. The statistics listed in Table 13 are kept:
Table 13: TPump Statistics
Reference
Number
Reference Item
Statistic
1
Candidate records considered
The number of records read.
2
Apply conditions satisfied
The number of statements sent to the RDBMS. If
there are no rejected or skipped records, this value is
equal to the number of candidate records, multiplied
by the number of APPLY statements referenced in the
import.
3
Errors loggable to error table
The number of records resulting in errors on the
RDBMS. These records are found in the associated
error table.
4
Candidate records rejected
The number of records which are rejected by the
TPump client code because they are formatted
incorrectly.
5
Statistics for Apply Label
This area breaks out the total activity count for each
statement within each DML APPLY clause. The ‘Type’
column contains the values U for update, I for insert,
and D for delete. Note that unlike the other reported
statistics, these values are NOT accumulated across
multiple imports.
6
Number of RDBMS requests
sent
These statistics are displayed only in the verbose
mode, which is selected as a runtime parameter,
VERBOSE, in MVS, or -v in UNIX.
In addition, Teradata TPump receives a count of the number of rows deleted from the
Teradata Database. Teradata TPump writes it either to SYSPRINT/stdout (or the redirected
stdout), or the destination specified in the ROUTE command.
If a record is rejected due to an error, as in the case of a duplicate, missing, or extra insert,
update, or delete row, the following statistical output shows that an error condition occurred.
.
.
Candidate records considered:........
Apply conditions satisfied:..........
Errors loggable to error table:......
Candidate records rejected:..........
Number of RDBMS requests sent:.......
** Statistics for Apply Label : LABELB
Type
Database
I
CME
Teradata Parallel Data Pump Reference
IMPORT 1
Total thus far
=========
==============
8.......
8
8.......
8
1.......
1
1.......
1
6.......
6
Table or Macro Name
TDBTB734_TAL
<-----(1)<-----(2)<-----(3)<-----(4)<-----(6)-
Activity
7 <-(5)-
75
Chapter 2: Using TPump
Viewing TPump Output
Restart Statistics
Teradata TPump stores statistics in the restart log table. After a restart, all statistics are
properly restored.
Teradata TPump Statistical Output
The following is an example of Teradata TPump output. Lines that are marked on the righthand side with (-----------(n) are explained above.
**** 16:53:31 UTY6633 WARNING: No configuration file, using build defaults
========================================================================
=
=
=
Teradata Parallel Data Pump Utility
Release 12.00.00.00 =
=
Platform MVS
=
=
=
========================================================================
=
=
=
Copyright 1997-200, NCR Corporation. ALL RIGHTS RESERVED. =
=
=
========================================================================
**** 16:53:31 UTY2411 Processing start date: MON JULY 16, 2007
========================================================================
=
=
=
Logon/Connection
=
=
=
========================================================================
0001 .LOGTABLE LT_SIGH;
0002 .LOGON pebble/cme,;
**** 16:53:32 UTY8400 Teradata Database Release: 12.00.00.00
**** 16:53:32 UTY8400 Teradata Database Version: 12.00.00.00
**** 16:53:32 UTY8400 Default character set: EBCDIC
**** 16:53:32 UTY8400 Maximum supported buffer size: 1M
**** 16:53:41 UTY6211 A successful connect was made to the RDBMS.
**** 16:53:41 UTY6217 Logtable 'CME.LT_SIGH' has been created.
========================================================================
=
=
=
Processing Control Statements
=
=
=
========================================================================
0003 CREATE TABLE TAB1, FALLBACK, NO JOURNAL
(F0 integer
,F1 integer
,F2 integer
,F3 char(38)) UNIQUE PRIMARY INDEX(F0)
;
**** 16:53:44 UTY1016 'CREATE' request successful.
0004 CREATE TABLE TAB2, FALLBACK, NO JOURNAL
(F0 integer
,F1 integer
,F2 integer
,F3 char(38)) UNIQUE PRIMARY INDEX(F0)
;
**** 16:53:48 UTY1016 'CREATE' request successful.
0005 .BEGIN LOAD
SESSIONS 10
ROBUST ON
76
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Viewing TPump Output
0006
0007
0008
0009
0010
0011
0012
0013
0014
0015
0016
0017
0018
0019
0020
0021
0022
0023
****
****
****
****
****
****
SERIALIZE ON
CHECKPOINT 10
NOMONITOR
ERRORTABLE ET_TEST1;
========================================================================
=
=
=
Processing TPump Statements
=
=
=
========================================================================
.LAYOUT LAY1A;
.FIELD F0 * integer key;
.FIELD F1 * integer;
.FIELD F2 * integer;
.FIELD F3 * char(38);
.DML LABEL TAB1PART1;
INSERT into tab1 values (:F0,:F1,:F2,:F3);
.DML LABEL TAB2PART1;
INSERT into tab2 values (:F0,:F1,:F2,:F3);
.DML LABEL TAB1UPSERT
DO INSERT FOR MISSING UPDATE ROWS
IGNORE DUPLICATE INSERT ROWS;
UPDATE tab1 set F2=F2 + 1 where f0=:f0 + 50 and f1 > 4;
INSERT into tab1 ( F0, F1, F2, F3) values (:F0 + 100,:F1,:F2,:F3);
.DML LABEL TAB2UPSERT
DO INSERT FOR MISSING UPDATE ROWS
IGNORE DUPLICATE INSERT ROWS;
UPDATE tab2 set F2=F2 + 1 where f0=:f0 + 50 and f1 > 4;
INSERT into tab2 ( F0, F1, F2, F3) values (:F0 + 100,:F1,:F2,:F3);
.IMPORT INFILE INDATA
FROM 1 THRU 100
LAYOUT LAY1A
APPLY TAB1PART1
APPLY TAB2PART1;
.IMPORT INFILE INDATA
FROM 1 THRU 100
LAYOUT LAY1A
APPLY TAB1UPSERT
APPLY TAB2UPSERT;
.END LOAD;
16:53:48 UTY6609 Starting to log on sessions...
16:53:49 UTY6610 Logged on 10 sessions.
========================================================================
=
=
=
TPump Import(s) Beginning
=
=
=
========================================================================
16:53:49 UTY6630 Options in effect for following TPump Import(s):
.
Tenacity:
4 hour limit to successfully connect load sessions.
.
Max Sessions:
10 session(s).
.
Min Sessions:
8 session(s).
.
Checkpoint:
10 minute(s).
.
Errlimit:
No limit in effect.
.
Restart Mode:
ROBUST.
. Serialization:
ON.
.
Packing:
20 Statements per Request.
.
StartUp Rate:
UNLIMITED Statements per Minute.
16:54:00 UTY6608 Import 1 begins.
16:54:05 UTY6641 Since last chkpt., 100 recs. in, 200 stmts., 10 reqs
16:54:05 UTY6647 Since last chkpt., avg. DBS wait time: 0.30
Teradata Parallel Data Pump Reference
77
Chapter 2: Using TPump
Viewing TPump Output
****
****
****
****
****
****
****
****
****
****
****
****
16:54:05 UTY6612 Beginning final checkpoint...
16:54:05 UTY6641 Since last chkpt., 100 recs. in, 200 stmts., 10 reqs
16:54:05 UTY6647 Since last chkpt., avg. DBS wait time: 0.30
16:54:05 UTY6607 Checkpoint Completes with 200 rows sent.
16:54:05 UTY6642 Import 1 statements: 200, requests: 10
16:54:05 UTY6643 Import 1 average statements per request: 20.00
16:54:05 UTY6644 Import 1 average statements per record: 1.00
16:54:05 UTY6645 Import 1 statements/session: avg. 20.00, min. 20.00, max. 20.00
16:54:05 UTY6646 Import 1 requests/session: average 1.00, minimum 1.00, maximum 1.00
16:54:05 UTY6648 Import 1 DBS wait time/session: avg. 0.30, min. 0.00, max. 3.00
16:54:05 UTY6649 Import 1 DBS wait time/request: avg. 0.30, min. 0.00, max. 3.00
16:54:05 UTY1803 Import processing statistics
.
IMPORT 1
Total thus far
.
=========
==============
Candidate records considered:........
100.......
100<-----(1)Apply conditions satisfied:..........
200.......
200<-----(2)Errors loggable to error table:......
0.......
0<-----(3)Candidate records rejected:..........
0.......
0<-----(4)Number of RDBMS requests sent:.......
10.......
10<-----(6)** Statistics for Apply Label : TAB1PART1
Type
Database
Table or Macro Name
Activity
I
CME
tab1
100<-(5)** Statistics for Apply Label : TAB2PART1
Type
Database
Table or Macro Name
Activity
I
CME
tab2
100
**** 16:54:19 UTY6608 Import 2 begins.
**** 16:54:29 UTY6641 Since last chkpt., 100 recs. in, 300 stmts., 171 reqs
**** 16:54:29 UTY6647 Since last chkpt., avg. DBS wait time: 0.00
**** 16:54:29 UTY6612 Beginning final checkpoint...
**** 16:54:29 UTY6641 Since last chkpt., 100 recs. in, 300 stmts., 171 reqs
**** 16:54:29 UTY6647 Since last chkpt., avg. DBS wait time: 0.00
**** 16:54:29 UTY6607 Checkpoint Completes with 200 rows sent.
**** 16:54:29 UTY6642 Import 2 statements: 300, requests: 171
**** 16:54:29 UTY6643 Import 2 average statements per request: 1.75
**** 16:54:29 UTY6644 Import 2 average statements per record: 1.50
**** 16:54:29 UTY6645 Import 2 statements/session: avg. 30.00, min. 30.00, max. 30.00
**** 16:54:29 UTY6646 Import 2 requests/session: average 17.10, minimum 17.00, maximum
18.00
**** 16:54:29 UTY6648 Import 2 DBS wait time/session: avg. 0.00, min. 0.00, max. 0.00
**** 16:54:29 UTY6649 Import 2 DBS wait time/request: avg. 0.00, min. 0.00, max. 0.00
**** 16:54:29 UTY1803 Import processing statistics
.
IMPORT 2
Total thus far
.
=========
==============
Candidate records considered:........
100.......
200
Apply conditions satisfied:..........
200.......
400
Errors loggable to error table:......
0.......
0
Candidate records rejected:..........
0.......
0
** Statistics for Apply Label : TAB1UPSERT
Type
Database
Table or Macro Name
Activity
U
CME
tab1
50
I
CME
tab1
50
** Statistics for Apply Label : TAB2UPSERT
Type
Database
Table or Macro Name
Activity
U
CME
tab2
50
I
CME
tab2
50
**** 16:54:36 UTY0821 Error table CME.ET_TEST1 is EMPTY, dropping table.
========================================================================
=
=
=
Logoff/Disconnect
=
78
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Viewing TPump Output
=
=
========================================================================
**** 16:54:49 UTY6216 The restart log table has been dropped.
**** 16:54:49 UTY6212 A successful disconnect was made from the RDBMS.
**** 16:54:49 UTY2410 Total processor time used = '1.0363 Seconds'
.
Start : 16:53:31 - MON JULY 16, 2007
.
End
: 16:54:49 - MON JULY 16, 2007
.
Highest return code encountered = '0'.
The above script has a realistic degree of complexity. The script demonstrates a TPump job
that contains two imports and each import has at least two associated statements.
For the first import there are two statements, each of which is specified in a separate DML
statement. The IMPORT statement references the two statements through two APPLY clauses.
The second import adds additional complexity by having two statements in each DML
statement. In this case, the two statements in each DML compose an upsert statement.
TPump Options Messages
The options message lists the settings of some important TPump task parameters. A few
examples follow:
Example 1
The following example depicts a typical options message.
**** 17:09:34 UTY6630 Options in effect for following TPump Import(s):
.
Tenacity:
4 hour limit to successfully connect load sessions.
.
Max Sessions:
10 session(s).
.
Min Sessions:
8 session(s).
.
Checkpoint:
10 minute(s).
.
Errlimit:
1 rejected record(s).
.
Restart Mode:
ROBUST.
. Serialization:
ON.
.
Packing:
20 Statements per Request.
.
StartUp Rate:
UNLIMITED Statements per Minute.
Example 2
In this example, the error limit is expressed as a percent of rows, not as a hard limit, the
recovery mode is simple, and serialization is on.
**** 17:09:34 UTY6630 Options in effect for following TPump Import(s):
.
Tenacity:
4 hour limit to successfully connect load sessions.
.
Max Sessions:
4 session(s).
.
Min Sessions:
4 session(s).
.
Checkpoint:
5 minutes.
.
Errlimit:
10% of records rejected.
.
Restart Mode:
SIMPLE.
. Serialization:
ON.
.
Packing:
20 Statements per Request.
.
StartUp Rate:
500 Statements per Minute.
Example 3
In this example, there is no error limit in effect and tenacity has been set to zero.
**** 17:09:34 UTY6630 Options in effect for following TPump Import(s):
Teradata Parallel Data Pump Reference
79
Chapter 2: Using TPump
Monitoring TPump Jobs
.
.
.
.
.
.
.
.
.
Tenacity:
Max Sessions:
Min Sessions:
Checkpoint:
Errlimit:
Restart Mode:
Serialization:
Packing:
StartUp Rate:
Sessions must successfully connect on first try.
1 session(s).
1 session(s).
5 minutes.
No limit in effect.
ROBUST.
OFF.
40 Statements per Request.
UNLIMITED Statements per Minute.
Logoff/Disconnect Messages
In response to the LOGOFF command, TPump completes the step by disconnecting active
sessions and reporting on total run statistics. The logtable is either dropped or kept,
depending on the success or failure of the previous activity.
When you log off a TPump session, the following status messages are written to the
SYSPRINT/stdout (or the redirected stdout) data destination, or to the destination specified
in the ROUTE command.
**** 13:57:45 UTY6216 The restart log table has been dropped.
**** 13:57:45 UTY6212 A successful disconnect was made from the RDBMS.
**** 13:57:45 UTY2410 Total processor time used = '0.270389 Seconds'
.
Start : 13:57:16 - MON JULY 16, 2007
.
End
: 13:57:45 - MON JULY 16, 2007
.
Highest return code encountered = '0'.Progress Monitoring
TPump differs from most other Teradata load utilities in that there is no support for it in
QrySessn. Instead, the optional TPump Monitor (see Table 14) is the only method for
remotely overseeing the progress of the utility. Note however, that while TPump requests do
appear in the QrySessn output, they are displayed as a collection of individual transactions
instead of being summarized into one utility instance.
Monitoring TPump Jobs
TPump provides an optional monitoring tool to monitor and update TPump jobs. The
TPump Monitor provides for run-time monitoring of the TPump job that allows you, via a
command-line interface, to track and alter the rate at which requests are issued to the
RDBMS.
The TPump Monitor provides the following functions:
80
•
Provides a set of SQL scripts that create a Monitor Interface table. TPump updates this
table approximately once every minute.
•
Allows you to learn the status of an import by querying against the Monitor Interface
table.
•
Allows you to alter the statement rate of an import by updating the Monitor Interface
table.
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Monitoring TPump Jobs
Monitor Interface Table
Use SQL scripts shipped with TPump to create a Monitor Interface Table
(SysAdmin.TPumpStatusTbl) in the RDBMS where TPump maintains information about an
import. TPump both reads commands from and updates status in the Monitor Interface
Table.
This table is required in order to use the TPump Monitor functionality, but is otherwise
optional. If the table does not exist, the worst that will happen is that TPump issues a warning
message indicating this fact.
This table must be secure, so it is created by the DBA.
An SQL script tpumpar.csql is provided in the TPump installation that performs the
appropriate setup. The tpumpar.csql script includes an action request.
Table 14 contains the following columns (other columns exist in order to support future
functionality):
Table 14: Monitor Interface Table
Name
Type
Notes
LogDB
VARCHAR(32)
Part of primary index. The name of the log table database.
LogTable
VARCHAR(32)
Part of primary index. The name of the log table.
Import
INTEGER
Part of primary index. The import number. (There may be
multiple imports in a TPump job.
UserName
VARCHAR(32)
The name of the user running the job. Used for security.
InitStartDate
DATE
The initial start date of the import.
InitStartTime
FLOAT
The initial start time of the import.
CurrStartDate
DATE
The last date this import was started (may be a restart).
CurrStartTime
FLOAT
The last time this import was started (may be a restart).
LastUpdateDate
DATE
The last date this import updated the table.
LastUpdateTime
FLOAT
The last time this import updated the table.
RestartCount
INTEGER
The number of times this import has been restarted.
Complete
CHAR(1)
‘Y’ if this import is complete. (There may be multiple
imports.)
RecordsOut
INTEGER
The number of statements sent to the RDBMS.
RecordsSkipped
INTEGER
The number of records skipped for apply conditions.
RecordsRejcted
INTEGER
The number of records rejected for bad data (on host)
RecordsRead
INTEGER
The number of records read.
RecordsErrored
INTEGER
The number of records resulting in errors on the RDBMS.
Teradata Parallel Data Pump Reference
81
Chapter 2: Using TPump
Monitoring TPump Jobs
Table 14: Monitor Interface Table (continued)
Name
Type
Notes
StmtsUnLimited
CHAR(1)
‘Y’ if this import running without a statement rate limit. If
‘N’ then refer to StmtsDesired for the statement rate.
StmtsDesired
INTEGER
The statement rate (if StmtsUnLimited is ‘N’)
PeriodsDesired
INTEGER
Allows you to specify the desired periodicity.
PleaseAbort
CHAR(1)
Set to ‘Y’ if you desire to abort.
RequestAction
CHAR(1)
Before processing any action request, a message will be logged
stating that the requested action is being taken. The following
action requests are permitted:
• Blank – No action
• C – Take a checkpoint and continue the job
• P – Take a checkpoint and pause until a subsequent action
request resumes or terminates the job
• R – Resume the job
• T – Take a checkpoint and terminate the job with rc = 8.
The job may be restarted.
• A – Terminate the job immediately with rc = 12. The job
may be restarted.
RequestChange
CHAR(1)
Set to ‘Y’ by the user if you desire TPump to pick up the
changes. Set to ‘N’ by TPump after changes are accepted.
Security concerns dictate that the SQL script to set up the Monitor Interface table for TPump
monitoring also establishes a set of views and macros in addition to the TPumpStatusTbl.
Although database administrators can access the table directly, using macros and views is
recommended because they provide for security and ensure rational use of the table.
Without action on the part of the database administrator, no normal user can update the
status of jobs. To grant controlled update access to the TPumpStatusTbl, a single command
will suffice:
“GRANT EXEC ON TPumpMacro TO _____;”
The macros for TPump monitoring reside in the database TPumpMacro and SysAdmin.
TPump Monitor Views
The following views of the Monitor Interface table are available:
View SysAdmin.TPumpStatus
This view is for database administrators and lets them see all running TPump imports.
CREATE VIEW SysAdmin.TPumpStatus AS LOCKING
SysAdmin.TPumpStatusTbl FOR ACCESS
SELECT * FROM SysAdmin.TPumpStatusTbl;
82
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Monitoring TPump Jobs
View SysAdmin.TPumpStatusX
This view is for all users and provides a view of TPump jobs. However, this view will only
show you jobs which you “own”.
CREATE VIEW SysAdmin.TPumpStatusX AS LOCKING
SysAdmin.TPumpStatusTbl FOR ACCESS
SELECT * FROM SysAdmin.TPumpStatusTbl
WHERE UserName = USER;
TPump Monitor Macros
TPump Monitor provides a set of macros that you can use to update the Monitor Interface
table and to monitor and control individual TPump import jobs. The following TPump
Monitor macros are provided:
Macro TPumpMacro. TPumpUpdateSelect
This macro is provided for database administrators to use to manipulate and monitor
individual TPump jobs:
CREATE MACRO SysAdmin.TPumpUpdateSelect
(
LogDB
VARCHAR(32),
LogTable
VARCHAR(32),
UserName
VARCHAR(32),
Import
INTEGER,
RequestChange CHAR(1),
StmtsUnLimited CHAR(1),
StmtsDesired
INTEGER,
PeriodsDesired INTEGER
)
AS
(
LOCK ROW WRITE
/* OR LOCKING Sysadmin.TPumpStatus for WRITE */
SELECT
RecordsOut
,
RecordsSkipped ,
RecordsRejcted ,
RecordsRead
,
RecordsErrored
FROM
SysAdmin.TPumpStatusTbl
WHERE
UserName = :UserName AND
LogDB
= :LogDB
AND
Import
= :Import
AND
LogTable = :LogTable
; UPDATE SysAdmin.TPumpStatusTbl
SET
RequestChange = :RequestChange,
StmtsUnLimited = :StmtsUnLimited,
StmtsDesired
= :StmtsDesired,
PeriodsDesired = :PeriodsDesired
Teradata Parallel Data Pump Reference
83
Chapter 2: Using TPump
Estimating Space Requirements
WHERE
UserName = :UserName
LogDB
= :LogDB
LogTable = :LogTable
Import
= :Import
);mport
= :Import
;
);
AND
AND
AND
;
Macro TPumpMacro. UserUpdateSelect
The macro UserUpdateSelect is provided to let you monitor/update your own TPump jobs.
CREATE MACRO TPumpMacro.UserUpdateSelect
(
LogDB
VARCHAR(32),
LogTable
VARCHAR(32),
Import
INTEGER,
RequestChange CHAR(1),
StmtsUnLimited CHAR(1),
StmtsDesired
INTEGER,
PeriodsDesired INTEGER
)
AS
(
LOCK ROW WRITE /* OR LOCKING Sysadmin.TPumpStatus FOR WRITE */
SELECT
RecordsOut
,
RecordsSkipped ,
RecordsRejcted ,
RecordsRead
,
RecordsErrored
FROM
SysAdmin.TPumpStatusTbl
WHERE
UserName = USER
AND
LogDB
= :LogDB
AND
Import
= :Import
AND
LogTable = :LogTable
; UPDATE SysAdmin.TPumpStatusTbl
SET
RequestChange = :RequestChange,
StmtsUnLimited = :StmtsUnLimited,
StmtsDesired
= :StmtsDesired,
PeriodsDesired = :PeriodsDesired
WHERE
UserName = USER
AND
LogDB
= :LogDB
AND
LogTable = :LogTable AND
Import
= :Import
;
);
Estimating Space Requirements
This section discusses space requirements for the TPump log table.
A row of approximately 200 bytes is written to the log table on each of the following events:
84
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Estimating Space Requirements
1
One row is written at initialization.
2
One row is written for each SQL statement issued through the TPump support
environment.
3
One row is written at the BEGIN LOAD command.
4
One row is written at the END LOAD command.
5
Two rows are written for each IMPORT command.
6
One row is written for each statement used in a load (between the BEGIN LOAD
command and the END LOAD command).
7
One row is written for each checkpoint taken.
8
In the ROBUST mode, for each packed request, a number of partial checkpoint rows are
written to the log between checkpoints. The rows are deleted each time a checkpoint is
written.
The partial checkpoint row contains 117 + (12 * packfactor) bytes per transaction. So the
number of partial checkpoints will vary, depending on the checkpoint frequency, the power of
the loading host, and the power of the Teradata target RDBMS.
Thus, an equation for the space is:
200 + 200 * each statement in the support environment + 400 * each BEGIN/END LOAD +
200 * each statement issued as DML + 200 * the estimated number of checkpoints + (117 +
(12 * packfactor)) * the number of partial checkpoints. A simplified version would be:
R = 200 + 200S + 400L + 200D + 200C + (117 + (12P))N,
where:
R = Required space for TPump log table
S = Each SQL statement issued through the support environment
L = Each BEGIN/END LOAD command pair
D = Each DML statement
C = Estimated number of checkpoints
P = Packfactor
N = Number of partial checkpoints
Space Calculation Example
The following example of how TPump log table space is derived takes a simple load that
consists of the following script:
LOGTABLE CME.TLddNT14H;
.LOGON OPNACC1/CME,CME;
DROP TABLE TBL14TA;
DROP TABLE TBL14TB;
DROP TABLE tlnt14err;
CREATE TABLE TBL14TA,FALLBACK
(ABYTEINT BYTEINT,
Teradata Parallel Data Pump Reference
85
Chapter 2: Using TPump
Estimating Space Requirements
ASMALLINT SMALLINT,
AINTEGER INTEGER,
ADECIMAL DECIMAL (5,2),
ACHAR
CHAR (5),
ABYTE
BYTE(1),
AFLOAT
FLOAT,
ADATE
DATE)
UNIQUE PRIMARY INDEX (ASMALLINT);
CREATE TABLE TBL14TB,FALLBACK
(ABYTEINT BYTEINT,
ASMALLINT SMALLINT,
AINTEGER INTEGER,
ADECIMAL DECIMAL (5,2),
CHAR
CHAR (5),
ABYTE
BYTE(1),
AFLOAT
FLOAT,
ADATE
DATE)
UNIQUE PRIMARY INDEX (ASMALLINT);
/*****************************************************************/
/* BEGIN TLOAD WITH ALL THE OPTIONS SPECIFIED SUCH AS ERRLIMIT, **/
/* CHECKPOINT, SESSIONS,TENACITY
**/
/*****************************************************************/
.BEGIN LOAD ERRLIMIT 5 CHECKPOINT 15 SESSIONS 4 1 TENACITY 2
ERRORTABLE tlnt14err ROBUST ON PACK 20;
.LAYOUT LAY1A;
.FILLER ATEST
* BYTEINT;
.FIELD ABYTEINT * BYTEINT;
.FIELD ASMALLINT * SMALLINT;
.FIELD AINTEGER * INTEGER;
.FIELD ADECIMAL * DECIMAL (5,2);
.FIELD ACHAR
* CHAR (5);
.FIELD ABYTE
* BYTE(1);
.FIELD AFLOAT
* FLOAT;
.FIELD ADATE
* DATE;
.DML LABEL LABELA IGNORE DUPLICATE ROWS IGNORE MISSING ROWS
IGNORE EXTRA ROWS;
INSERT INTO TBL14TA VALUES (:ABYTEINT,:ASMALLINT,:AINTEGER,:ADECIMAL,
:ACHAR,:ABYTE,:AFLOAT,:ADATE);
.DML LABEL LABELB IGNORE DUPLICATE ROWS IGNORE MISSING ROWS
IGNORE EXTRA ROWS;
INSERT INTO TBL14TB VALUES (:ABYTEINT,:ASMALLINT,:AINTEGER,:ADECIMAL,
:ACHAR,:ABYTE,:AFLOAT,:ADATE);
.IMPORT INFILE ./tlnt014.dat
LAYOUT LAY1A FROM 1 FOR 400
APPLY LABELA WHERE ATEST = 1
APPLY LABELB WHERE ATEST = 2;
.END LOAD;
.LOGOFF;
From this script the space requirements can be calculated to be:
86
•
200 bytes for initialization +
•
200 bytes * 6 for support environment statements +
•
200 bytes * 2 for DML SQL statements +
•
400 bytes for the BEGIN/END load pair +
•
200 bytes for the IMPORT
Teradata Parallel Data Pump Reference
Chapter 2: Using TPump
Estimating Space Requirements
which is a starting total of 2400 bytes.
Further, assume that the Teradata Database can accept about 32 statements per second and
that the load takes a little more than an hour to complete. The space for partial and complete
checkpoints is calculated with the following steps:
1
32 statements per second translates to 1920 statements per minute.
2
1920 / 20 (the packing factor) = 93 partial checkpoints/minute
3
Multiply by 15 (15 minute CP frequency) = 1395 partial checkpoint rows maximum.
4
Each checkpoint row is 117 + (12 * 20) = 357 bytes so 498,015 bytes are in partial
checkpoint rows.
5
Given that the load takes just more than an hour, assume 5 checkpoints are written at
300 bytes each.
Now we have the grand total of space in the log table:
2,400 + 498,015 + 1,500 = 517,980 bytes.
Teradata Parallel Data Pump Reference
87
Chapter 2: Using TPump
Estimating Space Requirements
88
Teradata Parallel Data Pump Reference
CHAPTER 3
TPump Commands
This chapter describes the TPump commands and Teradata SQL statements that you can
execute from the TPump utility.
Experienced TPump users can also refer to the simplified command descriptions in the
TPump chapter of the Teradata Tools and Utilities Command Summary. This book provides
the syntax diagrams and a brief description of the syntax variables for each Teradata client
utility.
Syntax Notes
This section provides information you should know before using TPump commands and
Teradata SQL statements.
Each TPump command:
•
Must begin on a new line.
•
Must start with a period (.) character. In this document, TPump command periods are
shown only in syntax diagrams and examples, but not in the narrative text.
•
Must end with a semicolon (;) character.
•
May continue for as many lines as necessary, as long as it satisfies the beginning and
ending requirements.
Statements are standard Teradata SQL statements and are not preceded by periods.
See Appendix A: “How to Read Syntax Diagrams” for more information about how to
read the syntax diagrams used in this book.
TPump Commands
Table 15 is an alphabetical list of the commands supported by TPump. The syntax and use of
these commands is described in detail in this chapter.
Teradata Parallel Data Pump Reference
89
Chapter 3: TPump Commands
Syntax Notes
Table 15: TPump Commands
90
Command
Definition
ACCEPT
Accepts the data type and value of one or more utility variables from an external
source.
BEGIN LOAD
Indicates the start of a TPump task and specifies the parameters for executing
the task.
DATEFORM
Lets you define the form of the DATE data type specifications for the TPump
job.
DISPLAY
Writes messages to the specified destination.
DML
Defines a label and error treatment option for the Teradata SQL DML
statement(s) following the DML command. INSERT, UPDATE, DELETE, and
EXECUTE are the DML statement options.
ELSE
The ELSE command is followed by commands and statements that execute
when the preceding IF command is false.
ENDIF
Exits from the conditional expression IF or IF/ELSE command sequences.
ENDIF is followed by commands and statements resuming the program.
END LOAD
Indicates completion of TPump command entries and initiates the task. This is
the last command of a TPump task.
FIELD
Defines a field of the data source record. Fields specified by this command are
sent to the Teradata Database. This command is used with the LAYOUT
command.
FILLER
Defines a field in the data source that is not sent to the Teradata Database. This
command is used with the LAYOUT command.
IF
The IF command is followed by a conditional expression which, if true, executes
commands and statements following the IF command.
IMPORT
Identifies the data source, layout, and optional selection criteria to the client
program.
LAYOUT
Specifies layout of the externally stored data records to be used in the TPump
task. This command is used in conjunction with an immediately following
sequence of FIELD, FILLER, and TABLE commands.
LOGOFF
Disconnects all active sessions and terminates execution of TPump on the
client.
LOGON
Establishes a Teradata SQL session on the Teradata Database, and specifies the
LOGON string to be used in connecting all sessions required by subsequent
functions.
LOGTABLE
Identifies the table to be used for journaling checkpoint information required
for safe, automatic restart of TPump in the event of a client or Teradata
Database failure.
NAME
Sets the utility variable &SYSJOBNAME with a job name of up to 16 characters.
PARTITION
Establishes session partitions to transfer SQL requests to the Teradata Database.
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
Syntax Notes
Table 15: TPump Commands (continued)
Command
Definition
ROUTE
Identifies the destination of output produced by TPump.
RUN FILE
Invokes the specified external source as the current source of commands and
statements.
SET
Assigns a data type and a value to a utility variable.
SYSTEM
Suspends TPump to issue commands to the local operating system.
TABLE
Identifies a table whose column names and data descriptions are used as the
names and data descriptions of the input record fields. Used in place of, or in
addition to, the FIELD command. This command is used with the LAYOUT
command.
Note: When UTF16 session character set is used, the TABLE command will
generate a field of CHAR(2n) for the CHAR(n) typed column and a field of
VARCHAR(2m) for the VARCHAR(m) typed column because each character in
the column takes 2-byte storage when using UTF16 session character set.
TPump Teradata SQL Statements
The following Teradata SQL statements supported by TPump are included in this chapter
because they require special considerations for use with TPump. They are used for loading
purposes and for creating TPump macros. The syntax and use of these Teradata SQL
statements is described in detail in this chapter.
Table 16: TPump Teradata SQL Statements
Statement
Definition
DATABASE
Changes the default database qualification for all DML
statements.
DELETE
Removes specified rows from a table.
EXECUTE
Specifies a user-created (predefined) macro for execution.
The macro named in this statement resides in the Teradata
Database and specifies the type of DML statement
(INSERT, UPDATE, or DELETE) being handled by the
macro.
INSERT
Adds new rows to a table by directly specifying the row
data to be inserted.
UPDATE Statement and Atomic Upsert
Changes field values in existing rows of a table.
Teradata Parallel Data Pump Reference
91
Chapter 3: TPump Commands
ACCEPT
ACCEPT
Purpose
The ACCEPT command accepts data types and values from an external source and uses them
to set one or more utility variables.
Syntax
,
.ACCEPT
var
FILE
A
fileid
FROM
;
A
IGNORE
charpos1
charpos1
THRU
THRU
charpos2
charpos1 THRU
charpos2
HE03A011
where
Syntax Element
Description
var
name of the utility variable that is to be set with the value accepted
from the designated source
Character string values appear as quoted strings in the data file.
92
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
ACCEPT
Syntax Element
Description
fileid
data source of the external system.
The external system DD (or similar) statement specifies a file.
UNIX and Windows
infilename (the path name for a file).
If the path name has embedded white space characters, enclose the
entire path name in single or double quotes.
MVS
a true DDNAME.
If DDNAME is specified, TPump reads data records from the
specified source.
A DDNAME must obey the same rules for its construction as
Teradata SQL column names, except that:
• the “at” sign (@) is allowed as an alphabetic character
• the underscore (_) is not allowed
The DDNAME must also obey the applicable rules of the external
system.
If DDNAME represents a data source on magnetic tape, the tape
may be either labelled or nonlabelled (if the operating system
supports it).
VM/CMS
A FILEDEF name.
charpos1 and charpos2
start and end character positions of a field in each input record
which contains extraneous information
TPump ignores the specified field(s) as follows:
• If charpos1 is specified, TPump ignores only the single specified
character position.
• If charpos1 THRU character positions are specified, TPump
ignores character positions from charpos1 through the end of the
record.
• If THRU charpos2 is specified, TPump ignores character
positions from the beginning of the record through charpos2.
• If charpos1 THRU charpos2 is specified, TPump ignores character
positions from charpos1 through charpos2.
Usage Notes
A single record, row, or input line is accepted from the designated source. Ensure that there is
only one record in the file from which the ACCEPT command is getting the variables.
If multiple variables are coded, each is sequentially assigned input text up to the first white
space character encountered that is not within a quoted string.
Input text for numeric values must be delimited only by white space or record boundaries.
Input text for character strings must be enclosed in apostrophes. For example:
.Accept age, name from file info;
Teradata Parallel Data Pump Reference
93
Chapter 3: TPump Commands
ACCEPT
The data record provided to satisfy the preceding ACCEPT should include two fields. The
following example shows two sample data records, where the first is correct but the next is not:
32
32
’Tom’
Tom
/* This line contains valid data. */
/* Tom is invalid data.
*/
An additional method of placing comments in input text is as follows:
32
’Tom’; This line contains valid data.
When the number of variables listed is greater than the number of responses available, unused
variables remain undefined (NULL). If there are not enough variables to hold all responses, a
warning message is issued. If the input source is a file, the next record (starting with the first)
of the file is always retrieved.
94
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
BEGIN LOAD
BEGIN LOAD
Purpose
The BEGIN LOAD command initiates or restarts a TPump task, specifying the number of
sessions to use and any other parameters needed to execute the task.
Syntax
.BEGIN LOAD
SESSIONS
number
A
threshold
A
tname
ERRORTABLE
dbname.
ERRLIMIT
APPEND
NODROP
QUEUETABLE
errcount
errpercent
CHECKPOINT
frequency
SERIALIZE
OFF
ON
DATAENCRYPTION
OFF
ON
ARRAYSUPPORT
OFF
ON
TENACITY
hours
PACK
statements
PACKMAXIMUM
LATENCY
seconds
RATE
statement_rate
RETRYTIMES
nn
SLEEP
minutes
NOTIMERPROCESS
NOATOMICUPSERT
NOMONITOR
ROBUST
ON
OFF
MACRODB
NOTIFY
dbname
OFF
LOW
MEDIUM
HIGH
ULTRA
Teradata Parallel Data Pump Reference
EXIT name
TEXT 'string '
MSG 'string '
3021G017
95
Chapter 3: TPump Commands
BEGIN LOAD
where
Syntax Element
Description
SESSIONS
keyword for the number of TPump sessions
number
number of sessions to be logged on for update purposes for TPump.
A TPump task logs on and uses the number of sessions specified. One
additional session is used for performing various utility functions.
There is no default value for number; it must be specified. Neither is
there a maximum value, except for system-wide session limitations,
which vary among machines.
Limiting the number of sessions conserves resources on both the
external system and the Teradata Database. This conservation is at the
expense of a potential decrease in throughput and increase in elapsed
time.
threshold
minimum number of sessions to be logged on for update purposes for
the utility
When logging on sessions, if the limits are reached above the threshold
value, TPump stops trying to log on, and uses whatever sessions are
already logged on.
If the sessions run out before the threshold is reached, TPump logs off
all sessions, waits for the time determined by the SLEEP value, and tries
to log on again.
ERRORTABLE
optional keyword for identifying a database and error table
You can use a database name as a qualifying prefix to the error tables.
Specifying a database that is not your production database avoids
cluttering your production system with error tables. This means that
because the database should probably have a lot of PERM space, that
space will not have to be increased for all databases with tables involved
in the TPump task.
Caution:
APPEND
Do not share the restart table between two or more TPump
jobs. Each TPump job must have its own restart log table to
ensure that it runs correctly. If you do not use a distinct
restart log table for each TPump job, the results are
unexpected. In addition, you may not be able to restart one
or more of the affected jobs.
will tell TPump to use the existing error table
If the specified error table does not exist, TPump will create it. If the
structure of the existing error table is not compatible with the error
table TPump creates, the job will run into an error when TPump tries to
insert or update the error table.
NODROP
will tell TPump not to DROP the error table even it is empty at the end
of the job
NODROP can be used with APPEND to persist the error table or alone.
QUEUETABLE
96
will tell TPump to select the error table as a Queue Table
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
BEGIN LOAD
Syntax Element
Description
dbname.
the qualified database for the error table
If the database is not specified, the database which contains the log table
is used. The period following the dbname separates the database name
from the tname parameter. If a different database is specified, it may
help to avoid cluttering the production database with error tables.
tname
error table that receives information about errors detected during the
load
tname may be preceded by a database name qualifier. This table is
referred to as the error table or ET table.
TPump does not explicitly specify the level of protection applied to this
table, using instead the default protection level applied to the database
wherein the error table is placed. If the database specifies fallback, tname
becomes fallback.
The default error table name is composed of the job name, followed by
an underscore and sequence number of the load, then an underscore
and an ET, as in jobname_nnn_ET.
tname identifies a nonexisting table for a nonrestart task, or an existing
table for a restart task.
For all errors inserted in this error table, the identifiers for the offending
combination of statement and data record are included in the
appropriate row of tname. The columns in the error table allow you to
identify a specific data record and statement combination which
produced an error. The column names and definitions of the error table
are:
Teradata Parallel Data Pump Reference
97
Chapter 3: TPump Commands
BEGIN LOAD
Syntax Element
Description
ImportSeq—A byteint containing the IMPORT command sequence
number.
DMLSeq —A byteint containing the sequence number of the DML
command within the command file.
SMTSeq—A byteint containing the sequence number of the DML
statement within the DML command.
ApplySeq—A byteint containing the sequence number of the APPLY
clause within its IMPORT command.
SourceSeq—An integer containing the position of a data record within a
data source.
DataSeq—A byteint identifying the data source. This value is always
one.
ErrorCode —An integer containing an error return code.
ErrorMsg—Contains the corresponding error message for the error
code.
ErrorField —A smallint, which, if valid, indicates the bad field.
The names of record fields sent to the Teradata Database are specified
via the LAYOUT command, in conjunction with FIELD and TABLE
commands.
HostData - A variable length byte string containing the data sent by the
external system.
ERRLIMIT
optional keyword for setting a limit on records rejected for errors
When the ERRLIMIT is exceeded, TPump performs a checkpoint, then
terminates the job. The data read before ERRLIMIT was exceeded will
be submitted and completed before the job is terminated. This means
when a job is terminated due to ERRLIMIT was exceeded, there may be
more error records in the error table than the number specified in
ERRLIMIT. To facilitate diagnosis of data errors, the ERRLIMIT should
be greater than the number of statements packed into one request.
98
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
BEGIN LOAD
Syntax Element
Description
errcount
error threshold for controlling the number of rejected records
Usage depends on whether used with the errpercent parameter.
1 When used without the errpercent parameter, specifies, as an
unsigned integer, the number of records that can be rejected and
recorded in tname during a load (all records sent between the BEGIN
LOAD and END LOAD commands). The default is no limit.
2 When used with the errpercent parameter (which is approximate),
specifies the maximum number of records that must be sent to the
Teradata Database before the errpercent parameter is applied.
For example, if errcount = 100 and errpercent = 5, then 100 records must
be sent to the Teradata Database before the approximate 5 percent
rejection limit is applied. If only the first five records are rejected when
the 100th record is sent, the limit is not exceeded. If there are six
rejections, however, then the limit is exceeded. After the 100th record is
sent, TPump stops processing if the 5 percent limit has been exceeded.
When the limit has been exceeded, TPump writes an error message to
the external system’s customary message destination and terminates the
task.
All tables in use are left in their state at the time of the termination. This
allows you to correct errors in data records and restart the task from the
last checkpoint. If a restart is not possible or not desired, any unwanted
tables should be dropped.
CHECKPOINT
keyword indicating the number of minutes between the occurrences of
checkpoints
This is followed by a frequency value.
Teradata Parallel Data Pump Reference
99
Chapter 3: TPump Commands
BEGIN LOAD
Syntax Element
Description
frequency
the interval in minutes between check pointing operations
Specify an unsigned integer from 0 through 60, inclusive.
If you specify a CHECKPOINT frequency of less than or equal to 60, a
checkpoint is recorded at the specified frequency, in minutes.
If you specify a CHECKPOINT frequency of more than 60, TPump
terminates the job.
Specifying a CHECKPOINT frequency of zero bypasses the checkpoint
operation. Prior to Teradata Tools and Utilities 07.00.00, TPump
initiated a checkpoint operation when the import began, regardless of
whether a CHECKPOINT frequency was set to zero or not. After
Teradata Tools and Utilities 07.00.00, the initial checkpoint operation is
bypassed when the CHECKPOINT frequency was set to zero. Bypassing
the initial checkpoint operation may cause data corruption when
TPump restarts a job, or when a DBS restart occurs during a TPump
job, whether or not the job is running in SIMPLE mode or ROBUST
mode.
If you do not specify a CHECKPOINT frequency, check pointing occurs
every 15 minutes by default.
Whether specified or not, checkpoints are written at the end of each
data input source.
Note: Checkpoints should not be set if you use an FDL-compatible
INMOD routine with the FOR, FROM, or THRU options. When you
use an FDL-compatible INMOD routine with the FOR, FROM, or
THRU options, TPump terminates and an error message appears if the
checkpoint frequency is other than zero.
DATAENCRYPTION
ON/OFF
keyword to encrypt import data and the request text during the
communication between TPump and Teradata Database
If ON, the encryption will be performed. If DATAENCRYPTION is not
specified, the default is OFF.
The "-y" runtime parameter applies the encryption to all connected
sessions, which include the control session and the load sessions. This
option only applies the encryption to the load sessions, which are the
sessions specified by the SESSIONS keyword in the BEGIN LOAD
command, and overrides the "-y" runtime parameter when OFF is
explicitly specified. For example, assuming the PARTITION command
is not used in the job, when "-y" runtime parameter is specified with the
job and DATAENCRYPTION OFF is specified in the script, the
encryption will only apply to the control session. Similarly, assuming
the PARTITION command is not used in the job when "-y" runtime
parameter is not specified with the job, and DATAENCRYPTION ON is
specified in the script, the encryption will apply to all load sessions but
not the control session.
When the PARTITION command is used, the encryption setting
explicitly specified in the PARTITION command will override the
setting of this option over the sessions defined by the PARTITION
command.
100
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
BEGIN LOAD
Syntax Element
Description
ArraySupport
ON/OFF
"ArraySupport ON|OFF" option to the .BEGIN LOAD command and
the .DML command
When "ArraySupport ON" is specified in the .BEGIN LOAD command,
the .DML commands enclosed in .BEGIN LOAD and .END LOAD
command pair will use the ArraySupport feature for its DML statement,
unless "ArraySupport OFF" is specified for the .DML command. The
default value of ArraySupport for the .BEGIN LOAD command is OFF.
When "ArraySupport ON|OFF" is not specified with the .DML
command, the default value for ArraySupport for that .DML command
is the effective setting of ArraySupport in the .BEGIN LOAD command
where the .DML command resides. When "ArraySupport ON|OFF" is
specified at the .DML command, the specified value overrides the
default setting determined by the .BEGIN LOAD command.
When a .DML command is using the ArraySupport feature, it must
contain one and only one DML statement, and the session partition that
the .DML command references needs to be used by this .DML
command exclusively.
If the DML statement is an UPSERT-type statement, it can be specified
as a pair of INSERT/UPDATE statements with DO INSERT FOR
MISSING UPDATE clause. TPump will create its equivalent form of
UPDATE … ELSE INSERT …, example Atomic Upsert, and use it as the
actual DML statement. Or an UPDATE … ELSE INSERT … statement
can be directly specified with DO INSERT FOR MISSING UPDATE
clause.
The non-atomic form of UPSERT is not supported by TPump Array
Support.
TENACITY
keyword (with hours parameter) defining how long the utility tries to
log on the sessions needed to perform the TPump job
If a logon is denied, TPump delays for the time specified by the SLEEP
parameter (the default is six minutes) and retries the logon. It retries
until either the logon succeeds or the number of hours specified by
TENACITY is exceeded.
If the TENACITY parameter is not specified, the utility retries the
logons for four hours.
hours
TPump tenacity factor, as an integral number of hours
Specifies how long TPump keeps trying to logon to the required
sessions.
The default value for hours is 4 if the parameter is not specified. If hours
is specified as 0, TPump does not retry logons after a logon fails because
of a capacity limit.
When a “no more sessions” error appears (either a 301 return code from
a workstation CLI or a 513 return code from a mainframe CLI), TPump
drops the sessions already acquired, and terminates the job without
trying another logon.
Teradata Parallel Data Pump Reference
101
Chapter 3: TPump Commands
BEGIN LOAD
Syntax Element
Description
LATENCY
Keyword for flushing stale buffers.
Note: When using the TPump latency option with Named Pipe Access
Module, need_full_block = no option should be added in the Named
Pipe Access Module initialization string.
seconds
flushing threshold based on number of seconds oldest record has
resided in buffer.
LATENCY cannot be less than one second.
If the SERIALIZE parameter is set to OFF, only the current buffer can
possibly be stale. If SERIALIZE is ON, the number of stale buffers can
range from zero to the number of sessions.
NOTIMERPROCESS
keyword to tell TPump not to fork a child process as a timer process
When a child process is forked, the SIGUSR2 signal notifies the parent
process when the latency period expires. When a child process is not
forked, the SIGALRM signal notifies the TPump process when the
latency period expires. A child process is necessary for the latency
function to work properly on the UNIX platforms when the MQSeries
Access Module is used.
minutes
number of minutes to wait between unsuccessful logon attempts due to
session limits errors on the Teradata Database or CLIv2
If SLEEP is not specified, the default between unsuccessful logon
attempts is 6 minutes.
SERIALIZE ON/OFF
keyword to set the state (ON/OFF) of the serialization feature which, if
ON, guarantees that operations on a given key combination (row) occur
serially
If SERIALIZE is not specified, the default is OFF.
This feature is meaningful only when a primary key for the loaded data
is specified by using the KEY option of the FIELD command.
To ensure data integrity, the SERIALIZE parameter defaults to ON in
the absence of an explicit value if there are upserts in the TPump job.
PACKMAXIMUM
keyword requesting TPump to dynamically determine the maximum
possible PACK factor for the current load
Maximum value is 600.
Displayed in message UTY6652, the value thus determined should be
specifically used on subsequent runs, as the use of PACKMAXIMUM
requires iterative interactions with the RDBMS during initialization to
heuristically determine the maximum possible PACK factor.
PACK
keyword for the number of statements to pack into a multiple-statement
request
Maximum value is 600.
Packing improves network/channel efficiency by reducing the number
of sends and receives between the application and the Teradata
Database.
102
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
BEGIN LOAD
Syntax Element
Description
statements
number of statements, as a positive integer of up to 600, to pack into a
multiple-statement request
Default value is 20 statements per request.
Under certain conditions, TPump may determine that the pack factor
has been set too high. TPump then automatically lowers the pack setting
to an appropriate value and issues warning message UTY6625, for
instance:
“UTY6625 WARNING: Packing has been changed to 12 statements per
request” and continues.
Packing improves network/channel efficiency by reducing the number
of sends/receives between the application and the RDBMS.
The packing factor is validated by sending a fully packed request to the
Teradata Database using a prepare. This test checks for syntax problems
and requests that are excessively large and may overwhelm the parser.
To simplify the script development process, TPump ignores certain
errors returned by an overloaded parser, shrinks the request, retries the
prepare until it executes successfully and finally, issues a warning noting
the revised packing factor size.
When this happens, the TPump script should be modified to eliminate
the warning, thereby avoiding the time-consuming process of shrinking
the request.
A packing failure may occur if the source parcel length does not match
the data defined. If this happens, TPump issues the message:
“UTY2819 WARNING: Packing may fail because input data does not
match with the data defined.”
To resolve this problem, increase the packing factor and resubmit the
job.
RATE
keyword for entering the rate at which statements are sent to the
Teradata Database
RETRYTIMES nn
keyword for retry times number of retry times
Default is 16.
If nn equals 0, the retry times will be set to 16. If retrytimes is set, this
only takes effect for the requests/data between "BEGIN LOAD" and
"END LOAD" pair.
statement_rate
initial maximum rate at which statements are sent to the Teradata
Database per minute
The statement rate must be a positive integer. If the statement rate is
unspecified, the rate is unlimited.
If the statement_ rate is less than the statement packing factor, TPump
sends requests smaller than the packing factor.
If the TPump Monitor is in use, the statement_rate can be changed later
on.
SLEEP
Teradata Parallel Data Pump Reference
keyword for the number of minutes to sleep
103
Chapter 3: TPump Commands
BEGIN LOAD
Syntax Element
Description
NOATOMICUPSERT
keyword to perform non-atomic upsert operations for UPSERT DMLs
in the job script if these UPSERT DMLs are not provided in the Atomic
UPSERT form
NOMONITOR
keyword to prevent TPump from checking for statement rate changes
from, or update status information for, the TPump Monitor
ROBUST ON/OFF
The OFF parameter signals TPump to use Simple restart logic. In this
case, restarts cause TPump to begin where the last checkpoint occurred
in the job. Any processing that occurred after the checkpoint is redone.
This method does not have the extra overhead of the additional
database writes in the Robust logic.
Caution:
Certain errors may cause reprocessing, resulting in extra
error table rows due to reexecuting statements (attempting to
re-insert rows, for example). Or, if the target table allows
duplicate rows, reexecuting statements may cause extra
duplicate rows to be inserted into the target table instead of
causing extra error table rows.
Simple logic is adequate in certain DML statements that can be repeated
without changing the results of the operation. Examples of statements
which are NOT Simple include the following:
INSERTs into tables that allow duplicate rows (MULTISET tables).
Self-referencing DML statements such as:
“UPDATE FOO SET A=A+1...”
“UPDATE FOO SET A = 3 WHERE A=4”
MACRODB
keyword for database to contain any macros used by TPump
dbname
name of database which is to contain any macros built/used by TPump
This database overrides the default placement of macros into the
database which contains the log restart table.
NOTIFY
TPump implementation of the notify user exit option:
• NOTIFY OFF suppresses the notify user exit option.
• NOTIFY LOW enables the notify user exit option for those events
signified by “Yes” in the Low Notification Level column of Table 17.
• NOTIFY MEDIUM enables the notify user exit option for the most
significant events, as specified by “Yes” in the Medium Notification
Level column of Table 17.
• NOTIFY HIGH enables the notify user exit option for every TPump
event that involves an operational decision point, as specified by
“Yes” in the High Notification Level column of Table 17.
• NOTIFY ULTRA enables the notify user exit option for every TPump
event that involves an operational decision point, as specified by
“Yes” in the ULTRA Notification Level column of Table 17.
104
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
BEGIN LOAD
Syntax Element
Description
EXIT name
keyword phrase that calls a user-defined exit where name is the name of
a user-supplied library with a member name of _dynamn
The default library names are:
• NOTFYEXT for channel-attached VM and MVS client systems
• libnotfyext.so for network-attached UNIX and Windows client
systems
The exit must be written in C, or in a language with a runtime
environment that is compatible with C.
On some versions of UNIX, you may have to add ./ prefix characters to
the EXIT name specification if the module is in the current directory.
TEXT 'string'
user-supplied string of up to 80 characters that TPump passes to the
named exit routine
The string specification must be enclosed in single quote characters (').
MSG 'string'
user-supplied string of up to 16 characters that TPump logs to:
• The operator’s console for channel-attached VM and MVS client
systems
• The system log for network-attached UNIX and Windows client
systems
The string specification must be enclosed in single quote characters (').
Table 17: Events that Create Notifications
Notification Level
Event
Low
Medium
High
Ultra
Signifies
Initialize
Yes
Yes
Yes
Yes
Successful processing of the Notify
option (BEGIN LOAD command)
File or INMOD Open
No
No
Yes
Yes
Successful processing of the
IMPORT command.
Checkpoint Begin
No
No
Yes
Yes
TPump started a checkpoint.
Checkpoint End
No
No
Yes
Yes
TPump successfully completed a
checkpoint.
Error Table
No
No
Yes
Yes
Successful processing of the SEL
COUNT(*) request for the error
table.
Table Statistics
No
Yes
Yes
Yes
TPump has successfully written the
table statistics.
Teradata Database
Restart
No
Yes
Yes
Yes
TPump received a crash error from
Teradata or CLI.
CLIv2 Error
Yes
Yes
Yes
Yes
TPump received a CLIv2 error.
Teradata Parallel Data Pump Reference
105
Chapter 3: TPump Commands
BEGIN LOAD
Table 17: Events that Create Notifications (continued)
Notification Level
RDBMS Error
Yes
Yes
Yes
Yes
A Teradata Database error that
terminates TPump.
Exit
Yes
Yes
Yes
Yes
TPump completed a load task.
Import Begin
No
No
Yes
Yes
TPump is about to start reading
records.
Import End
No
No
Yes
Yes
Last record has been read.
Interim Run Statistics
No
No
No
Yes
TPump is about to update the
Monitor Interface table, or TPump
successfully completed a
checkpoint, or an Import has just
completed successfully.
DML Error
No
No
Yes
Yes
TPump is about to log a DML error
to the error table.
Usage Notes
Multiple tables can be targeted by a single TPump job.
If the script author is uncertain whether or not to use ROBUST restart logic, it is always safe to
use the ROBUST ON parameter.
To ensure data integrity, the SERIALIZE parameter defaults to ON in the absence of an
explicit value if there are upserts in the TPump job.
The statement rate per minute you set using the RATE keyword is also affected by the
periodicity value. By default, TPump uses a periodicity value of four when enforcing the
statement rate limit. You can adjust the periodicity rate from 1 to 600 using a run-time
parameter.
For example, if you set the statement rate at 1600 and the periodicity at 10, then the maximum
number of statements processed is 160 (1600/10) statements every 6 (60/10) seconds.
Caution:
106
A LOGOFF command entered after a BEGIN and before the matching END LOAD logs you
off the TPump utility.
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
DATABASE
DATABASE
Purpose
TPump supports the Teradata SQL DATABASE statement, which changes the default database
qualification for all unqualified DML and DDL statements.
Syntax
DATABASE
database
;
3021A020
where
Syntax Element
Description
database
new qualified default database for the error table
Changes the database from the one originally specified by the BEGIN
LOAD command.
Usage Notes
The DATABASE command only affects native SQL commands. In particular, it has no effect
on the BEGIN LOAD command.
The DATABASE command does affect INSERT, UPDATE, DELETE, and EXEC statements
issued as part of a load. (When TPump logs on sessions, it immediately issues a DATABASE
statement on each session.)
The DATABASE command does not affect the placement of TPump macros.
Teradata Parallel Data Pump Reference
107
Chapter 3: TPump Commands
DATEFORM
DATEFORM
Purpose
The DATEFORM command lets you define the form of the DATE data type specifications for
the TPump job.
Syntax
.DATEFORM
INTEGERDATE
;
ANSIDATE
3021A006
where
Syntax Element
Description
INTEGERDATE
keyword that specifies integer DATE data types for the TPump job
This is the default Teradata DATE data type specification for TPump jobs
if you do not enter a DATEFORM command.
ANSIDATE
keyword that specifies ANSI fixed-length CHAR(10) DATE data types for
the TPump job
Usage Notes
The topics in the following table describe the things you should consider when using the
DATEFORM command.
Topic
Usage Notes
Command Frequency
and Placement
• You can only use one DATEFORM command.
• You must enter the command before the LOGON command.
Data Type Conversions
When you use the ANSIDATE specification, you must convert ANSI/SQL
DateTime data types to fixed-length CHAR data types when specifying
the column/field names in the TPump FIELD command.
See the Usage Notes subsection of the FIELD command for a description
of the fixed-length CHAR representations for each DATE, TIME,
TIMESTAMP, and INTERVAL data type specification.
Release Applicability
108
The ANSIDATE specification is valid for TPump jobs on the Teradata
Database for UNIX.
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
DELETE
DELETE
Purpose
TPump supports the DELETE Teradata SQL statement, which removes rows from a table.
Syntax
tname
DELETE
WHERE
conditional
;
FROM
HE05B032
where
Syntax Element
Description
tname
table from which rows are to be deleted
tname is qualified either explicitly by database name, or by the current default
database.
WHERE
condition
conditional clause identifying the row(s) to delete
The conditional clause uses values from input data record fields as defined by a
FIELD command or TABLE command of the layout referenced by an IMPORT
using this statement.
Usage Notes
The following notes describe how to use DELETE statements following a DML command.
A DELETE statement may also be used in the support environment; normal rules for DELETE
are followed in that case.
TPump operates only on single table statements so DELETE statements must not contain any
joins.
To delete records from a table, the username specified on the LOGON command must have
DELETE privilege on the specified table.
When the condition(s) of the DELETE statement’s WHERE clause are evaluated, the result
can be definitely true, definitely false, or indeterminate. If the result is true for a specific row,
TPump deletes the row. An indeterminate result, due to an abnormal arithmetic condition
such as underflow, overflow, or division by zero, is treated as an error, and TPump records
both row and error code in the error table.
Teradata Parallel Data Pump Reference
109
Chapter 3: TPump Commands
DELETE
The DELETE statement must identify only one object.
Remember the following when constructing scripts:
•
A DELETE statement can be applied to either a table or view, provided that the view does
not specify a join.
•
Equality values for all the primary index columns should normally be specified in the
WHERE clause.
The OR construct can be used in the WHERE clause of a DELETE statement; alternatively,
two or more separate DML statements (one per OR term) can be used, with the DML
statements applied conditionally with the APPLY clause of the IMPORT command. The
nature of the alternatives will usually make one of the methods more appropriate.
•
The column(s) specified in this clause need not be a part of any index, but can be one or
more nonindexed columns. This clause may specify nonequality values for any
combination of columns of unique indices, or any values for other columns.
•
The maximum number of INSERT, UPDATE, DELETE, and EXECUTE statements that
can be referenced in an IMPORT is 127.
•
The maximum number of DML statements that can be packed into a request is 600. The
default number of statements packed is 20.
Example
The following example uses an input data source containing a series of one-field, four-byte
records. Each record contains the value (EmpNum) of the primary index column (EmpNo) of
a row to be deleted from the Employee table.
.BEGIN LOAD SESSION number;
.LAYOUT Layoutname;
.FIELD EmpNum 1 INTEGER;
.DML LABEL DMLlabelname;
DELETE Employee WHERE EmpNo = :EmpNum;
.IMPORT INFILE Infilename LAYOUT Layoutname
.END LOAD;
110
APPLY DMLlabelname;
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
DISPLAY
DISPLAY
Purpose
The DISPLAY command can be used to write messages to the specified destination.
Syntax
DISPLAY
'text'
FILE
fileid
;
TO
3021A021
where
Syntax Element
Description
‘text’
text to be written to the specified output destination
fileid
data source of the external system.
The external system DD (or similar) statement specifies a file.
UNIX and Windows
infilename (the path name for a file)
If the path name has embedded white space characters, enclose the
entire path name in single or double quotes.
MVS
a true DDNAME.
If DDNAME is specified, TPump reads data records from the
specified source.
A DDNAME must obey the same rules for its construction as
Teradata SQL column names, except that:
• the “at” sign (@) is allowed as an alphabetic character
• the underscore (_) is not allowed
A DDNAME must obey the applicable rules of the external system.
If DDNAME represents a data source on magnetic tape, the tape
may be either labelled or nonlabelled (if the operating system
supports it).
VM/CMS
a FILEDEF name.
Teradata Parallel Data Pump Reference
111
Chapter 3: TPump Commands
DISPLAY
Usage Notes
Utility variables are replaced by their values before text is displayed. This is done by preceding
the variable name with an ampersand (&). To display the name of a utility variable, code two
’&’s instead of one.
To display an apostrophe within the text string, two consecutive apostrophes (single quotes)
must be used to distinguish it from both the single quotes enclosing the string and a regular
double quote.
In UNIX-based systems, if outfilename is used to redirect stdout as well as the file in a
DISPLAY command, the results written to outfilename may be incomplete due to conflicting
writes to the same file.
On UNIX systems, you can use an asterisk (*) as the fileid specification to direct the display
messages to the system console/standard output (stdout) device. The system console is the:
112
•
Display screen in interactive mode
•
Standard output device in batch mode
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
DML
DML
Purpose
The DML command defines a label and error treatment options for one or more immediately
following DML statements. DML statements relevant to a TPump job are INSERT, UPDATE,
DELETE, and EXECUTE, with UPDATE and INSERT statements sometimes paired to form
either a basic upsert or an Atomic upsert operation.
Syntax
.DML
LABEL
label
A
;
A
MARK
ROWS
DUPLICATE
IGNORE
INSERT
UPDATE
MISSING
UPDATE
DELETE
EXTRA
UPDATE
DELETE
DO INSERT FOR
MISSING UPDATE
,
SERIALIZEON
(
serialize_on_field
,
USE
(
use_field
PARTITION
ARRAYSUPPORT
)
)
partition_name
OFF
ON
3021E007
where
Syntax Element
Description
LABEL
keyword indicating that the following parameter is a label for the DML
statements that follow
Teradata Parallel Data Pump Reference
113
Chapter 3: TPump Commands
DML
Syntax Element
Description
label
unique label is to be used for the immediately following set of one or more
DML statements
A label must obey the same rules for its construction as Teradata SQL column
names.
The label name may be referenced in the APPLY clause of an IMPORT
command.
MARK
keyword indicating that the system should make a duplicate, missing, or extra
INSERT, UPDATE, or DELETE row entry in the error table and continue
processing
If MARK is set and a uniqueness violation occurs, the duplicate rows go to the
uniqueness violation table. In the case of an upsert, both the INSERT and
UPDATE portions must fail for an error to be recorded.
A row is a duplicate as follows: The table must be a nonunique primary index
(NUPI) set table. (A set table does not allow duplicates; a MULTISET table
does. MULTISET tables are only supported with Teradata for Windows NT.)
Rows with NUPIs are duplicates if all the columns of a row are the exact
duplicate of another row.
If neither MARK or IGNORE is specified for duplicate rows, MARK applies to
both INSERTs and UPDATEs. Similarly, if neither MARK or IGNORE is
specified for missing or extra rows, MARK applies to both UPDATEs and
DELETEs.
MARK is the default for:
• Both UPDATEs and DELETEs that refer to missing or extra rows.
• Duplicate rows arising from both INSERTs and UPDATEs, except when
those statements are combined to form an upsert, in which case the default
is IGNORE.
IGNORE
keyword indicating that the system should not make an error table entry for
the duplicate, missing, or extra INSERT, UPDATE, or DELETE row
The system should continue processing instead.
TPump determines whether a row is a duplicate as follows: The table must be a
NUPI set table. TPump treats rows with NUPIs as duplicates if all the columns
of a row are the exact duplicate of another row.
IGNORE DUPICATE ROWS does not apply if there are ANY unique indexes
in the row.
If neither INSERT nor UPDATE is specified for duplicate rows, IGNORE
applies to both INSERTs and UPDATEs.
Similarly, if neither UPDATE nor DELETE is specified for missing or extra
rows, IGNORE applies to both UPDATEs and DELETEs. IGNORE is the
default condition for an upsert operation.
INSERT
114
The upsert feature may be used (when used as DO INSERT FOR MISSING
UPDATE ROWS or DO INSERT ROWS).
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
DML
Syntax Element
Description
An upsert saves you time while loading a database. An upsert completes, in
one pass, an operation which requires two passes for other utilities. The DML
statements that follow this option must be in the order of a single UPDATE
statement followed by a single INSERT statement.
This option first executes the UPDATE statement. If the UPDATE fails because
the target row does not exist, TPump automatically executes the INSERT
statement. This capability lets you update the database without first presorting
the data. Otherwise, the data would have to be sorted into:
• rows that need to be updated
• rows that need to be inserted
Further information on the usage and restrictions of the upsert feature
appears in the following usage notes.
PARTITION
optional keyword used to name a session partition to be used for all SQL
requests associated with this DML command
If this keyword is not present, a session created from the SESSIONS will be
used.
Note: If serialization of two or more DML statements is required, the
statements cannot be put in different partitions. Serialization requires that all
DML statements with identical hash values of the rows be submitted from the
same session.
partition_name
parameter identifying the partition name
The partition name must obey the same rules for its construction as Teradata
SQL column names.
SERIALIZEON
keyword used to turn serialization on for the fields you specify
SERIALIZEON keyword may be used before, after, or between any IGNORE
or MARK statements.
serialize_on_field
parameter identifying the field names for which you want to turn serialization
on
This is the same field name you used in the LAYOUT command used by the
INSERT statement and referenced by the APPLY clause you have written.
Separate the field names with a comma and enclose them in parentheses.
USE
keyword used to specify the fields that are to be used with a DML’s SQL
statements
Use of this keyword allows you to specify which FIELDs from the LAYOUT
command are actually needed for each DML, so that data from all fields will
not be sent.
The USE keyword may be placed before, after, or between any IGNORE/
MARK statements.
use_field
parameter identifying the field names to use
Every LAYOUT FIELD used by any of the DML’s SQL statements must be
enumerated in the USE list; otherwise, an error will occur.
Separate the field names with a comma and enclose them in parentheses.
Teradata Parallel Data Pump Reference
115
Chapter 3: TPump Commands
DML
Syntax Element
Description
ArraySupport
ON/OFF
"ArraySupport ON|OFF" option to the .BEGIN LOAD command and the
.DML command
When "ArraySupport ON" is specified in the .BEGIN LOAD command, the
.DML commands enclosed in .BEGIN LOAD and .END LOAD command pair
will use the ArraySupport feature for its DML statement, unless "ArraySupport
OFF" is specified for the .DML command. The default value of ArraySupport
for the .BEGIN LOAD command is OFF.
When "ArraySupport ON|OFF" is not specified with the .DML command, the
default value for ArraySupport for that .DML command is the effective setting
of ArraySupport in the .BEGIN LOAD command where the .DML command
resides. When "ArraySupport ON|OFF" is specified at the .DML command,
the specified value overrides the default setting determined by the .BEGIN
LOAD command.
When a .DML command is using the ArraySupport feature, it must contain
one and only one DML statement and the session partition that the .DML
command references needs to be used exclusively by this .DML command.
If the DML statement is an UPSERT-type statement, it can be specified as a
pair of INSERT/UPDATE statements with DO INSERT FOR MISSING
UPDATE clause. TPump will create its equivalent form of UPDATE … ELSE
INSERT …, example Atomic Upsert, and use it as the actual DML statement.
Or an UPDATE … ELSE INSERT … statement can be directly specified with
DO INSERT FOR MISSING UPDATE clause.
The non-atomic form of UPSERT is not supported by TPump Array Support.
Usage Notes
The SQL EXECUTE command must be used between the BEGIN LOAD command and the
END LOAD command.
All INSERT, UPDATE, DELETE, and EXECUTE statements specified in the TPump script
should fully specify the primary index of the referenced table to prevent the generation of
table-level locks.
A maximum of 600 DML statements may be packed into a request; the default is
20 statements.
TPump assumes that row hash locking is used by INSERT, UPDATE, DELETE, and EXECUTE
statements. If row hash locking is not used, TPump will run anyway, but may encounter
trouble because table-level locking will cause each statement to block.
In addition, TPump does not support UPDATE or EXECUTE statements that modify the
primary index of the target table. TPump performs no checking to prevent the script author
from creating DML that requests unreasonable updates, except that TPump will not use
Atomic UPSERT if the UPDATE portion of the UPSERT specifies an unreasonable update.
This restriction is imposed by the Teradata Database.
IGNORE DUPICATE ROWS does not apply if there are ANY unique indexes in the row.
TPump converts INSERT, UPDATE, and DELETE statements into macro equivalents, and,
depending on the packing specified, submits multiple statements in one request. TPump also
116
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
DML
supports the EXECUTE statement, which can be used to bypass the macro creation step for
frequently executed macros. For more information on the EXECUTE statement, refer to
EXECUTE in this chapter.
The maximum number of INSERT, UPDATE, DELETE, and EXECUTE statements that can
be referenced in an IMPORT is 127.
At the end of an IMPORT, an environmental variable is established for each DML command
executed. TPump variables are not limited to 30 characters. These variables contain the
activity counts associated with each statement. The variables created are of the form:
&IMP <n>_<Apply label>_<x>
where
n = the number of the IMPORT, from one through four.
Apply label = the label of the clause containing the DML command in question.
x = the number of the statement within the containing APPLY clause.
Serialization
The SERIALIZEON keyword lets you turn serialization on for the fields you specify. You can
use the SERIALIZEON keyword before, after, or between any IGNORE or MARK statements.
You can also use the SERIALIZEON keyword with the SERIALIZE keyword in the BEGIN
LOAD command and with the KEY keyword in the FIELD command. When you do, the
DML-level serialization ignores and overrides the BEGIN LOAD-level serialization.
In addition, you can mix DML serialized APPLYs with nonserialized DML APPLYs in the
same IMPORT command.
See “BEGIN LOAD” and “FIELD” for details about these commands.
Sample Scripts
The following is an example using the SERIALIZEON keyword:
.LOGTABLE TPLOG01;
.LOGON slugger/dbc,dbc;
.BEGIN LOAD
ERRLIMIT 100 50
CHECKPOINT 15
SESSIONS 20
TENACITY 2
ERRORTABLE TPERR01
PACK 30
ROBUST ON
NOMONITOR;
.LAYOUT LAY01;
.FIELD cc1 * CHAR(8);
.FIELD cc2 * CHAR(4);
.FIELD cc3 * CHAR(6);
.FIELD cc4 * CHAR(62);
.DML LABEL LABEL01
DO INSERT FOR MISSING UPDATE ROWS
Teradata Parallel Data Pump Reference
117
Chapter 3: TPump Commands
DML
SERIALIZEON (CC1);
UPDATE TPTBL01
SET C4 = :CC4
WHERE C1 = :CC1;
INSERT TPTBL01 (C1, C2, C4)
VALUES (:CC1,:CC2, CC4);
UPDATE TPTBL02 SET C4 = :CC4 WHERE C1 = :CC1
AND C2 = :CC2;
INSERT TPTBL02 (C1, C2, C3, C4)
VALUES (:CC1,:CC2,:CC3, :CC4);
.IMPORT INFILE .\TPDAT.txt FORMAT TEXT
LAYOUT LAY01
APPLY LABEL01
APPLY LABEL02;
.END LOAD;
.LOGOFF;
The following is an example using the USE keyword:
.LOGTABLE TPLOG01;
.LOGON slugger/cfl,cfl;
DROP
DROP
DROP
DROP
TABLE
TABLE
TABLE
TABLE
TPERR01;
TPTBL01;
TPTBL02;
TPTBL03;
CREATE TABLE TPTBL01(
C1 INTEGER,
C2 VARCHAR(30),
C3 VARCHAR(30),
C4 VARCHAR(30),
C5 VARCHAR(30),
C6 VARCHAR(30))
UNIQUE PRIMARY INDEX (C1);
CREATE TABLE TPTBL02(
C1 INTEGER,
C2 VARCHAR(30),
C3 VARCHAR(30),
C4 VARCHAR(30),
C5 VARCHAR(30))
UNIQUE PRIMARY INDEX (C1);
CREATE TABLE TPTBL03(
C1 INTEGER,
C2 VARCHAR(30),
C3 VARCHAR(30),
C4 VARCHAR(30),
C5 VARCHAR(30),
C6 VARCHAR(30),
C7 VARCHAR(30),
C8 VARCHAR(30),
C10 VARCHAR(30),
C11 VARCHAR(30))
UNIQUE PRIMARY INDEX (C1);
118
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
DML
.BEGIN LOAD
CHECKPOINT 15
SESSIONS 5
ERRORTABLE TPERR01
NOMONITOR;
.LAYOUT LAY01;
.FIELD FLD1 * VARCHAR(10);
.FIELD FLD2 * VARCHAR(10);
.FIELD FLD3 * VARCHAR(10);
.FIELD FLD4 * VARCHAR(15);
.FIELD FLD5 * VARCHAR(20);
.FIELD FLD6 * VARCHAR(25);
.FIELD FLD7 * VARCHAR(30);
.FIELD FLD8 * VARCHAR(30);
.FIELD FLD9 * VARCHAR(1);
.FIELD FLD10 * VARCHAR(30);
.FIELD FLD11 * VARCHAR(30);
.DML LABEL LABEL01 USE(FLD1, FLD2, FLD4, FLD6, FLD8, FLD10);
INSERT TPTBL01 (C1, C2, C3, C4, C5, C6)
VALUES (:FLD1,:FLD2,:FLD4,:FLD6,:FLD8,:FLD10);
.DML LABEL LABEL02 USE(FLD1, FLD3, FLD5, FLD7, FLD11);
INSERT TPTBL02 (C1, C2, C3, C4, C5)
VALUES (:FLD1,:FLD3,:FLD5,:FLD7,:FLD11);
.DML LABEL LABEL03;
INSERT TPTBL03 (C1, C2, C3, C4, C5, C6, C7, C8, C10, C11)
VALUES (:FLD1, :FLD2, :FLD3, :FLD4, :FLD5,
:FLD6, :FLD7, :FLD8, :FLD10, :FLD11);
.IMPORT INFILE INDATA FORMAT VARTEXT ','
LAYOUT LAY01
APPLY LABEL01 WHERE FLD9 = 'A'
APPLY LABEL02 WHERE FLD9 = 'B'
APPLY LABEL03;
.VERSION;
.END LOAD;
.LOGOFF;
Note that as in the above example, DML USE APPLYs can be mixed with DML APPLYs not
using the USE keyword within the same IMPORT.
The following is an example using partitioning:
.LOGTABLE TPLOG01;
.LOGON cs4400s3/cfl,cfl;
DROP TABLE TPTBL01;
DROP TABLE TPTBL02;
DROP TABLE TPERR01;
CREATE TABLE TPTBL01, FALLBACK(
C1 CHAR(12) not null,
C2 CHAR(8) not null)
Teradata Parallel Data Pump Reference
119
Chapter 3: TPump Commands
DML
PRIMARY INDEX (C1);
CREATE TABLE TPTBL02, FALLBACK(
C1 CHAR(12),
C2 CHAR(8),
C3 CHAR(6))
UNIQUE PRIMARY INDEX (C1);
.BEGIN LOAD
ERRLIMIT 100 50
CHECKPOINT 15
TENACITY 2
ERRORTABLE TPERR01
ROBUST off
serialize on
;
.LAYOUT LAY02;
.FIELD cc1 * CHAR(12) key;
.FIELD cc2 * CHAR(8);
.FIELD cc3 * CHAR(6);
.filler space1 * char(1);
.partition part1 pack 10 sessions 10;
.partition part2 sessions 5 1 packmaximum;
.DML LABEL LABEL01 partition part1
DO INSERT FOR MISSING ROWS
ignore extra update rows
use(cc1, cc2);
UPDATE TPTBL01
SET C2 = :CC2
WHERE C1 = :CC1;
INSERT TPTBL01 (C1, C2)
VALUES (:CC1,:CC2);
.DML LABEL LABEL02 partition part2
serializeon( cc1 )
ignore extra update rows
DO INSERT FOR MISSING UPDATE ROWS;
UPDATE TPTBL02 SET C2 = :CC2 WHERE C1 = :CC1;
INSERT TPTBL02 (C1, C2, C3)
VALUES (:CC1,:CC2,:CC3);
.IMPORT INFILE c:\NCR\Test\TpumpData001.txt FORMAT TEXT
LAYOUT LAY02
APPLY LABEL01
APPLY LABEL02 where CC2 = '00000001';
.END LOAD;
.LOGOFF;
The Basic Upsert Feature
When using the basic upsert feature:
120
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
DML
•
There must be exactly two DML statements in this DML group.
•
The first DML statement must be an UPDATE statement that follows all of the TPump
task rules.
•
The second DML statement must be an INSERT statement.
•
Both DML statements must refer to the same table.
•
The INSERT statement, when built, must reflect the same primary index specified in the
WHERE clause of the UPDATE statement. This is true for both a single column primary
index and a compound primary index.
By following these rules, you will find a number of uses for the DO INSERT ROWS option. In
the past, you could either presort data into INSERTs and UPDATEs, or attempt UPDATEs
with all the data, and then do an INSERT on any UPDATEs that failed. With upsert, TPump
needs only one pass of the data to UPDATE rows that need to be updated and INSERT rows
that need to be inserted.
Note: To ensure data integrity, the SERIALIZE parameter defaults to ON in the absence of an
explicit value if there are upserts in the TPump job.
If you specify MARK MISSING UPDATE ROWS, while using DO INSERT ROWS, TPump
records any UPDATE that fails. This record appears in the Application Error Table, together
with an error code that shows that the INSERT of the DO INSERT ROWS was then executed.
If the INSERT fails, the INSERT row is also recorded in the Application Error table. The
default for an upsert function, however, is not to mark missing update rows. This is because
when you perform the upsert function, you expect the INSERT to occur when the UPDATE
fails. The failure of the UPDATE portion of an upsert does not, in itself, constitute an error
and should not be treated as one.
The MARK MISSING DELETE ROW option has no meaning when used with the DO
INSERT ROWS option.
The option of MARK (IGNORE) EXTRA DELETE (UPDATE) ROWS provides TPump with a
way to protect against an update or delete affecting multiple rows, which can happen in
TPump because the primary index can be non-unique.
MARK is the default for all DML options, except for an upsert.
Example Upsert
Each record in the following example contains the value of the primary index column
(EmpNo) of a row of the Employee table whose PhoneNo column is to be assigned a new
phone number from field Fone.
When the UPDATE fails, the INSERT statement is activated and TPump enters the upsert
mode. In this case, each record contains the primary index value (EmpNum) of a row that is
to be inserted successively into the Employee table whose columns are EmpNo and PhoneNo.
.BEGINLOAD SESSION number;
.LAYOUT Layoutname;
.FIELD EmpNum 1 INTEGER;
.FIELD Fone * (CHAR (10));
.DML LABEL DMLlabelname
Teradata Parallel Data Pump Reference
121
Chapter 3: TPump Commands
DML
DO INSERT FOR MISSING UPDATE ROWS;
UPDATE Employee SET PhoneNo = :Fone WHERE EmpNo = :EmpNum;
INSERT Employee (EmpNo, PhoneNo) VALUES (:EmpNum, :Fone);
.IMPORT INFILE Infilename LAYOUT Layoutname APPLY DMLlabelname;
.END LOAD;
The scope of a DML command (and its label) terminates at the first following command of
any kind or at the end of the file containing the DML statements, whichever occurs first.
The SQL EXECUTE command must be between the BEGIN LOAD command and END
LOAD command.
For IMPORT tasks, you may specify up to five distinct error treatment options for one DML
command. For example:
.DML LABEL COMPLEX
IGNORE DUPLICATE INSERT
MARK
DUPLICATE UPDATE
IGNORE MISSING
UPDATE
MARK
MISSING
DELETE
DO INSERT FOR MISSING
ROWS
ROWS
ROWS
ROWS
UPDATE ROWS;
It is valid to specify that missing update rows be both marked and treated as INSERTs or, as in
the example, both ignored and treated as INSERTs.
If TPump encounters any of the following:
•
no DML command in an IMPORT task,
•
DML statements outside the scope of a DML command in an IMPORT task, or
•
a DML command with no DML statements in an IMPORT task,
it writes a diagnostic message to the primary output destination for the system, terminates the
TPump task, and returns to the main TPump control module with a conventional nonzero
return code. You can then correct the error and resubmit the TPump task.
The DML commands (with their following DML statements) must appear between the
appropriate BEGIN LOAD command and the IMPORT commands that refer to them. When
the END LOAD command is encountered, the sequence of DML commands and DML
statements is forgotten. The DML command cannot be shared by multiple BEGIN LOAD
statements. The DML statements are described in the following sections.
The maximum number of DML commands that can be used in a single TPump task is 128. If
an excessive number of DML commands and statements are sent to the Teradata Database, an
error message is logged, stating that there are too many DML steps for one TPump job.
The Atomic Upsert Feature
The basic upsert function has been enhanced to support an Atomic upsert capability. This
enhancement permits TPump to perform single-row upserts in a single pass. This one-pass
logic adopts the upsert-handling technique used by MultiLoad. The one-pass logic is
designated Atomic to distinguish the grouping of paired UPDATE and INSERT statements
which are executed as a single SQL statement.
The syntax for Atomic upsert consists of an UPDATE statement and an INSERT statement,
separated by an ELSE keyword.
122
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
DML
Existing TPump scripts using the Atomic upsert form do not have to be changed. TPump will
automatically convert the old UPDATE/INSERT pairs to the Atomic upsert feature whenever
appropriate. Any attempts to change this will result in a syntax error.
The new syntax, which can also be used by CLIv2 and BTEQ applications, is dependent on
whether or not the RDBMS version, against which the TPump job is run, supports this
feature. If the RDBMS does not support Atomic Upsert, TPump reverts to the earlier logic of
sending the INSERT request if an UPDATE request fails.
The three basic constraints on the upsert feature are:
1
SAME TABLE: The UPDATE and INSERT statements must specify the same table.
2
SAME ROW: The UPDATE and INSERT statements must specify the same row; that is, the
primary index value in the INSERT row must be the same as the primary value in the
targeted UPDATE row.
3
HASHED ROW ACCESS: The UPDATE fully specifies the primary index, allowing the
target row to be accessed with a one-AMP hashed operation.
Although TPump does not verify basic upsert constraints, the Teradata Database will reject
Atomic upsert constructs that fail the constraint test, and notify TPump by returning an
appropriate error message to the client.
Other Restrictions on Atomic Upsert Feature
Some of these restrictions concern syntax that is supported in UPDATE and INSERT
statements separately, but not when combined in an Atomic upsert statement. Other
restrictions concern the upsert feature's not supporting certain Teradata Database features
such as triggers and join/hash indexes, meaning that the upsert statement cannot be used on
any table utilizing those features.
The following restrictions are not supported by the Atomic upsert feature, and return an error
if submitted to the Teradata Database:
1
INSERT-SELECT: Syntax not supported. The INSERT may not use a subquery to specify
any of the inserted values. Note that support of this syntax is likely to be linked to support
of subqueries in the UPDATE's WHERE clause constraints as described above, and may
involve new syntax features to allow the UPDATE and INSERT to effectively reference the
same subquery.
2
UPDATE-WHERE-CURRENT: Syntax not supported. The WHERE clause cannot use an
updatable cursor to do what is called a positioned UPDATE. (It is unlikely that this syntax
will ever be supported.) Note that this restriction does not prevent cursors from being
used in other ways with Atomic upsert statements. For example, a DECLARE CURSOR
statement may include upsert statements among those to be executed when the cursor is
opened, as long as the upserts are otherwise valid.
3
UPDATE-FROM: Not supported. The SET clause cannot use a FROM clause table
reference in the expression for the updated value for a column.
4
UPDATE-WHERE SUBQUERIES: Not supported. The WHERE clause cannot use a
subquery either to specify the primary index or to constrain a nonindex column. Note that
Teradata Parallel Data Pump Reference
123
Chapter 3: TPump Commands
DML
supporting this UPDATE syntax would also require support for either INSERT-SELECT or
some other INSERT syntax feature that lets it specify the same primary index value as the
UPDATE.
5
UPDATE-PRIMARY INDEX: Not supported. The UPDATE cannot change the primary
index. This is sometimes called unreasonable update.
6
TRIGGERS: Feature not supported if either the UPDATE or INSERT could cause a trigger
to be fired. The restriction applies as if the UPDATE and INSERT were both executed,
because the parser trigger logic will not attempt to account for their conditional execution.
UPDATE triggers on columns not referenced by the UPDATE clause, however, will never
be fired by the upsert and are therefore permitted. DELETE triggers cannot be fired at all
by an upsert and are likewise permitted. Note that an upsert could be used as a trigger
action but it would be subject to the same constraints as any other upsert. Because an
upsert is not allowed to fire any triggers itself, an upsert trigger action must not generate
any further cascaded trigger actions.
7
JOIN/HASH INDEXES: Feature not supported if either the UPDATE or INSERT could
cause the join/hash index to be updated. As with triggers, the restriction applies to each
upsert as if the UPDATE and INSERT were both executed. While the UPDATE could
escape this restriction if the join/hash index does not reference any of the updated
columns, it is much less likely (and maybe impossible) for the INSERT to escape. If the
benefit of lifting the restriction for a few unlikely join/hash index cases turns out to be not
worth the implementation cost, the restriction may have to be applied more broadly to any
table with an associated join/hash index.
Treat the failed constraint as a nonfatal error, report the error in the job log for diagnostic
purposes, and continue with the job by reverting to the old non-Atomic upsert protocol.
Existing TPump Scripts
Existing TPump scripts for upserts do not need to be changed. The syntax as described below
for an upsert will continue to be supported:
DO INSERT FOR MISSING UPDATE ROWS;
UPDATE <update-operands>;
INSERT <insert-operands>;
Atomic Upsert Examples
This section describes several examples that demonstrate how the Atomic upsert feature
works, including cases where errors are detected and returned to the user. All of the examples
use the same table, called Sales, as shown below:
CREATE TABLE Sales, FALLBACK,
(ItemNbr
INTEGER NOT NULL,
SaleDate
DATE FORMAT 'MM/DD/YYYY' NOT NULL,
ItemCount INTEGER)
PRIMARY INDEX
(ItemNbr);
It is assumed that the table has been populated with the following data:
INSERT INTO Sales (10, '05/30/2005', 1);
124
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
DML
Assume that there exists a table called NewSales that has the same column definitions as those
of table Sales.
Example 1 (Error: different target tables)
This example demonstrates an upsert statement that does not specify the same table name for
the UPDATE part and the INSERT part of the statement.
.Dml label upsertdml do insert for missing update rows;
UPDATE Sales SET ItemCount = ItemCount + 1 WHERE (ItemNbr = 10 AND
SaleDate = ’05/30/2005’);
INSERT INTO NewSales (10,’05/30/2005’, 1);
A rule of an upsert statement is that only one single table is processed for the statement.
Because the tables, Sales and NewSales, are not the same for the upsert statement, an error is
returned, indicating that the name of the table must be the same for both the UPDATE and
the INSERT.
Example 2 (Error: different target rows)
This example demonstrates an upsert statement that does not specify the same primary index
value for the UPDATE part and the INSERT part of the statement.
.Dml label upsertdml do insert for duplicate update rows;
UPDATE Sales SET ItemCount = ItemCount + 1 WHERE (ItemNbr = 10 AND
SaleDate = ’05/30/2005’);
INSERT INTO Sales (20,’05/30/2005’, 1);
The primary index values for the UPDATE and the INSERT must be the same. Otherwise, we
would be looking at two different rows: one for UPDATE and the other for INSERT, which is
not the purpose of an upsert. An error is returned for the upsert statement because the
specified primary index values of 10 and 20 are not the same (the primary index value must be
the same for both the UPDATE and the INSERT).
Example 3 (Valid Upsert UPDATE)
This example demonstrates a successful upsert statement where a row gets updated.
.Dml label upsertdml do insert for missing update rows;
UPDATE Sales SET ItemCount = ItemCount + 1 WHERE (ItemNbr = 10 AND
SaleDate = '05/30/2005');
INSERT INTO Sales (10, '05/30/2005', 1);
After all of the rules have been validated, the row with ItemNbr = 10 and SaleDate = '05/30/
2005' gets updated. A successful update of one row is returned.
Example 4 (Valid Upsert INSERT)
This example demonstrates a successful upsert statement where a row gets inserted.
.Dml label upsertdml do insert for missing update rows;
UPDATE Sales SET ItemCount = ItemCount + 1 WHERE (ItemNbr = 20 AND
SaleDate = '05/30/2005')
INSERT INTO Sales (20, '05/30/2005', 1);
After all of the rules have been validated and no row was found with Item = 20 and SaleDate =
'05/30/2005' for the UPDATE, a new row is inserted with ItemNbr = 20. A successful insert of
one row is returned.
Teradata Parallel Data Pump Reference
125
Chapter 3: TPump Commands
END LOAD
END LOAD
Purpose
The END LOAD command must be present as the last command of a TPump task to initiate
the execution of the task.
Syntax
.END LOAD
;
3021A022
126
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
EXECUTE
EXECUTE
Purpose
TPump supports the Teradata SQL EXECUTE statement, which specifies a user-created
(predefined) macro for execution. The EXECUTE statement specifies the type of DML
statement (INSERT, UPDATE, DELETE, or UPSERT) to be handled by the macro.
The macro named in this EXECUTE statement must reside in the Teradata Database before
the import task starts. Only one DML statement (INSERT, UPDATE, DELETE, or UPSERT)
can be specified in a TPump predefined macro.
Caution:
The SQL EXECUTE command must be used between the BEGIN LOAD command and the
END LOAD command.
Syntax
EXECUTE
EXEC
name
database.
UPDATE/UPD
INSERT/INS
DELETE/DEL
;
UPSERT/UPS
3021A001
where
Syntax Element
Description
database
name of the database in the Teradata Database where the macro to be executed
resides
name
name of the macro resident in the Teradata Database to be executed
DELETE/DEL
keyword indicating a DELETE statement is being executed by the macro
INSERT/INS
keyword indicating an INSERT statement is being executed by the macro
UPDATE/UPD
keyword indicating an UPDATE statement is being executed by the macro
UPSERT/UPS
keyword indicating an Atomic upsert is being executed by the macro
Usage Notes
Using predefined macros saves time because TPump does not need to create and drop new
macros each time a TPump job script is run.
The rules for user-created macros are:
Teradata Parallel Data Pump Reference
127
Chapter 3: TPump Commands
EXECUTE
•
TPump expects the parameter list for any macro to match the FIELD list specified by the
LAYOUT in the script. FILLER fields are ignored. If the USE clause is used in the DML
statement, TPump expects the parameter list for every macro in the DML statement to
match the field list specified by the USE clause. The order should be the same as the fields
in the LAYOUT.
•
The macro should specify a single prime index operation: INSERT, UPDATE, DELETE, or
UPSERT. TPump reports an error if the macro contains more than one supported
statement.
•
The restrictions on INSERT, UPDATE, DELETE, and UPSERT statements supported by
TPump are described in the corresponding sections of this manual.
If the EXECUTE statement is replacing an INSERT, UPDATE, DELETE, or UPSERT statement
in a job script, the EXECUTE statement must be placed at the same location as the INSERT,
UPDATE, DELETE, or UPSERT statement that it replaces. The following example shows an
INSERT statement is replaced by an equivalent predefined macro:
.DML LABEL LABELA;
DELETE <delete-operands> ;
INSERT <insert-operands> ;
UPDATE <update-operands> ;
.DML LABEL LABELA ;
DELETE <delete-operands> ;
EXECUTE <insert-macro-name> INSERT ;
UPDATE <update-operands> ;
The correct syntax for a TPump predefined macro is one of the following:
•
CREATE MACRO <name> (<parameter list>) as (UPDATE....; ) ;
•
CREATE MACRO <name> (<parameter list>) as (INSERT.....; ) ;
•
CREATE MACRO <name> (<parameter list>) as (DELETE.....; ) ;
•
CREATE MACRO <name> (<parameter list>) as (UPSERT.....; ) ;
If the Teradata Database server supports Atomic upsert, then automatic use of Atomic upsert
is allowed, when possible, without changing existing TPump scripts. This is accomplished in
the following manner:
•
TPump attempts to use the Atomic upsert syntax in defining a single UPSERT macro
(instead of an UPDATE/INSERT macro pair).
•
If the UPSERT macro is successfully defined, TPump uses the Atomic upsert function for
the UPSERT.
•
If an error occurs during UPSERT macro definition, presumably due to a violation of
Teradata Database Atomic upsert restrictions, TPump issues a warning and reverts to the
current TPump upsert method of paired UPDATE/INSERT statements.
TPump will continue to operate as it does now when the existing TPump syntax for upsert is
encountered, and references to predefined macros are used for either the UPDATE or the
INSERT or both.
128
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
EXECUTE
For example:
.DML LABEL <dml-label-name> DO INSERT FOR MISSING UPDATE ROWS ... ;
EXECUTE <update-macro-name> UPDATE ;
INSERT <insert-operands> ;
.DML LABEL <dml-label-name> DO INSERT FOR MISSING UPDATE ROWS ... ;
UPDATE <update-operands> ;
EXECUTE <insert-macro-name> INSERT ;
.DML LABEL <dml-label-name> DO INSERT FOR MISSING UPDATE ROWS ... ;
EXECUTE <update-macro-name> UPDATE ;
EXECUTE <insert-macro-name> INSERT ;
To allow for the use of predefined macros that take advantage of Atomic upsert, TPump
command syntax supports an UPSERT macro:
.DML LABEL <dml-label>;
EXECUTE <upsert-macro-name> UPSERT ;
When using predefined macros for atomic upserts, the DML statement will have “Ignore
Missing Update Rows” as a default option.
Atomic upsert syntax is not backward compatible; thus you cannot use it until you update the
Teradata Database server to a compatible release. If the Teradata Database supports Atomic
upsert, a TPump run can handle a mix of both standard and Atomic upserts.
Upserts are reported as UPDATEs and INSERTs in the statistics displayed by TPump (and
passed to the NOTIFY EXIT routine), because an Atomic upsert that results in an UPDATE
will be reported by the Teradata Database as an UPDATE activity type, and an Atomic upsert
that results in an INSERT will be reported by the Teradata Database as an INSERT activity
type.
Teradata Parallel Data Pump Reference
129
Chapter 3: TPump Commands
FIELD
FIELD
Purpose
The FIELD command specifies a field of the input record; it can also contain a NULLIF
expression. All fields specified by FIELD commands are sent to the Teradata Database. Only
fields relevant to the tasks using this layout need be specified.
Syntax
.FIELD
fieldname
startpos datadesc
fieldexpr
NULLIF
BLANKS
NULLS
TRAILING
LEADING
A
nullexpr
A
B
DROP
LEADING
TRAILING
AND
NULLS
BLANKS
;
B
KEY
3021A019
where
Syntax Element
Description
fieldname
name of an input record field to which:
1 a DML statement refers,
2 a nullexpr of a FIELD command or condition expression of a LAYOUT
command refers, or
3 a condition expression of the IMPORT command APPLY clause refers.
A fieldname must obey the same rules for its construction as Teradata SQL
column names.
fieldname can be referenced in other FIELD commands via NULLIF and field
concatenation expressions, and in APPLY WHERE conditions in IMPORT
commands.
startpos
starting position of a field of the data records in an external data source
It may be specified as an unsigned integer which is a character position starting
with 1, or as an asterisk which means the next available character position
beyond the preceding field. Nothing prevents redefinition of positions of input
records by specifying the same positions in multiple FIELD commands. See the
example below.
Note that where input records may be continued by use of the CONTINUEIF
condition, a startpos specified as an unsigned integer refers to a character
position in the final concatenated result from which the continuation indicator
fields have been removed. Refer to the description of the condition parameter of
the LAYOUT command.
130
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
FIELD
Syntax Element
Description
datadesc
type and length of data in the field.
TPump generates the USING phrase accordingly with the user-assigned field
name to which the body of the DML statement refers.
nullexpr
condition used for selectively inserting a null value into the affected column
The condition is specified as a conditional expression involving any number of
fields, each represented by its fieldname, and constants. All fieldnames
appearing in the conditional expression must be defined by any of the following
specifications:
1 The startpos and datadesc parameters of the FIELD command.
2 A FILLER command.
3 A TABLE command.
If the specified condition is true, TPump sends the data to the Teradata Database
with indicators, whether or not the INDICATORS option is specified on the
LAYOUT command.
When the character set of the job script is different from the client character set
used for the job (for example, on MVS the job script must be in Teradata
EBCDIC when using the UTF8 client character set, or UTF16 client character set
can be used with the job script in UTF8), TPump will translate the string
constants specified in the expression and the import data referenced in the
expression to the same character set before evaluating the expression.
For example, when the client character set is UTF16 and the script character set
is UTF8, if the following commands are given:
.field C1
.field C2
*
*
varchar(20);
varchar(40) nullif c1 = 'DELETED';
TPump will translate the data in the C1 field to the UTF8 form and compare it
with the UTF8 form of 'DELETED' to obtain the evaluation result.
Similarly, on the mainframe, when the client character set is UTF8 and the script
character set is Teradata EBCDIC, if the following commands are given:
.field C1
.field C2
*
*
char(20);
rchar(40) nullif c1 = 'removed';
TPump will translate the data in the C1 field from the UTF8 form to the
Teradata EBCDIC form and compare it to the Teradata EBCDIC form of
'removed' to obtain the valuation result.
Caution:
When using UTF8 client set on the mainframe, be sure to examine
the definition in the International Character Set Support manual to
determine the code points of the special characters you might
require. Different versions of EBCDIC do not always agree as to the
placement of these characters.
The mappings between Teradata EBCDIC and Unicode can be
referred to in Appendix E of the International Character Set Support
manual.
The fieldname1 parameter in other FIELD commands can be referenced in
nullexpr.
Teradata Parallel Data Pump Reference
131
Chapter 3: TPump Commands
FIELD
Syntax Element
Description
fieldexpr
concatenation of two or more items, either:
• fields
• character constants
• string constants
or a combination of these, as in:
fieldname1||'C'||fieldname2||'STRING'||fieldname3...
The field names within a layout must be unique and the data type of the field
must be either CHAR or VARCHAR. Nested concatenations are not supported.
If all items of the fieldexpr are fixed character (for example, no VARCHARs), the
data type of the resulting field is CHAR(m), where “m” is the sum of the length
for each component item.
If at least one component of the fieldexpr is a VARCHAR, the data type of the
resulting field is VARCHAR(m), where “m” is the sum of the maximum length
for each component item.
When the character set of the job script is different from the client character set
used for the job (for example, on MVS the job script must be in Teradata
EBCDIC when using the UTF8 client character set, or UTF16 client character set
can be used with the job script in UTF8), TPump will translate the character
constants and the string constants specified in the expression from the script
character set form to the client character set form before concatenating the
constants with the specified fields.
Caution:
DROP
When using the UTF8 client set on the mainframe, be sure to
examine the definition in the International Character Set Support
manual to determine the code points of the special characters you
might require. Different versions of EBCDIC do not always agree as
to the placement of these characters.
The mappings between Teradata EBCDIC and Unicode can be
referred to in Appendix E of the International Character Set Support
manual.
characters present in the specified position(s) to be dropped from the specified
fieldname, which must be of a character data type
TPump drops the specified characters and presents the field to the Teradata
Database as a VARCHAR data type.
Usage Rules:
If you specify two dropping actions, they must not be identical.
If a FIELD command defines a fieldname in terms of two or more concatenated
fieldname fields, any specified DROP clause applies to the concatenated result,
not to the individual fieldname fields. But, because each fieldname must be
defined by its own previous FIELD command, a DROP clause can be specified
on these commands to apply to the individual fields.
KEY
keyword, which, when added to the end of the FIELD command, specifies that
the field is part of the hash key for purposes of serialization, if the SERIALIZE
parameter on the BEGIN LOAD command is active
The serialization feature is meaningful only when a primary key for the loaded
data is specified via this KEY option.
132
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
FIELD
Usage Notes
One or more FIELD commands may be intermixed with the TABLE command and the
FILLER command. These commands must follow a LAYOUT command.
If you redefine an input record field in fieldname, you cannot change the data type from
‘character’ to ‘decimal’ with the datadesc parameter. This is illegal in TPump and will abort the
job and return an error message.
If you specify both NULLIF and DROP LEADING/TRAILING BLANKS/NULLS on the same
FIELD command, the DROP clause is evaluated after the NULLIF clause. As an example, in
the FIELD command:
.FIELD FIELDNAME * CHAR (5) NULLIF FIELDNAME = ‘x’
DROP LEADING BLANKS;
if the input for fieldname is ‘x’, the NULLIF expression would evaluate to false because the
leading blanks are not dropped before the NULLIF evaluation.
Specifying Data Types
Use the datadesc parameter to specify the type and length of data in the field. TPump
generates the USING phrase accordingly with the user-assigned field name to which the body
of the DML statement refers.
For complete details on data types and data conversions, see SQL Reference: Data Types and
Literals.
The following is a short list of the input length and field description for the data type
specifications you can make in the datadesc parameter:
Graphic Data Type Specifications
GRAPHIC(n)
Where n is the length of the input stream in terms of double-byte characters.
Length: n*2 bytes, if n is specified; otherwise 2 bytes, as n=1 is assumed.
Description: n double-byte characters.
The following example illustrates the use of the GRAPHIC data types in support of kanji or
multibyte character data. The FIELD statement can contain GRAPHIC data types.
.LAYOUT KANJIDATA;
.FIELD EMPNO
* SMALLINT;
.FIELD LASTNAME * GRAPHIC(30);
.FILLER FIRSTNAME * GRAPHIC(30);
.FIELD JOBTITLE * VARGRAPHIC(30);
VARGRAPHIC(n)
Where n is the length of the input stream in terms of double-byte characters.
Length: m + 2 bytes where m/2 <= 32000.
Description: 2-byte integer followed by m/2 double-byte characters.
LONG VARGRAPHIC
Teradata Parallel Data Pump Reference
133
Chapter 3: TPump Commands
FIELD
Length: m + 2 bytes where m/2 <= 32000.
Description: 2 byte integer followed by m/2 double-byte characters.
Note: For both VARGRAPHIC and LONG VARGRAPHIC, m, a value occupying the first
2 bytes of the input data, is the length of the input in bytes, not characters. Each multibyte
character set character is 2 bytes.
Note: LONG VARGRAPHIC also implies VARGRAPHIC (32000). Range is 0 to 32000 in a
64,000-byte field.
Decimal Data Type Specifications
DECIMAL(x) and DECIMAL(x,y)
Length: 1, 2, 4, or 8 bytes for network; packed decimal for mainframe
Description: 64-bit double precision, floating point
NULLIF Performance
Using a large number of NULLIF clauses can cause a significant increase in the CPU usage on
the system where you are running TPump. This rise in CPU usage may increase the time the
job takes to run.
An increase in CPU usage is most noticeable when you do not have:
•
FILLER commands in the LAYOUT
•
Input position gaps or overlaps
•
Concatenated fields
•
DROP clauses
To avoid an increase in CPU usage on the system running TPump, transfer the processing of
NULLIF expressions to the Teradata Database.
Example 1
Instead of specifying the following:
...
.FIELD fc * CHAR(5) NULLIF fc = 'empty';
.FIELD fi * INTEGER NULLIF fi = 0;
...
.DML LABEL ins;
INSERT INTO tbl1 VALUES(...,:fc,:fi,...);
You would use this instead:
...
.FIELD fc * CHAR(5);
.FIELD fi * INTEGER;
...
.DML LABEL ins;
INSERT INTO tbl1 VALUES(...,NULLIF(:fc,'empty'),NULLIF(:fi,0),...);
Example 2
In more complex situations, as in the following example:
134
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
FIELD
...
.FIELD fs * CHAR(1) ;
.FIELD fc * CHAR(5) NULLIF (fs <> 'M') AND (fs <> 'F');
.FIELD fi * INTEGER NULLIF fi < 0;
...
.DML LABEL ins;
INSERT INTO tbl2 VALUES(...,:fs,:fc,:fi,...);
You would use this instead:
...
.FIELD fs * CHAR(1) ;
.FIELD fc * CHAR(5);
.FIELD fi * INTEGER;
...
.DML LABEL ins;
INSERT INTO tbl2 VALUES(...,:fs,
CASE WHEN (:fs = 'M') OR (:fs = 'F') THEN :fc ELSE NULL END,
CASE WHEN (:fi >= 0) THEN :fi ELSE NULL END,...);
Using ANSI/SQL DateTime Data Types
When the DATEFORM command is used to specify ANSIDATE as the DATE data type, each
DATE field is internally converted to a CHAR(10) field. You must convert all ANSI/SQL
DateTime TIME, TIMESTAMP, and INTERVAL data types to fixed-length CHAR data types
to specify column and field names in a TPump FIELD command.
Table 18 shows how to use ANSI/SQL DateTime specifications.
Table 18: ANSI/SQL DateTime Specifications
Data Type
Variable Definition
Conversion Example
TIME
n = number of digits
after decimal point
CHAR(8 + n + (1 if n > 0, otherwise 0))
TIME (n)
Format (n = 0):
Example:
hh:mm:ss
11:37:58
Default = 6
Format: (n = 4):
Example:
hh:mm:ss.ssss
11:37:58.1234
n = number of digits
after decimal point
CHAR(19 + n + (1 if n > 0, otherwise 0))
Valid values: 0–6
TIMESTAMP
TIMESTAMP (n)
Valid values: 0–6
TIME WITH TIME ZONE
TIME (n) WITH TIME
ZONE
yyyy-mm-dd hh:mm:ss
1998-09-04 11:37:58
Default = 6
Format (n = 4):
yyyy-mm-dd hh:mm:ss.ssss
Example:
1998-09-04 11:37:58.1234
n = number of digits
after decimal point
CHAR(14 + n + (1 if n > 0, otherwise 0))
Valid values: 0–6
Default = 6
Teradata Parallel Data Pump Reference
Format (n = 0):
Example:
Format (n = 0): hh:mm:ss{±}hh:mm
Example:
11:37:58-08:00
Format (n = 4): hh:mm:ss.ssss {±} hh:mm
Example:
11:37:58.1234-08:00
135
Chapter 3: TPump Commands
FIELD
Table 18: ANSI/SQL DateTime Specifications (continued)
Data Type
Variable Definition
Conversion Example
TIMESTAMP WITH TIME
ZONE
n = number of digits
after decimal point
CHAR(25 + n + (1 if n > 0, otherwise 0))
TIMESTAMP (n) WITH
TIME ZONE
Valid values: 0-6
Default = 6
Format (n = 0):
yyyy-mm-dd hh:mm:ss{±}hh:mm
Example:
1998-09-24 11:37:58+07:00
Format (n = 4):
yyyy-mm-dd hh:mm:ss.ssss{±} hh:mm
Example:
1998-09-24 11:37:58.1234+07:00
INTERVAL YEAR
n = number of digits
CHAR(n)
INTERVAL YEAR (n)
Valid values: 1-4
Format (n = 2):
Example:
yy
98
Format: (n = 4):
Example:
yyyy
1998
Default = 2
INTERVAL YEAR TO
MONTH
n = number of digits
CHAR(n + 3)
Valid values: 1-4
INTERVAL YEAR (n) TO
MONTH
Default = 2
Format (n = 2):
Example:
yy-mm
98-12
Format: (n = 4):
Example:
yyyy-mm
1998-12
INTERVAL MONTH
n = number of digits
CHAR(n)
INTERVAL MONTH (n)
Valid values: 1-4
Format (n = 2):
Example:
mm
12
Format: (n = 4):
Example:
mmmm
0012
Default = 2
INTERVAL DAY
n = number of digits
CHAR(n)
INTERVAL DAY (n)
Valid values: 1-4
Format (n = 2):
Example:
dd
31
Format: (n = 4):
Example:
dddd
0031
Default = 2
136
INTERVAL DAY TO HOUR
n = number of digits
CHAR(n + 3)
INTERVAL DAY (n) TO
HOUR
Valid values: 1-4
Format (n = 2):
Example:
dd hh
31 12
Format: (n = 4):
Example:
dddd hh
0031 12
Default = 2
INTERVAL DAY TO
MINUTE
n = number of digits
CHAR(n + 6)
Valid values: 1-4
INTERVAL DAY (n) TO
MINUTE
Default = 2
Format (n = 2):
Example:
dd hh:mm
31 12:59
Format: (n = 4):
Example:
dddd hh:mm
0031 12:59
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
FIELD
Table 18: ANSI/SQL DateTime Specifications (continued)
Data Type
Variable Definition
Conversion Example
INTERVAL DAY TO
SECOND
n = number of digits
CHAR(n + 9 + m + (1 if m > 0, otherwise
0))
INTERVAL DAY (n) TO
SECOND
Default = 2
INTERVAL DAY TO
SECOND (m)
INTERVAL DAY (n) TO
SECOND (m)
Valid values: 1-4
m = number of digits
after decimal point
Format (n = 2, m = 0): hh:mm:ss
Example:
12:59:59
Format: (n = 4, m = 4): hhhh:mm:ss.ssss
Example:
0012:59:59.1234
Valid values: 0-6
Default = 6
INTERVAL HOUR
n = number of digits
CHAR(n)
INTERVAL HOUR (n)
Valid values: 1-4
Format: (n = 2): hh
Example: 12
Default = 2
Format: (n = 4): hhhh
Example: 0012
INTERVAL HOUR TO
MINUTE
n = number of digits
CHAR(n + 3)
Valid values: 1-4
INTERVAL HOUR (n) TO
MINUTE
Default = 2
Format: (n = 2): hh:mm
Example: 12:59
INTERVAL HOUR TO
SECOND
n = number of digits
Format: (n = 4): hhhh:mm
Example: 0012:59
Valid values: 1-4
CHAR(n + 6 + m + (1 if m > 0, otherwise
0))
INTERVAL HOUR (n TO
SECOND
Default = 2
INTERVAL HOUR TO
SECOND (m)
m = number of digits
Format: (n = 4, m = 4): hhhh:mm:ss.ssss
after the decimal point
Example: 0012:59:59.1234
Valid values: 0-6
INTERVAL HOUR (n) TO
SECOND (m)
Format: (n = 2, m = 0): hh:mm:ss
Example: 12:59:59
Default = 6
INTERVAL MINUTE
n = number of digits
CHAR(n)
INTERVAL MINUTE (n)
Valid values: 1-4
Format (n = 2):
Example:
mm
59
Format: (n = 4):
Example:
mmmm
0059
Default = 2
INTERVAL MINUTE TO
SECOND
INTERVAL MINUTE (n)
TO SECOND
INTERVAL MINUTE TO
SECOND (m)
INTERVAL MINUTE (n)
TO SECOND (m)
Teradata Parallel Data Pump Reference
n = number of digits
Valid values: 1-4
Default = 2
m = number of digits
after decimal point
CHAR(n + 3 + m + (1 if m > 0, otherwise
0))
Format (n = 2, m = 0): mm:ss
Example:
59:59
Format: (n = 4, m = 4): mmmm:ss.ssss
Example:
0059:59.1234
Valid values: 0-6
Default = 6
137
Chapter 3: TPump Commands
FIELD
Table 18: ANSI/SQL DateTime Specifications (continued)
Data Type
Variable Definition
Conversion Example
INTERVAL SECOND
n = number of digits
CHAR(n + m + (1 if m > 0, otherwise 0))
INTERVAL SECOND (n)
Valid values: 1-4
INTERVAL SECOND (n,m)
Default = 2
Format (n = 2, m = 0):
Example:
m = number of digits
after decimal point
ss
59
Format: (n = 4, m = 4): ssss.ssss
Example:
0059.1234
Valid values: 0-6
Default = 6
138
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
FILLER
FILLER
Purpose
The FILLER command describes a named or unnamed field as filler which is not to be sent to
the Teradata Database. Only fields relevant to this TPump task need to be specified.
Syntax
startpos
.FILLER
datadesc
;
fieldname
3021A023
where
Syntax Element
Description
fieldname
name of an input record field to which a nullexpr of a FIELD command refers;
or to which a “condition” expression of the IMPORT command’s APPLY
clause refers
The only reason for naming a filler field is to enable one of these expressions to
refer to it. A fieldname must obey the same rules for its construction as
Teradata SQL column names.
The reason for describing a field that is not to be sent to the Teradata Database
and is not used in any of the expressions mentioned in the previous paragraph
is to make it possible for you to specify startpos as an asterisk for subsequent
fields of the input records. If the use of the asterisk is not important to you, you
do not need to define fields that do not participate in the TPump.
startpos
starting position of a field of the data records in an external data source
It may be specified as an unsigned decimal integer, which is a character
position starting with 1, or as an asterisk, which is the next available character
position beyond the preceding field.
Note that where input records may be continued by use of the CONTINUEIF
condition, a startpos specified as an unsigned integer refers to a character
position in the final concatenated result from which the continuation
indicators have been removed. Refer to the description of the condition
parameter of the LAYOUT command.
datadesc
Teradata Parallel Data Pump Reference
type and length of data in the field
139
Chapter 3: TPump Commands
FILLER
Usage Notes
One or more FILLER commands may be intermixed with the FIELD command or the
TABLE command. These commands must follow a LAYOUT command.
Example
This example illustrates the use of the GRAPHIC data types in support of kanji or multibyte
character data. The FILLER statement describing the input data set or file can contain
GRAPHIC data types.
.LAYOUT KANJIDATA;
.FIELD EMPNO
* SMALLINT;
.FIELD LASTNAME * GRAPHIC(30);
.FILLER FIRSTNAME * GRAPHIC(30);
.FIELD JOBTITLE * VARGRAPHIC(30);
140
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
IF, ELSE, and ENDIF
IF, ELSE, and ENDIF
Purpose
TPump provides a structure of IF, ELSE, and ENDIF commands for the conditional control of
execution processes. Conditional execution works as follows:
Syntax
.IF
;
conditional expression
THEN
statements to execute if TRUE
.ELSE
;
statements to execute if FALSE
.ENDIF
;
statements to resume with
3021A024
where
Syntax Element
Description
conditional
expression
user-defined variables or pre-defined system variables following the IF
command, whose condition (TRUE or FALSE) triggers the execution of
alternative groups of statements
statements to execute
if TRUE
statements to be executed whenever the conditional expression following
the IF command evaluates as TRUE
statements to execute
if FALSE
statements following the optional ELSE command which execute only when
the conditional expression following the IF command evaluates as FALSE
statements to resume
with
statements following the ENDIF command to terminate the conditional
statement execution process and resume the normal command sequence
Usage Notes
The conditional expression in the IF command may consist of either user-defined variables or
predefined system variables.
The ELSE command clause is optional. ELSE is used only when there are statements to be
executed when the condition is evaluated as false. Conditional expression is an expression
which can be evaluated as either true or false. When evaluation of the expression returns a
Teradata Parallel Data Pump Reference
141
Chapter 3: TPump Commands
IF, ELSE, and ENDIF
numeric result, 0 is interpreted as false; nonzero results are interpreted as true. See the “Utility
Variables” on page 62.
TPump supports the nesting of IF commands to a level of 100.
Any ELSE or ENDIF commands must be present in their entirety and cannot be composed
simply of variables in need of substitution.
Commands and statements following an IF, ELSE, or ENDIF structure that are not executed
are not parsed and do not have their variables substituted.
Example 1
TPump is case sensitive when doing a compare on an ‘&SYS’ system variable. The RUN FILE
command does not execute because the substituted values returned in this example are all in
uppercase. This factor must be considered when creating a script to force the execution of a
predetermined sequence of events. If, in line 0003, ‘FRI’ was used, the compare would work
and the RUN FILE command would execute.
0003 .IF ’&SYSDAY’ = ’Fri’ THEN;
14:10:28 - FRI MAY 09, 1997
UTY2402 Previous statement modified to:
0004 .IF ’FRI’ = ’Fri’ THEN;
0005 .RUN FILE UTNTS38;
0006 .ENDIF;
Example 2
In Example 2, the user has created the table named &TABLE and a variable named
CREATERC, into which is set the system return code resulting from the execution of the
CREATE TABLE statement. If the table name has not already been used, and the return code is
not zero, the return code evaluates to an error condition and the job logs off with the error
code displayed.
0010 .SET CREATERC TO &SYSRC;
0011 .IF &CREATERC = 3803 /* Table &TABLE already exists */ THEN;
UTY2402 Previous statement modified to:
0012 .LOGOFF 08;
0013 .RUN FILE RUN01;
0014 .ELSE
0015 .IF &CREATERC <> 0 THEN
0016 .LOGOFF &CREATRC;
0017 .ENDIF
142
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
IMPORT
IMPORT
Purpose
The IMPORT command identifies a source for data input. By referencing the LAYOUT
command and DML command, IMPORT ties the previous commands together. The input
data source used for IMPORT depends on whether the TPump utility is running on an IBM
VM or MVS client, or on a network-attached client platform, as shown in the following syntax
diagram.
Syntax
For Channel-Attached Client Systems:
.IMPORT
INFILE ddname
A
AXSMOD name
'init-string'
B
C
A
HOLD
FREE
INMOD modulename
USING ( parms )
B
D
C
FROM m
FOR n
THRU k
D
E
FORMAT
VARTEXT
3
'c'
DISPLAY ERRORS
NOSTOP
E
F
LAYOUT layoutname
;
F
APPLY label
WHERE condition
3021A004
Teradata Parallel Data Pump Reference
143
Chapter 3: TPump Commands
IMPORT
For Network-Attached Client Systems:
.IMPORT
INFILE filename
A
AXSMOD name
'init-string'
B
C
A
FROM m
INMOD modulename
THRU k
USING ( parms )
B
FOR n
D
C
FORMAT
FASTLOAD
BINARY
TEXT
UNFORMAT
VARTEXT
3
'c '
DISPLAY ERRORS
NOSTOP
D
;
LAYOUT layoutname
APPLY label
WHERE condition
3021A025
where
Syntax Element
Description
INFILE ddname
external data source that contains the input records on channel-attached
client systems
In MVS, this is a DDNAME. In VM, it is a FILEDEF name.
If DDNAME is specified, TPump reads data records from the specified
source. If modulename is also specified, TPump passes the records it reads to
the specified module.
The DDNAME must obey the applicable rules of the external system.
A DDNAME must obey the same construction rules as Teradata SQL column
names except that:
• The “at” character (@) is allowed as an alphabetic character
• The underscore character (_) is not allowed
If the DDNAME represents a data source on magnetic tape, the tape may be
either labeled or nonlabeled, as supported by the operating system.
144
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
IMPORT
Syntax Element
Description
AXSMOD name
name of the access module file to be used to import data
The names of the access module files are:
OLE DB Access Module
oledb_axsmod.dll on Microsoft® Windows platforms
Named Pipes Access Module
• np_axsmod.sl on Hewlett-Packard® HP-UX platforms
• np_axsmod.so on NCR® MP-RAS, IBM® AIX®, Sun® Solaris® SPARC®,
Sun® Solaris® Opteron®, and Novell® SUSE® Linux Enterprise and Red
Hat® Enterprise Linux® Advanced Server platforms
• np_axsmod.dll on Windows platforms
Note: When using TPump latency option with Named Pipe Access
Module, the Named Pipe Access Module parameter file should use
need_full_block = no option.
WebSphere® Access Module for Teradata (client version)
• libmqsc.sl on HP-UX platforms
• libmqsc.so on MP-RAS, AIX, Solaris SPARC, Solaris Opteron, and Linux
platforms
• libmqsc.dll on Windows platforms
WebSphere® Access Module for Teradata (server version)
• libmqs.sl on HP-UX platforms
• libmqs on IBM MVS/ESA platforms
• libmqs.so on AIX, Solaris SPARC, Solaris Opteron, and Linux platforms
• libmqs.dll on Windows platforms.
You may use your own shared library file name if you have a custom access
module.
Large File Access Module is no longer available because the Data Connector
API supports file sizes greater than 2 gigabytes on Windows, HP-UX, AIX,
and Solaris SPARC platforms.
The AXSMOD option is not required for importing:
Disk files on either network- or channel-attached client systems
Magnetic tape files on channel-attached client systems
It is required for importing magnetic tape and other types of files on
network-attached client systems.
Refer to Teradata Tools and Utilities Access Module Reference for more
information about the specific access modules.
‘init-string’
Teradata Parallel Data Pump Reference
optional initialization string for the access module
145
Chapter 3: TPump Commands
IMPORT
Syntax Element
Description
INFILE filename
fully qualified UNIX or Windows path name for an input file on networkattached client systems
If the path name has embedded white space characters, you must enclose the
entire path name in single or double quotes.
If you specify the INFILE filename, the data is read from the specified source.
If you also specify the INMOD modulename, the data is passed to the
specified module.
HOLD
default condition to not deallocate the input tape device specified by ddname
when the import operation completes on channel-attached client systems
Instead, the HOLD specification de-allocates the device when the entire
TPump operation completes.
FREE
deallocation of the tape input device specified by ddname when the import
operation completes on channel-attached client systems
When de-allocated, any attempt to open the input device, either in the same
TPump utility task or in another task within the same script, produces an
undefined ddname error.
The default is to not deallocate the device.
INMOD
modulename
optional user-written routine for preprocessing the input data
In MVS, the modulename is the name of a load module. In UNIX and
Windows, it is the pathname for the INMOD executable code file.
The modulename must obey the applicable rules of the external system.
A modulename must obey the same construction rules as Teradata SQL
column names except that on channel-attached client systems:
• The “at” character (@) is allowed as an alphabetic character
• The underscore character (_) is not allowed
When you specify both the INFILE fileid and the INMOD modulename
parameters, the input file is read and the data is passed to the INMOD
routine for preprocessing.
If you do not specify the INFILE fileid parameter, your INMOD routine must
provide the input data record.
Note: When you use an INMOD routine with the INFILE specification,
TPump performs the file read operation, and the INMOD routine acts as a
pass-through filter.
Because the FDL-compatible INMOD routine must always perform the file
read operation, you cannot use an FDL-compatible INMOD routine with the
INFILE specification of a TPump IMPORT command.
Note: On some versions of UNIX, you may have to add ./ prefix characters to
the modulename specification if the module is in the current directory.
146
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
IMPORT
Syntax Element
Description
USING (parms)
character string with the parameters you want to pass to the user exit routine
The parms string can include one or more character strings, each delimited
on either end by an apostrophe or quotation mark.
Maximum size of the parms string is 1K bytes.
Parentheses within delimited character strings or comments have the same
syntactical significance as alphabetic characters.
Before passing the parms string to the user exit routine, the following items
are replaced with a single blank character:
• Each comment.
• Each consecutive sequence of white space characters, such as blank, tab
and so on, that appears outside of delimited strings.
The entire parms string must be enclosed in parentheses. On channelattached client systems, the parentheses are included in the string passed to
the user exit routine.
Note: The parms string must be FDLINMOD for the user exit routines
written for the prior Pascal version of the FastLoad utility (program
FASTMAIN).
FROM m
logical record number, as an integer, of the record in the identified data
source where processing is to begin
If you do not use a FROM m specification, TPump begins processing with the
first record received from the data source.
FOR n
number of records, as an integer, starting at record m, to be processed
If you do not use a FOR n or a THRU k specification, TPump continues
processing through the last record obtained from the data source.
THRU k
logical record number, as an integer, of the record in the identified data
source where processing is to end
If you do not use a THRU k or a FOR n specification, TPump continues
processing through the last record obtained from the data source.
Teradata Parallel Data Pump Reference
147
Chapter 3: TPump Commands
IMPORT
Syntax Element
Description
FORMAT
record format of the input file
The format can be:
FASTLOAD—A 2-byte integer, n, followed by n bytes of data and an end-ofrecord marker (either X’0A’ or X’0D’).
BINARY—A 2-byte integer, n, followed by n bytes of data.
TEXT—An arbitrary number of bytes, followed by an end-of-record marker
which is a:
• Line feed (X’0A’) on UNIX platforms.
• Carriage-return and line-feed pair (X’0D0A’) on Windows platforms.
UNFORMAT—defined by FIELD, FILLER, and TABLE commands of the
specified layout.
Note: When using UNFORMAT formatting in MVS, ensure that the data
stream and data source are consistent with the layout defined in the utility
script. Discrepancies in the length of the data stream could result in data
corruption.
VARTEXT—in variable-length text record format, with each field separated
by a delimiter. Rules for a delimiter are:
• No control characters other than a TAB character can be used as a
delimiter.
• Any character that appears in the data cannot be used as a delimiter.
• Delimiters can only be up to 10 single-characters long.
If you do not specify a FORMAT option, the default is FASTLOAD.
Note: On the mainframe platform, when access module is not used, the
default is the input data read record by record and the LAYOUT is applied to
each read record.
148
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
IMPORT
Syntax Element
Description
'c'
optional specification of the delimiter characters that separate fields in the
variable-length text records of the input data source
The default, if you do not use a 'c' specification, is the vertical bar character
( | ).
When the character set of the job script is different from the client character
set used for the job (for example, on MVS the job script must be in Teradata
EBCDIC when using the UTF8 client character set, or UTF16 client character
set can be used with the job script in UTF8), TPump will translate the
effective delimiter from the script character set form to the client character set
form before separating the fields with it.
For example, when the client character set is UTF16 and the script character
set is UTF8, if the following command is given:
… FORMAT VARTEXT '-' ...
TPump translates '-' from the UTF8 form to the UTF16 form and then
separate the fields in the record according to the UTF16 form of '-'.
Similarly, on the mainframe, when the client character set is UTF8 and the
script character set is Teradata EBCDIC, if the following command is given:
… FORMAT VARTEXT '6A'xc ...
TPump interprets x’6A’ according to Teradata EBCDIC and translates it to the
corresponding Unicode code point, U+007C "VERTICAL LINE", and uses
the UTF8 encoding scheme of U+007C, 0x7C (which is '|' in 7-bit ASCII), as
the delimiter character for the record.
Caution:
When using the UTF8 client set on the mainframe, examine the
definition in the International Character Set Support manual to
determine the code points of the special characters you might
require. Different versions of EBCDIC do not always agree as to
the placement of these characters.
For example, the code point of '|' in most IBM EBCDIC code pages
is x'4F'. If you specify '|' as the delimiter in the script or leave the
delimiter to default in a system environment using such an IBM
EBCDIC code page, (which is essentially the same as specifying '|'),
but your UTF8 data uses x'7C' ('|' in Unicode) as the delimiter, the
job will run into errors because:
1 the code point of x'4F' in Teradata EBCDIC maps to U+008D but not
U+007C, and
2 the delimiter must use single-byte characters when it is in the client
character set form.
DISPLAY ERRORS
optional keyword specification that writes input data records that produce
errors to the standard error file
NOSTOP
optional keyword specification that inhibits the TPump termination in
response to an error condition associated with a variable-length text record
LAYOUT
layoutname
layout of the input record, as specified by a previous command
APPLY label
error treatment options specified by a previous DML LABEL command for
subsequent INSERT, UPDATE, or DELETE statements
Teradata Parallel Data Pump Reference
149
Chapter 3: TPump Commands
IMPORT
Syntax Element
Description
WHERE condition
condition that determines whether the indicated label options are applied to
the records and sent to the Teradata Database, where:
• condition true = yes
• condition false = no
The condition specification can reference:
• Any combination of fields defined in the currently active layout
• System and user-defined constants and variables
• The fieldname1 specified in commands
When you specify VARTEXT, the TPump utility assumes that the input data
is variable-length text fields separated by a field delimiter character. The
utility parses each input data record on a field-by-field basis, and creates a
VARCHAR field for each input text field.
When the character set of the job script is different from the client character
set used for the job (for example, on MVS the job script must be in Teradata
EBCDIC when using the UTF8 client character set, or UTF16 client character
set can be used with the job script in UTF8), TPump translates the string
constants specified in the condition and the import data referenced in the
condition to the same character set before evaluating the condition.
For example, when the client character set is UTF16 and the script character
set is UTF8, if the following command is given
… APPLY lable1 WHERE C1 = 'INSERT';
TPump translates the data in the C1 field to the UTF8 form and compares it
with the UTF8 form of 'INSERT' to obtain the evaluation result.
Similarly, on the mainframe, when the client character set is UTF8 and the
script character set is Teradata EBCDIC, if the following command is given:
… APPLY lable2 WHERE C2 = 'DELETE';
TPump translates the data in the C2 field from the UTF8 form to the Teradata
EBCDIC form and perform the comparison with the Teradata EBCDIC form
of 'DELETE'.
Caution: When using the UTF8 client set on the mainframe, be sure to
examine the definition in the International Character Set Support
manual to determine the code points of the special characters you
might require. Different versions of EBCDIC do not always agree
as to the placement of these characters. The mappings between
Teradata EBCDIC and Unicode can be referred to in Appendix E
of the International Character Set Support manual.
Usage Notes
A maximum of four IMPORT commands can be used in a single TPump load task. A single
load comprises the set of commands and statements bounded by a BEGIN LOAD-END LOAD
command pair. If the number of IMPORTs sent to the Teradata Database for the load exceeds
four, an error message is logged. TPump has been limited to four IMPORTs per load in order
to limit the amount of memory needed to keep track of job-related statistics.
The maximum number of INSERT, UPDATE, DELETE, and EXECUTE statements that can
be referenced in an IMPORT is 127.
150
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
IMPORT
The only DML statements that are candidates for application by an IMPORT command are
those within the scope of DML commands whose labels appear in one or more of the
IMPORT command’s APPLY clauses. The referenced DML commands and their following
DML statement(s) must appear between the BEGIN LOAD command that defines the task
and the referencing IMPORT commands. A statement or group of statements is applied if no
condition is specified, or if the specified condition is true.
TPump permits multiple statements to be applied to the same data record in either of two
ways. First, if an APPLY clause refers to a label whose scope includes multiple DML
statements, each of these statements is applied to the same data record under the same
condition specified in the clause. Second, if multiple APPLY clauses are used, each can refer to
the label of a different DML statement or group of statements. Each label’s statements are
applied to the same data record under that condition specified in the respective clause. These
features allow the same data record to be applied to different tables under the same or
differing conditions.
VARTEXT Record Usage
When you specify VARTEXT, TPump assumes that the input data is variable-length text fields
separated by a field delimiter character. It parses each input data record on a field-by-field
basis, and creates a VARCHAR field for each input text field.
When using the VARTEXT specification, VARCHAR, VARBYTE, and LONG VARCHAR are
the only valid data type specifications to use in TPump layout FIELD and FILLER commands.
Two consecutive delimiter characters direct TPump to null the field corresponding to the one
immediately following the first delimiter character.
Also, if the last character in a record is a delimiter character, and there is at least one more field
to be processed, then TPump nulls the field corresponding to the next one to be processed, as
defined in the layout FIELD and FIELD command.
The total number of fields in each input record must be equal to or greater than the number of
fields described in the TPump layout FIELD and FIELD commands.
If it is less, TPump generates an error message. If it is more, the Teradata Database ignores the
extra fields.
The last field of a record does not have to end with a delimiter character. It can end with a
delimiter character, but it is not required.
When TPump encounters an error condition in an input record, it normally discards the
record and terminates. When loading variable-length text records, you can inhibit either or
both of these functions by specifying the error-handling options:
•
DISPLAY ERRORS
•
NOSTOP
If NOSTOP is specified, TPump will not terminate even if an error is encountered.
By specifying both options and redirecting STDERR to a file location instead of your terminal
screen, the TPump job runs to completion and saves all the error records. Then you can
manually modify them and load them into the table.
Teradata Parallel Data Pump Reference
151
Chapter 3: TPump Commands
IMPORT
All IMPORT commands for a TPump task must appear between the BEGIN LOAD and END
LOAD commands for the task.
TPump imposes several syntax rules for the parms string for an INMOD user exit routine. On
entry to any INMOD user exit routine for TPump, the conventional parameter register points
to a parameter list of two 32-bit addresses used to communicate with the INMOD.
At the end of an IMPORT, an environmental variable is established for each DML command
executed. TPump variables are not constrained to 30 characters. These variables contain the
activity counts associated with each statement. The variables created are of the form:
&IMP <n>_<Apply label>_<x>
where
n = the number of the IMPORT, from one through four.
Apply label = the label of the clause containing the DML command in question.
x = the number of the statement within the containing APPLY clause.
The following script is an example of a TPump job using the APPLY keyword to create
conditional clauses to apply DML INSERTs, UPDATEs, and UPSERTs to the IMPORT.
APPLY Example
.BEGIN LOAD SESSIONS 34;
.LAYOUT EQTTB535;
.FIELD Pool_Upd_Code
* CHAR(01);
.FIELD Eqmt_Init
* CHAR(04);
....
.DML LABEL UPSERTAC
DO INSERT FOR MISSING UPDATE ROWS;
UPDATE EQTDBT50.EQTTB535_TAL SET
TCS_POOL_IDFR_NUM =:TCS_POOL_IDFR_NUM
.....
WHERE
.....
;
INSERT INTO EQTDBT50.EQTTB535_TAL
VALUES(
POOL_EXPN_DATE
=:POOL_EXPN_DATE (DATE, FORMAT 'YYYYMMDD')
.....
);
.DML LABEL UPSERTDL;
UPDATE EQTDBT50.EQTTB535_TAL SET
.....
WHERE
.....
;
152
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
IMPORT
.IMPORT INFILE INFILE
LAYOUT EQTTB535
APPLY UPSERTAC WHERE (POOL_UPD_CODE = 'C'
OR POOL_UPD_CODE = 'A')
APPLY UPSERTDL WHERE POOL_UPD_CODE = 'D'
;
.END LOAD;
/* For the upsert:
*/
/* (first statement in .DML UPSERTAC) */
/* make sure we have the 50 updates
*/
.IF &IMP1_UPSERTAC_1 <> 50 THEN
.LOGOFF 100;
/* ... and 50 inserts
*/
/* (second statement in .DML UPSERTAC) */
.IF &IMP1_UPSERTAC_2 <> 50 THEN
.LOGOFF 101;
/* And for the plain update:
*/
/* (first statement in .DML UPSERTDL) */
/* we should have 10 of these.
*/
.IF &IMP1_UPSERTDL_1 <> 10 THEN
.LOGOFF 102;
.LOGOFF;
Teradata Parallel Data Pump Reference
153
Chapter 3: TPump Commands
INSERT
INSERT
Purpose
TPump supports the Teradata SQL INSERT statement, which adds new rows to a table by
directly specifying the row data to be inserted.
Syntax
INSERT
INS
tname
;
.*
,
INTO
VALUES
,
(
cname
(
:fieldname
expression
)
)
3021A026
where
Syntax Element
Description
tname
table that is to receive rows created from input data records
If the table is not explicitly qualified by database name, the default database
qualifies it.
cname
column of the specified table that is to receive the value from a field of
matching input records, where the value is identified by the corresponding
entry in the fieldname list
fieldname
field of an input record, whose value is given to a column of the specified table
that is identified by the corresponding entry in the cname clause in this
statement
If this statement did not specify a cname, the object’s CREATE statement
provides the corresponding column identifier. This assumes that all columns
from the table correspond to those specified in the original CREATE
statement.
expression
alternative to the fieldname clause, an expression that includes one or more
actual fieldnames as terms may instead be used
Usage Notes
The following notes describe how to use an INSERT statement following a DML command.
An INSERT statement may also be used in the support environment; normal rules for INSERT
are followed in that case.
One way of specifying the applicable DML statements is to relate each field name to the name
of the column to which the field’s data is applied. Another way tells TPump to apply the first
154
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
INSERT
nonfiller field of a record that is sent to the Teradata Database to the first defined column of
the affected table, the second nonfiller field to the second column, and so on.
TPump converts INSERT statements into macro equivalents and, depending on the packing
specified, submits multiple statements on one request.
To insert records into the table identified by tname, the username specified in the LOGON
command must have the INSERT privilege for the table.
A value must be specified for every column, either explicitly or by default.
For TPump use, if the object of the INSERT statement is a view, it must not specify a join.
TPump operates only on single table statements so INSERT statements must not contain any
joins.
The correspondence between the fields of data records to be inserted into a table, and the
columns of the table, can be specified in any of four ways. These appear in the following
examples, using targetable as the table or view name.
The maximum number of INSERT, UPDATE, and DELETE statements that can be referenced
in an IMPORT is 127.
The maximum number of DML statements that can be packed into a request is 600. The
default number of statements packed is 20.
ANSI/SQL DateTime Specifications
You can use the ANSI/SQL DATE, TIME, TIMESTAMP, and INTERVAL DateTime data types
in Teradata SQL CREATE TABLE statements. Specify them as column/field modifiers in
INSERT statements.
Example 1
.BEGIN LOAD SESSION number;
.LAYOUT Layoutname;
.TABLE Targetablename;
.DML LABEL DMLlabelname;
INSERT INTO Targetablename.*;
.IMPORT INFILE Infilename LAYOUT Layoutname
.END LOAD;
APPLY DMLlabelname;
Example 2
.LAYOUT lname;
.FIELD first 1 somedatatype;
.FIELD f2nd * anydatatype;
.
.
.
.FIELD flast * datatype;
.DML LABEL label;
INSERT INTO targetable VALUES (:first, :f2nd, ... :flast);
Teradata Parallel Data Pump Reference
155
Chapter 3: TPump Commands
INSERT
Example 3
.LAYOUT lname;
.FIELD first 1 somedatatype;
.FIELD f2nd * anydatatype;
.
.
.
.FIELD flast * datatype;
.DML LABEL label;
INSERT INTO targetable (col1, col2, ... colast)
VALUES (:f2nd, :first, ... :flast);
Example 4
An input data source contains a series of 10- to 40-byte records. Each record contains the
primary index value (EmpNum) of a row that is to be inserted successively into the Employee
table whose columns are EmpNo, Name, and Salary.
.BEGIN LOAD SESSION number ;
.LAYOUT Layoutname;
.FIELD EmpNum 1 INTEGER;
.FIELD Name * (VARCHAR (30));
.FIELD Sal * (DECIMAL (7,2));
.DML LABEL DMLlabelname;
INSERT Employee (EmpNo, Name, Salary) VALUES (:EmpNum, :Name, :Sal);
.IMPORT INFILE Infilename LAYOUT Layoutname APPLY DMLlabelname;
.END LOAD;
156
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
LAYOUT
LAYOUT
Purpose
The LAYOUT command, in conjunction with the immediately following sequence of FIELD,
FILLER, and TABLE commands, specifies the layout of the externally stored data records.
Syntax
.LAYOUT
;
layoutname
CONTINUEIF
condition
INDICATORS
3021A027
where
Syntax Element
Description
layoutname
name assigned to the layout for reference by one or more subsequent IMPORT
commands
A layoutname must obey the same rules for its construction as Teradata SQL
column names.
The name specified also may be used in the LAYOUT clause of an IMPORT
command.
Teradata Parallel Data Pump Reference
157
Chapter 3: TPump Commands
LAYOUT
Syntax Element
Description
CONTINUEIF
condition
following:
position = value
where position is an unsigned integer (never an asterisk) that specifies the
starting character position of the field of every input record that contains the
continuation indicator, and where value is the continuation indicator specified
as a character constant or a string constant. TPump uses the length of the
constant as the length of the continuation indicator field.
In the CONTINUEIF option, the condition specified as position = value is
case-sensitive; verify that the correct case has been specified for this parameter.
If the condition is true, TPump forms a single record to be sent to the Teradata
Database, by concatenating the next input record at the end of the current
record. (The current record is the one most recently obtained from the external
data source.)
If the condition parameter is false, TPump sends to the Teradata Database, the
current input record either by itself, or as the last of a sequence of concatenated
records.
Note that the starting position of the continuation indicator field is specified as
a character position of the input record. Character positions start with
character position one. TPump removes the continuation indicator field from
the input records so that they are not part of the final concatenated result.
For other fields whose startpos is specified as an unsigned integer, the startpos
refers to the field position within the final concatenated result. Consequently,
you cannot define the continuation indicator field as all or part of a field
defined with the FIELD, FILLER, or TABLE commands. Refer to the startpos
parameter of the FIELD command.
When the character set of the job script is different from the client character set
used for the job (for example, on MVS the job script must be in Teradata
EBCDIC when using the UTF8 client character set, or UTF16 client character
set can be used with the job script in UTF8), TPump translates the specified
value, which is either a character constant or a string constant, from the script
character set form to the client character set form before evaluating the
condition. TPump uses the length of the constant in the client character set
form as the length of the continuation indicator field.
Caution: When using the UTF8 client set on the mainframe, be sure to
examine the definition in the International Character Set Support
manual to determine the code points of the special characters you
might require. Different versions of EBCDIC do not always agree as
to the placement of these characters.
The mappings between Teradata EBCDIC and Unicode can be
referred to in Appendix E of International Character Set Support.
158
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
LAYOUT
Syntax Element
Description
INDICATORS
condition that the data is in the indicator mode
This means that the first n bytes of each record are indicator bytes, where n is
the rounded-up integer quotient of the number of fields defined by this
LAYOUT command for transmission to the Teradata Database, divided by 8.
TPump sends all the FIELD commands, including redefines, to the Teradata
Database. If a field has been defined and then redefined, indicator bits must be
set for both. FILLER commands also need to have indicator bits set. TPump
sends both the defined and the redefined fields to the Teradata Database. This
demonstrates the inefficiency of redefines which cause the transfer of an
extraneous field.
If INDICATORS is specified on the LAYOUT command and the data file does
not contain indicator bytes in each record, the target table will be loaded with
spurious data. Conversely, if INDICATORS is not specified and the data file
contains indicator bytes in each record, the target table will likewise be
corrupted. Exercise caution to guard against either occurrence.
A LAYOUT command that includes the INDICATORS option must accurately
describe all fields of the record to agree with the column descriptions and
ordering of the table from which this indicator-mode data was previously
selected. If the INDICATORS option is specified, TPump sends the data to the
Teradata Database with indicators.
The NULLIF parameter of the FIELD command can be specified with or
without the INDICATORS option. If NULLIF is specified, TPump sends the
data to the Teradata Database with indicators, whether or not the
INDICATORS option is specified.
Usage Notes
Although there is no explicit limit to the number of LAYOUT commands allowed, there is a
practical limit. The implied limit on usable LAYOUT commands per TPump load is four,
because TPump allows up to four IMPORT commands within a load, and each IMPORT can
reference only one LAYOUT.
A LAYOUT command must be immediately followed by a combination of FIELD, FILLER,
and/or TABLE commands. This sequence of commands, referenced by the layoutname, may
describe one or more record formats contained in one or more client data sources (see
redefinition options for FIELD, FILLER, and TABLE). The LAYOUT command sequence is
terminated by the first subsequent command that is not a FIELD, FILLER, or TABLE
command.
A layoutname may be used by one or more TPump tasks (delimited by BEGIN LOAD and
END LOAD) in a single job step and must be defined prior to any IMPORT commands that
reference it. All IMPORT commands in a single TPump task must reference the same
layoutname in the LAYOUT clause.
Teradata Parallel Data Pump Reference
159
Chapter 3: TPump Commands
LOGDATA
LOGDATA
Purpose
Supplies parameters to the LOGMECH command beyond those needed by the logon
mechanism, such as user ID and password, to successfully authenticate the user. The
LOGDATA command is optional. Whether or not parameters are supplied and the values and
types of parameters depend on the selected logon method.
LOGDATA is available only on network-based platforms.
Syntax
.LOGDATA
logdata_string
;
'logdata_string '
2409A054
where
Syntax Element
Description
logdata_string ‘logdata_string’
parameters required for the logon mechanism specified using
“LOGMECH” on page 161
For information about the logon parameters for supported
mechanisms, see the Security Administration guide.
The string is limited to 64 KB and must be in the session
character set.
To specify a string containing white space or other special
characters, enclose the data string in single quotes.
Note: The security feature this command supports is not
supported with the UTF16 session character set.
Usage Notes
For more information about logon security, see the Security Administration guide.
Examples
If used, the LOGDATA command and the LOGMECH command must precede the LOGON
command. The commands themselves may occur in any order.
The example demonstrates using the LOGDATA, LOGMECH, and LOGON commands in
combination to specify the Kerberos logon authentication method and associated parameters:
.logmech KRB5;
.logdata joe@domain1@@mypassword;
.logon cs4400s3;
160
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
LOGMECH
LOGMECH
Purpose
Identifies the appropriate logon mechanism by name. If the specified mechanism requires
parameters other than user ID and password for authentication, the LOGDATA command
provides these parameters. The LOGMECH command is optional and available only on
network-attached systems.
Syntax
logmech_name
.LOGMECH
;
2409A053
where
Syntax Element
Description
logmech_name
definition of the logon mechanism
For a discussion of supported logon mechanisms, see Security Administration.
The name is limited to 8 bytes; it is not case-sensitive.
Usage Notes
Every session to be connected requires a mechanism name. If none is supplied, a default
mechanism can be used instead, as defined on either the server or client system in an
XML-based configuration file.
For more information about logon security, see Security Administration.
Examples
If used, the LOGDATA and LOGMECH commands must precede the LOGON command. The
commands themselves may occur in any order.
The following example demonstrates using the LOGDATA, LOGMECH, and LOGON
commands in combination to specify the Windows logon authentication method and
associated parameters:
.logmech NTLM;
.logdata joe@domain1@@mypassword;
.logon cs4400s3;
Teradata Parallel Data Pump Reference
161
Chapter 3: TPump Commands
LOGOFF
LOGOFF
Purpose
The LOGOFF command disconnects all active sessions and terminates execution of TPump
on the client. An optional return code value may be specified as a conditional or arithmetic
expression, evaluated to a signed integer.
Syntax
;
.LOGOFF
retcode
3021A028
where
Syntax Element
Description
retcode
completion code returned to the client operating system
If retcode is not specified, TPump returns the value generated by the error
condition.
Usage Notes
TPump tracks the internal error condition code throughout the job and returns either 0 for
complete success, 4 for warnings, 12 for fatal errors, and 16 for no sysprint. These values are
the “error conditions”.
To avoid ambiguity or conflict with standard TPump completion codes, values greater than 20
should be used. TPump returns the higher value between the value generated by the error
condition and the return code specified in LOGOFF.
If the LOGOFF command processes, this indicates that the highest return code reached was no
more than 4 (warning). Any return code other than 0 or 4 would have terminated the job.
LOGOFF is permitted at any point in the input script and logs you off immediately.
Example
Suppose successful execution of a Teradata SQL statement (such as CREATE TABLE) is
necessary to prepare for TPump. If you determine the statement has failed with an
unacceptable completion code, and if BADRC is set to &SYSRC after the failed SQL statement,
you can terminate execution of TPump and return the unacceptable code to the client by
executing this command:
.LOGOFF &BADRC;
162
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
LOGOFF
The restart table is dropped when this command is executed. If execution is terminated before
the LOGOFF command is encountered, the restart table is not dropped, in order to support a
restart at a later time.
If a serious error terminates the program before the LOGOFF command is processed, the
return code output is the value generated by the error condition rather than the optional
retcode specified as a LOGOFF command option.
Teradata Parallel Data Pump Reference
163
Chapter 3: TPump Commands
LOGON
LOGON
Purpose
The LOGON command establishes a Teradata SQL session between TPump and the Teradata
Database. You use it to specify the LOGON string for connecting sessions required by
subsequent functions.
Syntax
Standard LOGON
;
username
.LOGON
tdpid /
, password
, ' acctid '
3021A005
Note: On VM/MVS, with the use of the User Logon Exit routine in TDP, the user name is not
required. See Teradata Director Program Reference for more information.
Single Sign On LOGON
;
.LOGON
tdpid /
username
, password
,'acctid '
2409A010
Note: When logon encryption is enabled on the gateway, single sign on is disabled on the
client and standard logon syntax should be used instead.
where
Syntax Element
Description
tdpid
optional identifier associated with a particular copy of the Teradata Database
If this field is not specified, the default tdpid, established by the system
administrator, is used.
For channel-attached systems, the tdpid string must be in the form:
TDPn
where n is the TDP identifier
username
164
user identifier of up to a 30-character maximum
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
LOGON
Syntax Element
Description
password
optional password associated with the username, up to a 30-character
maximum
The Teradata Database must be configured to recognize the password specified.
’acctid’
optional account identifier associated with the username, up to a 30-character
maximum
You must enclose the string specification in single quotes.
If this field is not specified, a default ’acctid’ is used.
Usage Notes
Both the LOGON command and the LOGTABLE command are required to set up the TPump
support environment. You can use them in any order, but they must precede any other
commands. However, you can use a RUN FILE command to identify a file containing the
LOGON command before the LOGON and LOGTABLE commands.
LOGON and LOGTABLE commands typically occur as:
.logtable logtable001;
.logon tdpx/me,paswd;
When the LOGON command is executed, the initial TPump utility session is logged on. The
logon information is saved and re-used when processing the BEGIN LOAD command to
connect the appropriate number of sessions.
The parameters (tdpid, username, password, and ’acctid’) are optional and are used in all
sessions established with the Teradata Database. The LOGON command may only occur once.
The period (.) preceding LOGON is also optional.
The tdpid identifier specifies a particular Teradata Database. See your system or site
administrator for the identifier that you plan to use. If you do not specify a tdpid and the site
administrator has not updated the System Parameter Block, the default identifier is Teradata
Database. The long form of this parameter, tdpx, should be used to avoid CLI errors that can
occur when the short form is used.
The tdpid parameter is optional if your site has only one TDP, if you have previously executed
a TDP command, or if you select the default TDP. This parameter is not case-sensitive.
TPump does not prompt for a username or password. If either or both of these are required,
TPump fails and reports the error. Both of these parameters may be optional if a logon exit is
used.
Where possible, you should not use special characters in the ’acctid’ parameter string.
Although ’acctid’ may contain special characters, they might be interpreted differently by
different output devices. Therefore, you might have to modify a script containing special
characters if your output is routed to another device. If the ’acctid’ contains an apostrophe
(single quote) character, you should use either the second form of the LOGON command,
which is delimited by quotes, or double the apostrophe character as follows:
.LOGON 0/fml,fml,”engineering’s account”
Teradata Parallel Data Pump Reference
165
Chapter 3: TPump Commands
LOGON
or
.LOGON 0/fml,fml,’engineering”s account”
If the ’acctid’ does not contain an apostrophe, the two LOGON command forms are the same.
If you enter any parameter incorrectly, the logon fails and TPump returns an error message.
For security reasons, the error message does not state in which parameter the error occurred.
If password security on a channel-attached client is a concern, use the RUN FILE command to
alter the script to accept the LOGON command from another dataset/file under the control of
ACF2 or another client-resident security system. For example:
//stepname EXEC PGM=TPUMP,...
//SECURE DD DSN = FOO
//SYSIN DD *
.LOGTABLE MYTABLE;
.RUN SECURE;
You can then log on by simply entering the LOGON command with a valid user name and no
password if your system administrator has granted this option. As an example, to log onto
TPump as user ABC with ABC as the password (which is masked from view on the output
listing), specify the LOGON command on one line as follows:
.logon ABC,ABC
When the command is entered, TPump displays something like:
****
****
****
****
****
22:13:18
22:13:18
22:13:18
22:13:18
22:13:26
UTY8400
UTY8400
UTY8400
UTY8400
UTY6211
Teradata Database Release: 12.00.00.00
Teradata Database Version: 12.00.00.00
Default character set: EBCDIC
Maximum supported buffer size: 1M
A successful connect was made to the RDBMS
Logon exits are supported on both mainframe and UNIX clients. The CLIv2 User Logon Exit
routine can be used to make some or all logon string elements optional.
LOGON is used with the LOGTABLE command, both of which are required. LOGON and
LOGTABLE may appear in any order, but must precede other commands except RUN FILE
commands used to identify the file containing the LOGON command. If you enter LOGON
first, you are warned that LOGTABLE is required.
The parameters (tdpid, username, password, and ’acctid’) are used in all sessions established
with the Teradata Database. The LOGON command may occur only once.
Note: If the RDBMS is configured to use single sign on (SSO) and you are logged on to the
Teradata client machine, the machine name, user name, and password are not required in the
LOGON command. The user name and password combination specified when you logged on
to your Teradata client machine are authenticated via network security for a SSO such that
valid Teradata users will be permitted to log on to the Teradata Database. The use of SSO is
strictly optional, unless the Gateway has been configured to accept only SSO-style logons.
If you want to connect to a Teradata Database other than the default, the TDPid must be
included in the LOGON command. If the TDPid is not specified, the default contained in
clispb.dat will be used. To be interpreted correctly, the TDPid must be followed by the slash
separator (‘/’), to distinguish the TDPid from a Teradata Database user name. For example, to
connect to slugger, you would enter one of the following:
166
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
LOGON
.LOGON slugger/;
.LOGON slugger/,,'acctinfo';
If you enter the LOGON command first, TPump warns you that the LOGTABLE command is
also required.
If an account ID is to be used, the optional account ID must be specified in the LOGON
command.
Teradata Parallel Data Pump Reference
167
Chapter 3: TPump Commands
LOGTABLE
LOGTABLE
Purpose
The LOGTABLE command identifies the table to be used for journaling checkpoint
information required for safe, automatic restart of the TPump support environment in the
event of a client or Teradata Database hardware platform failure.
The LOGTABLE command is used in conjunction with the LOGON command, both of which
are required. Both LOGON and LOGTABLE may appear in any order, but must precede any
other commands except any RUN FILE commands used to identify the file containing the
LOGON command. If you enter LOGON first, you are warned that the LOGTABLE is
required.
Caution:
Do not share the restart log table between two or more TPump jobs. Each TPump job must
have its own restart log table to ensure that it runs correctly. If you do not use a distinct restart
log table for each TPump job, the results are unexpected. In addition, you may not be able to
restart one or more of the affected jobs.
Syntax
;
tname
.LOGTABLE
dbname.
3021A029
where
Syntax Element
Description
dbname
(optional) database name under which the log table exists
The default is the database name associated with the username specified in the
LOGON command. TPump searches for the table (tname) in that database
unless another database name is specified in this option.
tname
identifier for the restart log table
Usage Notes
A LOGTABLE command is required for each invocation of TPump. Only a single LOGTABLE
command is allowed for each execution. It must precede all environmental and application
commands (other than RUN FILE and LOGON) in the input stream.
The specified table is used as the TPump restart log. It does not have to be fully qualified. If the
table exists, it is examined to determine if this is a restart. When this is the case, a restart is
done automatically. If the table does not exist, it is created and used as a restart log during this
invocation of TPump.
168
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
LOGTABLE
The log table is maintained automatically by the Teradata TPump. If you manipulate this table
in any way, the restart capability is lost. The only action that you should take is to DROP the
log table; never attempt to delete rows from the table. The log table should not be dropped
when the Teradata Tpump job using that log table is running. If the log table is dropped
during a job run, Teradata Tpump will run into errors.
You cannot override the default for the database name with a DATABASE statement because it
must come after LOGTABLE/ LOGON. Instead, you must use the LOGTABLE dbname
option.
TPump allows a DELETE DATABASE statement because DELETE is a standard Teradata SQL
function. This statement can delete the current restart log after it has been created, which
terminates the job.
Example
The following example presents both the LOGTABLE command and the LOGON command
as they typically occur.
.logtable Mine.Logtable001;
.logon tdpx/me,paswd;
Log Table Space Requirements
The calculation of space requirements for a TPump log table is highly dependent on the
specifics of the job. Although there are mandatory inserts for every TPump job, others occur
on a job-dependent basis. See “Estimating Space Requirements” for details on how to
calculate log table space.
Teradata Parallel Data Pump Reference
169
Chapter 3: TPump Commands
NAME
NAME
Purpose
The NAME command assigns a unique job name identifier to the environmental variable
&SYSJOBNAME.
Syntax
.NAME
;
jobname
3021A030
where
Syntax Element
Description
jobname
character string that identifies the name of a job in a maximum of
16 characters
If this command is not specified, the default job name of ltdbase_logtable is
used, where:
1 ltdbase is a character string of up to the first seven characters of the name of
the database containing the log table.
2 logtable is a character string with the first eight characters of the log table
name.
Usage Notes
The NAME environmental command must be used only once, in order to set the job name
and the variable &SYSJOBNAME. Further attempts to execute the command will fail.
The NAME command sets the variable &SYSJOBNAME to the specified string. The string is
truncated to 16 characters. It is an error to use this command more than once in a TPump
script or after the first BEGIN LOAD command in the script.
If &SYSJOBNAME is not set using the NAME command, it defaults to
MYYYYMMDD_HHMMSS_LLLLL, where
M = macro
MM = month
DD = day
YYYY = year
hh = hour
mm = minute
ss = second
170
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
NAME
lllll = is the low order 5 digits of the logon sequence number returned by the dbs from the
.LOGON command.
This variable is not set until created with the NAME command, or with the first BEGIN LOAD
by default. Any attempt to use it before a NAME command is issued (or before the first
BEGIN LOAD if there is no NAME command), results in a syntax error. This variable is
significant because it is used by TPump when composing default names for various database
artifacts, namely the error table and TPump-created macros.
Note: If serialization for two or more DML statements is required, they cannot be put in
different partitions. Serialization requires that all DML statements with identical hash values
of the rows be submitted from the same session.
Teradata Parallel Data Pump Reference
171
Chapter 3: TPump Commands
PARTITION
PARTITION
Purpose
The PARTITION command defines a collection of sessions used to transfer SQL requests to
the Teradata RDBMS. A DML command may name the partition to be used for its requests to
the RDBMS.
A default session partition may still be created using the SESSIONS and PACK parameters of
the BEGIN LOAD command.
This command works in conjunction with a DML parameter, PARTITION, which names the
session partition that a DML’s SQL will use. If the DML command does not have a
PARTITION parameter, then the default partition created using the SESSIONS and PACK
parameters of the BEGIN LOAD command will be used.
Syntax
.PARTITION
partition_name
SESSIONS
A
number
threshold
A
DATAENCRYPTION
OFF
ON
statements
PACK
PACKMAXIMUM
3021B018
where
Syntax Element
Description
number
number of sessions to be logged on for the partition
TPump logs on and uses the number of sessions specified to communicate
requests to the Teradata Database.
There is no default value for number; it must be specified. Neither is there a
maximum value, except for system-wide session limitations, which vary
among machines.
Limiting the number of sessions conserves resources on both the external
system and the Teradata Database. This conservation is at the expense of a
potential decrease in throughput and increase in elapsed time.
172
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
PARTITION
Syntax Element
Description
DATAENCRYPTION
ON/OFF
keyword to encrypt import data and the request text during the
communication between TPump and Teradata Database for the sessions
defined in the PARTITION command
If ON, the encryption will be performed. If OFF, the encryption will not be
performed. If DATAENCRYPTION is not specified, the default is OFF
when "-y" runtime parameter is not specified and DATAENCRYPTION is
OFF in the BEGIN LOAD command. If "-y" runtime parameter is specified
or DATAENCRYPTION is ON in the BEGIN LOAD command, the default
is ON.
This option applies to the sessions defined by the PARTITION command.
When the session is specified explicitly, the setting overrides the encryption
setting by the "-y" runtime parameter and by the DATAENCRYPTION
option in the BEGIN LOAD command for the sessions defined in the
PARTITION command.
PACK
keyword for the number of statements to pack into a multiple-statement
request
Maximum value is 600.
Packing improves network/channel efficiency by reducing the number of
sends and receives between the application and the Teradata Database.
PACKMAXIMUM
keyword requesting TPump to dynamically determine the maximum
possible PACK factor for the current partition
Maximum value is 600.
Displayed in message UTY6652, the value thus determined should be
specifically used on subsequent runs, as the use of PACKMAXIMUM
requires iterative interactions with the RDBMS during initialization to
heuristically determine the maximum possible PACK factor.
partition_name
name assigned to the partition for reference by one or more subsequent
DML commands
A partition name must obey the same rules for its construction as Teradata
SQL column names. The name specified may be used in the PARTITION
clause of a DML command.
SESSIONS
Teradata Parallel Data Pump Reference
keyword for designating the number of sessions for the partition
173
Chapter 3: TPump Commands
PARTITION
Syntax Element
Description
statements
number of statements, as a positive integer of up to 600, to pack into a
multiple-statement request
Default value is 20 statements per request.
Note: Under certain conditions, TPump may determine that the pack
factor has been set too high. TPump then automatically lowers the pack
setting to an appropriate value and issues warning message UTY6625, for
example:
“UTY6625 WARNING: Packing has been changed to 12 statements per
request”, and continues.
Packing improves network/channel efficiency by reducing the number of
sends/receives between the application and the RDBMS.
The packing factor is validated by sending a fully packed request to the
Teradata Database using a prepare. This test checks for syntax problems and
requests that are excessively large and overwhelm the parser.
To simplify the script development process, TPump ignores certain errors
returned by an overloaded parser, shrinks the request, retries the prepare
until it executes successfully and finally, issues a warning noting the revised
packing factor size.
When this happens, the TPump script should be modified to eliminate the
warning, which avoids the time-consuming process of shrinking the
request.
Note: A packing failure may occur if the source parcel length does not
match the data defined. If this happens, TPump issues the message:
“UTY2819 WARNING: Packing may fail because input data does not
match with the data defined.”
To resolve this problem, increase the packing factor and resubmit the job.
threshold
minimum number of sessions to be logged on for the partition
When logging on sessions, if system limits are reached above the threshold
value, TPump stops trying to log on, and uses whatever sessions are already
logged on.
If the sessions run out before the threshold is reached, TPump logs off all
sessions, waits for the time determined by the SLEEP value (specified in the
BEGIN LOAD command), and tries to log on again.
Example
A sample script that uses partitioning follows:
.LOGTABLE TPLOG01;
.LOGON cs4400s3/cfl,cfl;
DROP TABLE TPTBL01;
DROP TABLE TPTBL02;
DROP TABLE TPERR01;
CREATE TABLE TPTBL01, FALLBACK(
C1 CHAR(12) not null,
C2 CHAR(8) not null)
PRIMARY INDEX (C1);
174
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
PARTITION
CREATE TABLE TPTBL02, FALLBACK(
C1 CHAR(12),
C2 CHAR(8),
C3 CHAR(6))
UNIQUE PRIMARY INDEX (C1);
.BEGIN LOAD
ERRLIMIT 100 50
CHECKPOINT 15
TENACITY 2
ERRORTABLE TPERR01
ROBUST off
serialize on
;
.LAYOUT LAY02;
.FIELD cc1 * CHAR(12) key;
.FIELD cc2 * CHAR(8);
.FIELD cc3 * CHAR(6);
.filler space1 * char(1);
.partition part1 pack 10 sessions 10;
.partition part2 sessions 5 1 packmaximum;
.DML LABEL LABEL01 partition part1
DO INSERT FOR MISSING ROWS
ignore extra update rows
use(cc1, cc2);
UPDATE TPTBL01
SET C2 = :CC2
WHERE C1 = :CC1;
INSERT TPTBL01 (C1, C2)
VALUES (:CC1,:CC2);
.DML LABEL LABEL02 partition part2
serializeon( cc1 )
ignore extra update rows
DO INSERT FOR MISSING UPDATE ROWS;
UPDATE TPTBL02 SET C2 = :CC2 WHERE C1 = :CC1;
INSERT TPTBL02 (C1, C2, C3)
VALUES (:CC1,:CC2,:CC3);
.IMPORT INFILE c:\NCR\Test\TpumpData001.txt FORMAT TEXT
LAYOUT LAY02
APPLY LABEL01
APPLY LABEL02 where CC2 = '00000001';
.END LOAD;
.LOGOFF;
Teradata Parallel Data Pump Reference
175
Chapter 3: TPump Commands
ROUTE
ROUTE
Purpose
The ROUTE command identifies the destination of various outputs produced by TPump.
Syntax
.ROUTE
MESSAGES
;
fileid1
FILE
TO
ECHO
fileid2
FILE
WITH
TO
ECHO OFF
WITH
FILE fileid1
TO
FILE
TO
fileid2
ECHO
WITH
fileid1
TO
ECHO OFF
WITH
3021A031
where
Syntax Element
Description
MESSAGES
preferred location where the messages be redirected (normally written to
DDNAME SYSPRINT in VM/MVS or stdout in UNIX); that is, sent to an
additional destination, or both
fileid1 and fileid2
alternate message destination in the external system
UNIX and Windows
Fileid is the path name for a file. If the path name has embedded white
space characters, enclose the entire path name in single or double quotes.
VM
A FILEDEF name.
MVS
A DDNAME. See the MVS fileid topic in the “Usage Notes” section.
ECHO
additional destination, with a fileid specification
Use the ECHO keyword to specify, for example, that messages be captured
in a file (fileid2) while still being written to your terminal.
Note: The ECHO OFF specification cancels the additional file
specification of a previously established ECHO destination.
Usage Notes
In MVS, fileid is a true DDNAME; in VM/CMS, it is a FILEDEF name; and in UNIX, it is a file
pathname. If DDNAME is specified, TPump writes data records to the specified destination. A
176
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
ROUTE
DDNAME must obey the same rules for its construction as Teradata SQL column names
except that the “at” sign (@) is allowed as an alphabetic character and the underscore ( _ ) is
not allowed. The DDNAME must also obey the applicable rules of the external system. If
DDNAME represents a data source on magnetic tape, the tape may be either labelled or
nonlabelled (if the operating system supports it).
On UNIX systems, you can use an asterisk (*) as the fileid1 or fileid2 specifications to route
messages to the system console/standard output (stdout) device. The system console is the:
•
Display screen in interactive mode
or
•
Standard output device in batch mode
Example 1
.ROUTE MESSAGES TO FILE OUTPUT WITH ECHO TO FILE SYSOUT;
ECHO, when specified with OFF, stops routing output to the previously established echo
destination.
Example 2
.ROUTE MESSAGES FILE OUTPUT;
The messages are written to the file designated by OUTPUT from this point unless redirected
by another ROUTE command.
In UNIX-based systems, if “outfilename” is used to redirect “stdout,” and also as the file in a
ROUTE WITH ECHO command, the results written to “outfilename” may be incomplete due
to conflicting writes to the same file.
Teradata Parallel Data Pump Reference
177
Chapter 3: TPump Commands
RUN FILE
RUN FILE
Purpose
The RUN FILE command invokes the specified external source as the current source of
commands and statements.
Syntax
.RUN FILE
;
fileid
IGNORE
charpos1
charpos1 THRU
THRU charpos2
charpos1 THRU charpos2
3021A032
where
Syntax Element
Description
fileid
data source of the external system
The client system DD or equivalent statement specifies a file.
UNIX and Windows
infilename (the path name for a file). If the path name has embedded
white space characters, enclose the entire path name in single or double
quotes.
MVS
a true DDNAME.
If DDNAME is specified, TPump reads data records from the specified
source.
A DDNAME must obey the same rules for its construction as Teradata
SQL column names, except that:
• the “at” sign (@) is allowed as an alphabetic character
• the underscore (_) is not allowed
The DDNAME must also obey the applicable rules of the external system.
If DDNAME represents a data source on magnetic tape, the tape may be
either labelled or nonlabelled (if the operating system supports it). The
“at” sign (@) is allowed as an alphabetic character and the underscore (_)
is not allowed.
VM/CMS
A FILEDEF name.
178
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
RUN FILE
Syntax Element
Description
charpos1 and charpos2
start and end character positions of a field in each input record which
contains extraneous information
TPump ignores the specified field(s) as follows:
1 charpos1: only the single specified character position is ignored.
2 charpos1 THRU: character positions from charpos1 through the end of
the record are ignored.
3 THRU charpos2: character positions from the beginning of the record
through charpos2 are ignored.
4 charpos1 THRU charpos2: character positions from charpos1 through
charpos2 are ignored.
Usage Notes
Once TPump executes the RUN FILE command, further commands and DML statements are
read from the specified source until a LOGOFF command or end-of-file condition is
encountered, whichever occurs first. An end-of-file condition automatically causes TPump to
resume reading its commands and DML statements from the previously active source (or the
previous RUN source when RUNs are nested), either SYSIN for VM/MVS, or stdin (normal or
redirected) in UNIX. After SYSIN/stdin processes any user-provided invocation parameter, it
remains the active input source.
If an end-of-file condition occurs on fileid, SYSIN/stdin is read because there are no more
commands or statements in the PARM string.
The command source specified by a RUN FILE command may contain nested RUN FILE
commands to 16 levels.
On UNIX systems, you can use an asterisk (*) as the fileid specification for the system console/
standard input (stdin) device. The system console is the:
•
Keyboard in interactive mode
or
•
Standard input device in batch mode
Example
.RUN FILE LOGON;
Teradata Parallel Data Pump Reference
179
Chapter 3: TPump Commands
SET
SET
Purpose
The SET command assigns a data type and a value to a utility variable. Variables need not be
declared in advance to be the object of the SET command. If a variable does not already exist,
the SET command creates it.
The SET command also dynamically changes the data type to that of the assigned value if it
has already been defined. Variables used to the right of TO in the expression must have already
been defined.
Syntax
.SET
;
expression
var
TO
3021A033
where
Syntax Element
Description
var
name of the utility variable which is set to the evaluated expression
following it
expression
value and data type to which the utility variable is to be set
Usage Notes
The utility variable can be substituted wherever substitution is allowed.
If the expression evaluates to a numeric value, the symbol is assigned an integer value, as in:
.SET FOONUM TO -151 ;
If the expression is a quoted string, the symbol is assigned a string value, as in:
.SET FOOCHAR TO ’-151’ ;
The minimum and maximum limits for Floating Point data types are as follows:
4.0E-75 <=abs(float variable)<7.0E75
Example 1
TPump supports concatenation of variables, using the SET command, such as:
.SET C TO 1;
.SET D TO 2;
.SET X TO &C.&D;
180
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
SET
Example 2
In this example, X evaluates to 12. If a decimal point is added to the concatenated variables, as
in:
.SET C TO 1;
.SET D TO 2;
.SET X TO &C..&D;
X then evaluates to 1.2.
Teradata Parallel Data Pump Reference
181
Chapter 3: TPump Commands
SYSTEM
SYSTEM
Purpose
The SYSTEM command allows you to access the local operating system during TPump
operations.
Syntax
.SYSTEM
'oscommand'
;
3021A034
where
Syntax Element
Description
‘oscommand’
command string (enclosed within single quotes) that is appropriate to the local
operating system
The SYSTEM command suspends the current TPump application in order to
execute the command. When the command completes, the return code from the
invoked command is displayed, and the &SYSRC variable is updated with the
return code.
Usage Notes
On MVS clients, the command is passed to the PGM executor. The first token of the
command string is interpreted as a load module and the remainder as a PARM string. As an
example, the following statement calls the load module IEBUPDTE, passing the PARM string
“NEW”.
.SYSTEM “IEBUPDTE NEW”;
This command calls IEBUPDTE in the same way it is called via the JCL statement:
//EXEC PGM=IEBUPDTE,PARM=’NEW’
On MVS, the program must be present in the STEPLIB or JOBLIB concatenation, be resident
in one of the LPAs, or be located in the linklist concatenation.
Otherwise, the invocation will fail, with code SYS_ABTM (-14) returned, resulting from an
underlying abend S806-04. Other types of failures also are possible.
Similarly, on VM clients, if the command to be executed is not found, an abend is likely to
occur.
182
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
SYSTEM
On VM clients, the SYSTEM command is passed to the CMS SUBSET executor and the result
returned.
On UNIX clients, the SYSTEM command invokes the standard UNIX interface to issue the
command to the shell (sh), and waits for its completion.
Teradata Parallel Data Pump Reference
183
Chapter 3: TPump Commands
TABLE
TABLE
Purpose
The TABLE command identifies a table whose column names and data descriptions are used
as the names and data descriptions of the input record fields. These are assigned in the order
defined. The TABLE command must be used with the LAYOUT command.
Syntax
.TABLE
;
tableref
3021A035
where
Syntax Element
Description
tableref
existing table whose column names and data descriptions are assigned, in the
order defined, to fields of the input data records
The column names of the table specified by the TABLE command must be
Teradata SQL column names that do not require being enclosed in quotation
marks. Tables cannot be created with invalid column names. Any nonstandard
column name causes one of three kinds of errors, depending on the nature of
the divergence from the standard. These errors are:
1 Embedded blanks cause a syntax error, depending on the nonblank contents
of the name.
2 Invalid characters cause an invalid name error.
3 Reserved words cause a syntax error that mentions invalid use of the
reserved word.
Usage Notes
One or more TABLE commands may be intermixed with the FIELD command or FILLER
command following a LAYOUT command.
This method of specifying record layout fields assumes each field, as defined by the data
description of the corresponding column of tableref, is contiguous with the previous one,
beginning at the next available character position beyond any previous field specifications for
the input records. The fields must appear in the order defined for the columns of the table.
The object identified by the tableref parameter must be a table. It need not appear as a
parameter of the BEGIN LOAD command, but you must either be an owner of the object or
have at least one privilege on it. If specified as an unqualified table name, the current default
database qualifies it.
184
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
TABLE
When serialization has been set to ON by the BEGIN LOAD command, the primary index
columns for the specified table are considered key fields for serialization purposes.
When the TABLE command is used and the table contains a structured UDT type, TPump
returns an external representation of the UDT and that requires the user to transform. The
term “external type” means the data type of the external opaque container for a structured
UDT and is the type returned by the from-sql transform method.
Teradata Parallel Data Pump Reference
185
Chapter 3: TPump Commands
UPDATE Statement and Atomic Upsert
UPDATE Statement and Atomic Upsert
Purpose
TPump supports the UPDATE Teradata SQL statement, which changes field values in existing
rows of a table.
Syntax
,
UPDATE
UPD
tname
SET
cname = expr
WHERE
conditional
;
3021A036
where
Syntax Element
Description
tname
table or view to be updated
This table was previously identified as tname in the BEGIN LOAD command.
If tname is not explicitly qualified by database name, the current default
database qualifies it.
cname
column whose value is to be replaced by the value of expr
The column named must not be a column of the primary index.
expr
expression whose resulting value is to replace the current value of the
identified column
The expression can contain any combination of constants, current values of
columns of the referenced row, or values from fields of input data records.
References to fields of input data records are as follows:
:fieldname
where :fieldname is defined by a FIELD command or TABLE command of the
layout referenced by an IMPORT using this UPDATE.
WHERE condition
conditional clause specifying the row or rows to be updated
The conditional clause can use values from fields of input data records by
referring to their field names as follows:
:fieldname
where fieldname is defined by a FIELD command or TABLE command.
Equality values for all the columns of the primary index must be explicitly
specified in this clause.
186
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
UPDATE Statement and Atomic Upsert
Usage Notes - Update
The following notes describe how to use an UPDATE statement following a DML command.
An UPDATE statement may also be used in the support environment; normal rules for
UPDATE are followed in that case.
1
To update records in a table, the userid that is logged on must have UPDATE privilege for
the table.
2
In an IMPORT task, if you specify multiple UPI columns, they should all be specified in
the WHERE clause of the UPDATE statement. In this case, the WHERE clause is fully
qualified, thereby allowing TPump to avoid table locks and optimize the processing.
3
For TPump use, if the object of the UPDATE statement is a view, it must not specify a join.
TPump operates only on single table statements so UPDATE statements must not contain
any joins.
4
Only one object may be identified.
5
The OR construct can be used in the WHERE clause of a DELETE statement; alternatively,
two or more separate DML statements (one per OR term) can be used, with the DML
statements applied conditionally with the APPLY clause of the IMPORT command. The
nature of the alternatives will usually make one of the methods more appropriate.
6
The maximum number of INSERT, UPDATE, DELETE, and EXECUTE statements that
can be referenced in an IMPORT is 127.
7
The maximum number of DML statements that can be packed into a request is 600. The
default number of statements packed is 20.
Note: To ensure data integrity, the SERIALIZE parameter defaults to ON in the absence of an
explicit value if there are upserts in the TPump job.
Example
The following example describes an input data source containing a series of 14-byte records.
Each record contains the value of the primary index column (EmpNo) of a row of the
Employee table whose PhoneNo column is to be assigned a new phone number from field
Fone.
.BEGIN LOAD SESSION number;
.LAYOUT Layoutname;
.FIELD EmpNum 1 INTEGER;
.FIELD Fone * (CHAR (10));
.DML LABEL DMLlabelname;
UPDATE Employee SET PhoneNo = :Fone WHERE EmpNo = :EmpNum;
.IMPORT INFILE Infilename LAYOUT Layoutname APPLY DMLlabelname;
.END LOAD;
Usage Notes - Atomic Upsert
The syntax for Atomic upsert consists of an UPDATE statement and an INSERT statement
separated by an ELSE keyword as follows:
UPDATE <update-operands> ELSE INSERT <insert-operands>;
Teradata Parallel Data Pump Reference
187
Chapter 3: TPump Commands
UPDATE Statement and Atomic Upsert
TPump inserts the ELSE keyword between the UPDATE and INSERT statements by itself, so
the user should not enter it in the script. If the ELSE keyword is used in this context, TPump
will terminate with a syntax error.
The <update-operands> and <insert-operands> are operands for regular UPDATE and
INSERT SQL statements, respectively. Only certain types of UPDATE and INSERT operands
are valid in an Atomic upsert statement, and the operand parameters within a given upsert
statement are subject to further constraints linking the update and insert parameters.
When using the standard upsert feature, the primary index should always be fully specified for
the UPDATE statement, just as for other DML in a TPump script, so that the update can be
processed as a one-AMP rather than an all-AMP operation. In addition, both the UPDATE
and the INSERT of an upsert statement pair should specify the same target table, and the
primary index value specified in the UPDATE's WHERE clause should match the primary
index value implied by the column values in the INSERT. When processing an Atomic upsert
statement, the Teradata Database will usually reject statements that fail to meet these basic
upsert constraints and return an error, enabling TPump to detect and handle constraint
violations.
Constraints considered to be basic to the upsert operation are:
1
SAME TABLE: The UPDATE and INSERT statements must specify the same table.
2
SAME ROW: The UPDATE and INSERT statements must specify the same row; that is, the
primary index value in the inserted row must be the same as the primary value in the
targeted UPDATE row.
3
HASHED ROW ACCESS: The UPDATE must fully specify the primary index, allowing the
target row to be accessed with a one-AMP hashed operation.
Some of these restrictions concern syntax that is supported in UPDATE and INSERT
statements separately but not when combined in an Atomic upsert statement. Restrictions not
supported by the Atomic upsert feature that return an error if submitted to the Teradata
Database are:
188
1
INSERT-SELECT: Syntax not supported. The INSERT may not use a subquery to specify
any of the inserted values. Note that support of this syntax is likely to be linked to support
of subqueries in the UPDATE's WHERE clause constraints as described above, and may
involve new syntax features to allow the UPDATE and INSERT to effectively reference the
same subquery.
2
UPDATE-WHERE-CURRENT: Syntax not supported. The WHERE clause cannot use an
updatable cursor to do what is called a positioned UPDATE. (It is unlikely that this syntax
will ever be supported.) Note that this restriction does not prevent cursors from being
used in other ways with Atomic upsert statements. For example, a DECLARE CURSOR
statement may include upsert statements among those to be executed when the cursor is
opened, as long as the upserts are otherwise valid.
3
UPDATE-FROM: Not supported. The SET clause cannot use a FROM clause table
reference in the expression for the updated value for a column.
4
UPDATE-WHERE SUBQUERIES: Not supported. The WHERE clause cannot use a
subquery either to specify the primary index or to constrain a nonindex column. Note that
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
UPDATE Statement and Atomic Upsert
supporting this UPDATE syntax would also require support for either INSERT-SELECT or
some other INSERT syntax feature that lets it specify the same primary index value as the
UPDATE.
5
UPDATE-PRIMARY INDEX: Not supported. The UPDATE cannot change the primary
index. This is sometime called unreasonable update.
6
TRIGGERS: Feature not supported if either the UPDATE or INSERT could cause a trigger
to be fired. The restriction applies as if the UPDATE and INSERT were both executed,
because the parser trigger logic will not attempt to account for their conditional execution.
UPDATE triggers on columns not referenced by the UPDATE clause, however, will never
be fired by the upsert and are therefore permitted. DELETE triggers cannot be fired at all
by an upsert and are likewise permitted. Note that an upsert could be used as a trigger
action but it would be subject to the same constraints as any other upsert. Because upsert
is not allowed to fire any triggers itself, an upsert trigger action must not generate any
further cascaded trigger actions.
7
JOIN/HASH INDEXES: Feature not supported if either the UPDATE or INSERT could
cause the join/hash index to be updated. As with triggers, the restriction applies to each
upsert as if the UPDATE and INSERT were both executed. While the UPDATE could
escape this restriction if the join/hash index does not reference any of the updated
columns, it is much less likely (and maybe impossible) for the INSERT to escape this
restriction. If the benefit of lifting the restriction for a few unlikely join/hash index cases
turns out to be not worth the implementation cost, the restriction may have to be applied
more broadly to any table with an associated join/hash index.
TPump treats a failed constraint as a nonfatal error, reports the error in the job log for
diagnostic purposes, and continues with the job by reverting to nonbasic upsert protocol.
To resolve order-dependency issues, TPump always processes the UPDATE before the INSERT
because:
•
It matches the ordering implied by the upsert name: UP[date] + [in]SERT.
•
It matches the ordering implied by the UPDATE-ELSE-INSERT syntax.
•
It matches the common definition of upsert semantics.
•
It allows for an upsert operation on MULTISET tables, where an insert-first policy would
always succeed on INSERT and never on UPDATE.
Existing TPump scripts for upsert do not need to be changed. The syntax as described below
for upsert will continue to be supported:
DO INSERT FOR MISSING UPDATE ROWS;
UPDATE <update-operands>;
INSERT <insert-operands>;
TPump changes this syntax into Atomic upsert syntax by replacing the semicolon between the
UPDATE and INSERT statements with an ELSE keyword to convert the statement pair into a
single Atomic upsert statement.
If user-created macros are used in place of the UPDATE and INSERT statements, TPump
generates:
EXEC <update-macro> ELSE EXEC <insert-macro>;
Teradata Parallel Data Pump Reference
189
Chapter 3: TPump Commands
UPDATE Statement and Atomic Upsert
because this statement does not conform to Atomic upsert syntax used by TPump.
Atomic Upsert Examples
This section describes several examples that demonstrate how the Atomic upsert feature
works, including cases where errors are detected and returned to the user. All of the examples
use the same table, called Sales, as shown below:
CREATE TABLE Sales, FALLBACK,
(ItemNbrINTEGER NOT NULL,
SaleDateDATE FORMAT 'MM/DD/YYYY' NOT NULL,
ItemCountINTEGER)
PRIMARY INDEX (ItemNbr);
Assume that the table has been populated with the following data:
INSERT INTO Sales (10, '05/30/2005', 1);
A table called NewSales has the same column definitions as those of table Sales.
Example 1 (Error: different target tables)
This example demonstrates an upsert statement that does not specify the same table name for
the UPDATE part and the INSERT part of the statement.
UPDATE Sales SET ItemCount = ItemCount + 1 WHERE (ItemNbr = 10 AND
SaleDate = '05/30/2005') ELSE INSERT INTO NewSales (10, '05/30/2005',
1);
A rule of an upsert is that only one single table is processed for the statement. Because the
tables, Sales and NewSales, are not the same for the upsert statement, an error is returned to
the user indicating that the name of the table must be the same for both the UPDATE and the
INSERT.
Example 2 (Error: different target rows)
This example demonstrates an upsert statement that does not specify the same primary index
value for both the UPDATE and INSERT parts of the statement.
UPDATE Sales SET ItemCount = Itemcount + 1 WHERE (ItemNbr = 10 AND
SaleDate = '05/30/2005') ELSE INSERT INTO Sales (20, '05/30/2005', 1);
The primary index values for the UPDATE and the INSERT must be the same. In this case, an
error is returned to the user indicating that the primary index value must be the same for both
the UPDATE and the INSERT.
Example 3 (Error: unqualified primary index)
This example demonstrates an upsert statement that does not specify the primary index in the
WHERE clause.
UPDATE Sales SET ItemCount = ItemCount + 1 WHERE SaleDate = '05/30/2005'
ELSE INSERT INTO Sales (10, '05/30/2005', 1);
When the primary index is not fully specified in the UPDATE of an upsert statement, an allrow scan to find rows to update might result. This is again not the purpose of upsert, and an
error is returned to the user.
190
Teradata Parallel Data Pump Reference
Chapter 3: TPump Commands
UPDATE Statement and Atomic Upsert
Example 4 (Error: missing ELSE)
This example demonstrates an upsert statement with a missing ELSE keyword.
UPDATE Sales SET ItemCount = ItemCount + 1 WHERE (ItemNbr = 10 AND
SaleDate = '05/30/2005') INSERT INTO Sales (10, '05/30/2005', 1);
Example 5 (Error: INSERT-SELECT)
This example demonstrates an upsert statement that specifies INSERT-SELECT.
UPDATE Sales SET ItemCount = ItemCount + 1 WHERE (ItemNbr = 10 AND
SaleDate = '05/30/2005') ELSE INSERT INTO Sales SELECT * FROM NewSales
WHERE (ItemNbr = 10 AND SaleDate = '05/30/2005');
The INSERT part of an upsert may not use a subquery to specify any of the inserted values. In
this case, a syntax error is returned.
Example 6 (Error: UPDATE-FROM)
This example demonstrates an upsert statement that specifies UPDATE-FROM.
UPDATE Sales FROM NewSales SET Sales.ItemCount = NewSales.ItemCount
WHERE Sales.ItemNbr = NewSales.ItemNbr ELSE INSERT INTO Sales (10, '05/
30/2005', 1);
The SET clause may not use a FROM clause table reference in the expression for the updated
value for a column, and an error is returned.
Example 7 (Error: UPDATE-WHERE SUBQUERIES)
This example demonstrates an upsert statement that specifies UPDATE-WHERE
SUBQUERIES.
UPDATE Sales SET ItemCount = ItemCount + 1 WHERE ItemNbr IN (SELECT
ItemNbr FROM NewSales) ELSE INSERT INTO Sales (10, '05/30/2005', 1);
The WHERE clause of the UPDATE may not use a subquery for any purpose. In this case,
error ERRTEQUPSCOM is returned.
Example 8 (Error: UPDATE-PRIMARY INDEX)
This example demonstrates an upsert statement that tries to update a primary index value.
UPDATE Sales SET ItemNbr = 20 WHERE (ItemNbr = 10 AND SaleDate = '05/30/
2005') ELSE INSERT INTO Sales (20, '05/30/2005', 1);
Unreasonable updates or updates that change the primary index values are not allowed in an
upsert statement, and an error is returned.
Example 9 (Valid Upsert UPDATE)
This example demonstrates a successful upsert statement that updates a row.
UPDATE Sales SET ItemCount = ItemCount + 1 WHERE (ItemNbr = 10 AND
SaleDate = '05/30/2005') ELSE INSERT INTO Sales (10, '05/30/2005', 1);
After all of the rules have been validated, the row with ItemNbr = 10 and SaleDate = '05/30/
2005' gets updated. A successful update of one row results.
Teradata Parallel Data Pump Reference
191
Chapter 3: TPump Commands
UPDATE Statement and Atomic Upsert
Example 10 (Valid Upsert INSERT)
This example demonstrates a successful upsert statement that inserts a row.
UPDATE Sales SET ItemCount = ItemCount + 1 WHERE (ItemNbr = 20 AND
SaleDate = '05/30/2005') ELSE INSERT INTO Sales (20, '05/30/2005', 1);
After all of the rules have been validated and no row was found with Item = 20 and SaleDate =
'05/30/2005' for the UPDATE, a new row is inserted with ItemNbr = 20. A successful insert of
one row results.
192
Teradata Parallel Data Pump Reference
CHAPTER 4
Troubleshooting in TPump
This chapter provides a description of the user aids for identifying and correcting errors that
may occur during a TPump task. Foremost among these tools are a large number of error
messages. For more information on error messages, refer to the Messages manual.
Troubleshooting information in this chapter includes:
•
Early Error Detection
•
Error Types
•
Error Messages
•
Reading TPump Error Tables
•
TPump Performance Checklist
Early Error Detection
The TPump utility avoids wasting time and resources on a task that contains “terminating”
errors in either input statements, commands, or both. To accomplish this, statements and
commands for a task are acquired and analyzed for detectable syntax and other errors before
the TPump task is initiated on the Teradata Database.
When a BEGIN LOAD command invokes TPump, and the utility can complete an error-free
pass, it proceeds. If not, TPump cleans up and terminates after an error pass. TPump uses the
Teradata Database to detect errors in the set of DML statements for the task. The first
statement in error terminates TPump.
Error Types
Most errors are fatal, resulting in termination of TPump. The exceptions to this general rule
are as follows:
•
User-specified SQL commands fail with no adverse effect. The variable &SYSRC is set and
if the script tests this variable it can stop the job if necessary.
•
Data-related errors in the RDBMS can reach the user-specified error limit before
terminating the job. A list of data related errors is provided in Table 19.
•
Errors which can be retried. The error numbers for these types of errors are: 2595, 2631,
2639, 2641, 2826, 2834, 2835, 3110, 3111, 3231, 3120, 3319, 3598, 3603, 5991, 6699, 8018,
and 8024.
Teradata Parallel Data Pump Reference
193
Chapter 4: Troubleshooting in TPump
Error Messages
Error Messages
Teradata Database error message numbers that identify errors that can be fixed and
resubmitted are 2631, 2639, 2641, 2834, 2835, 3110, 3111, 3120, 3127, 3178, 3598, 3603,
and 8024.
TPump ignores errors on Teradata SQL statements outside of the TPump task; that is, before
the BEGIN LOAD command or after the END LOAD command. The TPump job continues
and no return code is returned, although Teradata Database error messages are displayed.
When TPump encounters errors caused by Teradata Database failure, it neither terminates the
job nor produces a return code. When the Teradata Database recovers, TPump restarts the job
and continues without user intervention. Teradata Database error message numbers
identifying a Teradata Database failure are 2825, 2826, 2827, 2828, 3897, and 8018.
When one of these errors occur, a row is inserted into TPump’s error table for the statement or
data record in question. If the error occurs for one of the statements in a multiple-statement
request, the remaining statements are re-driven.
The retryable errors are automatically retried up to 16 times if retry times are not specified.
For the complete text and explanation of error messages, refer to the Messages manual.
A row is inserted into TPump’s error table for the statement in error. If the error occurs for
one of the statements in a multiple-statement request, then the remaining statements are redriven. These errors include the conditions listed in Table 19.
Table 19: TPump Error Conditions
194
Error
Description
2603
Bad argument for SQRT function.
2604
Bad argument involving %TVMID.%FLDID for SQRT function.
2605
Bad argument for LOG function.
2606
Bad argument involving %TVMID.%FLDID for LOG function.
2607
Bad argument for LN function.
2608
Bad argument involving %TVMID.%FLDID for LN function.
2614
Precision loss during expression evaluation.
2615
Precision loss calculating expression involving %TVMID.%FLDID.
2616
Numeric overflow occurred during computation.
2617
Overflow occurred computing an expression involving %TVMID.%FLDID.
2618
Invalid calculation: division by zero.
2619
Division by zero in an expression involving %TVMID.%FLDID.
2620
The format or data contains a bad character.
Teradata Parallel Data Pump Reference
Chapter 4: Troubleshooting in TPump
Error Messages
Table 19: TPump Error Conditions (continued)
Error
Description
2621
Bad character in format or data of %TVMID.%FLDID.
2622
Bad argument for ** operator.
2623
Bad argument involving %TVMID.%FLDID for ** operator.
2650
Numeric Processor Operand Error.
2651
Operation Error computing expression involving %TVMID.%FLDID.
2665
Invalid date.
2666
Invalid date supplied for %TVMID.%FLDID.
2674
Precision loss during data conversion.
2675
Numerical overflow occurred during computation.
2676
Invalid calculation: division by zero.
2679
The format or data contains a bad character.
2682
Precision loss during data conversion.
2683
Numerical overflow occurred during computation.
2684
Invalid calculation: division by zero.
2687
The format or data contains a bad character.
2689
Non-nullable field was null.
2700
Referential constraint violation: invalid Foreign Key value.
2726
Referential constraint violation: cannot delete/update the Parent Key value.
2801
Duplicate unique prime key error in %DBID.%TVMID.
2802
Duplicate row error in %DBID.%TVMID.
2803
Secondary index uniqueness violation in %DBID.%TVMID.
2805
Maximum row length exceeded in %TVMID.
2814
Data size exceeds the maximum specified.
2816
Failed to insert duplicate row into TPump target table. This error occurs if MARK
DUPLICATE INSERT/UPDATE ROWS is specified and a duplicate row is detected.
2817
Activity count greater than one for TPump UPDATE/DELETE. This error occurs if MARK
EXTRA UPDATE/DELETE ROWS is specified and an activity count greater than one was
seen. In this case, the error table row is inserted, but the corresponding UPDATE/DELETE
also completes.
2818
Activity count zero for TPump UPDATE or DELETE. This error occurs if MARK
MISSING UPDATE/DELETE ROWS is specified and an activity count of zero was seen.
2844
Journal image is longer than maximum.
Teradata Parallel Data Pump Reference
195
Chapter 4: Troubleshooting in TPump
Error Messages
Table 19: TPump Error Conditions (continued)
196
Error
Description
2893
Right truncation of string data.
3535
A character string failed conversion to a numeric value.
3564
Range constraint: Check error in field%TVMID.%FLDID.
3577
Row size overflow.
3578
Scratch space overflow.
3604
Cannot place a null value in a NOT NULL field.
3751
Expected a digit for the exponent.
3752
Too many digits in exponent.
3753
Too many digits in integer or decimal.
3754
Numeric precision error.
3755
Numeric overflow error.
3756
Numeric divided-by-zero error.
3757
Numeric stack overflow error.
3758
Numeric stack underflow error.
3759
Numeric illegal character error.
3996
Right truncation of string data.
5317
Check constraint violation.
5326
Operand of EXTRACT function is not a valid data type or value.
5410
Invalid TIME literal.
5411
Invalid TIMESTAMP literal.
5991
Error during plan generation.
6705
Illegally formed character string was encountered during translation.
6706
The string contains an untranslatable character.
7433
Invalid time.
7441
Date not corresponding to an existing era.
7442
Invalid era.
7451
Invalid timestamp.
7452
Invalid interval.
7453
Invalid field overflow.
7454
DateTime field overflow.
Teradata Parallel Data Pump Reference
Chapter 4: Troubleshooting in TPump
Reading TPump Error Tables
Table 19: TPump Error Conditions (continued)
Error
Description
7455
Invalid time specified.
Reading TPump Error Tables
This section describes the reading and usage of TPump error tables as a diagnostic device to
locate and fix problems. For more information, refer to the description of these tables in
“BEGIN LOAD”.
Occasionally, TPump encounters rows that cannot be correctly processed. When this happens,
TPump creates a row in the error table that is produced for each target table. Error tables are
structured to provide enough information to reveal the cause of a problem and allow
correction.
In the case of missing, duplicate, or extra rows, these are noted in the error table only if the
DML command specifies that requirement with the MARK parameter, which is the default for
DML statements, except for those participating in an upsert.
There are three error codes that relate specifically to the incidence of missing, duplicate, and
extra rows. These are:
1
2816: Failed to insert duplicate row into TPump target table.
This error occurs if MARK DUPLICATE INSERT/UPDATE ROWS is specified and a
duplicate row is detected.
2
2817: Activity count greater than one for TPump UPDATE/DELETE.
This error occurs if MARK EXTRA UPDATE/DELETE ROWS is specified and an activity
count greater than one resulted. In this case, the error table row is inserted, but the
corresponding UPDATE/DELETE also completes.
3
2818: Activity count zero for TPump UPDATE or DELETE.
This error occurs if MARK MISSING UPDATE/DELETE ROWS is specified and an
activity count of zero resulted.
The error table is used primarily to hold information about errors that occur while the
Teradata Database is trying to redistribute the data during the acquisition phase. If the
Teradata Database is unable to build a valid primary index, some application phase errors may
be put into this table. .
Table 20 defines the Acquisition Error Table, with column entries comprising the unique
primary index.
Teradata Parallel Data Pump Reference
197
Chapter 4: Troubleshooting in TPump
Reading TPump Error Tables
Table 20: Acquisition Error Table
Column
Data Type
Definition
ImportSeq
byteint
Sequence number assigned to the IMPORT command in which the error
occurred.
DMLSeq
byteint
Sequence number assigned to the DML command in which the error
occurred.
SMTSeq
byteint
Sequence number of the DML statement in the DML command that was
being executed while this error occurred.
ApplySeq
byteint
Sequence number of apply clause in IMPORT command executing when
error occurred.
SourceSeq
integer
The data row number in the client file that the DBC was building when
the error occurred.
DataSeq
byteint
The data source where the record resides.
ErrorCode
char(255)
The RDBMS code for the error.
ErrorMsg
char
The corresponding error message for the error code.
ErrorField
smallint
The number of the field in error if it can be determined.
HostData
varbyte
(63677)
The first 63,677 bytes of client data associated with the error.
The following TPump task describes how to interpret the error table information to isolate
and fix the problem. This task is greatly abbreviated, containing only the DML command and
the IMPORT command. A probable sequence of actions for locating and fixing the problem
follows the task.
SEQ TYPE
-------DML
STMT
STMT
SEQ # Statement
--- ------------------------------------------------------001
.DML LABEL FIRSTDML;
001
INSERT INTO table1 VALUES( :FIELD1, :FIELD2 );
002
UPDATE table2 SET field3 = :FIELD3 WHERE field4 =:FIELD4;
DML
STMT
002
001
.DML LABEL SECNDDML;
DELETE FROM table3 WHERE field3 = :FIELD3;
IMPORT
APPLY
001
001
.IMPORT INFILE file1 LAYOUT layout1
APPLY FIRSTDML;
IMPORT
APPLY
APPLY
002
001
002
.IMPORT INFILE file2 LAYOUT layout2
APPLY FIRSTDML
APPLY SECNDDML;
198
Teradata Parallel Data Pump Reference
Chapter 4: Troubleshooting in TPump
Reading TPump Error Tables
In this example, the Statement column represents the user entry. The SEQ # and SEQ TYPE
columns are the Sequence Number and Sequence Type assigned to each statement. If an error
occurs while using this task and the information in the following error table is displayed, you
can determine where the error occurred and what was being executed at the time of the error.
ImportSeq DMLSeq SMTSeq ApplySeq SourceSeq DBCErrorCode DBCErrorField
----------------- ---------------002
001
002
001
20456 2679
field3
The following sequence provides a series of analytical steps for extracting and interpreting the
information in this row of the error table.
1
Check the DMLSeq field to find the statement being executed. It contains the sequence
number 001.
2
Check the SMTSeq field. The sequence number 002 in this field indicates that the error
occurred while executing: change this statement of the first DML command, which is the
UPDATE statement in the above task.
3
Verify that the script shows that the DML command is used twice, once in each IMPORT.
4
The value of 002 in the ImportSeq field shows that the error occurred in the second
IMPORT clause.
5
The value of 001 in the ApplySeq field indicates that the error occurred in the first apply of
that clause, which was being executed when the error occurred.
6
The value of 2679 in the DBCErrorCode field shows:
The format or data contains a bad character
which indicates that bad data is coming from the client.
7
The ErrorField field of the error row shows that the error occurred while building field3 of
the table.
8
The script then shows that the error occurred when field3 was being built from :FIELD3 in
the client data.
9
The LAYOUT clause in the script shows where the problem data is positioned within the
row coming from the client.
10 The script shows that the IMPORT clause with the error was loading file2, and indicates
what error occurred, which statement detected the error, and which file has the error.
11 The SourceSeq field of the error table pinpoints the problem location in the 20456th
record of this file. The problem is isolated and can now be fixed.
Most problems in the error tables do not require as much research as this example required.
This error was selected in order to use all of the information in the error table. As a rule, you
only need to look at one or two items in the error tables to be able to locate and correct the
problem.
Teradata Parallel Data Pump Reference
199
Chapter 4: Troubleshooting in TPump
TPump Performance Checklist
TPump Performance Checklist
The following checklist helps to isolate and analyze TPump performance problems and their
causes.
200
1
Monitor the TPump job using the Monitor macros. Determine whether the job is making
progress.
2
Check for locks. The existence of locks can be detected by using the Teradata Database
Showlocks utility. The existence of transaction locks can be detected by checking for
'blocked' status Teradata Database utilities that use the performance monitor feature of the
Teradata Database (Teradata Manager).
3
Check table DBC.Resusage for problem areas (for example, data bus capacity or CPU
capacity at 100% for one or more processors).
4
Avoid large error tables, if possible, because error processing is generally expensive.
5
Verify that the primary index is unique. Nonunique primary indexes can cause severe
TPump performance problems.
Teradata Parallel Data Pump Reference
CHAPTER 5
Using INMOD and Notify Exit Routines
This chapter provides a detailed description of the INMOD feature used in TPump and the
notify exit routines that are associated with INMODs.
An INMOD is a user-generated module called by the IMPORT command, which reads data
from a data source. TPump honors INMODs developed for other load utilities.
Owing to the complexity of this feature, it has been given a separate place in this chapter,
rather than including it in the command syntax descriptions.
The following information is included in this chapter:
•
Overview
•
Using INMOD and Notify Exit Routines
Overview
This section provides an overview of the INMOD and Notify Exit routines. Information
includes INMOD routines, notify exit routines, programming languages, programming
structure, routine entry points, the TPump/INMOD routine interface, the TPump/notify exit
routine interface, and rules and restrictions for using routines.
INMOD Routines
The term INMOD is an acronym for input modification routines. An INMOD is a
user-written routine used by TPump to supply or preprocess input records before they are
sent to the Teradata Database.
You can use an INMOD routine to supply input records or to perform preprocessing tasks on
the input records before passing them to TPump. Such tasks, for example, could:
•
Generate records to be passed to TPump.
•
Validate data records before passing them to TPump.
•
Read data directly from one or more database systems such as IMS, Total.
•
Convert fields in a data record before passing it to TPump.
The INMOD is specified as part of the IMPORT command. See “IMPORT” for INMOD
syntax information.
Teradata Parallel Data Pump Reference
201
Chapter 5: Using INMOD and Notify Exit Routines
Overview
Notify Exit Routines
A notify exit routine specifies a predefined action to be performed whenever certain
significant events occur during a TPump job.
Notify exit routines are especially useful in an operator-free environment where job
scheduling relies heavily on automation to optimize system performance.
For example, by writing an exit routine in C (without using CLIv2) and using the NOTIFY . . .
EXIT option of the BEGIN LOAD command, you can provide a routine to detect whether a
TPump job succeeds or fails, how many records were loaded, what the return code was for a
failed job, and so on.
Programming Languages
The TPump utility is written in:
•
SAS/C for channel-attached VM and MVS client systems
•
C for network-attached UNIX and Windows client systems
You can write INMOD and notify exit routines in the following programming languages,
depending on the platform that runs TPump:
202
Platform
Routines
VM, MVS
• INMOD routines in Assembler, COBOL, PL/I, or SAS/C
• Notify exit routines in SAS/C
UNIX, Windows
• INMOD and notify exit routines in C
Note: Although it is neither certified nor supported, you can write
INMOD routines in COBOL on network-attached client systems if you
use the Micro Focus COBOL for UNIX compiler.
Teradata Parallel Data Pump Reference
Chapter 5: Using INMOD and Notify Exit Routines
Overview
Programming Structure
Table 21 defines the structure by programming language for communicating between TPump
and INMOD or notify exit routines.
Table 21: Programming Routines by Language
Routine Language
Programming Structure
Assembler
First parameter:
RRECORD
RTNCODE
RLENGTH
RBODY
DSECT
DS
F
DS
F
DS
CLxxxxx
Note: In the RBODY specification, the body length xxxxx is:
• 32004 for Teradata for Windows
• 64004 for Teradata Database for UNIX
Second parameter:
IPARM
RSEQ
PLEN
PBODY
C
DSECT
DS
F
DS
H
DS
CL100
First parameter:
struct {
}
long Status;
long RecordLength;
char buffer[xxxxx];
Note: In the char buffer specification, the buffer length xxxxx is:
• 32004 for Teradata for Windows
• 64004 for Teradata Database for UNIX
Second parameter:
struc
}
COBOL
long seqnum;
short parmlen;
char parm[80];
First parameter:
01 INMOD-RECORD.
03 RETURN-CODE PIC S9(9) COMP.
03 RECORD-LENGTH PIC 9(9) COMP.
03 RECORD-BODY PIC X(xxxxx)
Note: In the RECORD-BODY specification, the body length xxxxx is:
• 32004 for Teradata for Windows
• 64004 for Teradata Database for UNIX
Second parameter:
01 PARM-STRUCT.
03 SEQ-NUM PIC 9(9) COMP.
03 PARM-LEN PIC 9(4) COMP.
03 PARM-BODY PIC X(80).
Teradata Parallel Data Pump Reference
203
Chapter 5: Using INMOD and Notify Exit Routines
Overview
Table 21: Programming Routines by Language (continued)
Routine Language
Programming Structure
PL/I
First parameter:
DCL 1 PARMLIST,
10 STATUS FIXED BINARY(31,0)
10 RLENGTH FIXED BINARY(31,0)
10 REC CHAR(xxxxx)
Note: In the REC CHAR specification, the length xxxxx is:
• 32004 for Teradata for Windows
• 64004 for Teradata Database for UNIX
Second parameter:
DCL 1 PARMLIST,
10 SEQNUM FIXED BINARY(31,0)
10 PLENGTH FIXED BINARY(15,0)
10 PBODY CHAR(80)
In each structure, the records must be constructed so that the left-to-right order of the data
field corresponds to the order of the field names specified in the TPump LAYOUT command
and subsequent FIELD, FILLER, and TABLE commands.
Routine Entry Points
The following table shows the entry points for INMOD routines.
INMOD Routine Language
Entry Point
SAS/C on VM and MVS platforms
_dynamn
COBOL and PL/I on VM and MVS
platforms
DYNAMN
C on UNIX and Windows platforms
_dynamn (or BLKEXIT*)
*Only for FDL-compatible INMODs compiled and linked
with BLKEXIT as the entry point. When the FDLcompatible INMOD is used, ’USING("FDLINMOD")’ must
be specified in the IMPORT statement.
The following table shows the entry points for Notify Exit routines.
204
Notify Exit Routine Language
Entry Point
SAS/C on VM and MVS platforms
_dynamn
COBOL and PL/I on VM and MVS platforms
DYNAMN
C on UNIX and Windows platforms
_dynamn
Teradata Parallel Data Pump Reference
Chapter 5: Using INMOD and Notify Exit Routines
Overview
The TPump/INMOD Routine Interface
TPump exchanges information with an INMOD routine by using the conventional parameter
register to point to a parameter list of two 32-bit addresses.
The first 32-bit address points to a three-value structure consisting of status code, length, and
body. The second 32-bit address points to a data structure containing a sequence number and
a parameter list.
Status Code
Status Code is a 32-bit signed binary value that carries information in both directions. The
TPump-to-INMOD interface uses eight status codes, as defined in Table 22.
Table 22: TPump-to-INMOD Status Codes
Value
Description
0
TPump is calling for the first time and expects the INMOD routine to return a record.
At this point, the INMOD routine should perform its initialization tasks before sending a
data record to TPump.
1
TPump is calling, not for the first time and expects the INMOD routine to return a record.
2
The client system has been restarted, the INMOD routine should reposition to the last
checkpoint, and TPump is not expecting the INMOD routine to return a data record.
Note: If the client system restarts before the first checkpoint, TPump sends entry code 0
to re-initialize. Repositioning information, provided by the INMOD after a code 3, is read
from the restart log table and returned in the buffer normally used for the data record.
3
A checkpoint has been written, the INMOD routine should remember the checkpoint
position, and TPump does not expect the INMOD routine to return a data record.
In the buffer normally used to return data, the INMOD should return any information
(up to 100 bytes) needed to reposition to this checkpoint. The utility saves this
information in the restart log table.
4
The Teradata Database has failed, the INMOD routine should reposition to the last
checkpoint, and TPump is not expecting the INMOD routine to return a data record.
Note: If the RDBMS restarts before the first checkpoint, TPump sends entry code 5 for
cleanup, and then it sends entry code 0 to re-initialize.
TPump reads the repositioning information, provided by the INMOD after a code 3, from
the restart log table and returned to the INMOD in the buffer normally used for the data
record.
5
The TPump job has ended and the INMOD routine should perform any required cleanup
tasks.
6
The INMOD should initialize and prepare to receive records.
7
The next record is available for the INMOD.
Table 23 explains the two status codes used by the INMOD-to-TPump interface.
Teradata Parallel Data Pump Reference
205
Chapter 5: Using INMOD and Notify Exit Routines
Overview
Table 23: INMOD-to-TPump Interface Status Codes
Value
Description
0
A record is being returned as the body value for a read call (code 1).
For calls other than read, a value of 0 indicates successful completion.
Any nonzero
value
The INMOD routine is at an end-of-file condition for a read call (code 1). For calls
other than read, a nonzero value indicates a processing error that terminates
TPump.
Length
Length is the 32-bit binary value that the INMOD routine uses to specify the length, in bytes,
of the data record. The INMOD routine can use a length value of zero to indicate an end-offile condition.
Body
Body is the area where the INMOD routine places the data record. Maximum record length is
31K or 31,744 bytes for Teradata for Windows. Maximum record length for Teradata Database
for UNIX is 62K or 63,488 bytes.
Sequence Number
Sequence number is a 4-byte integer record counter portion of the source sequence number.
Parameter List
The parameter list in the second 32-bit address consists of the following:
Caution:
•
VARCHAR specification
•
Two-byte length specification, m
•
The m-byte parms string, as parsed and presented by TPump
To prevent data corruption, INMOD routines that cannot comply with these protocols should
terminate if they encounter a restart code 2, 3, or 4. To support proper TPump restart
operations, INMOD routines must save and restore checkpoint information as described here.
If the INMOD saves checkpoint information in some other manner, a subsequent restart/
recovery operation could result in data loss or corruption.
TPump/Notify Exit Routine Interface
TPump accumulates operational information about specific events that occur during a
TPump job. If the BEGIN LOAD command includes a NOTIFY option with an EXIT
specification, then, when the specific events occur, TPump calls the named notify exit routine
and passes to it:
•
An event code to identify the event
•
Specific information about the event
Table 24 lists the event codes and describes the data that TPump passes to the notify exit
routine for each event. (See the description of the NOTIFY option in the “BEGIN LOAD”
206
Teradata Parallel Data Pump Reference
Chapter 5: Using INMOD and Notify Exit Routines
Overview
command description in Chapter 3: “TPump Commands,” for a description of the events
associated with each level of notification—low, medium, high, and ultra.)
Note: To support future enhancements, always make sure that your notify exit routines ignore
invalid or undefined event codes, and that they do not cause TPump to terminate abnormally.
Table 24: Events Passed to the Notify Exit Routine
Event
Event
Code
Initialize
0
Event Description
Data Passed to the Notify Exit Routine
Successful processing of the NOTIFY
option of the BEGIN LOAD
command.
•
•
•
•
•
•
•
•
•
Version ID length—4-byte unsigned integer
Version ID string—32-character (maximum) array
Utility ID—4-byte unsigned integer
Utility name length—4-byte unsigned integer
Utility name string—36-character (maximum)
array
User name length—4-byte unsigned integer
User name string—64-character (maximum) array
Optional string length—4-byte unsigned integer
Optional string—80-character (maximum) array
File or INMOD
open
1
Successful processing of the IMPORT
command that specifies the file or
INMOD routine name
• File name length—4-byte unsigned integer
• File name—256-character (maximum) array
• Import number—4-byte unsigned integer
Checkpoint begin
2
TPump is about to perform a
checkpoint operation
Record number—4-byte unsigned integer
Import begin
3
The first record is about to be read for
each import task
Import number—4-byte unsigned integer
Import end
4
The last record has been read for each
import task. The returned data is the
record statistics for the import task
•
•
•
•
•
Error table
5
Processing of the SEL COUNT(*)
request completed successfully for the
error table
• Table name—128-byte character (maximum) array
• Number of rows—4-byte unsigned integer
Teradata Database
restart
6
TPump received a crash message from No data accompanies the Teradata Database restart
event code
the Teradata Database or from the
CLIv2
CLIv2 error
7
TPump received a CLIv2 error
Teradata Parallel Data Pump Reference
Import number—4-byte unsigned integer
Records read—4-byte unsigned integer
Records skipped—4-byte unsigned integer
Records rejected—4-byte unsigned integer
Records sent to the Teradata Database—4-byte
unsigned integer
• Data errors—4-byte unsigned integer
Error code—4-byte unsigned integer
207
Chapter 5: Using INMOD and Notify Exit Routines
Overview
Table 24: Events Passed to the Notify Exit Routine (continued)
Event
Event
Code
Event Description
Data Passed to the Notify Exit Routine
TPump received a Teradata Database
error that will produce an exit code of
12
Error code—4-byte unsigned integer
Teradata Database
error
8
Exit
9
TPump completed a load task
Exit code—4-byte unsigned integer
Table statistics
10
TPump has successfully written the
table statistics
• Type (I = Insert, U = Update, or D = Delete) —
1-byte character variable
• Database name—64-character (maximum) array
• Table/macro name—64-character (maximum)
array
• Activity count—4-byte unsigned integer
Checkpoint end
11
TPump successfully completed the
checkpoint operation
Record number—4-byte unsigned integer
Interim run
statistics
12
TPump is updating the Monitor
Interface table, has just completed a
checkpoint, or has read the last record
for an import task. The returned data
is the statistics for the current load
• Import number—4-byte unsigned integer
• Statements sent to the Teradata Database—4-byte
unsigned integer
• Requests sent to the Teradata Database—4-byte
unsigned integer
• Records read—4-byte unsigned integer
• Records skipped—4-byte unsigned integer
• Records rejected—4-byte unsigned integer
• Records sent to the Teradata Database—4-byte
unsigned integer
• Data errors—4-byte unsigned integer
DML error
13
TPump received a Teradata Database
error that was caused by DML and
will introduce an error-row insert to
the error table
• Import number—4-byte unsigned integer
• Error code—4-byte unsigned integer
• Error message—256-character (maximum) array
• Record number—4-byte unsigned integer
• Apply number—1-byte unsigned char
• DML number—1-byte unsigned char
• Statement number—1-byte unsigned char
• Record data—64,004-character (maximum) array
• Record data length—4-byte unsigned integer
• Feedback—a pointer to 4-byte unsigned integer
“Feedback” always points to integer 0 when it is passed
to the user exit routine. The user may change the value
of this integer to 1 to instruct TPump not to log the
error to the error table. In this case, TPump will not
log the error, but continue other regular process on
this error.
208
Note: Not all Teradata Database errors cause this
event. A 3807 error, for example, while trying to drop
or create a table, does not terminate TPump.
Teradata Parallel Data Pump Reference
Chapter 5: Using INMOD and Notify Exit Routines
Overview
Rules and Restrictions for Using Routines
The following sections describe the operational rules and restrictions for using INMOD and
notify exit routines in TPump jobs.
Specifying Routines
INMOD and notify exit routine names must be unique within the system.
A TPump job can specify one INMOD routine with each IMPORT command. These
specifications can be to the same or different INMOD routines.
In addition to the multiple INMOD routines, each TPump job can specify an exit routine with
the NOTIFY... EXIT option of the BEGIN LOAD command.
Compiling and Linking Routines
The methods for compiling and linking routines vary with the operating system. The
following sections describe the methods for VM, MVS, UNIX, and Windows.
Using VM
On channel-attached VM client systems, INMOD and notify exit routines must be compiled
under SAS/C and passed to CLINK with the following options:
•
CLINK <filename>
•
LKED
•
LIBE
•
DYNAMC
•
NAME <modulename>
The resulting module, which can be loaded by SAS/C at run time, is placed in a load library
called DYNAMC LOADLIB. (The first name must be DYNAMC because this is the only place
that SAS/C looks for user load modules.)
Multiple load modules can exist in the local library as long as each module has a unique name.
Using MVS
The procedure on MVS platforms is similar to the procedure on VM platforms, with one
exception:
•
User load modules can be located anywhere, as long as the location is identified by one of
the DD name STEPLIB specifications in the JCL.
Using UNIX
On network-attached UNIX client systems, INMOD and notify exit routines must:
•
Be compiled with the MetaWare High C compiler
•
Be linked into a shared object module
•
Use an entry point named _dynamn
Using Windows
On network-attached Windows client systems, INMOD and notify exit routines must:
Teradata Parallel Data Pump Reference
209
Chapter 5: Using INMOD and Notify Exit Routines
Overview
•
Be written in C
•
Have a dynamn entry point that is a _declspec
•
Be saved as a Dynamic Link Library (DLL) file
For more information, see the examples in Appendix C: “INMOD and Notify Exit Routine
Examples” for sample programs and procedures that compile and link INMOD and notify
exit routines for your operating system environment.
Addressing Mode on VM and MVS Systems
You can use either 31-bit or 24-bit addressing for INMOD routines on channel-attached
systems. The 31-bit mode provides access to more memory, which enhances performance for
TPump jobs with a large number of sessions.
Use the following linkage parameters to specify the addressing mode when building INMOD
routines for VM and MVS systems:
•
For 31-bit addressing: AMODE(31) RMODE(24)
•
For 24-bit addressing: AMODE(24) RMODE(24)
INMOD Routine Compatibility with Other Load Utilities
You can use FDL-compatible INMOD routines that were created for FastLoad by including
the FDLINMOD parameter as the USING (parms) option of your IMPORT command. Using
this parameter provides compatible support operations except for the way checkpointing is
performed:
•
If your TPump job uses the FROM, FOR, or THRU options to request a range of records
from an FDL-compatible INMOD routine, then TPump bypasses any default record
checkpoint function. By default, TPump takes a checkpoint every 15 minutes. You can
bypass the TPump checkpoint function by specifying a CHECKPOINT rate of zero in your
BEGIN LOAD commands.
If the Teradata Database experiences a restart/recovery operation, TPump starts over and
gets the records again from the beginning of the range.
Under these same circumstances, if your BEGIN LOAD command included a
CHECKPOINT rate other than zero, TPump terminates with an error condition.
•
If your TPump job does not request a range of records, then TPump performs
checkpointing either by default (every 15 minutes) or per your job specifications.
If the Teradata Database experiences a restart/recovery operation and the INMOD routine
supports recovery, TPump continues the data acquisition activity from the last recorded
checkpoint.
Note, however, that the source sequence numbers generated by TPump may not correctly
identify the sequence in which the INMOD routine supplied the records. The data is still
applied correctly, however, despite this discrepancy.
You cannot specify an FDL-compatible INMOD routine with the INFILE specification of a
TPump IMPORT command.
When you specify an INMOD routine with the INFILE specification:
210
Teradata Parallel Data Pump Reference
Chapter 5: Using INMOD and Notify Exit Routines
Using INMOD and Notify Exit Routines
•
TPump performs the file-read operation
•
The INMOD routine acts as a pass-through filter
The combination of an FDL-compatible INMOD routine with a TPump INFILE specification
is not valid because an FDL-compatible INMOD routine must always perform the file read
operation.
Checkpoints
To support TPump restart operations, your INMOD routine must support checkpoint
operations, as described in “The TPump/INMOD Routine Interface” on page 205.
If you use an INMOD routine that does not support the checkpoint function, your job may
encounter problems when TPump takes a checkpoint.
By default, TPump takes a checkpoint every 15 minutes. You can bypass the TPump
checkpoint function by specifying a CHECKPOINT rate of zero in your BEGIN LOAD
command; that way, the job completes without taking a checkpoint.
Though this would nullify the TPump restart/reload capability, it would allow you to use an
INMOD routine that does not support the checkpoint function.
Using INMOD and Notify Exit Routines
This section provides some specific information you need for using INPUT and notify exit
routines in TPump. Topics include TPump-specific restrictions, the TPump/INMOD
interface for different client operating systems, preparation of the INMOD program, INMOD
input values, INMOD output values, and programming specifications for Unix-based and
Windows clients.
TPump-specific Restrictions
INMOD names should be unique within the system. INMODs are not re-entrant and cannot
be shared by two TPump (or FastLoad, MultiLoad, or FastExport) sessions at the same time.
Some changes have been made to the INMOD utility interface for TPump because of
operational differences between TPump and the older utilities. For compatibility with
INMODs, the FDLINMOD parameter should be used. The use of this parm provides support
of existing INMODs, with the following restrictions:
•
When the FDLINMOD parm is used, INMODs that are compatible with other utilities
may be used. However, if a range of records is requested from an FDL-compatible INMOD
(using FROM, FOR, or THRU on the IMPORT command), TPump bypasses any default
record checkpointing. If there is a recovery under these circumstances, TPump starts over
and acquires the records again from the beginning of the range. Under these same
circumstances, if checkpointing is requested by specifying the CHECKPOINT parameter
on the BEGIN LOAD command, TPump terminates with an error.
•
If a range of records is not requested when using an FDL-compatible INMOD, TPump
performs checkpointing, either by default or by the user’s request. If there is a recovery and
Teradata Parallel Data Pump Reference
211
Chapter 5: Using INMOD and Notify Exit Routines
Using INMOD and Notify Exit Routines
the INMOD supports recovery, TPump continues its data acquisition from the last
recorded checkpoint. However, the source sequence numbers generated by TPump may
not correctly identify the sequence in which the INMOD supplied the records. Despite this
discrepancy, the data is still applied correctly.
•
Warning:
You cannot specify an FDL-compatible INMOD routine in conjunction with the INFILE
specification of a TPump IMPORT command. If an INMOD is specified together with the
INFILE specification, TPump performs the file read operation and the INMOD acts as a
pass-through filter. Since an FDL-compatible INMOD always performs the file read
operation, it is not valid with a TPump INFILE specification.
The TPump default is to take a checkpoint every 15 minutes. With other loading utilities,
checkpointing must be explicitly requested. If you attempt to run with an INMOD that does not
use checkpointing, problems may arise when TPump defaults to a checkpoint mode. To avoid
this condition, you can disable TPump checkpointing by specifying zero as the checkpoint rate
parameter on the BEGIN LOAD command, so that the checkpoint is never reached. This may be
imperative for users who do not have INMODs capable of checkpointing.
TPump/INMOD Interface
This section discusses the TPump/INMOD Interface for different client operating systems:
TPump/INMOD Interface on IBM Client-based Systems
The use of an INMOD is specified on the IMPORT command. On IBM client-based systems,
the Teradata Database interfaces with INMODs written in C, COBOL, PL/I, and Assembler.
Examples of these INMODs are presented in Appendix C: “INMOD and Notify Exit Routine
Examples”. An optional parms string to be passed to the INMOD may also be specified on the
IMPORT command. TPump imposes the following syntax rules for this string:
212
•
The parms string may include one or more character strings, each delimited on either end
by an apostrophe, or delimited on either end by a quotation mark. The maximum size of
the parms string is 1 KB.
•
If a FastLoad INMOD is used, the parms string of the IMPORT command must be
FDLINMOD.
•
The parms string passed to an INMOD includes the parentheses used to specify the parm.
Thus, if the IMPORT specifies USING (’5’), the entire expression (’5’) is passed to the
INMOD.
•
Parentheses within delimited character strings or comments have the same syntactical
significance as alphabetic characters.
•
In the parms string that TPump passes to the INMOD routine, each comment is replaced
by a single blank character.
•
In the parms string that TPump passes to the INMOD routine, each consecutive sequence
of whitespace characters, such as blank, tab, and so on, that appears outside of delimited
strings, is replaced by a single blank character.
•
FDLINMOD is used for compatibility by pointing to a data structure that is the same for
BDL and FDL INMODs.
Teradata Parallel Data Pump Reference
Chapter 5: Using INMOD and Notify Exit Routines
Using INMOD and Notify Exit Routines
TPump/INMOD Interface on UNIX-based Systems
On UNIX-based client platforms, TPump is written in C and, therefore, the INMOD
procedure is dynamically loaded at runtime, rather than link-edited into the TPump module
or operated as a separate executable program.
The runtime loader requires that the INMOD module be compiled and linked as a shared
object, and that the entry point for the procedure be named _dynamn.
The use of an INMOD is specified in the IMPORT command. On UNIX-based systems, the
Teradata Database interfaces only with INMODs written in C. An example of a C INMOD is
presented in Appendix C: “INMOD and Notify Exit Routine Examples”. An optional parms
string to be passed to the INMOD may also be specified on the IMPORT command. TPump
imposes these syntax rules:
•
One INMOD is allowed for each IMPORT command. Multiple IMPORTs are allowed;
these may use the same or different INMODs.
•
The input filename parameter specified on the IMPORT command must be the
fully qualified UNIX pathname for the input file.
•
The INMOD filename parameter specified on the IMPORT command must be the
fully qualified UNIX pathname of the INMOD shared object file.
•
The parms string may include one or more character strings, each delimited on either end
by an apostrophe, or delimited on either end by a quotation mark. The maximum size of
the parms string is 1k bytes.
•
If a FastLoad INMOD is used, the parms string of the IMPORT command must be
“FDLINMOD”.
•
The parms string as a whole must be enclosed in parentheses.
•
Parentheses within delimited character strings or comments have the same syntactical
significance as alphabetic characters.
•
In the parms string that TPump passes to the INMOD routine, each comment is replaced
by a single blank character.
•
In the parms string that TPump passes to the INMOD routine, each consecutive sequence
of whitespace characters, such as blank, tab, and so on, that appears outside of delimited
strings, is replaced by a single blank character.
•
FDLINMOD is used for compatibility by pointing to a data structure that is the same for
FDL INMODs.
TPump/INMOD Interface on Windows Systems
On Windows client platforms, TPump is written in C and, therefore, the INMOD procedure is
dynamically loaded at runtime, rather than link-edited into the TPump module or run as a
separate executable program.
The runtime loader requires that the INMOD module be compiled and linked as a DynamicLink Library (DLL) file, and that the point for the procedure be named _dynamn.
The use of an INMOD is specified in the IMPORT command. On systems, the Teradata
Database interfaces only with written in C.An optional parms string to be passed to INMOD
may also be specified on the IMPORT command. TPump imposes the following syntax rules:
Teradata Parallel Data Pump Reference
213
Chapter 5: Using INMOD and Notify Exit Routines
Preparing the INMOD Program
•
One INMOD is allowed for each IMPORT command. Multiple are allowed; these may use
the same or different INMODs.
•
The input filename parameter specified on the IMPORT command must be the fully
qualified Windows pathname for the input file.
•
The INMOD filename parameter specified on the IMPORT command must be the fully
qualified Windows pathname of the INMOD DLL file.
•
The parms string may include one or more character strings, each delimited on either end
by an apostrophe, or delimited on either end by a quotation mark. The maximum size of
the parms string is 1k bytes.
•
If a FastLoad INMOD is used, the parms string of the IMPORT command must be
“FDLINMOD”.
•
The parms string as a whole must be enclosed in parentheses.
•
Parentheses within delimited character strings or comments have the same syntactical
significance as alphabetic characters.
•
In the parms string that TPump passes to the INMOD routine, each comment is replaced
by a single blank character.
•
In the parms string that TPump passes to the INMOD routine, each consecutive sequence
of whitespace characters, such as blank, tab, and so on, that appears outside of delimited
strings, is replaced by a single blank character.
•
FDLINMOD is used for compatibility by pointing to a data structure that is the same for
FDL INMODs.
Preparing the INMOD Program
This section describes the protocol used between TPump and an INMOD written for TPump.
The protocols are applicable to all client platforms running TPump. Considerations applicable
exclusively to UNIX-based clients are contained in “Programming INMODs for UNIX-based
Clients” on page 216.
On entry to an INMOD user exit routine for TPump, the conventional parameter register
points to a parameter list of two 32-bit addresses. The first 32-bit address points to a data
structure containing the following fields:
214
•
Return Code/Function Code, 4-byte integer.
•
Length, 4-byte integer, Length of the data record.
•
Data Record, Input data record buffer. The maximum length is:
•
31K or 31,744 bytes for Teradata for Windows
•
62K or 63,488 bytes for Teradata Database for UNIX
Teradata Parallel Data Pump Reference
Chapter 5: Using INMOD and Notify Exit Routines
INMOD Input Values
INMOD Input Values
As input to the INMOD routine, Table 25 lists valid values of the Return Code/Function Code
field and their meanings:
Table 25: INMOD Input Return Code Values
Code
Description
0
Request for INMOD to initialize and return first record.
1
Request for INMOD to return a record.
2
Request for INMOD to reposition to last checkpoint because of client restart. Repositioning
information, provided by the INMOD after a code 3, is read from the restart logtable and
returned to the INMOD in the buffer normally used for the data record.
3
Request for INMOD to take a checkpoint. In the buffer normally used to return data, the
INMOD should return any information (up to 100 bytes) that it may need to reposition to
this checkpoint. TPump then saves this information in its restart log table.
4
Request for INMOD to reposition to last checkpoint because of Teradata Database failure.
Repositioning information, provided by the INMOD after a code 3, is read from the restart
log table and returned to the INMOD in the buffer normally used for the data record.
5
Request for INMOD to wrap up at termination.
6
Request for INMOD to initialize.
7
Request for INMOD to receive first (next) record.
INMOD Output Values
As output from the INMOD routine, Table 26 lists valid values of the Return Code field and
their meanings:
Table 26: INMOD Output Return Code Values
Code
Description
0 on read call (code 1)
Indicates End Of File not reached. The length field should be
set to the length of the output record. If an input record was
supplied to the INMOD and it is to be skipped, set the length
field to zero. If no input record was supplied, setting the
length to zero acts as an End Of File.
Non-0 on read call (code 1)
Indicates End Of File.
0 on non-read call (not code 1)
Indicates successful operation.
Non-0 on non-read call (not code 1) Indicates a processing error. TPump terminates.
Teradata Parallel Data Pump Reference
215
Chapter 5: Using INMOD and Notify Exit Routines
Programming INMODs for UNIX-based Clients
The second 32-bit address points to a data structure containing the following fields:
•
Sequence Number, 4-byte integer, Integer record counter portion of the source sequence
number.
•
Parameter List, Varchar, 2-byte length, m, followed by the m-byte parms string as parsed
and presented by TPump.
INMODs that cannot comply with these protocols should terminate if a Restart Code 2,
Code 3, or Code 4 is encountered. Otherwise, data might become corrupted. In order to be
restartable, INMODs must make use of TPump to save and restore checkpoint information as
described above. If the INMOD saves its checkpointing information privately, recovery might
result in data corruption.
Note: For VM users, INMODs must be link-edited into a CMS LOADLIB with the name
DYNAMC LOADLIB to be available for use with TPump.
Note: On MVS, the module must reside in the steplib/joblib (for JCL), task library (for clist/
exec), or the system linklist (for any).
Programming INMODs for UNIX-based Clients
In addition to the techniques for preparing INMODs listed in “Preparing the INMOD
Program” on page 214 which apply to all platforms, there are several rules that must be
followed only for developing C INMODs for UNIX-based clients. These are:
•
The INMOD subroutine must be named _dynamn.
•
The INMOD must be compiled with the MetaWare High C compiler.
•
The compiled INMOD module must be linked into a shared object module.
Compiling and Linking a C INMOD on a UNIX-based Client
Note: For a description of the syntax diagrams used in this book, see Appendix A: “How to
Read Syntax Diagrams.”
The following syntax example can be used to compile a C INMOD on a UNIX-based client.
Compile Syntax
cc
-c
inmod.c
3021A038
where
216
Teradata Parallel Data Pump Reference
Chapter 5: Using INMOD and Notify Exit Routines
Programming INMODs for UNIX-based Clients
Syntax Element
Description
cc
Program that invokes the MetaWare High C Compiler
c
Linker option specifying to compile without linking to produce an output file
(a.out)
inmod.c
A C source module for the INMOD
Use the following syntax example to link the object modules into a shared object module.
Link Syntax
,
ld
-dy
-G
inmod.o
inmod.so
-o
HE05A016
where
Syntax Element
Description
ld
Invokes the UNIX linker editor
dy
Specifies to use dynamic linking
G
Specifies to create a shared object
inmod.o
Describes an object module derived from the compile step (see above)
o
Specifies the output filename; default is a.out
inmod.so
Specifies the resulting shared object module
This is the user-specified name in the IMPORT command.
Compiling and Linking a C INMOD on MP-RAS and Sun Solaris SPARC
Use the following syntax example to compile a C INMOD on MP-RAS or Sun Solaris SPARC
client systems.
cc -G
-KPIC
sourcefile.c
-o shared-object-name
2409B051
where
Teradata Parallel Data Pump Reference
217
Chapter 5: Using INMOD and Notify Exit Routines
Programming INMODs for UNIX-based Clients
Syntax Element
Description
cc
Invokes the MetaWare High C Compiler
-G
Specifies to create a shared object
-KPIC
Is a compiler option that generates Position Independent Code (PIC) for all user
exit routine
sourcefile
Is a C source module for the INMOD
-o
Specifies the output file name
shared-objectname
Specifies the resulting shared object module
This is the name you specify as:
• The INMOD modulename parameter of the IMPORT command of your
TPump job script.
• The EXIT name parameter for the NOTIFY option of the BEGIN LOAD
command of your TPump job script
The shared-object-name can be any valid UNIX file name.
Compiling and Linking a C INMOD on a Sun Solaris Opteron
Use the following syntax example to compile a C INMOD on a Sun Solaris Opteron client
system.
cc -dy
-G
sourcefile.c
-o shared-object-name
2409A055
where
218
Syntax Element
Description
cc
Invokes the MetaWare High C Compiler
-dy
Specifies to use dynamic linking
-G
Specifies to create a shared object
sourcefile
Is a C source module for the INMOD
-o
Specifies the output file name
Teradata Parallel Data Pump Reference
Chapter 5: Using INMOD and Notify Exit Routines
Programming INMODs for UNIX-based Clients
Syntax Element
Description
shared-objectname
Specifies the resulting shared object module
This is the name you specify as:
• The INMOD modulename parameter of the IMPORT command of your
TPump job script.
• The EXIT name parameter for the NOTIFY option of the BEGIN LOAD
command of your TPump job script.
The shared-object-name can be any valid UNIX file name.
Compiling and Linking a C INMOD on HP-UX PA RISC
Use the following syntax example to compile a C INMOD on HP-UX PA RISC client.
Compile Syntax
cc
+z
inmod.c
+ul
3021A002
where
Syntax Element
Description
cc
Invokes the MetaWare High C Compiler
+z
Is a compiler option specified to generate Position Independent Code (PIC) for
all user exit routines
+ul
Is a compiler option that allows pointers to access non-natively aligned data
inmod.c
Is a C source module for the INMOD
Use the following syntax example to link the object modules on HP-UX PA-RISC into the
shared object.
Link Syntax
,
ld
-b
inmod.o
-o
inmod.so
3021A003
where
Teradata Parallel Data Pump Reference
219
Chapter 5: Using INMOD and Notify Exit Routines
Programming INMODs for UNIX-based Clients
Syntax Element
Description
ld
Invokes the UNIX linker editor
-b
Is a linker option specified to generate a shared object file
inmod.o
Is an object module derived from the compile step (see above)
o
Specifies the output filename; default is a.out
inmod.so
Specifies the resulting shared object module
This is the user-specified name in the IMPORT command.
Compiling and Linking a C INMOD on HP-UX Itanium
Use the following syntax example to compile a C INMOD on an HP-UX Itanium-based client.
Compile Syntax
cc
+u1
-D_REENTRANT
+DD64
-c
inmod.c
2409A057
where
Syntax Element
Description
cc
Invokes the MetaWare High C compiler
+u1
Is a compiler option that allows pointers to access non-natively aligned data
-D_REENTRANT
Ensures that all the Pthread definitions are visible at compile time
+DD64
Generates 64-bit object code for PA2.0 architecture
-c
Compiles one or more source files but does not enter the linking phase
inmod.c
A C source module for the INMOD
Use the following syntax example to link the object modules on HP-UX Itanium into the
shared object.
Link Syntax
ld
-n
-b
inmod.o
-lc
-o
inmod.so
2409A056
220
Teradata Parallel Data Pump Reference
Chapter 5: Using INMOD and Notify Exit Routines
Programming INMODs for UNIX-based Clients
where
Syntax Element
Description
ld
Invokes the UNIX linker editor
-n
Generates an executable with file type SHARE_MAGIC. This option is ignored
in 64-bit mode.
-b
Is a linker option specified to generate a shared object file
inmod.o
Is an object module derived from the compile step (see above)
-lc
Search a library libc.a, libc.so, or libc.sh
-o
Specifies the output filename; default is a.out
inmod.so
Specifies the resulting shared object module
This is the user-specified name in the IMPORT command.
Compiling and Linking a C INMOD on an IBM AIX
Use the following syntax example to compile a C INMOD on an IBM AIX-based client.
Compile Syntax
cc
-c
-brtl
-fPIC
sourcefile.c
3021A008
where
Syntax Element
Description
cc
Is a call to the program that invokes the native UNIX C compiler
-c
Is a compiler option that specifies to not send object files to the linkage editor
-brtl
Tells the linkage editor to accept both .sl and .a library file types
-fPIC
Is a compiler option that generates Position Independent Code (PIC) for all user
exit routines
sourcefile.c
Is a C source module for the INMOD
Use the following syntax example to link the object modules into a shared object module.
Teradata Parallel Data Pump Reference
221
Chapter 5: Using INMOD and Notify Exit Routines
Programming INMODs for UNIX-based Clients
Link Syntax
ld
-G
A
objectfile.o
A
-bE: export_dynamn.txt
-e_dynamn
-o shared-object-name
-lm
-lc
3021A011
where
Syntax Element
Description
ld
Invokes the UNIX linker editor
G
Produces a shared object enabled for use with the run-time linker
-e_dynamn
Sets the entry point of the exit routine to _dynamn
-bE : export_dynamn.txt
Is a linker option that exports the symbol "_dynamn" explicitly and
the file export_dynamn.txt contains the symbol
objectfile.o
Is an object module created during compile step
-o
Specifies the output file name
shared-object-name
Specifies the resulting shared object module
This is the name you specify as:
• The INMOD modulename parameter of the IMPORT command
of your TPump job script.
• The EXIT name parameter for the NOTIFY option of the BEGIN
LOAD command of your TPump job script
The shared-object-name can be any valid UNIX file name.
-lm
Is a linker option specifying to link with the /lib/libm.a library
-lc
Is a linker option specifying to link with the /lib/libc.a library
Compiling and Linking a C INMOD on a Linux Client
Use the following syntax example to compile a C INMOD on a Linux client.
Note: Be sure to compile your INMOD and notify exit routines in 32-bit mode so they are
compatible with Teradata TPump.
,
gcc
-shared
-fPIC
inmod.c
-o
inmod.so
3021A037
where
222
Teradata Parallel Data Pump Reference
Chapter 5: Using INMOD and Notify Exit Routines
Programming INMODs for a Windows Client
Syntax Element
Description
gcc
Invokes the C compiler on Linux
shared
Produces a shared object, which can then be linked with other objects to form an
executable
-fPIC
Produces Position Independent Code
-o
Specifies the output file name
Programming INMODs for a Windows Client
The previous section lists INMOD preparation techniques that apply to all platforms. There
are several additional rules to follow when developing C INMODs for Windows clients. These
are:
•
The INMOD routine must be written in C
•
The INMOD routine must have an entry point named _dynamn and declared with
_declspec keyword
•
The file must be saved as a DLL file
Compiling and Linking a C INMOD on a Windows Client
Use the following syntax example to create a DLL on a Windows client.
cl
/DWIN32
/LD inmod.c
3021B012
where
Syntax Element
Description
cl
Invokes the Microsoft C Compiler
D
Defines a macro
LD
Creates a .dll
inmod.c
Denotes a C source module for the INMOD
Teradata Parallel Data Pump Reference
223
Chapter 5: Using INMOD and Notify Exit Routines
Programming INMODs for a Windows Client
224
Teradata Parallel Data Pump Reference
APPENDIX A
How to Read Syntax Diagrams
This appendix describes the conventions that apply to reading the syntax diagrams used in
this book.
Syntax Diagram Conventions
Notation Conventions
The following table defines the notation used in this section:
Item
Definition/Comments
Letter
An uppercase or lowercase alphabetic character ranging from A through Z.
Number
A digit ranging from 0 through 9.
Do not use commas when entering a number with more than three digits.
Word
Variables and reserved words:
UPPERCASE LETTERS
Represents a keyword.
Syntax diagrams show all keywords in uppercase, unless operating system
restrictions require them to be in lowercase.
If a keyword is shown in uppercase, you may enter it in uppercase or mixedcase.
lowercase letters
Represents a keyword that you must enter in lowercase, such as a UNIX
command.
lowercase italic letters
Represents a variable such as a column or table name. You must substitute a
proper value.
lowercase bold letters
Represents a variable that is defined immediately following the diagram that
contains it.
UNDERLINED LETTERS
Represents the default value.
This applies both to uppercase and to lowercase words.
Spaces
Use one space between items, such as keywords or variables.
Punctuation
Enter all punctuation exactly as it appears in the diagram.
Teradata Parallel Data Pump Reference
225
Appendix A: How to Read Syntax Diagrams
Syntax Diagram Conventions
Paths
The main path along the syntax diagram begins at the left, and proceeds, left to right, to the
vertical bar, which marks the end of the diagram. Paths that do not have an arrow or a vertical
bar only show portions of the syntax.
Note that the only part of a path that reads from right to left is a loop.
Paths that are too long for one line use continuation links. Continuation links are small circles
with letters indicating the beginning and ending of a link:
A
A
FE0CA002
When you see a circled letter in a syntax diagram, go to the corresponding circled letter and
continue.
Required Items
Required items appear on the main path:
SHOW
FE0CA003
If you can choose from more than one item, the choices appear vertically, in a stack. The first
item appears on the main path:
SHOW
CONTROLS
VERSIONS
FE0CA005
Optional Items
Optional items appear below the main path:
SHOW
CONTROLS
226
FE0CA004
Teradata Parallel Data Pump Reference
Appendix A: How to Read Syntax Diagrams
Syntax Diagram Conventions
If choosing one of the items is optional, all the choices appear below the main path:
SHOW
CONTROLS
VERSIONS
FE0CA006
You can choose one of the options, or you can disregard all of the options.
Abbreviations
If a keyword or a reserved word has a valid abbreviation, the unabbreviated form always
appears on the main path. The shortest valid abbreviation appears beneath.
SHOW
CONTROLS
CONTROL
FE0CA042
In the above syntax, the following formats are valid:
•
SHOW CONTROLS
•
SHOW CONTROL
Loops
A loop is an entry or a group of entries that you can repeat one or more times. Syntax
diagrams show loops as a return path above the main path, over the item or items that you can
repeat.
,
,
(
3
4
cname
)
JC01B012
The following rules apply to loops:
If
Then
there is a maximum
number of entries
allowed
the number appears in a circle on the return path.
there is a minimum
number of entries
required
the number appears in a square on the return path.
Teradata Parallel Data Pump Reference
In the example, you may enter cname a maximum of 4 times.
In the example, you must enter at least 3 groups of column names.
227
Appendix A: How to Read Syntax Diagrams
Syntax Diagram Conventions
If
Then
a separator character
is required between
entries
the character appears on the return path.
If the diagram does not show a separator character, use one blank space.
In the example, the separator character is a comma.
a delimiter character
is required around
entries
the beginning and ending characters appear outside the return path.
Generally, a space is not needed between delimiter characters and entries.
In the example, the delimiter characters are the left and right parentheses.
Excerpts
Sometimes a piece of a syntax phrase is too large to fit into the diagram. Such a phrase is
indicated by a break in the path, marked by | terminators on either side of the break. A name
for the excerpted piece appears between the break marks in boldface type.
The named phrase appears immediately after the complete diagram, as illustrated by the
following example.
LOCKING
excerpt
A
A
HAVING
con
excerpt
where_cond
,
cname
,
col_pos
2409A050
228
Teradata Parallel Data Pump Reference
APPENDIX B
TPump Examples
This appendix provides some examples of TPump scripts and their corresponding output.
Included are:
•
Simple Script Example
•
Restarted Upsert Example
•
Example Using the TABLE Command
In the output examples, the lines that begin with 4-digit numbers (for example, 0001) are
scripts, the rest are output.
Simple Script Example
The following is an example of a simple script. This script:
/**************************************************************/
/*
*/
/* MLNT002H MVSJCL
*/
/*
*/
/**************************************************************/
/***********************************************/
/* STEP01 CREATES THE TABLES FOR THE TPump JOB */
/***********************************************/
.LOGTABLE CME.TLddNT2H;
.LOGON OPNACC1/CME,CME;
DROP TABLE TBL1T;
DROP TABLE TBL2T;
DROP TABLE tlnt2err;
CREATE TABLE TBL2T,FALLBACK
(ABYTEINT BYTEINT,
ASMALLINT SMALLINT,
AINTEGER INTEGER,
ADECIMAL DECIMAL (5,2),
ACHAR CHAR (5),
ABYTE BYTE(1),
AFLOAT FLOAT,
ADATE DATE)
UNIQUE PRIMARY INDEX (ASMALLINT);
/*****************************************************************/
/* BEGIN LOAD WITH ALL THE OPTIONS SPECIFIED SUCH AS ERRLIMIT,
*/
/* CHECKPOINT, SESSIONS,TENACITY
*/
/*****************************************************************/
.BEGIN LOAD
SESSIONS 6 4
PACK 10
Teradata Parallel Data Pump Reference
229
Appendix B: TPump Examples
Simple Script Example
CHECKPOINT 1
TENACITY 2
ERRLIMIT 5
ERRORTABLE tlnt2err;
.LAYOUT LAY1A;
.FIELD ABYTEINT * BYTEINT;
.FIELD ASMALLINT * SMALLINT;
.FIELD AINTEGER * INTEGER;
.FIELD ADECIMAL * DECIMAL (5,2);
.FIELD ACHAR * CHAR (5);
.FIELD ABYTE * BYTE(1);
.FIELD AFLOAT * FLOAT;
.FIELD ADATE * DATE;
.DML LABEL LABELA
IGNORE DUPLICATE ROWS
IGNORE MISSING ROWS
IGNORE EXTRA ROWS;
INSERT INTO TBL2T VALUES
(:ABYTEINT,:ASMALLINT,:AINTEGER,:ADECIMAL,:ACHAR,:ABYTE,:AFLOAT,:ADATE);
.IMPORT INFILE ./tlnt002.dat
LAYOUT LAY1A
APPLY LABELA FROM 1 FOR 1000;
.END LOAD;
.LOGOFF;
produces the following output:
0002 /**************************************************************/
/*
*/
/* MLNT002H MVSJCL
*/
/*
*/
/**************************************************************/
/***********************************************/
/* STEP01 CREATES THE TABLES FOR THE TPump JOB */
/***********************************************/
.LOGTABLE CME.TLddNT2H;
0003 .LOGON OPNACC1/CME,;
**** 09:47:17 UTY8400 Teradata Database Release: 12.00.00.00
**** 09:47:17 UTY8400 Teradata Database Version: 12.00.00.00
**** 09:47:17 UTY8400 Default character set: EBCDIC
**** 09:47:17 UTY8400 Maximum supported buffer size: 1M
**** 09:47:17 UTY8400 Upsert supported by RDBMS server
**** 09:47:17 UTY6211 A successful connect was made to the DBS.
**** 09:47:17 UTY6217 Logtable 'CME.TLddNT2H' has been created.
========================================================================
=
=
= Processing Control Statements
=
=
=
========================================================================
0004 DROP TABLE TBL1T;
**** 09:47:23 UTY1016 'DROP' request successful.
0005 DROP TABLE TBL2T;
**** 09:47:29 UTY1016 'DROP' request successful.
0006 DROP TABLE tlnt2err;
**** 09:47:30 UTY1008 DBS failure: 3807, Table/view 'tlnt2err' does not exist.
0007 CREATE TABLE TBL2T,FALLBACK
(ABYTEINT BYTEINT,
ASMALLINT SMALLINT,
AINTEGER INTEGER,
230
Teradata Parallel Data Pump Reference
Appendix B: TPump Examples
Simple Script Example
****
0008
0009
0010
0011
0012
0013
0014
0015
0016
0017
0018
0019
0020
0021
****
****
****
ADECIMAL DECIMAL (5,2),
ACHAR CHAR (5),
ABYTE BYTE(1),
AFLOAT FLOAT,
ADATE DATE)
UNIQUE PRIMARY INDEX (ASMALLINT);
09:47:42 UTY1016 'CREATE' request successful.
/*****************************************************************/
/* BEGIN LOAD WITH ALL THE OPTIONS SPECIFIED SUCH AS ERRLIMIT,
*/
/* CHECKPOINT, SESSIONS,TENACITY
*/
/*****************************************************************/
.BEGIN LOAD
SESSIONS 6 4
PACK 10
CHECKPOINT 1
TENACITY 2
ERRLIMIT 5
ERRORTABLE tlnt2err;
========================================================================
=
=
= Processing TPump Statements
=
=
=
========================================================================
.LAYOUT LAY1A;
.FIELD ABYTEINT * BYTEINT;
.FIELD ASMALLINT * SMALLINT;
.FIELD AINTEGER * INTEGER;
.FIELD ADECIMAL * DECIMAL (5,2);
.FIELD ACHAR * CHAR (5);
.FIELD ABYTE * BYTE(1);
.FIELD AFLOAT * FLOAT;
.FIELD ADATE * DATE;
.DML LABEL LABELA
IGNORE DUPLICATE ROWS
IGNORE MISSING ROWS
IGNORE EXTRA ROWS;
INSERT INTO TBL2T VALUES
(:ABYTEINT,:ASMALLINT,:AINTEGER,:ADECIMAL,:ACHAR,:ABYTE,:AFLOAT,:ADATE);
.IMPORT INFILE ./tlnt002.dat
LAYOUT LAY1A
APPLY LABELA FROM 1 FOR 1000;
.END LOAD;
09:47:43 UTY6609 Starting to log on sessions...
09:47:57 UTY6610 Logged on 6 sessions.
========================================================================
=
=
=
TPump Import(s) Beginning
=
=
=
========================================================================
09:47:57 UTY6630 Options in effect for following TPump Import(s):
.
Tenacity:
2 hour limit to successfully connect load sessions.
.
Max Sessions:
6 session(s).
.
Min Sessions:
4 session(s).
.
Checkpoint:
1 minute(s).
.
Errlimit:
5 rejected record(s).
.
Restart Mode:
ROBUST.
. Serialization:
OFF.
.
Packing:
10 Statements per Request.
.
StartUp Rate:
UNLIMITED Statements per Minute.
Teradata Parallel Data Pump Reference
231
Appendix B: TPump Examples
Restarted Upsert Example
**** 09:48:13 UTY6608 Import 1 begins.
**** 09:48:51 UTY6641 Since last chkpt., 1000 recs. in, 1000 stmts., 104 reqs
**** 09:48:51 UTY6647 Since last chkpt., avg. DBS wait time: 0.26
**** 09:48:51 UTY6612 Beginning final checkpoint...
**** 09:48:51 UTY6641 Since last chkpt., 1000 recs. in, 1000 stmts., 104 reqs
**** 09:48:51 UTY6647 Since last chkpt., avg. DBS wait time: 0.26
**** 09:48:51 UTY6607 Checkpoint Completes with 1000 rows sent.
**** 09:48:51 UTY6642 Import 1 statements: 1000, requests: 104
**** 09:48:51 UTY6643 Import 1 average statements per request: 9.62
**** 09:48:51 UTY6644 Import 1 average statements per record: 1.00
**** 09:48:51 UTY6645 Import 1 statements/session: avg. 166.67, min. 154.00, max
. 182.00
**** 09:48:51 UTY6646 Import 1 requests/session: average 17.33, minimum 16.00, m
aximum 19.00
**** 09:48:51 UTY6648 Import 1 DBS wait time/session: avg. 4.50, min. 2.00, max.
11.00
**** 09:48:51 UTY6649 Import 1 DBS wait time/request: avg. 0.25, min. 0.11, max.
0.58
**** 09:48:51 UTY1803 Import processing statistics
.
IMPORT 1
Total thus far
.
=========
==============
Candidate records considered:........
1000.......
1000
Apply conditions satisfied:..........
1000.......
1000
Records logged to error table:.......
0.......
0
Candidate records rejected:..........
0.......
0
** Statistics for Apply Label : LABELA
Type
Database
Table or Macro Name
Activity
I
CME
TBL2T
1000
**** 09:48:52 UTY0821 Error table CME.tlnt2err is EMPTY, dropping table.
0022 .LOGOFF ;
========================================================================
=
=
=
Logoff/Disconnect
=
=
=
========================================================================
**** 09:49:00 UTY6216 The restart log table has been dropped.
**** 09:49:00 UTY6212 A successful disconnect was made from the RDBMS.
**** 09:49:00 UTY2410 Total processor time used = '0.791138 Seconds'
.
Start : 09:47:17 - MON JULY 16, 2007
.
End
: 09:49:00 - MON JULY 16, 2007
.
Highest return code encountered = '0'.
Restarted Upsert Example
This restarted upsert example uses two IMPORT clauses. The first one loads half of the
records from the data source into an empty table. The second one does an upsert using all the
records in the same data file. The result is that it updates the rows inserted during the first
import and inserts all of the rows that the first import skipped.
This script:
/***********************************************/
/* STEP01 CREATES THE TABLES FOR THE TPump JOB */
/***********************************************/
.LOGTABLE TLddNT13H;
232
Teradata Parallel Data Pump Reference
Appendix B: TPump Examples
Restarted Upsert Example
.LOGON cs4400s3/wth,wth;
DROP TABLE TBL13TA;
DROP TABLE tlnt13err;
CREATE TABLE TBL13TA,FALLBACK
(ABYTEINT BYTEINT,
ASMALLINT SMALLINT,
AINTEGER INTEGER,
ADECIMAL DECIMAL (5,2),
ACHAR CHAR (5),
ABYTE BYTE(1),
AFLOAT FLOAT,
ADATE DATE)
UNIQUE PRIMARY INDEX (ASMALLINT);
/*****************************************************************/
/* BEGIN LOAD WITH ALL THE OPTIONS SPECIFIED SUCH AS ERRLIMIT,
*/
/* CHECKPOINT, SESSIONS,TENACITY
*/
/*****************************************************************/
.BEGIN LOAD
SESSIONS 1 1
PACK 10
CHECKPOINT 1
TENACITY 2
ERRLIMIT 50
ERRORTABLE tlnt13err;
.LAYOUT LAY1A;
/*.FILLER ATEST * BYTEINT;*/
.FIELD ABYTEINT * BYTEINT;
.FIELD ASMALLINT * SMALLINT KEY;
.FIELD AINTEGER * INTEGER;
.FIELD ADECIMAL * DECIMAL (5,2);
.FIELD ACHAR * CHAR (5);
.FIELD ABYTE * BYTE(1);
.FIELD AFLOAT * FLOAT;
.FIELD ADATE * DATE;
/* insert half of the rows ......................*/
.DML LABEL LABELA
IGNORE DUPLICATE ROWS
IGNORE MISSING ROWS
IGNORE EXTRA ROWS;
INSERT INTO TBL13TA VALUES
(:ABYTEINT,:ASMALLINT,:AINTEGER,:ADECIMAL,:ACHAR,:ABYTE,:AFLOAT,:ADATE);
/* ... and then upsert all of the rows ..........*/
.DML LABEL LABELB
IGNORE DUPLICATE ROWS
IGNORE MISSING ROWS
IGNORE EXTRA ROWS
DO INSERT FOR MISSING UPDATE ROWS;
UPDATE TBL13TA SET ADECIMAL = ADECIMAL + 1 WHERE ASMALLINT = :ASMALLINT;
INSERT INTO TBL13TA VALUES
(:ABYTEINT,:ASMALLINT,:AINTEGER,:ADECIMAL,:ACHAR,:ABYTE,:AFLOAT,:ADATE);
/* should result in an upsert with half inserts and half updates */
.IMPORT INFILE ./tlnt013.dat
LAYOUT LAY1A FROM 1 FOR 400
APPLY LABELA WHERE ABYTEINT = 1;
.IMPORT INFILE ./tlnt013.dat
LAYOUT LAY1A FROM 1 FOR 400
Teradata Parallel Data Pump Reference
233
Appendix B: TPump Examples
Restarted Upsert Example
APPLY LABELB;
.END LOAD;
.LOGOFF;
produces the following output (assuming it was restarted during the second import):
0001 /***********************************************/
/* STEP01 CREATES THE TABLES FOR THE TPump JOB */
/***********************************************/
.LOGTABLE TLddNT13H;
0002 .LOGON cs4400s3/wth,;
**** 16:57:43 UTY8400 Teradata Database Release: 12.00.00.00
**** 16:57:43 UTY8400 Teradata Database Version: 12.00.00.00
**** 16:57:43 UTY8400 Default character set: ASCII
**** 16:57:43 UTY8400 Maximum supported buffer size: 1M
**** 16:57:43 UTY8400 Upsert supported by RDBMS server
**** 16:57:43 UTY6211 A successful connect was made to the RDBMS.
**** 16:57:43 UTY6210 Logtable 'WTH.TLddNT13H' indicates that a restart is in progress.
========================================================================
=
=
=
Processing Control Statements
=
=
=
========================================================================
0003 DROP TABLE TBL13TA;
**** 16:57:43 UTY1012 A restart is in progress. This request has already been executed.
The return code was: 0.
0004 DROP TABLE tlnt13err;
**** 16:57:43 UTY1011 A restart is in progress. This request has already been executed.
The return code was: 3807, accompanied by the following message text:
Table/view/trigger/procedure 'tlnt13err' does not exist.
0005 CREATE TABLE TBL13TA,FALLBACK
(ABYTEINT BYTEINT,
ASMALLINT SMALLINT,
AINTEGER INTEGER,
ADECIMAL DECIMAL (5,2),
ACHAR CHAR (5),
ABYTE BYTE(1),
AFLOAT FLOAT,
ADATE DATE)
UNIQUE PRIMARY INDEX (ASMALLINT);
**** 16:57:43 UTY1012 A restart is in progress. This request has already been executed.
The return code was: 0.
0006 /*****************************************************************/
/* BEGIN LOAD WITH ALL THE OPTIONS SPECIFIED SUCH AS ERRLIMIT,
*/
/* CHECKPOINT, SESSIONS,TENACITY
*/
/*****************************************************************/
.BEGIN LOAD
SESSIONS 1 1
PACK 10
CHECKPOINT 1
TENACITY 2
ERRLIMIT 50
ERRORTABLE tlnt13err;
========================================================================
=
=
=
Processing TPump Statements
=
=
=
========================================================================
0007 .LAYOUT LAY1A;
234
Teradata Parallel Data Pump Reference
Appendix B: TPump Examples
Restarted Upsert Example
0008 /*.FILLER ATEST * BYTEINT;*/
.FIELD ABYTEINT * BYTEINT;
0009 .FIELD ASMALLINT * SMALLINT KEY;
0010 .FIELD AINTEGER * INTEGER;
0011 .FIELD ADECIMAL * DECIMAL (5,2);
0012 .FIELD ACHAR * CHAR (5);
0013 .FIELD ABYTE * BYTE(1);
0014 .FIELD AFLOAT * FLOAT;
0015 .FIELD ADATE * DATE;
0016 /* insert half of the rows ......................*/
.DML LABEL LABELA
IGNORE DUPLICATE ROWS
IGNORE MISSING ROWS
IGNORE EXTRA ROWS;
0017 INSERT INTO TBL13TA VALUES
(:ABYTEINT,:ASMALLINT,:AINTEGER,:ADECIMAL,:ACHAR,:ABYTE,:AFLOAT,:ADATE);
0018 /* ... and then upsert all of the rows ..........*/
.DML LABEL LABELB
IGNORE DUPLICATE ROWS
IGNORE MISSING ROWS
IGNORE EXTRA ROWS
DO INSERT FOR MISSING UPDATE ROWS;
0019 UPDATE TBL13TA SET ADECIMAL = ADECIMAL + 1 WHERE ASMALLINT = :ASMALLINT;
0020 INSERT INTO TBL13TA VALUES
(:ABYTEINT,:ASMALLINT,:AINTEGER,:ADECIMAL,:ACHAR,:ABYTE,:AFLOAT,:ADATE);
0021 /* should result in an upsert with half inserts and half updates */
.IMPORT INFILE ./tlnt013.dat
LAYOUT LAY1A FROM 1 FOR 400
APPLY LABELA WHERE ABYTEINT = 1;
0022 .IMPORT INFILE ./tlnt013.dat
LAYOUT LAY1A FROM 1 FOR 400
APPLY LABELB;
0023 .END LOAD;
**** 16:57:43 UTY6609 Starting to log on sessions...
**** 16:57:43 UTY6610 Logged on 1 sessions.
========================================================================
=
=
=
TPump Import(s) Beginning
=
=
=
========================================================================
**** 16:57:43 UTY6630 Options in effect for following TPump Import(s):
.
Tenacity:
2 hour limit to successfully connect load sessions.
.
Max Sessions:
1 session(s).
.
Min Sessions:
1 session(s).
.
Checkpoint:
1 minute(s).
.
Errlimit:
50 rejected record(s).
.
Restart Mode:
ROBUST.
. Serialization:
ON.
.
Packing:
10 Statements per Request.
.
StartUp Rate:
UNLIMITED Statements per Minute.
**** 16:57:43 UTY6615 Processing complete for load 1, import 1.
**** 16:57:44 UTY6622 Restart recovery processing begins.
**** 16:57:44 UTY6623 Restart recovery processing complete.
**** 16:57:44 UTY8800 WARNING: Rate Monitoring turned off - no permission on macro:
TPumpMacro.ImportCreate.
Teradata Parallel Data Pump Reference
235
Appendix B: TPump Examples
Example Using the TABLE Command
****
****
****
****
****
****
****
****
****
****
****
****
****
****
****
16:57:44 UTY6608 Import 2 begins.
16:57:58 UTY6641 Since last chkpt., 370 recs. in, 370 stmts., 37 reqs
16:57:58 UTY6647 Since last chkpt., avg. DBS wait time: 0.38
16:57:58 UTY6612 Beginning final checkpoint...
16:57:58 UTY6641 Since last chkpt., 370 recs. in, 370 stmts., 37 reqs
16:57:58 UTY6647 Since last chkpt., avg. DBS wait time: 0.38
16:57:59 UTY6607 Checkpoint Completes with 400 rows sent.
16:57:59 UTY6642 Import 2 statements: 400, requests: 40
16:57:59 UTY6643 Import 2 average statements per request: 10.00
16:57:59 UTY6644 Import 2 average statements per record: 1.00
16:57:59 UTY6645 Import 2 statements/session: avg. 400.00, min. 400.00, max. 400.00
16:57:59 UTY6646 Import 2 requests/session: avg. 40.00, min. 40.00, max. 40.00
16:57:59 UTY6648 Import 2 DBS wait time/session: avg. 15.00, min. 15.00, max. 15.00
16:57:59 UTY6649 Import 2 DBS wait time/request: avg. 0.38, min. 0.38, max. 0.38
16:57:59 UTY1803 Import processing statistics
.
IMPORT 2
Total thus far
.
=========
==============
Candidate records considered:........
400.......
800
Apply conditions satisfied:..........
400.......
600
Records logged to error table:.......
0.......
0
Candidate records rejected:..........
0.......
0
** Statistics for Apply Label : LABELB
Type
Database
Table or Macro Name
Activity
U
WTH
TBL13TA
200
I
WTH
TBL13TA
200
**** 16:58:01 UTY0821 Error table WTH.tlnt13err is EMPTY, dropping table.
0024 .LOGOFF;
========================================================================
=
=
=
Logoff/Disconnect
=
=
=
========================================================================
**** 16:58:08 UTY6216 The restart log table has been dropped.
**** 16:58:08 UTY6212 A successful disconnect was made from the RDBMS.
**** 16:58:08 UTY2410 Total processor time used = '0.450648 Seconds'
.
Start : 16:57:39 - MON JULY 16, 2007
.
End
: 16:58:08 - MON JULY 16, 2007
.
Highest return code encountered = '0'.
Example Using the TABLE Command
This example script uses the TABLE command and “INSERT <TABLENAME>.*” feature.
/***********************************************/
/* STEP01 CREATES THE TABLES FOR THE TPump JOB */
/***********************************************/
.LOGTABLE TLddNT10H;
.LOGON cs4400s3/wth,wth;
DROP TABLE TBL10T;
DROP TABLE TLNT10ERR;
CREATE TABLE TBL10T, FALLBACK
(RAND INTEGER,
ATIME INTEGER,
ASESS INTEGER)
UNIQUE PRIMARY INDEX (RAND);
/*****************************************************************/
236
Teradata Parallel Data Pump Reference
Appendix B: TPump Examples
Example Using the TABLE Command
/* BEGIN LOAD WITH ALL THE OPTIONS SPECIFIED SUCH AS ERRLIMIT,
*/
/* CHECKPOINT, SESSIONS,TENACITY
*/
/*****************************************************************/
.BEGIN LOAD
SESSIONS 8 1
PACK 20
SERIALIZE ON
CHECKPOINT 1
TENACITY 2
ERRORTABLE TLNT10ERR;
.LAYOUT LAY1A;
.TABLE TBL10T;
.DML LABEL LABELA
MARK DUPLICATE ROWS
MARK MISSING ROWS
MARK EXTRA ROWS;
INSERT INTO TBL10T.*;
.IMPORT INFILE ./tlnt010.dat
LAYOUT LAY1A
APPLY LABELA FROM 1 FOR 111;
.END LOAD;
.LOGOFF;
produces the following results. When looking at the following results, notice that the output
fields generated by the TABLE command includes the “KEY” modifier for the field coming
from the primary index of the table. This is what enables the use of the “SERIALIZE” option:
0001 /***********************************************/
/* STEP01 CREATES THE TABLES FOR THE TPump JOB */
/***********************************************/
.LOGTABLE TLddNT10H;
0002 .LOGON cs4400s3/wth,;
**** 17:14:07 UTY8400 Teradata Database Release: 12.00.00.00
**** 17:14:07 UTY8400 Teradata Database Version: 12.00.00.00
**** 17:14:07 UTY8400 Default character set: ASCII
**** 17:14:07 UTY8400 Maximum supported buffer size: 1M
**** 17:14:07 UTY8400 Upsert supported by RDBMS server
**** 17:14:12 UTY6211 A successful connect was made to the RDBMS.
**** 17:14:12 UTY6217 Logtable 'WTH.TLddNT10H' has been created.
========================================================================
=
=
=
Processing Control Statements
=
=
=
========================================================================
0003 DROP TABLE TBL10T;
**** 17:14:13 UTY1016 'DROP' request successful.
0004 DROP TABLE TLNT10ERR;
**** 17:14:14 UTY1008 RDBMS failure: 3807, Table/view/trigger/procedure 'TLNT10ERR' does
not exist.
0005 CREATE TABLE TBL10T, FALLBACK
(RAND INTEGER,
ATIME INTEGER,
ASESS INTEGER)
UNIQUE PRIMARY INDEX (RAND);
**** 17:14:15 UTY1016 'CREATE' request successful.
0006 /*****************************************************************/
/* BEGIN LOAD WITH ALL THE OPTIONS SPECIFIED SUCH AS ERRLIMIT,
*/
/* CHECKPOINT, SESSIONS,TENACITY
*/
/*****************************************************************/
Teradata Parallel Data Pump Reference
237
Appendix B: TPump Examples
Example Using the TABLE Command
.BEGIN LOAD
SESSIONS 8 1
PACK 20
SERIALIZE ON
CHECKPOINT 1
TENACITY 2
ERRORTABLE TLNT10ERR;
238
Teradata Parallel Data Pump Reference
Appendix B: TPump Examples
Example Using the TABLE Command
0007
0008
****
****
****
****
****
0009
0010
0011
0012
****
****
****
****
****
****
****
****
****
****
****
****
****
****
****
****
****
****
****
========================================================================
=
=
=
Processing TPump Statements
=
=
=
========================================================================
.LAYOUT LAY1A;
.TABLE TBL10T;
17:14:15 UTY6009 Fields generated by .TABLE command begin.
17:14:15 UTY6010 *** .FIELD RAND
* INTEGER KEY;
17:14:15 UTY6010 *** .FIELD ATIME
* INTEGER;
17:14:15 UTY6010 *** .FIELD ASESS
* INTEGER;
17:14:15 UTY6011 Fields generated by .TABLE command end.
.DML LABEL LABELA
MARK DUPLICATE ROWS
MARK MISSING ROWS
MARK EXTRA ROWS;
INSERT INTO TBL10T.*;
.IMPORT INFILE ./tlnt010.dat
LAYOUT LAY1A
APPLY LABELA FROM 1 FOR 111;
.END LOAD;
17:14:15 UTY6609 Starting to log on sessions...
17:14:16 UTY6610 Logged on 7 sessions.
========================================================================
=
=
=
TPump Import(s) Beginning
=
=
=
========================================================================
17:14:16 UTY6630 Options in effect for following TPump Import(s):
.
Tenacity:
2 hour limit to successfully connect load sessions.
.
Max Sessions:
8 session(s).
.
Min Sessions:
1 session(s).
.
Checkpoint:
1 minute(s).
.
Errlimit:
No limit in effect.
.
Restart Mode:
ROBUST.
. Serialization:
ON.
.
Packing:
20 Statements per Request.
.
StartUp Rate:
UNLIMITED Statements per Minute.
17:14:21 UTY8800 WARNING: Rate Monitoring turned off - no permission on macro:
TPumpMacro.ImportCreate.
17:14:21 UTY6608 Import 1 begins.
17:14:24 UTY6641 Since last chkpt., 111 recs. in, 111 stmts., 7 reqs
17:14:24 UTY6647 Since last chkpt., avg. DBS wait time: 0.43
17:14:24 UTY6612 Beginning final checkpoint...
17:14:24 UTY6641 Since last chkpt., 111 recs. in, 111 stmts., 7 reqs
17:14:24 UTY6647 Since last chkpt., avg. DBS wait time: 0.43
17:14:24 UTY6607 Checkpoint Completes with 111 rows sent.
17:14:24 UTY6642 Import 1 statements: 111, requests: 7
17:14:24 UTY6643 Import 1 average statements per request: 15.86
17:14:24 UTY6644 Import 1 average statements per record: 1.00
17:14:24 UTY6645 Import 1 statements/session: avg. 15.86, min. 14.00, max. 18.00
17:14:24 UTY6646 Import 1 requests/session: avg. 1.00, min. 1.00, max. 1.00
17:14:24 UTY6648 Import 1 DBS wait time/session: avg. 0.43, min. 0.00, max. 2.00
17:14:24 UTY6649 Import 1 DBS wait time/request: avg. 0.43, min. 0.00, max. 2.00
17:14:24 UTY1803 Import processing statistics
Teradata Parallel Data Pump Reference
239
Appendix B: TPump Examples
Example Using the TABLE Command
.
IMPORT 1
Total thus far
.
=========
==============
Candidate records considered:........
111.......
111
Apply conditions satisfied:..........
111.......
111
Records logged to error table:.......
0.......
0
Candidate records rejected:..........
0.......
0
** Statistics for Apply Label : LABELA
Type
Database
Table or Macro Name
Activity
I
WTH
TBL10T
111
**** 17:14:25 UTY0821 Error table WTH.TLNT10ERR is EMPTY, dropping table.
0013 .LOGOFF;
========================================================================
=
=
=
Logoff/Disconnect
=
=
=
========================================================================
**** 17:14:33 UTY6216 The restart log table has been dropped.
**** 17:14:33 UTY6212 A successful disconnect was made from the RDBMS.
**** 17:14:33 UTY2410 Total processor time used = '0.330475 Seconds'
.
Start : 17:14:05 - MON JULY 16, 2007
.
End
: 17:14:33 - MON JULY 16, 2007
.
Highest return code encountered = '0'.
240
Teradata Parallel Data Pump Reference
APPENDIX C
INMOD and Notify Exit Routine
Examples
This appendix provides INMOD examples using:
•
COBOL Pass-Thru INMOD
•
Assembler INMOD
•
PL/I INMOD
•
C INMOD - UNIX
These examples contain MVS control statements. Each of these INMODs works for VM when
appropriate changes are made in order to convert from JCL to REXX.
Workstation-based clients support only INMODs written in C; an example of this is also
provided in this appendix.
COBOL INMOD
//DBCCB1 JOB 1,’DBC’,MSGCLASS=A,NOTIFY=DBC,CLASS=B,REGION=4096K
//COBCOMPL EXEC COBUCL
//COB.SYSIN DD *
IDENTIFICATION DIVISION.
PROGRAM-ID. DYNAMN.
AUTHOR. JCK.
INSTALLATION. TERADATA.
DATE-WRITTEN.
DATE-COMPILED.
SECURITY. OPEN.
REMARKS.
THIS PROGRAM IS A COBOL INMOD ROUTINE FOR TPUMP.
FUNCTION: THIS PROGRAM READS AND RETURNS A RECORD
OF 80 BYTES LONG VIA STRUCT-1 AND STRUCT-2.
ENVIRONMENT DIVISION.
CONFIGURATION SECTION.
SOURCE-COMPUTER. IBM-370.
OBJECT-COMPUTER. IBM-370.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT INMOD-DATA-FILE ASSIGN TO SYSIN-INDATA.
DATA DIVISION.
FILE SECTION.
FD INMOD-DATA-FILE
BLOCK CONTAINS 0 RECORDS
LABEL RECORDS STANDARD.
01 INPUT-PARM-AREA
PICTURE IS X(80).
WORKING-STORAGE SECTION.
01 NUMIN
PICTURE S9(4) COMP VALUE +0.
01 NUMOUT
PICTURE S9(4) COMP VALUE +0.
Teradata Parallel Data Pump Reference
241
Appendix C: INMOD and Notify Exit Routine Examples
LINKAGE SECTION.
*TPump COMMUNICATES WITH INMOD VIA STRUCT-1 AND STRUCT-2.
01 STRUCT-1.
02 RETURN-INDICATE
PIC S9(9) COMP.
02 RECORD-LEN
PIC S9(9) COMP.
02 RECORD-BODY.
03 DATA-AREA1
PIC X(80).
01 STRUCT-2.
02 SEQ-NUMBER
PIC S9(9) COMP.
02 PARM-LIST.
05 PARM-LENTH
PIC X(2).
05 PARM-STRING
PIC X(80).
PROCEDURE DIVISION USING STRUCT-1, STRUCT-2.
BEGIN.
MAIN.
DISPLAY “==============================================”
DISPLAY STRUCT-1.
DISPLAY STRUCT-2.
IF RETURN-INDICATE = 0 THEN
*
INMOD INITIALIZATION - OPEN FILE AND READ THE 1ST REC.
DISPLAY “INMOD CALLED - RETURN CODE 0 ”
PERFORM OPEN-FILES
PERFORM READ-RECORDS
GOBACK
ELSE
IF RETURN-INDICATE = 1 THEN
*
READ A RECORD.
DISPLAY “INMOD CALLED - RETURN CODE 1 ”
PERFORM READ-RECORDS
GOBACK
ELSE
IF RETURN-INDICATE = 5 THEN
*
CLOSE INMOD - JUST SEND RETURN CODE = 0
DISPLAY “INMOD CALLED - RETURN CODE 5 ”
MOVE 0 TO RECORD-LEN
MOVE 0 TO RETURN-INDICATE
GOBACK
ELSE
*
UNKNOWN CODE.
DISPLAY “INMOD CALLED - RETURN CODE X ”
MOVE 0 TO RECORD-LEN
MOVE 16 TO RETURN-INDICATE
GOBACK.
OPEN-FILES.
OPEN INPUT INMOD-DATA-FILE.
MOVE 0 TO RETURN-INDICATE.
READ-RECORDS.
READ INMOD-DATA-FILE INTO DATA-AREA1
AT END GO TO END-DATA.
ADD 1 TO NUMIN.
MOVE 80 TO RECORD-LEN.
MOVE 0 TO RETURN-INDICATE.
ADD 1 TO NUMOUT.
END-DATA.
CLOSE INMOD-DATA-FILE.
DISPLAY “NUMBER OF INPUT RECORDS = ” NUMIN.
DISPLAY “NUMBER OF OUTPUT RECORDS = ” NUMOUT.
MOVE 0 TO RECORD-LEN.
MOVE 0 TO RETURN-INDICATE.
242
Teradata Parallel Data Pump Reference
Appendix C: INMOD and Notify Exit Routine Examples
GOBACK.
/*
//LKED.SYSLMOD DD DSN=JCK.INMOD.LOAD(INMODG1),DISP=MOD
//LKED.SYSIN DD *
ENTRY DYNAMN
NAME INMODG1(R)
/*
//******************************************************************
//* NEXT 3 STEPS PREPARE TERADATA rdbms FOR THE TPump’S INMOD TEST *
/*******************************************************************
//TPUMPDEL
EXEC PGM=IEFBR14
//TPUMPLOG
DD DSN=JCK.INMOD.TDQ8.TPumpLOG,
//
DISP=(MOD,DELETE),UNIT=SYSDA,SPACE=(TRK,0)
//TPUMPCAT
EXEC PGM=TPUMP
//SYSPRINT DD SYSOUT=*
//TPumpLOG
DD DSN=JCK.INMOD.TDQ8.TPumpLOG,DISP=(NEW,CATLG),
//
UNIT=SYSDA,DCB=(RECFM=F,DSORG=PS,LRECL=8244),
//
SPACE=(8244,(12,5))
//SYSIN
DD *
//*******************************************************************
//* THIS STEP WILL ONLY DROP THE TABLES IF TPump NOT IN APPLY PHASE *
//*******************************************************************
//CREATE
EXEC BTEQ
//STEPLIB DD DSN=STV.GG00.APP.L,DISP=SHR
//
DD DSN=STV.TG00.APP.L,DISP=SHR
//
DD DSN=STV.RG00.APP.L,DISP=SHR
//SYSPRINT DD SYSOUT=*
//SYSABEND DD SYSOUT=*
//SYSIN
DD DATA,DLM=##
.LOGON TDQ8/DBC,DBC;
RELEASE TPump XXXX.INMODCB1;
.IF ERRORCODE = 2572 THEN .GOTO NODROP;
DROP TABLE XXXX.LOGTABLE;
DROP TABLE XXXX.ET_INMODCB1;
DROP TABLE XXXX.UV_INMODCB1;
DROP TABLE XXXX.WT_INMODCB1
.QUIT;
.LABEL NODROP;
.EXIT 4;;
DROP USER XXXX;
##
//*****************************************************************
//*
*
//*
RUN TPump
*
//*
*
//*****************************************************************
//LOADIT
EXEC PGM=TPump
//STEPLIB DD DISP=SHR,DSN=JCK.INMOD.LOAD
//SYSPRINT DD SYSOUT=*
//SYSTERM DD SYSOUT=*
//SYSOUT
DD SYSOUT=*
//SYSIN
DD DATA,DLM=##
.LOGTABLE XXXX.LOGTABLE;
.LOGON TDQ8/XXXX,XXXX;
/* TEST DATAIN, DATALOC
*/
DROP TABLE XXXX.INMODCB1;
CREATE TABLE INMODCB1 (F1 CHAR(10), F2 CHAR(70));
.BEGIN IMPORT TPump TABLES INMODCB1;
.Layout layname1;
Teradata Parallel Data Pump Reference
243
Appendix C: INMOD and Notify Exit Routine Examples
COBOL Pass-Thru INMOD
.Field L1Fld1 1 Char(10);
.Field L1Fld2 * Char(70);
.DML Label DML1;
INSERT INMODCB1(F1,F2) VALUES (:L1FLD1, :L1FLD2);
.IMPORT INMOD INMODG1 USING (“AAA” “BBB”) LAYOUT LAYNAME1 APPLY DML1;
.End LOAD;
.LOGOFF;
##
//INDATA DD DATA,DLM=##
01COBOL1 AAAAAAAAAAAAAAAA
02COBOL1 BBBBBBBBBBBBBBBB
03COBOL1 CCCCCCCCCCCCCCCC
04COBOL1 DDDDDDDDDDDDDDDD
##
//SELECT
EXEC BTEQ
//STEPLIB DD DSN=STV.GG00.APP.L,DISP=SHR
//
DD DSN=STV.TG00.APP.L,DISP=SHR
//
DD DSN=STV.RG00.APP.L,DISP=SHR
//SYSPRINT DD SYSOUT=*
//SYSABEND DD SYSOUT=*
//SYSIN
DD DATA,DLM=##
.LOGON TDQ8/XXXX,XXXX;
SELECT * FROM INMODCB1;
.LOGOFF;
##
//
COBOL Pass-Thru INMOD
IDENTIFICATION DIVISION.
PROGRAM-ID. INMOD2.
AUTHOR. STV.
INSTALLATION. TERADATA.
DATE-WRITTEN.
DATE-COMPILED.
SECURITY. OPEN.
REMARKS.
THIS PROGRAM IS AN EXAMPLE OF A COBOL INMOD ROUTINE
WHICH RECEIVES A RECORD FROM TPump THEN MODIFIES OR
REJECTS IT.
ENVIRONMENT DIVISION.
CONFIGURATION SECTION.
SOURCE-COMPUTER. IBM-370.
OBJECT-COMPUTER. IBM-370.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 COUNTROWS
PICTURE S9(4) COMP VALUE +0.
01 REJROWS
PICTURE S9(4) COMP VALUE +0.
01 INSROWS
PICTURE S9(4) COMP VALUE +0.
01 I
PICTURE S9(4) COMP.
01 MATCHFLAG PIC 9.
88 NOTMATCH VALUE 0.
88 MATCH
VALUE 1.
LINKAGE SECTION.
01 STRUCT-1.
02 RETURN-INDICATE
PIC S9(9) COMP.
244
Teradata Parallel Data Pump Reference
Appendix C: INMOD and Notify Exit Routine Examples
COBOL Pass-Thru INMOD
02
02
RECORD-LEN
PIC S9(9) COMP.
RECORD-BODY OCCURS 80 TIMES.
03 DATA-AREA1
PIC X.
01 STRUCT-2.
02 SEQ-NUMBER
PIC S9(9) COMP.
02 PARM-LIST.
05 PARM-LENGTH
PIC S9(4) COMP.
05 PARM-STRING OCCURS 80 TIMES.
07 PARM-DATA
PIC X.
PROCEDURE DIVISION USING STRUCT-1, STRUCT-2.
BEGIN.
MAIN.
DISPLAY “================================================”
IF RETURN-INDICATE = 6 THEN
DISPLAY “INMOD2 CALLED - RETURN CODE 6 ”
PERFORM INITIALIZE
GOBACK
ELSE
IF RETURN-INDICATE = 7 THEN
DISPLAY “INMOD2 CALLED - RETURN CODE 7 ”
PERFORM PROCESS-RECORD
GOBACK
ELSE
IF RETURN-INDICATE = 5 THEN
DISPLAY “INMOD2 CALLED - RETURN CODE 5 ”
PERFORM FINALIZE
GOBACK
ELSE
DISPLAY “BLKEXIT CALLED - RETURN CODE X ”
MOVE 0 TO RETURN-INDICATE.
GOBACK.
INITIALIZE.
MOVE 0 TO COUNTROWS INSROWS REJROWS.
MOVE 0 TO RETURN-INDICATE.
PROCESS-RECORD.
ADD 1 TO COUNTROWS.
MOVE 0 TO RETURN-INDICATE. MOVE 1 TO I.
MOVE 1 TO MATCHFLAG.
PERFORM COMPARE UNTIL (I > PARM-LENGTH) OR (NOTMATCH).
IF NOTMATCH THEN
DISPLAY “REJECTED”
ADD 1 TO REJROWS
MOVE 0 TO RECORD-LEN
ELSE
DISPLAY “ACCEPTED”
ADD 1 TO INSROWS.
COMPARE.
IF (RECORD-BODY(I) = PARM-STRING(I)) THEN
NEXT SENTENCE
ELSE
MOVE 0 TO MATCHFLAG.
ADD 1 TO I.
FINALIZE.
MOVE 0 TO RETURN-INDICATE.
DISPLAY “NUMBER OF TOTAL RECORDS
= ” COUNTROWS.
DISPLAY “NUMBER OF REJECTED RECORDS = ” REJROWS.
DISPLAY “NUMBER OF ACCEPTED RECORDS = ” INSROWS.
GOBACK.
Teradata Parallel Data Pump Reference
245
Appendix C: INMOD and Notify Exit Routine Examples
Assembler INMOD
Assembler INMOD
//JCKAS1 JOB 1,’JCK’,MSGCLASS=A,NOTIFY=JCK,CLASS=B,
REGION=4096K
//*****************************************************************
//*
*
//*
IDENTIFY NECESSARY LOAD LIBRARIES FOR RELEASE
*
//*
*
//*****************************************************************
//JOBLIB
DD DISP=SHR,DSN=STV.GG10.APP.L
//
DD DISP=SHR,DSN=STV.GG00.APP.L
//
DD DISP=SHR,DSN=STV.TG00.APP.L
//
DD DISP=SHR,DSN=STV.RG00.APP.L
//
DD DISP=SHR,DSN=TER2.SASC301H.LINKLIB
//ASMFCL
EXEC ASMFCL
//ASM.SYSIN DD *
DYNAMN TITLE ’-- CONCATENATE INPUT RECORDS FOR INPUT TO TPump’
DYNAMN
CSECT
USING DYNAMN,15
*******************************************************************
*
THIS PROGRAM IS CALLED BY THE TERADATA TPump PROGRAM
*
*
TO OBTAIN A RECORD TO BE USED TO INSERT,UPDATE, OR
*
*
DELETE ROWS OF A TARGET TABLE.
*
*
*
*
THIS PROGRAM IS NOT REENTRANT.
*
*
FUNCTION:
*
*
READ AN INPUT RECORD AND ADD A FOUR-BYTE INTEGER FIELD *
*
THE FRONT OF THE RECORD. THE NEW FIELD WILL CONTAIN
*
*
A SEQUENCE NUMBER WHICH RANGES FROM 1 TO ...
*
*
NUMBER-OF-INPUT-RECORDS.
*
*
*
*
RETURN TO THE CALLER (TPump) INDICATING
*
*
EITHER MORE RECORDS ARE AVAILABLE OR NO MORE RECORDS
*
*
ARE TO BE PROCESSED.
*
*
*
* THIS INMOD PROGRAM CAN BE USED TO ENSURE UNIQUE RECORDS
*
* IN CERTAIN APPLICATIONS, THE SEQUENCE FIELD
*
*
CAN BE USED FOR “DATA SAMPLING”.
*
*
*
*
DDNAME OF THE INPUT DATA SET: “INDATA”
*
*
*
*******************************************************************
B
STOREGS
BRANCH AROUND EP
DC
AL1(31)
DEFINE EP LENGTH
DC
CL9’DYNAMN ’
DEFINE
DC
CL9’&SYSDATE’
ENTRY
DC
CL8’
VM ’
POINT
DC
CL5’&SYSTIME’
IDENTIFIER
*******************************************************************
*
SAVE REGISTERS
*
*******************************************************************
STOREGS DS
0H
DEFINE AND ALIGN SYMBOL
STM
R14,R12,12(R13)
STORE OFF CALLER’S REGISTERS
LR
R12,R15
COPY BASE ADDRESS
DROP R15
DROP VOLATILE BASE REGISTER
USING DYNAMN,R12
ESTAB PERM CSECT ADDRBLTY
LA
R14,SAVEAREA
POINT AT LOCAL SAVE WORK
246
Teradata Parallel Data Pump Reference
Appendix C: INMOD and Notify Exit Routine Examples
Assembler INMOD
ST
ST
LR
R14,8(,R13)
R13,4(,R14)
R13,R14
STORE FWD LINK IN SA CHAIN
STORE BWD LINK IN SA CHAIN
COPY LOCAL SAVE/WORK AREA
ADDR
POINT TO PARM
L
R11,0(,R1)
SPACE 1
*******************************************************************
*
OPEN “DATA” DATA SET
*
*
(ONLY THE FIRST TIME)
*
*******************************************************************
USING PREBUF,R11
COVER PRE-PROC AREA
LA
R9,PREREC
POINT TO START OF PREPROC.
DATA
OC
PRECODE,PRECODE
FIRST ENTRY ?
(0=FIRST ENTRY)
BNZ
NOOPEN
NO, SKIP OPEN
USING IHADCB,R10
YES,COVER DCB FOR OPEN
LA
R10,INDATA
POINT TO DATA DCB
OPEN INDATA
OPEN INPUT DATA SET
TM
DCBOFLGS,X’10’
DID IT OPEN ?
BO
OPENOK
YES,
WTO
’UNABLE TO OPEN INDATA DATA SET’,ROUTCDE=11
B
BADRET
RETURN WITH ERROR CODE
*******************************************************************
*
CHECK TPump STATUS CODES
*
* 0 = FIRST ENTRY
(TPump EXPECTS TO RECEIVE A RECORD)
*
* 1 = GET NEXT RECORD
(TPump EXPECTS TO RECEIVE A RECORD)
*
* 2 = CLIENT RESTART CALL (TPump DOES NOT EXPECT A RECORD)
*
* 3 = CHECKPOINT CALL
(TPump DOES NOT EXPECT A RECORD)
*
* 4 = RESTART CALL
(TPump DOES NOT EXPECT A RECORD)
*
* 5 = CLOSE INMOD
(TPump DOES NOT EXPECT A RECORD)
*
*
*
*
*
*
NOTE: CODES 2,3 AND 4 ARE NOT HANDLED BY THIS PROGRAM
*
*
*
*******************************************************************
OPENOK
DS
0H
NOOPEN
L
R15,PRECODE
CHECK ON CODE FROM TPump
C
R15,=F’1’
NEED RECORD ?
BH
NOREC
NO , DO NOT “GET” A RECORD
L
R15,SAMPNUM
GET CURRENT SAMPLE NUM.
LA
R15,1(R15)
INCR BY 1
ST
R15,0(R9)
STORE AT FRONT OF RECORD
ST
R15,SAMPNUM
RESET COUNTER
LA
R9,4(R9)
ADVANCE FOR READ ADDR.
LA
R10,INDATA
COVER INDATA DCB
GETNEXT GET
INDATA,(R9)
READ A RECORD
INCREC
LH
R9,DCBLRECL
GET RECORD LENGTH
AH
R9,=H’4’
ADD 4 FOR NEW FIELD
SR
R15,R15
SET RETURN CODE VALUE
RETURN
ST
R9,PRELEN
SET LENGTH (ZERO AFTER EOF)
ST
R15,PRECODE
L
R13,4(R13)
RETURN (14,12),RC=0
RETURN
SPACE 5
*******************************************************************
*
EOF ENTERED AT END-OF-FILE
*
*******************************************************************
Teradata Parallel Data Pump Reference
247
Appendix C: INMOD and Notify Exit Routine Examples
Assembler INMOD
*
EOF
CLOSE INDATA
CLOSE INPUT DATA SET
*
*******************************************************************
NOREC
SR
R15,R15
SET ZERO RETURN CODE
SR
R9,R9
SET ZERO LENGTH
B
RETURN
RETURN
*
BADRET
LA
R15,16
SET RETURN CODE FOR ERROR
SR
R9,R9
SET LENGTH = 0
B
RETURN
ERROR RETURN
EJECT
*
*
CONSTANTS
*
*
REGEQU
R0
EQU 0
R1
EQU 1
R2
EQU 2
R3
EQU 3
R4
EQU 4
R5
EQU 5
R6
EQU 6
R7
EQU 7
R8
EQU 8
R9
EQU 9
R10
EQU 10
R11
EQU 11
R12
EQU 12
R13
EQU 13
R14
EQU 14
R15
EQU 15
EJECT
*
*
DATA STRUCTURES AND VARIABLES
*
SPACE 1
SAVEAREA DC
9D’0’
SAVE AREA
SAMPNUM DC
F’0’
SPACE 10
INDATA
DCB
DDNAME=INDATA,MACRF=(GM),DSORG=PS,EODAD=EOF
PREBUF
DSECT
PRECODE DS
F
PRELEN
DS
F
PREREC
DS
0XL31000
DCBD DEVD=DA,DSORG=PS
PREPRM
DSECT
PRESEQ
DS
F
PREPRML DS
H
PREPRMS DS
CL80
END
//LKED.SYSLMOD DD DSN=JCK.INMOD.LOAD(INMODG1),DISP=MOD,UNIT=3380,
//
VOLUME=SER=TSO805
//LKED.SYSIN DD *
ENTRY DYNAMN
NAME INMODG1(R)
/*
//TPUMPDEL
EXEC PGM=IEFBR14
//TPUMPLOG
DD DSN=JCK.INMOD.TDQ8.TPumpLOG,
//
DISP=(MOD,DELETE),UNIT=SYSDA,SPACE=(TRK,0)
//TPUMPCAT
EXEC PGM=TPUMP
248
Teradata Parallel Data Pump Reference
Appendix C: INMOD and Notify Exit Routine Examples
Assembler INMOD
//STEPLIB DD DSN=STV.GG00.APP.L,DISP=SHR
//
DD DSN=STV.TG00.APP.L,DISP=SHR
//
DD DSN=STV.RG00.APP.L,DISP=SHR
//SYSPRINT DD SYSOUT=*
//TPUMPLOG DD DSN=JCK.INMOD.TDQ8.TPumpLOG,DISP=(NEW,CATLG),
//
UNIT=SYSDA,DCB=(RECFM=F,DSORG=PS,LRECL=8244),
//
SPACE=(8244,(12,5))
//SYSIN
DD *
//**********************************************************************
//* THIS STEP WILL ONLY DROP THE TABLES IF TPump IS NOT IN APPLY PHASE *
//**********************************************************************
//CREATE
EXEC BTEQ
//STEPLIB DD DSN=STV.GG00.APP.L,DISP=SHR
//
DD DSN=STV.TG00.APP.L,DISP=SHR
//
DD DSN=STV.RG00.APP.L,DISP=SHR
//SYSPRINT DD SYSOUT=A
//SYSABEND DD SYSOUT=*
//SYSIN
DD DATA,DLM=##
.LOGON TDQ8/DBC,DBC;
RELEASE TPump XXXX.INMODAS1;
.IF ERRORCODE = 2572 THEN .GOTO NODROP;
DROP TABLE XXXX.LOGTABLE;
DROP TABLE XXXX.ET_INMODAS1;
DROP TABLE XXXX.UV_INMODAS1;
DROP TABLE XXXX.WT_INMODAS1;
.QUIT;
.LABEL NODROP;
.EXIT 4;
##
//*****************************************************************
//*
*
//*
RUN TPump
*
//*
*
//*****************************************************************
//LOADIT
EXEC PGM=TPump
//STEPLIB DD DISP=SHR,DSN=STV.GG10.APP.L
//
DD DISP=SHR,DSN=STV.GG00.APP.L
//
DD DISP=SHR,DSN=STV.TG00.APP.L
//
DD DISP=SHR,DSN=STV.RG00.APP.L
//
DD DISP=SHR,DSN=TER2.SASC301H.LINKLIB
//
DD DISP=SHR,DSN=JCK.INMOD.LOAD,VOLUME=SER=TSO805,UNIT=3380
//SYSPRINT DD SYSOUT=*
//SYSTERM DD SYSOUT=*
//SYSOUT
DD SYSOUT=*
//SYSIN
DD DATA,DLM=##
.LOGTABLE XXXX.LOGTABLE;
.LOGON TDQ8/XXXX,XXXX;
/* TEST DATAIN, DATALOC
*/
DROP TABLE XXXX.INMODAS1;
CREATE TABLE INMODAS1 (F1 CHAR(10), F2 CHAR(70));
.BEGIN IMPORT TPump TABLES INMODAS1;
.Layout layname1;
.FIELD L1FLD0 1 CHAR(4);
.FIELD L1FLD1 * CHAR(10);
.Field L1Fld2 * Char(70);
.DML Label DML1;
INSERT INMODAS1(F1,F2) VALUES (:L1FLD1, :L1FLD2);
.IMPORT INMOD INMODG1 USING (“AAA” “BBB”) LAYOUT LAYNAME1 APPLY DML1;
.End LOAD;
Teradata Parallel Data Pump Reference
249
Appendix C: INMOD and Notify Exit Routine Examples
PL/I INMOD
.LOGOFF;
##
//INDATA DD DATA,DLM=##
01ASSEMBLEAAAAAAAAAAAAAAAA
02ASSEMBLEBBBBBBBBBBBBBBBB
03ASSEMBLECCCCCCCCCCCCCCCC
04ASSEMBLEDDDDDDDDDDDDDDDD
##
//SELECT
EXEC BTEQ
//STEPLIB DD DSN=STV.GG00.APP.L,DISP=SHR
//
DD DSN=STV.TG00.APP.L,DISP=SHR
//
DD DSN=STV.RG00.APP.L,DISP=SHR
//SYSPRINT DD SYSOUT=A
//SYSABEND DD SYSOUT=*
//SYSIN
DD DATA,DLM=##
.LOGON TDQ8/XXXX,XXXX;
SELECT * FROM INMODAS1;
.LOGOFF;
##
//
PL/I INMOD
//SFDPL2 JOB (22150000),’SFD’,MSGCLASS=A,CLASS=B,
// REGION=4096K
//*****************************************************************
//*
*
//*
IDENTIFY NECESSARY LOAD LIBRARIES FOR RELEASE
*
//*
*
//*****************************************************************
//JOBLIB
DD DSN=STV.RG20.APPLOAD,DISP=SHR
//
DD DSN=STV.EG14MLL1.APP.L,DISP=SHR
//
DD DSN=STV.TG13BLD.APP.L,DISP=SHR
//
DD DSN=TER2.SASC450F.LINKLIB,DISP=SHR
//STEP1
EXEC ASMFC
//ASM.SYSGO DD DSN=&&LOADSET1,DISP=(MOD,PASS),UNIT=VIO,
//
SPACE=(880,(500,100),,,ROUND)
//ASM.SYSIN DD *
PLIA
TITLE ’TPump INTERFACE TO PL/I EXIT ROUTINE’
DYNAMN
CSECT
CNOP 0,4
B
START-*(,R15)
BRANCH AROUND CONSTANTS
DC
AL1(L’PLIAFLAG)
LENGTH OF CONSTANTS
PLIAFLAG DC
C’ASSEMBLED AT &SYSTIME ON &SYSDATE.. BLKPLIA’
*-----------------------------------------------------------------*
*
G1_01
*
*
*
* ON ENTRY: R1 -> PARAMETER LIST
*
*
PARM 1 -> MULTI-FIELD RECORD
*
*
FIELD 1: COMMAND CODE/RETURN CODE
*
*
(32 BIT INTEGER)
*
*
0 = INITIAL CALL
*
*
1 = RECORD CALL
*
*
2 = HOST RESTART - ALSO INITIAL CALL
*
*
3 = CHECKPOINT
*
*
4 = DBC RESTART
*
250
Teradata Parallel Data Pump Reference
Appendix C: INMOD and Notify Exit Routine Examples
PL/I INMOD
*
5 = FINAL CALL
*
*
6 = W/ INFILE - ALSO INITIAL CALL
*
*
7 = RECEIVE RECORD CALL
*
*
FIELD 2: DATA RECORD LENGTH
*
*
(32 BIT INTEGER)
*
*
FIELD 3: DATA RECORD
*
*
(UP TO 31K BYTES)
*
*
PARM 2 -> EXIT ROUTINE WORK WORD
*
*
(32 BIT INTEGER)
*
*
*
*
*
* OPERATION:
*
*
INITIAL CALL:
*
*
1) BULK LOADER
LOADS THIS MODULE AND CALLS
*
*
2) BLKPLIA (THIS PROGRAM) WHICH CALLS
*
*
3) BLKPLI (PL/I PROGRAM TO ESTABLISH PL/I ENVIRONMENT)
*
*
WHICH CALLS
*
*
4) BLKASM (ENTRY POINT IN THE PROGRAM) WHICH CALLS
*
*
5) BLKEXIT (USER EXIT PROGRAM IN PL/I).
*
*
UPON RETURN:
*
*
1) BLKEXIT RETURNS TO
*
*
2) BLKASM WHICH PERFORMS MAGIC AND RETURNS DIRECTLY TO
*
*
3) BULK LOADER, THEREBY PRESERVING THE PL/I ENVIRONMENT
*
*
FOR SUBSEQUENT CALLS.
*
*
RECORD CALL:
*
*
1) BULK LOADER CALLS
*
*
2) BLKPLIA WHICH REVERSES THE MAGIC AND BRANCHES INTO
*
*
3) BLKASM WHICH CALLS
*
*
4) BLKEXIT WITH THE PL/I ENVIRONMENT SAVED BEFORE.
*
*
UPON RETURN:
*
*
1) BLKEXIT RETURNS TO
*
*
2) BLKASM WHICH PERFORMS MAGIC AND RETURNS DIRECTLY TO
*
*
3) BULK LOADER, THEREBY PRESERVING THE PL/I ENVIRONMENT
*
*
FOR SUBSEQUENT CALLS.
*
*
*
*-----------------------------------------------------------------*
START
SAVE (14,12)
LR
R11,R15
-> PROGRAM ENTRY POINT
USING DYNAMN,R11
L
R2,4(,R1)
-> EXIT ROUTINE WORD
L
R3,0(,R1)
-> COMMAND WORD
L
R3,0(,R3)
COMMAND WORD
CH
R3,=H’0’
INITIAL CALL?
BE
INITCALL
YES , DO INITIAL CODE
CH
R3,=H’6’
INITIAL CALL?
BE
INITCALL
YES , DO INITIAL CODE
CH
R3,=H’2’
INITIAL CALL?
BNE
CALLPGM
NO, JUST GO CALL PROGRAM
*-----------------------------------------------------------------*
*
SETUP WORK AREA AND PL/I ENVIRONMENT
*
*-----------------------------------------------------------------*
INITCALL LA
R0,WORKALEN
SR
R1,R1
L
R15,=V(DBCMEM)
BALR R14,R15
ST
R1,0(,R2)
SAVE WORKAREA ADDRESS
ST
R1,WORKADDR
SAVE WORKAREA ADDRESS
LR
R10,R1
-> CURRENT WORK AREA
USING WORKAREA,R10
Teradata Parallel Data Pump Reference
251
Appendix C: INMOD and Notify Exit Routine Examples
PL/I INMOD
SPIE MF=(E,NOSPIE)
CLEAR PASCAL INTERRUPT EXIT
ST
R1,SPIEPAS
SAVE PASCAL SPIE
MVC
WRKFLAG,WRKFLAGP
IDENTIFY WORK AREA
XC
SAVE1(12),SAVE1
CLEAR START OF SAVEAREA
LA
R1,SAVE1
INITIAL PROGRAM SAVE AREA
ST
R13,4(,R1)
BACK CHAIN SAVE AREAS
ST
R1,8(,R13)
FORW CHAIN SAVE AREAS
LR
R13,R1
-> NEW SAVE AREA
ST
R3,COMMAND
KEEP COMMAND FOR LATER
LA
R1,PLIPARM
-> STARTUP PARAMETERS
L
R15,=V(PLISTART)
PL/I SETUP ENTRY POINT
BALR R14,R15
CALL PL/I SETUP PROGRAM
*-----------------------------------------------------------------*
*
FINAL RETURN FROM PL/I: FREE WORKAREA AND RETURN
*
*-----------------------------------------------------------------*
L
R1,SPIEPAS
-> PASCALVS SPIE
SPIE MF=(E,(1))
RESTORE PASCALUS SPIE
L
R13,4(,R13)
BACK UP SAVE AREA CHAIN
LR
R1,R10
LA
R0,WORKALEN
L
R15,=V(DBCMEM)
BALR R14,R15
DROP R10
WORKAREA
RETURN
XR
R15,R15
INDICATE ALL IS WELL
ST
R15,16(,R13)
SET CALLER’S RETURN CODE
RETURN (14,12)
RETURN TO CALLER
*-----------------------------------------------------------------*
*
REESTABLISH PL/I ENVIRONMENT AND CALL USER
*
*-----------------------------------------------------------------*
ALLPGM L
R10,0(R2)
-> WORK AREA
CALLPGM L
R10,WORKADDR
-> WORK AREA
USING WORKAREA,R10
ST
R3,COMMAND
KEEP COMMAND FOR LATER
LR
R3,R1
SAVE -> PARMS FOR LATER
LA
R1,SAVE1
-> BLKPLIA SAVE AREA
ST
R13,4(,R1)
REBUILD BACK CHAIN
ST
R1,8(,R13)
REBUILD FORW CHAIN
LM
R12,R13,SAVE2
REESTABLISH PL/I ENVIRONMENT
B
AGAIN
CALL EXIT ROUTINE
DROP R10
WORKAREA
DROP R11
BLKPLIA
*-----------------------------------------------------------------*
*
PL/I CALLS HERE WITH CORRECT ENVIRONMENT
*
*-----------------------------------------------------------------*
ENTRY BLKASM
BLKASM
B
ASMSTART-*(,R15)
BRANCH AROUND CONSTANTS
DC
AL1(L’ASMFLAG)
LENGTH OF CONSTANTS
ASMFLAG DC
C’BLKASM’
ASMSTART SAVE (14,12)
SAVE BLKPLI REGISTERS
LR
R11,R15
ADDRESS PROGRAM
USING BLKASM,R11
SPIE MF=(E,NOSPIE))
CLEAR PASCAL INTERRUPT EXIT
LR
R4,R1
HOLD PL/I SPIE FOR LATER
*-----------------------------------------------------------------*
*
PREPARE PROPER PL/I DSA FOR FURTHER MURGLING
*
*-----------------------------------------------------------------*
LA
R0,88
LENGTH OF NEW DSA
L
R1,76(,R13)
-> FIRST AVAILABLE STORAGE
ALR
R0,R1
-> POSSIBLE END + 1
252
Teradata Parallel Data Pump Reference
Appendix C: INMOD and Notify Exit Routine Examples
PL/I INMOD
CL
R0,12(,R12)
ENOUGH ROOM FOR NEW DSA?
BNH
ENOUGH
YES, GO USE IT
L
R15,116(,R12)
NO, POINT TO OVERFLOW ROUTINE
BALR R14,R15
AND CALL IT
ENOUGH
ST
R0,76(,R1)
NEW FIRST AVAILABLE STORAGE
ST
R13,4(,R1)
BACK CHAIN SAVE AREAS
MVC
72(4,R1),72(R13)
COPY LIB WORKSPACE ADDRESS
LR
R13,R1
ADDRESS NEW DSA
MVI
0(R13),X’80’
SET FLAGS IN DSA TO
MVI
1(R13),X’00’
PRESERVE PL/I
MVI
86(R13),X’91’
ERROR HANDLING
MVI
87(R13),X’C0’
IN THE ASSEMBLER ROUTINE
*-----------------------------------------------------------------*
*
CALL USER PL/I ROUTINE WITH ORIGINAL BULK PARMS
*
*-----------------------------------------------------------------*
L
R2,4(,R13)
-> REGISTERS TO BLKASM
L
R2,4(,R2)
-> PREVIOUS REGISTERS
L
R2,4(,R2)
-> REGISTERS TO BLKPLI
L
R2,4(,R2)
-> REGISTERS TO BLKPLIA
L
R3,24(,R2)
-> PARMS TO BLKPLIA
L
R1,4(,R3)
-> EXIT ROUTINE WORD
L
R10,0(,R1)
-> WORK AREA
L
R10,WORKADDR
-> WORK AREA.G1_01.
USING WORKAREA,R10
CLC
WRKFLAG,WRKFLAGP
DID IT WORK?
BE
GOODWRK
YES, USE IT
ABEND 1,DUMP
NO, ABEND RIGHT HERE
GOODWRK STM
R12,R13,SAVE2
SAVE PL/I ENVIRONMENT
ST
R4,SPIEPLI
SAVE PL/I SPIE
L
R11,16(,R2)
-> BLKPLIA ENTRY POINT
DROP R11
BLKASM
USING DYNAMN,R11
AIN
L
R1,SPIEPLI
-> PLI SPIE
SPIE MF=(E,(1))
RESTORE PL/I SPIE
AGAIN
XR
R5,R5
MUST BE ZERO CALLING PL/I
OI
4(R3),X’80’
LAST PARAMETER
.G1_01.
LR
R1,R3
RESTORE ORIGINAL R1 .G1_01.
L
R15,=V(BLKEXIT)
-> USER ROUTINE
BALR R14,R15
CALL USER
*-----------------------------------------------------------------*
*
CHECK WHETHER OR NOT TO HOLD PL/I ENVIRONMENT
*
*-----------------------------------------------------------------*
L
R1,SPIEPAS
-> PASCALVS SPIE
SPIE MF=(E,(1))
RESTORE PASCALVS SPIE
L
R13,SAVE1+4
RETURN AROUND PL/I
B
RETURN
GO PERFORM RETURN
DROP R10
WORKAREA
DROP R11
BLKPLIA
LTORG
SPACE 2
NOSPIE
SPIE MF=L
SPACE 2
STRUC
DC
F’0’
OFFSET OF FIRST ELEMENT
DC
F’4’
OFFSET OF SECOND ELEMENT
DC
F’8’
OFFSET OF THIRD ELEMENT
DC
Y(31*1024,0)
31K FIXED LENGTH STRING
SPACE 2
PLIPARM DC
A(*+4+X’80000000’)
-> PL/I INITIAL ARGUMENT
DC
Y(L’PLIARG)
LENGTH OF PL/I INITIAL ARGUMENT
Teradata Parallel Data Pump Reference
253
Appendix C: INMOD and Notify Exit Routine Examples
PL/I INMOD
PLIARG
WORKADDR
WRKFLAGP
WRKFLAGL
WORKAREA
WRKFLAG
SAVE1
COMMAND
SAVE2
SPIEPAS
SPIEPLI
EXITPRM
AGGLOC
WORKALEN
R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R13
R14
R15
DC
SPACE
DS
DC
DC
EQU
SPACE
DSECT
DS
DS
DS
DS
DS
DS
DS
DS
DS
EQU
EQU
EQU
EQU
EQU
EQU
EQU
EQU
EQU
EQU
EQU
EQU
EQU
EQU
EQU
EQU
EQU
END
C’NOSTAE/’
DISABLE ERROR RECOVERY
2
F
ADDRESS FOR WORKAREA .G1_01.
C’BLKPLIA WORK AREA’
CL(((*-WRKFLAGP+7)/8*8)(*WRKFLAGP))’ ’ FILL TO DWORD
*-WRKFLAGP
2
CL(WRKFLAGL)
18F
F
2F
F
F
A
2A
0D
*-WORKAREA
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
DYNAMN
SAVE AREA FOR BULKPLI
ALIGN END OF WORK AREA
/*
//STEP2
EXEC PLIXC
//PLI.SYSLIN DD DSN=&&LOADSET2,DISP=(MOD,PASS),UNIT=VIO,
//
SPACE=(80,(250,100))
//PLI.SYSIN DD DATA,DLM=##
BLKPLI: /* BULK LOADER INTERFACE TO PL/I USER EXIT ROUTINE */
PROC OPTIONS (MAIN);
/* THIS PROGRAM IS CALLED BLKPLIA (THE SPECIAL EXIT ROUTINE ENTRY */
/* POINT PROGRAM, WRITTEN IN ASSEMBLER).
*/
/* IT THEN CALLS BLKASM (ANOTHER ENTRY POINT IN BLKPLIA).
*/
DCL BLKASM ENTRY;
CALL BLKASM;
END;
##
//STEP3
EXEC PLIXCL
//PLI.SYSIN DD DATA,DLM=##
BLKEXIT: PROCEDURE (X,Y);
/* ONLY BLKEXIT ACCEPTED HERE. */
DCL X FIXED,
Y FIXED;
DCL
1 PARM_LIST ALIGNED BASED(P),
10 STATUS
FIXED BINARY (31,0),
10 RLENGTH
FIXED BINARY (31,0),
10 BUFFER
CHAR(80);
254
Teradata Parallel Data Pump Reference
Appendix C: INMOD and Notify Exit Routine Examples
PL/I INMOD
DCL
1 PARM_PARM2 ALIGNED BASED(Q),
10 SEQ
FIXED BINARY (31,0),
10 LEN
FIXED BINARY (15,0),
10 PARAMETER
CHAR(80);
DCL COUNT STATIC FIXED BINARY (31,0),
INSROWS STATIC FIXED BINARY (31,0),
REJROWS STATIC FIXED BINARY (31,0;
DCL
I,NOTMATCH FIXED BINARY (31,0);
DCL
ADDR BUILTIN;
DCL
SUBSTR BUILTIN;
P=ADDR(X);
Q=ADDR(Y);
DISPLAY(’### INSIDE PL/I INMOD ROUTINE...’);
DISPLAY(P->STATUS);
DISPLAY(P->RLENGTH);
DISPLAY(P->BUFFER);
DISPLAY(Q->SEQ);
DISPLAY(Q->LEN);
DISPLAY(Q->PARAMETER);
SELECT (P->STATUS);
WHEN (6) DO; /* Initialize */
COUNT=0;
REJROWS=0;
INSROWS=0;
P->STATUS=0;
END;
WHEN (7) DO; /* Process */
DISPLAY('Processing...');
COUNT=COUNT+1;
NOTMATCH=0;
P->STATUS =0;
DO I = 1 TO Q->LEN;
IF SUBSTR(P->BUFFER,I,1) ^= SUBSTR(Q->PARAMETER,I,1)
THEN DO;
NOTMATCH = 1;
LEAVE; END;
END;
IF NOTMATCH = 1
THEN DO;
DISPLAY('------> REJECTED <--------');
REJROWS = REJROWS +1;
P->RLENGTH = 0;
END;
ELSE DO;
DISPLAY('------> accepted <--------');
INSROWS = INSROWS +1;
END;
END;
WHEN (5) DO; /* Finalizing */
DISPLAY('Finalizing...');
P->STATUS=0;
END;
OTHERWISE DO;
DISPLAY('UNKNOWN CODE...');
P->STATUS=99;
END;
END;
DISPLAY('P->STATUS=');DISPLAY(STATUS); DISPLAY('P->RLENGTH=');DISPLAY(RLENGTH);
DISPLAY('TOTAL =');DISPLAY(COUNT);
Teradata Parallel Data Pump Reference
255
Appendix C: INMOD and Notify Exit Routine Examples
PL/I INMOD
DISPLAY('INSERTS=');DISPLAY(INSROWS);
DISPLAY('REJROWS=');DISPLAY(REJROWS);
DISPLAY('--------------------------------------------------------');
END BLKEXIT;
##
//LKED.SYSIN DD *
INCLUDE BLKPLI
INCLUDE BLKPLIA
INCLUDE CLILIB(DBCMEM)
ENTRY DYNAMN
NAME INMDPL2(R)
/*
//LKED.BLKPLIA DD DSN=*.STEP1.ASM.SYSGO,DISP=(OLD,PASS),
//
VOL=REF=*.STEP1.ASM.SYSGO
//LKED.BLKPLI DD DSN=*.STEP2.PLI.SYSLIN,DISP=(OLD,PASS),
//
VOL=REF=*.STEP2.PLI.SYSLIN
//LKED.CLILIB DD DISP=SHR,DSN=STV.RG20APP.APP.L,UNIT=3380,
//
VOLUME=SER=CNFG03
//COPY EXEC PGM=IEBGENER
//SYSIN DD DUMMY
//SYSPRINT DD SYSOUT=*
//SYSUT2 DD DISP=(NEW,PASS),DSN=&&TEMP,UNIT=SYSDA,
//
DCB=(LRECL=80,BLKSIZE=1760,RECFM=FB),
//
SPACE=(CYL,(1,1),RLSE)
//SYSUT1 DD DATA,DLM=@@
("SASC") A0000000000000000000000000000A
A0000000000000000000000000000A
("COBOL") A0000000000000000000000000000A
("ASSEM") A0000000000000000000000000000A
("SASC") B1111111111111111111111111111B
("PASC") B1111111111111111111111111111B
("COBOL") B1111111111111111111111111111B
("ASSEM") B1111111111111111111111111111B
("SASC") C2222222222222222222222222222C
("PASC") C2222222222222222222222222222C
("COBOL") C2222222222222222222222222222C
("ASSEM") C2222222222222222222222222222C
("PL/I") C2222222222222222222222222222C
("SASC") D3333333333333333333333333333D
("PASC") D3333333333333333333333333333D
("PL/I") D3333333333333333333333333333D
("SASC") E4444444444444444444444444444E
("PASC") E4444444444444444444444444444E
("PL/I") E4444444444444444444444444444E
("SASC") F5555555555555555555555555555F
("PASC") F5555555555555555555555555555F
("PL/I") F5555555555555555555555555555F
@@
//**********************************************************************
//* THIS STEP WILL ONLY DROP THE TABLES IF TPump IS NOT IN APPLY PHASE *
//**********************************************************************
//CREATE
EXEC BTEQ
.LOGON TDP5/DMD,DMD;
/* INMOD TEST CASE II - PL/I
*/
RELEASE TPump DMD.INMODPL2;
.IF ERRORCODE = 2572 THEN .GOTO NODROP;
DROP TABLE DMD.LOGTABLE;
DROP TABLE DMD.ET_INMODPL2;
DROP TABLE DMD.UV_INMODPL2;
256
("PASC")
Teradata Parallel Data Pump Reference
Appendix C: INMOD and Notify Exit Routine Examples
PL/I INMOD
DROP TABLE DMD.WT_INMODPL2;
DROP TABLE DMD.INMODPL2;
.QUIT;
.LABEL NODROP;
.EXIT 4;
CREATE TABLE INMODPL2 (F1 CHAR(10), F2 CHAR(70));
##
//*****************************************************************
//*
*
//*
RUN TPump
*
//*
*
//*****************************************************************
//LOADIT
EXEC PGM=TPump,TIME=(,3)
//STEPLIB DD DSN=STV.RG20.APPLOAD,DISP=SHR
//
DD DSN=STV.EG14MLL1.APP.L,DISP=SHR
//
DD DSN=STV.TG13BLD.APP.L,DISP=SHR
//
DD DSN=TER2.SASC450F.LINKLIB,DISP=SHR
//
DD DSN=*.STEP3.LKED.SYSLMOD,DISP=(OLD,PASS),
//
VOL=REF=*.STEP3.LKED.SYSLMOD
//SYSPRINT DD SYSOUT=*
//SYSTERM DD SYSOUT=*
//SYSOUT
DD SYSOUT=*
//INDATA DD DISP=OLD,DSN=*.COPY.SYSUT2,DCB=(LRECL=80,RECFM=F),
//
VOL=REF=*.COPY.SYSUT2
//SYSIN
DD DATA,DLM=##
.LOGON TDP5/DMD,DMD;
.LOGTABLE DMD.LOGTABLE_SFD;
.BEGIN LOAD TABLES INMODPL2;
.Layout layname1;
.Field L1Fld1 1 Char(10);
.Field L1Fld2 * Char(30);
.Field L1Fld3 * Char(40);
.DML Label DML1;
INSERT INMODPL2(F1,F2) VALUES (:L1FLD1, :L1FLD2);
.IMPORT INFILE INDATA
INMOD INMDPL2 USING (“PL/I”) LAYOUT LAYNAME1 APPLY DML1;
.End LOAD;
.LOGOFF;
##
C INMOD - MVS
//JCKLC1 JOB 1,’JCK’,MSGCLASS=A,NOTIFY=JCK,CLASS=B,
REGION=4096K
//******************************************************************
//*
*
//*
IDENTIFY NECESSARY LOAD LIBRARIES FOR RELEASE
*
//*
*
//******************************************************************
//JOBLIB
DD DISP=SHR,DSN=STV.GG10.APP.L
//
DD DISP=SHR,DSN=STV.GG00.APP.L
//
DD DISP=SHR,DSN=STV.TG00.APP.L
//
DD DISP=SHR,DSN=STV.RG00.APP.L
//
DD DISP=SHR,DSN=TER2.SASC301H.LINKLIB
//C
EXEC PGM=LC370B
//STEPLIB
//
//SYSTERM
//SYSPRINT
//SYSUT1
DD
DD
DD
DD
DD
DSN=TER2.SASC301H.LOAD,DISP=SHR
DSN=TER2.SASC301H.LINKLIB,DISP=SHR
SYSOUT=*
SYSOUT=*
UNIT=SYSDA,SPACE=(TRK,(10,10))
Teradata Parallel Data Pump Reference
257
Appendix C: INMOD and Notify Exit Routine Examples
PL/I INMOD
//SYSUT2
DD UNIT=SYSDA,SPACE=(TRK,(10,10))
//SYSLIN
DD DSN=&&OBJECT,SPACE=(3200,(10,10)),DISP=(MOD,PASS),
//
UNIT=SYSDA
//SYSLIB
DD DSN=TER2.SASC301H.MACLIBC,DISP=SHR
//SYSDBLIB DD DSN=&&DBGLIB,SPACE=(4080,(20,20,1)),DISP=(,PASS),
//
UNIT=SYSDA,DCB=(RECFM=U,BLKSIZE=4080)
//SYSTMP01 DD UNIT=SYSDA,SPACE=(TRK,25)
VS1 ONLY
//SYSTMP02 DD UNIT=SYSDA,SPACE=(TRK,25)
VS1 ONLY
//SYSIN DD DATA,DLM=##
/* This program is for TPump INMOD testing using C user exit routine.
When this routine is activated it looks at the content of the
function code passed (a->code) and depending on its value, it
0) initializes, i.e., opens a file, etc...
1) reads a record
5) acknowledges “close inmod” request. The user exit routine
must return “return code”(a->code) and “length” (a->len). You
should send return code = zero when no errors occur and non-zero for
an error. TPump expects length = zero at the end of file. Then
it sends “CLOSE INMOD” request. THE USER EXIT routine must
explicitly return “return code” = ZERO to terminate the
conversation.
*/
#include <stddef.h>
#include <stdlib.h>
#include <stdio.h>
typedef unsigned short
Int16;
typedef unsigned char
Int8;
typedef unsigned long int Int32;
/* PASSING parameter structures
*/
typedef struct {
Int32 code;
Int32 len;
Int8 buf[80];
} inmodbuf;
typedef struct {
Int32 seq;
Int16 len;
char param[80];
} inmodpty;
static FILE *IN;
static int count=0;
char *memcpy();
void _dynamn(a,b)
inmodbuf *a;
inmodpty *b;
{int code=0;
char tempbuf[80];
memcpy(tempbuf,a->buf,sizeof(a->buf));
tempbuf[79]=’\0’;
printf(“BEGIN--> %d %d %s\n”,a->code,a->len,tempbuf);
printf(“
+++ %d %d %s\n”,b->seq ,b->len,b->param);
code= (int) a->code;
switch (code) {
258
Teradata Parallel Data Pump Reference
Appendix C: INMOD and Notify Exit Routine Examples
PL/I INMOD
case 0:
/* Here you open the file and read the first record */
printf(“## CODE=0, openinig...\n”);
IN=fopen(“ddn:INDATA”,“rb”);
if (! ferror(IN)) {
if (! readrecord(a))
fclose(IN);
};
break;
case 1:
/* TPump requested next record, read it */
printf(“## CODE=1, reading...\n”);
if (! readrecord(a))
fclose(IN);
break;
case 5:
/* TPump is closing INMOD routine */
a->code=0;
a->len=0;
printf(“## CODE=5, terminating...\n”);
break;
default:
a->code=12; /* any number not = to zero */
a->len=0;
printf(“##### UNKNOWN code ######\n”);a->code=0;a->len=0;
};
memcpy(tempbuf,a->buf,sizeof(a->buf));
tempbuf[79]=’\0’;
printf(“END --> %d %d %s\n”,a->code,a->len,tempbuf);
printf(“
+++ %d %d %s\n”,b->seq ,b->len,b->param);
}
int readrecord(a)
inmodbuf *a;
{
int rtn=0;
char tempbuf[80];
if (fread((char *)&(a->buf),sizeof(a->buf),1,IN)) {
count++;
memcpy(tempbuf,a->buf,sizeof(a->buf));
tempbuf[79]=’\0’;
printf(“
%d %s \n”,count,tempbuf);
a->len=80;
a->code=0;
rtn=1;
};
if ferror(IN) {
printf(“==== error ====\n”);
a->code=16; /* any non zero number */
a->len=0;
};
if feof(IN) {
/* EOF, set length = zero */
printf(“=== EOF ===\n”);
a->code=9;
a->len=9;
Teradata Parallel Data Pump Reference
259
Appendix C: INMOD and Notify Exit Routine Examples
PL/I INMOD
};
return(rtn);
}
##
//LKED EXEC PGM=LINKEDIT,PARM=’LIST,MAP’,COND=(8,LT,C)
//SYSPRINT DD SYSOUT=*,DCB=(RECFM=FBA,LRECL=121,BLKSIZE=1210)
//SYSTERM
DD SYSOUT=*00153 //SYSLIN
DD
DSN=*.C.SYSLIN,DISP=(OLD,PASS),VOL=REF=*.C.SYSLIN
//
DD DDNAME=SYSIN
//SYSLIB
DD DSN=TER2.SASC301H.SUBLIB,DISP=SHR
//SYSUT1
DD DSN=&&SYSUT1,UNIT=SYSDA,DCB=BLKSIZE=1024,
//
SPACE=(1024,(200,50))00158
//SYSLMOD DD DSN=JCK.INMOD.LOAD(INMODG1),DISP=MOD,UNIT=3380,
//
VOLUME=SER=TSO805
//SYSIN
DD DATA,DLM=##
NAME INMODG1(R)
##
//BDLDEL
EXEC PGM=IEFBR1400164
//BDLCAT
EXEC PGM=TPUMP
//STEPLIB DD DSN=STV.GG00.APP.L,DISP=SHR
//
DD DSN=STV.TG00.APP.L,DISP=SHR
//
DD DSN=STV.RG00.APP.L,DISP=SHR
//SYSPRINT DD SYSOUT=*00171
//
UNIT=SYSDA,DCB=(RECFM=F,DSORG=PS,LRECL=8244),
//
SPACE=(8244,(12,5))
//SYSIN
DD *
//*******************************************************************
//* THIS STEP WILL ONLY DROP THE TABLES IF TPump NOT IN APPLY PHASE *
//*******************************************************************
//CREATE
EXEC BTEQ
//STEPLIB DD DSN=STV.GG00.APP.L,DISP=SHR
//
DD DSN=STV.TG00.APP.L,DISP=SHR
//
DD DSN=STV.RG00.APP.L,DISP=SHR
//SYSPRINT DD SYSOUT=*
//SYSABEND DD SYSOUT=*
//SYSIN
DD DATA,DLM=##
.LOGON TDQ8/DBC,DBC;
DROP TABLE XXXX.LOGTABLE;
DROP TABLE XXXX.ET_INMODLC1;
DROP TABLE XXXX.UV_INMODLC1;
DROP TABLE XXXX.WT_INMODLC1;
.QUIT;
.LABEL NODROP;
.EXIT 4;
##
//******************************************************************
//*
*
//*
RUN TPump
*
//*
*
//******************************************************************
//LOADIT
EXEC PGM=TPump
//STEPLIB DD DISP=SHR,DSN=STV.GG10.APP.L
//
DD DISP=SHR,DSN=STV.GG00.APP.L
//
DD DISP=SHR,DSN=STV.TG00.APP.L
//
DD DISP=SHR,DSN=STV.RG00.APP.L
//
DD DISP=SHR,DSN=TER2.SASC301H.LINKLIB
//
DD DISP=SHR,DSN=JCK.INMOD.LOAD,VOLUME=SER=TSO805,
//
UNIT=338
//SYSPRINT DD SYSOUT=*
260
Teradata Parallel Data Pump Reference
Appendix C: INMOD and Notify Exit Routine Examples
C INMOD - UNIX
//SYSTERM DD SYSOUT=*
//SYSOUT
DD SYSOUT=*
//SYSIN
DD DATA,DLM=##
.LOGTABLE XXXX.LOGTABLE;
.LOGON TDQ8/XXXX,XXXX;
/* TEST DATAIN, DATALOC
*/
DROP TABLE XXXX.INMODLC1;
CREATE TABLE INMODLC1 (F1 CHAR(10), F2 CHAR(70));
.BEGIN LOAD TABLES INMODLC1;
.Layout layname1;
.Field L1Fld1 1 Char(10);
.Field L1Fld2 * Char(70);
.DML Label DML1;
INSERT INMODLC1(F1,F2) VALUES (:L1FLD1, :L1FLD2);
.IMPORT INMOD INMODG1 USING (“AAA” “BBB”) LAYOUT LAYNAME1 APPLY DML1;
.End LOAD;
.LOGOFF;
##
//INDATA DD DATA,DLM=##
01C
AAAAAAAAAAAAAAAA
02C
BBBBBBBBBBBBBBBB
03C
CCCCCCCCCCCCCCCC
04C
DDDDDDDDDDDDDDDD00229 ##
//SELECT
EXEC BTEQ
//STEPLIB DD DSN=STV.GG00.APP.L,DISP=SHR
//
DD DSN=STV.TG00.APP.L,DISP=SHR
//
DD DSN=STV.RG00.APP.L,DISP=SHR
//SYSPRINT DD SYSOUT=*
//SYSABEND DD SYSOUT=*
//SYSIN
DD DATA,DLM=##
.LOGON TDQ8/XXXX,XXXX;
SELECT * FROM INMODLC1;
.LOGOFF;
##
C INMOD - UNIX
/* This program is for TPump INMOD testing using C user exit routine.
When this routine is activated it looks at the content of the
function code passed (a->code) and depending on its value, it
0) initializes, i.e., opens a file, etc...
1) reads a record
5) acknowledges “close inmod” request. The user exit routine
must return “return code”(a->code) and “length” (a->len). You
should send return code = zero when no errors occur and non-zero for
an error. TPump expects length = zero at the end of file. Then
it sends “CLOSE INMOD” request. THE USER EXIT routine must
explicitly return “return code” = ZERO to terminate the
conversation.
*/
#include <stddef.h>
#include <stdlib.h>
#include <stdio.h>
typedef unsigned short
Int16;
typedef unsigned char
Int8;
typedef unsigned long int Int32;
Teradata Parallel Data Pump Reference
261
Appendix C: INMOD and Notify Exit Routine Examples
C INMOD - UNIX
/* PASSING parameter structures
*/
typedef struct {
Int32 code;
Int32 len;
Int8 buf[80];
} inmodbuf;
typedef struct {
Int32 seq;
Int16 len;
char param[80];
} inmodpty;
static FILE *IN;
static int count=0;
char *memcpy();
void _dynamn(a,b)
inmodbuf *a;
inmodpty *b;
{int code=0;
char tempbuf[80];
memcpy(tempbuf,a->buf,sizeof(a->buf));
tempbuf[79]=’\0’;
printf(“BEGIN--> %d %d %s\n”,a->code,a->len,tempbuf);
printf(“
+++ %d %d %s\n”,b->seq ,b->len,b->param);
code= (int) a->code;
switch (code) {
case 0:
/* Here you open the file and read the first record */
printf(“## CODE=0, openinig...\n”);
IN=fopen(“ddn:INDATA”,“rb”);
if (! ferror(IN)) {
if (! readrecord(a))
fclose(IN);
};
break;
case 1:
/* TPump requested next record, read it */
printf(“## CODE=1, reading...\n”);
if (! readrecord(a))
fclose(IN);
break;
case 5:
/* TPump is closing INMOD routine */
a->code=0;
a->len=0;
printf(“## CODE=5, terminating...\n”);
break;
default:
a->code=12; /* any number not = to zero */
a->len=0;
printf(“##### UNKNOWN code ######\n”);a->code=0;a->len=0;
};
memcpy(tempbuf,a->buf,sizeof(a->buf));
262
Teradata Parallel Data Pump Reference
Appendix C: INMOD and Notify Exit Routine Examples
Sample Notify Exit Routine
tempbuf[79]=’\0’;
printf(“END --> %d
printf(“
+++ %d
%d
%d
%s\n”,a->code,a->len,tempbuf);
%s\n”,b->seq ,b->len,b->param);
}
int readrecord(a)
inmodbuf *a;
{
int rtn=0;
char tempbuf[80];
if (fread((char *)&(a->buf),sizeof(a->buf),1,IN)) {
count++;
memcpy(tempbuf,a->buf,sizeof(a->buf));
tempbuf[79]=’\0’;
printf(“
%d %s \n”,count,tempbuf);
a->len=80;
a->code=0;
rtn=1;
};
if ferror(IN) {
printf(“==== error ====\n”);
a->code=16; /* any non zero number */
a->len=0;
};
if feof(IN) {
/* EOF, set length = zero */
printf(“=== EOF ===\n”);
a->code=9;
a->len=9;
};
return(rtn);
}
Sample Notify Exit Routine
The following is the listing of tldnfyxt.c, the sample notify exit routine that is provided with
TPump software.
/*********************************************************************
*
*
* tldnfyxt.c
- Sample Notify Exit for Tpump.
*
*
*
* Copyright 1997-2007, NCR Corporation. ALL RIGHTS RESERVED.
*
*
*
* Purpose
- This is a sample notify exit for Tpump .
*
*
*
* Execute
- Build Notify on a Unix system
*
*
compile and link into shared object
*
*
cc -G
tldnfyxt.c - o libtldnfyxt.so
*
*
- Build Notify on a Win32 system
*
*
compile and link into dynamic link library
*
*
cl /DWIN32 /LD
tldnfyxt.c
*
*
- Build Notify on AIX system
*
Teradata Parallel Data Pump Reference
263
Appendix C: INMOD and Notify Exit Routine Examples
Sample Notify Exit Routine
*
cc -c -brtl -qalign=packed tldnfyxt.c
*
*
ld -G -e_dynamn -bE:export_dynamn.txt tldnfyxt.o
*
*
-o libtldnfyxt.so -lm -lc
*
*
where export_dynamn.txt conaints the symbol "_dynamn"*
*
- Build Notify on Linux system
*
*
gcc -shared -fPIC tldnfyxt.c -o libtldnfyxt.so
*
*
*
/*********************************************************************
#include <stdio.h>
typedef unsigned long UInt32;
#define NOTIFYID_FASTLOAD
1
#define NOTIFYID_MULTILOAD
2
#define NOTIFYID_FASTEXPORT
3
#define NOTIFYID_BTEQ
4
#define NOTIFYID_TPUMP
5
#define
#define
#define
#define
#define
#define
MAXVERSIONIDLEN
32
MAXUTILITYNAMELEN 36
MAXUSERNAMELEN
64
MAXUSERSTRLEN
80
MAXTABLENAMELEN
128
MAXFILENAMELEN
256
typedef enum {
NMEventInitialize
NMEventFileInmodOpen
NMEventCkptBeg
NMEventImportBegin
NMEventImportEnd
NMEventErrorTable
NMEventDBSRestart
NMEventCLIError
NMEventDBSError
NMEventExit
NMEventTableStats
NMEventCkptEnd
NMEventRunStats
NMEventDMLError
} NfyTLDEvent;
=
=
=
=
=
=
=
=
=
=
=
=
=
=
0,
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13
#define TIDUPROW 2816
typedef enum {
DEFeedbackDefault
= 0,
DEFeedbackNoLogging = 1
} DMLErrorFeedbackType;
typedef struct _TLNotifyExitParm {
long Event;
/* should be NfyTLDEvent values */
union {
struct {
int VersionLen;
char
VersionId[MAXVERSIONIDLEN];
int UtilityId;
int UtilityNameLen;
char
UtilityName[MAXUTILITYNAMELEN];
int UserNameLen;
char
UserName[MAXUSERNAMELEN];
int UserStringLen;
264
Teradata Parallel Data Pump Reference
Appendix C: INMOD and Notify Exit Routine Examples
Sample Notify Exit Routine
char
UserString[MAXUSERSTRLEN];
} Initialize;
struct {
int nImport;
} ImpStart;
struct {
UInt32 FileNameLen;
char FileOrInmodName[MAXFILENAMELEN];
UInt32 nImport;
} FileOpen ;
struct {
unsigned long Records;
} CheckPt;
struct {
char *TableName;
unsigned long Rows;
} ETDrop ;
struct {
long ReturnCode;
} Exit;
struct {
int nImport;
unsigned long RecsIn;
unsigned long RecsSkipped;
unsigned long RecsRejd;
unsigned long RecsOut;
unsigned long RecsError;
} Complete;
struct {
char type;
char *dbasename;
char *szName;
unsigned long Activity;
} TableStats;
struct {
UInt32 ErrorCode;
} DBSError;
struct {
UInt32 ErrorCode;
} CLIError;
struct {
int nImport;
unsigned long nSQLstmt;
unsigned long nReqSent;
unsigned long RecsIn;
unsigned long RecsSkipped;
unsigned long RecsRejd;
unsigned long RecsOut;
unsigned long RecsError;
} Stats;
struct {
UInt32 nImport;
UInt32 ErrorCode;
char *ErrorMsg;
UInt32 nRecord;
unsigned char nApplySeq;
unsigned char nDMLSeq;
unsigned char nSMTSeq;
char *ErrorData;
Teradata Parallel Data Pump Reference
265
Appendix C: INMOD and Notify Exit Routine Examples
Sample Notify Exit Routine
UInt32 ErrorDataLen;
UInt32 *feedback;
} DMLError;
} Vals;
} TLNotifyExitParm;
#ifdef I370
#define TLNfyExit MLNfEx
#endif
extern long TLNfyExit(
#ifdef __STDC__
TLNotifyExitParm *Parms
#endif
);
#ifdef WIN32
/* Change for WIN32 */
__declspec(dllexport) long _dynamn(TLNotifyExitParm *P)
#else
long _dynamn(P)
TLNotifyExitParm *P;
#endif
{
FILE *fp;
int i;
if (!(fp = fopen("NFYEXIT.OUT", "a")))
return(1);
switch(P->Event) {
case NMEventInitialize :
fprintf(fp, "exit called @ Tpump init.\n");
fprintf(fp, "Version: %s\n", P->Vals.Initialize.VersionId);
P->Vals.Initialize.UtilityName[MAXUTILITYNAMELEN] = '\0';
fprintf(fp, "Utility: %s\n", P->Vals.Initialize.UtilityName);
fprintf(fp, "User: %s\n", P->Vals.Initialize.UserName);
if (P->Vals.Initialize.UserStringLen)
fprintf(fp, "UserString: %s\n", P->Vals.Initialize.UserString);
break;
case NMEventFileInmodOpen :
fprintf(fp, "Exit called @ File/Inmod Open\n"
"File/Inmod Name : %s Import : %d\n",
P->Vals.FileOpen.FileOrInmodName,
P->Vals.FileOpen.nImport);
break;
case NMEventCkptBeg :
fprintf(fp,"exit called @ checkpoint begin : %u Records .\n",
P->Vals.CheckPt.Records);
break;
case NMEventCkptEnd :
fprintf(fp,"exit called @ checkpoint End :
P->Vals.CheckPt.Records);
break;
%u Records Sent.\n",
case NMEventCLIError :
fprintf(fp, "exit called @ CLI error %d\n",
P->Vals.CLIError.ErrorCode);
break;
266
Teradata Parallel Data Pump Reference
Appendix C: INMOD and Notify Exit Routine Examples
Sample Notify Exit Routine
case NMEventErrorTable :
fprintf(fp,"exit called @ Error Table : %s
"
"%u logable records.\n",
P->Vals.ETDrop.TableName, P->Vals.ETDrop.Rows);
break;
case NMEventDBSError :
fprintf(fp, "exit called @ DBS error %d\n",
P->Vals.DBSError.ErrorCode);
break;
case NMEventImportBegin: /*DR51679 event name should be consistent */
fprintf(fp, "exit called @ import %d starting. \n",
P->Vals.ImpStart.nImport);
break;
case NMEventImportEnd : /*DR51679 event name should be consistent */
fprintf(fp, "exit called @ import %d
ending
\n",
P->Vals.Complete.nImport);
fprintf(fp,
"Total Records Read: %u \nRecords Skipped
"
"%u \nUnreadable Records:%u \nRecords Sent:
"
"%u \nData Errors :
%u \n",
P->Vals.Complete.RecsIn,
P->Vals.Complete.RecsSkipped,
P->Vals.Complete.RecsRejd,
P->Vals.Complete.RecsOut,
P->Vals.Complete.RecsError);
break;
case NMEventDBSRestart :
fprintf(fp, "exit called @ RDBMS restarted\n");
break;
case NMEventExit :
fprintf(fp, "exit called @ tpump notify out of scope:"
" return code %d.\n",
P->Vals.Exit.ReturnCode);
break;
case NMEventTableStats:
fprintf(fp,"exit called @ Table Stats: \n");
if(P->Vals.TableStats.type == 'I')
fprintf(fp,"Rows Inserted
:
"
"%u \nTable/Macro Name : %s \nDatabase Name"
"
:
%s \n",
P->Vals.TableStats.Activity,
P->Vals.TableStats.szName,
P->Vals.TableStats.dbasename);
if(P->Vals.TableStats.type == 'U')
fprintf(fp,"Rows Updated
:
"
"%u \nTable/Macro Name : %s \nDatabase Name"
"
:
%s \n",
P->Vals.TableStats.Activity,
P->Vals.TableStats.szName,
P->Vals.TableStats.dbasename);
if(P->Vals.TableStats.type == 'D')
fprintf(fp,"Rows Deleted
:
"
"%u \nTable/Macro Name :
%s \nDatabase Name"
Teradata Parallel Data Pump Reference
267
Appendix C: INMOD and Notify Exit Routine Examples
Sample Notify Exit Routine
"
: %s \n",
P->Vals.TableStats.Activity,
P->Vals.TableStats.szName,
P->Vals.TableStats.dbasename);
break;
case NMEventRunStats :
fprintf(fp, "exit called @ states\n");
fprintf(fp, "import %d \n",
P->Vals.Stats.nImport);
fprintf(fp,
"Total SQL Statements: %u \nRequest Sent: %u \n"
"Records Read: %u \nRecords Skipped: %u \n"
"nUnreadable Records: %u \nRecords Sent: %u \n"
"Data Errors : %u \n",
P->Vals.Stats.nSQLstmt,
P->Vals.Stats.nReqSent,
P->Vals.Stats.RecsIn,
P->Vals.Stats.RecsSkipped,
P->Vals.Stats.RecsRejd,
P->Vals.Stats.RecsOut,
P->Vals.Stats.RecsError);
break;
case NMEventDMLError :
fprintf(fp, "exit called @ DML error \n");
fprintf(fp, "import %d \n",
P->Vals.DMLError.nImport);
fprintf(fp,
"Error code: %u \nError text: %s \n"
"Record number: %u \nApply number: %d \n"
"DML number: %d \nStatement number: %d \n"
"Error data length : %u \n"
"feedback : %u \n",
P->Vals.DMLError.ErrorCode,
P->Vals.DMLError.ErrorMsg,
P->Vals.DMLError.nRecord,
P->Vals.DMLError.nApplySeq,
P->Vals.DMLError.nDMLSeq,
P->Vals.DMLError.nSMTSeq,
P->Vals.DMLError.ErrorDataLen,
*(P->Vals.DMLError.feedback));
fprintf(fp, "Error data: ");
for (i=0 ;i<P->Vals.DMLError.ErrorDataLen; i++) {
fprintf(fp, "%c", P->Vals.DMLError.ErrorData[i]);
}
fprintf(fp, "\n");
if (P->Vals.DMLError.ErrorCode == TIDUPROW) {
*(P->Vals.DMLError.feedback) = DEFeedbackNoLogging;
fprintf(fp, "Returning feedback = %u \n",
DEFeedbackNoLogging);
}
break;
default :
fprintf(fp,"\nAn Invalid Event Passed to the Exit Routine\n");
break;
}
268
Teradata Parallel Data Pump Reference
Appendix C: INMOD and Notify Exit Routine Examples
Sample Notify Exit Routine
fclose(fp);
return(0);
}
Teradata Parallel Data Pump Reference
269
Appendix C: INMOD and Notify Exit Routine Examples
Sample Notify Exit Routine
270
Teradata Parallel Data Pump Reference
Glossary
Numeric
24x7 Lights Out Operations: The use of Systems Management tools to ensure the reliable
movement and update of data from operational systems to analytical systems.
2PC:
Two-Phase Commit
A
abend: Abnormal END of task. Termination of a task prior to its completion because of an
error condition that cannot be resolved by the recovery facilities that operate during
execution.
ABORT: In Teradata SQL, a statement that stops a transaction in progress and backs out
changes to the database only if the conditional expression associated with the abort statement
is true.
Access Lock: A lock that allows selection of data from a table that may be locked for write
access. The Teradata MultiLoad utility maintains access locks against the target tables during
the Acquisition Phase.
Access Module: A software component that provides a standard set of I/O functions to
access data on a specific device.
Access Module Processor (AMP): A virtual processor that receives steps from a parsing
engine (PE) and performs database functions to retrieve or update data. Each AMP is
associated with one virtual disk, where the data is stored. An AMP manages only its own
virtual disk and not the virtual disk of any other AMP.
access right: A user’s right to perform the Teradata SQL statements granted to him against a
table, database, user, macro, or view. Also known as privilege.
account: The distinct account name portion of the system account strings, excluding the
performance group designation. Accounts can be employed wherever a user object can be
specified.
Acquisition Lock: A lock that is a flag in the table header that effectively rejects certain types
of Teradata SQL access statements. An acquisition lock allows all concurrent DML access and
the DROP DDL statement, and rejects DDL statements other than DROP.
Acquisition Phase: Responsible for populating the primary data subtables of the work
tables. Data are received from the host, converted into internal format, and inserted into the
work tables. The work tables will be sorted at the end of the Acquisition Phase and prior to the
Application Phase.
Teradata Parallel Data Pump Reference
271
Glossary
action definition:
attributes.
A logical action consisting of a single physical action and related
active data warehouse (ADW): An active data warehouse provides information that enables
decision-makers within an organization to manage customer relationships quickly, efficiently
and proactively. Active data warehousing is about integrating advanced decision support with
day-to-day, even minute-to-minute decision making that increases quality which encourages
customer loyalty and thus secures an organization's bottom line. The market is maturing as it
progresses from first-generation “passive” decision-support systems to current- and
next-generation “active” data warehouse implementations.
Active Database: Active database systems integrate event-based rule processing with
traditional database functionality. The behavior of the database is achieved through a set of
Event-Condition-Action rules associated with the database. When an event is detected the
relevant rules fire. Firing of a rule implies evaluating a condition on the database and carrying
out the corresponding action. An active database system derives its power from the variety of
events it can respond to and the kind of actions it can perform in response.
Ad Hoc Query:
issued.
Any query that cannot be determined prior to the moment the query is
administrator:
A special user responsible for allocating resources to a community of users.
Aggregation: Used in the broad sense to mean aggregating data horizontally, vertically, and
chronologically.
all joins: In Teradata SQL, a join is a SELECT operation that allows you to combine
columns and rows from two or more tables to produce a result. Join types restricted by DWM
are: inner join, outer join, merge join, product join, and all joins.
All joins are a combination of the above types, depending on how the user selects the
information to be returned. In addition to the four types listed above, selecting all joins may
include an exclusion join, nested join, and RowID join.
allocation group: (AG) A set of parameters that determine the amount of resources
available to the sessions assigned to a PG referencing a specific AG. Has an assigned weight
that is compared to other AG weights. An AG can limit the total amount of CPU used by
sessions under its control.
AMP: Access Module Processor (UNIX-based systems), a type of virtual processor (vproc)
that controls the management of the Teradata Database and the disk subsystem, with each
AMP being assigned to a virtual disk (vdisk). For more information, see the Introduction to
Teradata Warehouse.
AMP worker task: (AWT) Processes (threads on some platforms) dedicated to servicing the
Teradata Database work requests. For each AMP vproc, a fixed number of AWTs are preallocated during Teradata Database initialization. Each AWT looks for a work request to arrive
in the Teradata Database, services the request, and then looks for another. An AWT can
process requests of any work type. Each Teradata Database query is composed of a series of
work requests that are performed by AWTs. Each work request is assigned a work type
272
Teradata Parallel Data Pump Reference
Glossary
indicating when the request should be executed relative to other work requests waiting to
execute.
Analytical Data Store: Useful in making strategic decisions, this data storage area maintains
summarized or historical data. This stored data is time variant, unlike operational systems
which contain real-time data. Information contained in this data store is determined and
collected based on the corporate business rules.
ANSI: American National Standards Institute. ANSI maintains a standard for SQL. For
information about Teradata compliance with ANSI SQL, see the SQL Reference: Fundamentals.
AP:
Application Processor
APE: Alert Policy Editor. Use this Teradata Manager component to define alert policies:
create actions, set event thresholds, assign actions to events, and apply the policy to the
Teradata Database.
APH:
Alternate Parcel Header.
Application Lock: A flag set in the table header of a target table indicating that the
Application Phase is in progress. An application lock allows all concurrent access lock select
access and the DROP DDL statement, and rejects all other DML and DDL statements.
Application Lifecycle:
Includes the following three stages:
•
process and change management
•
analysis and design
•
construction and testing
Application Phase: Responsible for turning rows from a work table into updates, deletes,
and inserts and applying them to a single target table.
APRC:
Application Processor Reset Containment
API: Application Program Interface. An interface (calling conventions) by which an
application program accesses an operating system and other services. An API is defined at
source code level and provides a level of abstraction between the application and the kernel
(or other privileged utilities) to ensure the portability of the code.
An API can also provide an interface between a high level language and lower level utilities
and services written without consideration for the calling conventions supported by compiled
languages. In this case, the API may translate the parameter lists from one format to another
and the interpret call-by-value and call-by-reference arguments in one or both directions.
Architecture: A definition and preliminary design which describes the components of a
solution and their interactions. An architecture is the blueprint by which implementers
construct a solution which meets the users’ needs.
ARCMAIN: ARC executable that extracts (or inserts) database headers and data rows from
the HUT (Host UTility) archive interface.
Teradata Parallel Data Pump Reference
273
Glossary
ASCII: American Standard Code for Information Interchange, a character set used
primarily on personal computers.
Availability: A measure of the percentage of time that a computer system is capable of
supporting a user request. A system may be considered unavailable as a result of events such as
system failures or unplanned application outages.
B
B Tree: An indexing technique in which pointers to data are kept in a structure such that all
referenced data is equally accessible in an equal time frame.
BAR: Backup and restore; also referred to as Backup/Archive/Restore; a software and
hardware product set.
BLOB: An acronym for binary large object. A BLOB is a large database object that can be
anything that doesn’t require character set conversion. This includes MIDI, MP3, PDF,
graphics and much more. BLOBs can be up to 2 GB in size.
BTEQ: Basic Teradata Query facility. A utility that allows users on a workstation to access
data on a Teradata Database, and format reports for both print and screen output.
Business-Driven: An approach to identifying the data needed to support business activities,
acquiring or capturing those data, and maintaining them in a data resource that is readily
available.
bypass objects: Specific users, groups, and accounts can be set up to circumvent DWM
query management by declaring them to be bypassed. Basically, this turns off the DWM query
checking mechanism for all of the requests issued by those users and/or using those accounts.
C
Call-Level Interface Version 2 (CLIv2): A collection of callable service routines that provide
an interface to the Teradata Database. Specifically, CLI is the interface between the application
program and the Micro Teradata Directory Program (for network-attached clients). CLI
builds parcels that MTDP packages for sending to the Teradata Database using the Micro
Operating System Interface (for network-attached clients), and provides the application with
a pointer to each of the parcels returned from the Teradata Database.
Capture:
The process of capturing a production data source.
cardinality: In set theory, cardinality refers to the number of members in the set. When
specifically applied to database theory, the cardinality of a table refers to the number of rows
contained in a table.
CASE:
Computer Aided Software Engineering.
Change Data Capture: The process of capturing changes made to a production data source.
Change data capture is typically performed by reading the source DBMS log. It consolidates
units of work, ensures data is synchronized with the original source, and reduces data volume
in a data warehousing environment.
274
Teradata Parallel Data Pump Reference
Glossary
channel-attached: A mainframe computer that communicates with a server (for example, a
Teradata RDBMS) through a channel driver.
Character Set: A grouping of alphanumeric and special characters used by computer
systems to support different user languages and applications. Various character sets have been
codified by the American National Standards Institute (ANSI).
Checkpoint Rate: The interval between checkpoint operations during the Acquisition Phase
of a MultiLoad import task expressed as either the number of rows read from your client
system or sent to the Teradata Database, or an amount of time, in minutes.
CICS:
Customer Information Control System
CLI: Call-Level Interface. The interface between the application program and the MTDP
(for network-attached clients) or TDP (for channel-attached clients). CLIv2 refers to version
two of the interface.
Client:
A computer that can access the Teradata Database.
CLIv2: Call-Level Interface Version 2. The interface between the application program and
the MTDP (for network-attached clients) or TDP (for channel-attached clients).
CLIv2so: Call-Level Interface Version 2 Shared Object (CLIv2so); this program installs the
CLI libraries required by other utilities. When the CLIv2so program submits a request to a
Teradata Database, CLI Library components transform the request into Teradata Database
formats. The CLI Library sends requests to, and receives responses from, the Teradata
Database over a network.
client-server environment: The distribution of work on a LAN in which the processing of
an application is divided between a front-end client and a back-end server, resulting in faster,
more efficient processing. The server performs shared functions such as managing
communication and providing database services. The client performs individual user
functions such as providing customized interfaces, performing screen-to-screen navigation,
and offering help functions.
CMS:
Conventional Monitor System
CLOB: An acronym for character large object. A CLOB is a pure character-based large
object in a database. It can be a large text file. HTML, RTF or other character-based file.
CLOBs can be up 2 GB in size. Also see BLOB and LOB.
Cluster: Logical, table-level archive whereby only those rows residing on specific AMPs, and
which are members of the specified cluster, are archived onto a single tape data set. This allows
multiple jobs to be applied for backup of large tables, to reduce the backup window. This
method is used to affect a parallel archive/restore operation via a “divide and conquer” backup
strategy.
COBOL:
Common Business-Oriented Language
Coexistence System:
Teradata Parallel Data Pump Reference
A Teradata system running on mixed platforms
275
Glossary
column: In the relational model of Teradata SQL, databases consist of one or more tables. In
turn, each table consists of fields, organized into one or more columns by zero or more rows.
All of the fields of a given column share the same attributes.
COP: Communications Processor. One kind of interface processor (IFP) on the Teradata
Database. A COP contains a gateway process for communicating with workstations via a
network.
COP Interface: Workstation-resident software and hardware, and Teradata Databaseresident software and hardware, that allows workstations and the Teradata Database to
communicate over networks.
CPU: Central processing unit.
D
DASD: Direct access storage device (pronounced DAZ-dee). A general term for magnetic
disk storage devices that has historically been used in the mainframe and minicomputer (midrange computer) environments. When used, it may also include hard disk drives for personal
computers. A recent form of DASD is the redundant array of independent disks (RAID).
The “direct access” means that all data can be accessed directly in about the same amount of
time rather than having to progress sequentially through the data.
database: A related set of tables that share a common space allocation and owner. A
collection of objects that provide a logical grouping for information. The objects include,
tables, views, macros, triggers, and stored procedures.
Data Cardinality: Cardinality is a property of data elements which indicates the number of
allowable entries in the element. A data element such as gender only allows two entries (male
or female) and is said to possess low cardinality. Data elements for which many allowable
entries are possible, such as age or income are said to have high cardinality.
Data Definition Language (DDL): In Teradata SQL, the statements and facilities that
manipulate database structures (such as CREATE, MODIFY, DROP, GRANT, REVOKE, and
GIVE) and the Data Dictionary information kept about those structures. In the typical, prerelational data management system, data definition and data manipulation facilities are
separated, and the data definition facilities are less flexible and more difficult to use than in a
relational system.
Data Dictionary: In the Teradata Database, the information automatically maintained
about all tables, views, macros, databases, and users known to the Teradata Database system,
including information about ownership, space allocation, accounting, and access right
relationships between those objects. Data Dictionary information is updated automatically
during the processing of Teradata SQL data definition statements, and is used by the parser to
obtain information needed to process all Teradata SQL statements.
data loading: The process of loading data from a client platform to a Teradata RDBMS
server. For TPump, data loading includes any combination of INSERT, UPDATE, DELETE,
and/or UPSERT operations.
276
Teradata Parallel Data Pump Reference
Glossary
data manipulation: In Teradata SQL, the statements and facilities that change the
information content of the database. These statements include INSERT, UPDATE, and
DELETE.
Data Mart: A type of data warehouse designed to meet the needs of a specific group of users
such as a single department or part of an organization. Typically a data mart focuses on a
single subject area such as sales data. Data marts may or may not be designed to fit into a
broader enterprise data warehouse design.
Data Mining: A process of analyzing large amounts of data to identify hidden relationships,
patterns, and associations.
Data Model: A logical map that represents the inherent properties of the data independent
of software, hardware, or machine performance considerations. The model shows data
elements grouped into records, as well as the association around those records.
Data Synchronization: The process of identifying active data replicates and ensuring that
data concurrency is maintained. Also known as data version synchronization or data version
concurrency because all replicated data values are consistent with the same version as the
official data.
Data Scrubbing: The process of filtering, merging, decoding, and translating source data to
create validated data for the data warehouse.
data streams: Buffers in memory for temporarily holding data. A data stream is not a
physical file; instead, it is more like a pipe (in UNIX or Windows), or a batch pipe in MVS.
Data Warehouse: A subject oriented, integrated, time-variant, non-volatile collection of
data in support of management’s decision making process. A repository of consistent
historical data that can be easily accessed and manipulated for decision support.
DB2:
IBM DATABASE 2
DBA:
Database Administrator
DBQL: Database Query Log. DBQL are a series of system tables created in the DBC database
during the Teradata Database installation process. They are used to track query processing.
See Database Administration to learn more about the DBQL
DD:
Data dictionary or data definition.
DDL: Data definition language, which supports manipulating database structures and the
Data Dictionary information kept about these structures.
DDL operator: The DDL operator is a stand-alone operator that allows you to perform any
necessary database routines prior to a load/apply job without having to use another utility
such as BTEQ. For example, you can create tables or indexes, or drop tables, as needed, before
starting a load/apply job. As a stand-alone operator, supporting only one instance, the DDL
operator does not send or retrieve data to or from a Teradata TPump operator interface.
Teradata Parallel Data Pump Reference
277
Glossary
DEFINE Statement: A statement preceding the INSERT statement that describes the fields
in a record before the record is inserted in the table. This statement is similar to the SQL
USING clause.
Delete Task: A task that uses a full file scan to remove a large number of rows from a single
Teradata Database table. A delete task is composed of three major phases: Preliminary,
Application, and End. The phases are a collection of one or more transactions that are
processed in a predefined order according to the Teradata MultiLoad protocol.
delimiter: In Teradata SQL, a punctuation mark or other special symbol that separates one
clause in a Teradata SQL statement from another, or that separates one Teradata SQL
statement from another.
DIT: Directory Information Tree. A graphical display of an organization's directory
structure, sites, and servers, shown as a branching structure. The top-level (root) directory
usually represents the organization level.
DLL: Dynamic-link library. A feature of the Windows family of operating systems that
allows executable routines to be stored separately as files with .dll extensions and to be loaded
only when needed by a program.
DML: Data manipulation language. In Teradata SQL, the statements and facilities that
manipulate or change the information content of the database. These statements include
SELECT, INSERT, UPDATE, and DELETE.
domain name: A group of computers whose host names (the unique name by which a
computer is known on a network) share a common suffix, that is the domain name.
Drill down:
of data.
A method of exploring detailed data that was used in creating a summary level
DSN: Digital Switched Network. The completely digital version of the PSTN.
Dual Active System: A dual active system is comprised of two active database systems that
operate in tandem and serve the needs of both the production and development
environments. Dual active systems virtually eliminate all down time and provide seamless
disaster recovery protection for critical users and applications.
Duplicate Row Check: A logic within the Teradata Database used to check for duplicate
rows while processing each primary data row for INSERTs and UPDATEs.
DWM: Dynamic Workload Manager. The product described in this document, which
manages access to the Teradata Database.
EBCDIC: Extended binary coded decimal interchange code. An IBM code that uses 8 bits to
represent 256 possible characters. It is used primarily in IBM mainframes, whereas personal
computers use ASCII.
E-CLI:
278
Extended Call-Level Interface
Teradata Parallel Data Pump Reference
Glossary
Error Tables: Tables created during the Preliminary Phase used to store errors detected
while processing a Teradata MultiLoad job. There are two error tables, ET and UV, that
contains errors found during the Acquisition Phase and Application Phase, respectively.
EOF:
End of File
ETL:
Extract, transform, and load
EUC: Extended UNIX Code. Extended UNIX Code (EUC) for Japanese and TraditionalChinese defines a set of encoding rules that can support from 1 to 4 character sets.
exclusion join: In Teradata SQL, a product join or merge join where only the rows that do
not satisfy (are NOT in) the conditional specified in the SELECT are joined.
Exclusive Lock: Supports the manual recovery procedure when a RELEASE MLOAD
statement is executed after a Teradata MultiLoad task has been suspended or aborted.
execution time frame:
waiting to run.
A period of time when DWM can execute scheduled requests that are
Extract: The process of copying a subset of data from a source to a target environment.
Exit Routines: Specifies a predefined action to be performed whenever certain significant
events occur during a Teradata MultiLoad job.
F
Failover: Failover is when Teradata QD switches from one connected system to another
when an error occurs. Many factors affect how failover occurs.
failure: Any condition that precludes complete processing of a Teradata SQL statement. Any
failure will abort the current transaction.
FastExport: Teradata FastExport utility. A program that quickly transfers large amounts of
data from tables and views of the Teradata Database to a client-based application.
FastLoad: Teradata FastLoad utility. A program that loads empty tables on the Teradata
Database with data from a network-attached or channel-attached client.
field: The basic unit of information stored in the Teradata Database. A field is either null, or
has a single numeric or string value. See also column, database, row, table.
FIFO:
First In first out queue.
FIPS:
Federal Information Processing Standards
filter operator:
operators.
A type of operator that performs filtering on data en route from other
Flat File As a noun, an ASCII text file consisting of records of a single type, in which there is
no embedded structure information governing relationships between records.
As an adjective, describes a flattened representation of a database as single file from which the
structure could implicitly be rebuilt.
Teradata Parallel Data Pump Reference
279
Glossary
A particular type of database structure, as opposed to relational.
Foreign Key: The primary key of a parent data subject that is placed in a subordinate data
subject. Its value identifies the data occurrence in the parent data subject that is the parent of
the data occurrence in the subordinate data subject.
Formatted Records:
See Records.
Function: User Defined Functions (UDF) are extensions to Teradata SQL. Users can write
UDFs to analyze and transform data already stored in their data warehouse in ways that are
beyond the functionality of Teradata’s native functions.
G
Gateway:
A device that connects networks having different protocols.
global rule: Object Access and Query Resource rules can be specified as being global, that is,
they apply to all objects, and therefore to all requests. When a rule is specified as being global,
no query objects need be (or can be) associated with the rule because all objects are implicitly
included. Care should be taken defining a global access rule, as it causes all requests to be
rejected except those from the DBC user and any bypassed objects.
Globally Distributed Objects (GDO): A data structure that is shared by all of the virtual
processors in the Teradata Database system configuration.
graphical user interface (GUI): The use of pictures rather than just words to represent the
input and output of a program. A program with a GUI runs under a Windows operating
system. The GUI displays certain icons, buttons, dialog boxes in its windows on the screen and
the user controls it by moving a pointer on the screen (typically controlled by a mouse) and
selecting certain objects by pressing buttons on the mouse. This contrasts with a command
line interface where communication is by exchange of strings of text.
GSS: Generic Security Services. An application level interface (API) to system security
services. It provides a generic interface to services which may be provided by a variety of
different security mechanisms. Vanilla GSS-API supports security contexts between two
entities (known as “principals”).
H
heuristics: Statistics recommendations, based on general rules of thumb.
HOSI:
Acronym for hash-ordered secondary index.
I
IPT:
I/Os Per Transaction
import: This refers to the process of pulling system information into a program. To add
system information from an external source to another system. The system receiving the data
must support the internal format or structure of the data.
280
Teradata Parallel Data Pump Reference
Glossary
Import Task: A task that quickly applies large amounts of client data to one or more tables
or views on the Teradata Database. Composed of four major phases: Preliminary, Acquisition,
Application, and End. The phases are a collection of one or more transactions that are
processed in a predefined order according to the Teradata MultiLoad protocol. An import task
references up to five target tables.
In-Doubt: A transaction that was in process on two or more independent computer
processing systems when an interruption of service occurred on one or more of the systems.
The transaction is said to be in doubt because it is not known whether the transaction was
successfully processed on all of the systems.
Information engineering: The discipline for identifying information needs and developing
information systems that produce messages that provide information to a recipient.
Information engineering is a filtering process that reduces masses of data to a message that
provides information.
INMOD: INput MODule, a program that administrators can develop to select, validate, and
preprocess input data.
INMOD Routine: User-written routines that Teradata MultiLoad and other load/export
utilities use to provide enhanced processing functions on input records before they are sent to
the Teradata Database. Routines can be written in C language (for network-attached
platforms), or SAS/S, COBOL, PL/I or Assembler (for channel-attached platforms). A routine
can read and preprocess records from a file, generate data records, read data from other
database systems, validate data records, and convert data record fields.
inner join: In Teradata SQL, a join operation on two or more tables, according to a join
condition, that returns the qualifying rows from each table.
instance: In object-oriented programming, refers to the relationship between an object and
its class. The object is an instance of the class. In Teradata Parallel Transporter (Teradata PT),
an instance is an occurrence of a fully defined Teradata PT operator, with its source and target
data flows, number of sessions, and so on. Teradata PT can process multiple instances of
operators.
interface processor (IFP): Used to manage the dialog between the Teradata Database and
the host. Its components consist of session control, client interface, the parser, the dispatcher,
and the BYNET. One type of IFP is a communications processor (COP). A COP contains a
gateway process for communicating with workstations via a network.
Intermediary: A computer software process written by a third party which interfaces to one
or more Teradata servers and initiates a change data capture or change data apply operation
with replication services.
internet protocol (IP): Data transmission standard; the standard that controls the routing
and structure of data transmitted over the Internet.
interval histogram: Interval histograms are a form of synopsis data structure. A synopsis
data structure is a data structure that is substantially smaller than the base data it represents.
Interval histograms provide a useful statistical profile of attribute values that characterize the
Teradata Parallel Data Pump Reference
281
Glossary
properties of that raw data. The Teradata Database uses interval histograms to represent the
cardinalities and certain other statistical values and demographics of columns and indexes for
all-AMPs sampled statistics and for full-table statistics. Each histogram is composed of a
maximum of 100 intervals.
I/O:
Input/output.
ISO:
International Standards Organization
J
JES: Job Entry Subsystem (JES) is an MVS subsystem of the OS/390 and z/OS mainframe
operating systems that manages jobs (units of work) that execute on the system. Each job is
described to the operating system by system administrators or other users in job control
language (JCL). There are two versions, JES2 and JES3. JES3 allows central control of the
processing of jobs using a common work queue. Both OS/390 and z/OS provide an interactive
menu for initiating and managing jobs.
JCL: Job Control Language is a language for describing jobs (units of work) to the OS/390,
z/OS, and VSE operating systems, which run on IBM's OS/390 and z800/900 large server
(mainframe) computers. These operating systems allocate their time and space resources
among the total number of jobs that have been started in the computer. Jobs in turn break
down into job steps. All the statements required to run a particular program constitute a job
step. Jobs are background (sometimes called batch) units of work that run without requiring
user interaction (for example, print jobs). In addition, the operating system manages
interactive (foreground) user requests that initiate units of work. In general, foreground work
is given priority over background work.
JIS: Japanese Industrial Standards specify the standards used for industrial activities in
Japan. The standardization process is coordinated by Japanese Industrial Standards
Committee and published through Japanese Standards Association.
Job Script: A job script, or program, is a set of MultiLoad commands and Teradata SQL
statements that make changes to specified target tables and views in the Teradata Database.
These changes can include inserting new rows, updating the contents of existing rows, and
deleting existing rows.
join: A select operation that combines information from two or more tables to produce a
result.
L
LAN: Local Area Network. LANs supported by Teradata products must conform to the IEEE
802.3 standard (Ethernet LAN).
Least Used: Least used (-lu) in a command line parameter that tells Teradata QD to route
queries to the least used database.
Load operator: A consumer-type operator that emulates some of the functions of the
FastLoad utility in the Teradata PT infrastructure.
282
Teradata Parallel Data Pump Reference
Glossary
LOB: An acronym for large object. A large object is a database object that is large in size.
LOBs can be up to 2 gigabytes. There are two types of LOBs, CLOBs and BLOBs. CLOBs are
character-based objects, BLOBs are binary-based objects.
Locks: Teradata FastLoad automatically locks any table being loaded and frees a lock only
after an END LOADING statement is entered. Therefore, access to a table is available when
Teradata FastLoad completes.
log: A record of events. A file that records events. Many programs produce log files. Often
you will look at a log file to determine what is happening when problems occur. Log files have
the extension “.log”.
log stream: A log stream is a series of log messages defined in one message catalog and
initiated from one originator. One originator may initiate several log streams (for example, if
there are multiple operators in one originator).
logical action: A named action that is defined on the Alert Policy Editor's Actions tab.
Logical actions can be assigned to events in the alert policy.
Logical Data Model: A data model that represents the normalized design of data needed to
support an information system. Data are drawn from the common data model and
normalized to support the design of a specific information system.
Actual implementation of a conceptual module in a database. It may take multiple logical data
models to implement one conceptual data model.
loner value: A value that has a frequency greater than the total number of table rows divided
by the maximum interval times 2.
M
MAPI: Messaging Application Programming Interface. A set of Microsoft-defined functions
and interfaces that support E-mail capabilities.
macro: A file that is created and stored on the Teradata RDBMS, and is executed in response
to a Teradata SQL EXECUTE statement
merge join: In Teradata SQL, the type of join that occurs when the WHERE conditional of a
SELECT statement causes the system first to sort the rows of two tables based on a join field
(specified in the statement), then traverse the result while performing a merge/match process.
Metadata: Data about data. For example, information about where the data is stored, who is
responsible for maintaining the data, and how often the data is refreshed.
methods: In object-oriented programming, methods are the programming routines by
which objects are manipulated.
NFS:
Network file system.
MIB:
Management Information Base
Teradata Parallel Data Pump Reference
283
Glossary
MOSI: Micro Operating System Interface. A library of routines that implement operating
system dependent and protocol dependent operations on the workstation.
MTDP: Micro Teradata Director Program. A library of routines that implement the session
layer on the workstation. MTDP is the interface between CLI and the Teradata Database.
MPP:
Massively Parallel Processing
multi-threading: An option that enables you to speed up your export and import
operations with multiple connections.
MultiLoad: Teradata MultiLoad utility. A command-driven utility that performs fast, highvolume maintenance functions on multiple tables and views of the Teradata Database.
Multiset Tables: Tables that allow duplicate rows.
MVS (Multiple Virtual Storage): One of the primary operating systems for large IBM
computers.
N
name: A word supplied by the user that refers to an object, such as a column, database,
macro, table, user, or view.
nested join: In Teradata SQL, this join occurs when the user specifies a field that is a unique
primary index on one table and which is in itself an index (unique/non-unique primary or
secondary) to the second table.
Network:
In the context of the Teradata Database, a LAN (see LAN).
network attached: A computer that communicates over the LAN with a server (for example,
a Teradata RDBMS).
NIC:
Network Interface Card.
NO REWIND: A tape device definition that prevents a rewind operation at either file open
or file close. NO REWIND allows a program to access multiple files on a tape by leaving the
tape positioned at the end of the current file at close, thus allowing the subsequent file to be
easily accessed by the next open.
notify exit: A user-defined exit routine that specifies a predefined action to be performed
whenever certain significant events occur during a TPump job.
For example, by writing an exit in C (without using CLIv2) and using the NotifyExit attribute
in an operator definition, you can provide a routine to detect whether a TPump job succeeds
or fails, how many records were loaded, what the return code is for a failed job, and so on.
null:
The absence of a value for a field.
Nullif Option: This option allows the user to null a column in a table under certain
conditions; it is only used in conjunction with DEFINE statements.
284
Teradata Parallel Data Pump Reference
Glossary
NUPI: Non-unique primary index; an NUPI is typically assigned to minor entities in the
database.
NUSI: Non-unique secondary index; an NUSI is efficient for range query access, while a
unique secondary index (USI) is efficient for accessing a single value.
O
object: In object-oriented programming, a unique instance of a data structure defined
according to the template provided by its class. Each object has its own values for the variables
belonging to its class and can respond to the messages, or methods, defined by its class.
object access rule: An Object Access filter allows you to define the criteria for limiting access
to issuing objects and/or query objects. Queries that reference objects associated with the rule
(either individually or in combination) during the specified dates and times are rejected.
Global rules are not applicable for this type.
object definition: The details of the structure and instances of the objects used by a given
query. Object definitions are used to create the tables, views, and macros, triggers, join
indexes, and stored procedures in a database.
ODBC: (Open Database Connectivity) Under ODBC, drivers are used to connect
applications with databases. The ODBC driver processes ODBC calls from an application, but
passes SQL requests to the Teradata Database for processing.
ODBC operator: A producer-type operator that enables universal open data access with
many ODBC compliant data sources, including Oracle, SQL Server, DB2, and so on. The
ODBC operator runs on all Teradata PT supported platforms. It reads data close to the
sources, and then feeds the data directly to the Teradata Database without the need of an
intermediate staging platform.
OLTP: (On-Line Transaction Processing) Processing that supports the daily business
operations. Also known as operational processing.
operator routine:
method.
In object-oriented programming, refers to a function that implements a
The terms operator routine and operator function may be used interchangeably.
OS/VS
Operating System/Virtual Storage
OTB: Open Teradata Backup; a product set consisting of OTB-Veritas, OTB-BakBone, and
others; Teradata backup products for MP-RAS/UNIX, NT and Windows 2000 platforms.
outer join: In Teradata SQL, an extension of an inner join operation. In addition to
returning qualifying rows from tables joined according to a join condition (the inner join), an
outer join returns non-matching rows from one or both of its tables. Multiple tables are joined
two at a time.
owner: In Teradata SQL, the user who has the ability to grant or revoke all access rights on a
database to and from other users. By default, the creator of the database is the owner, but
ownership can be transferred from one user to another by the GIVE statement.
Teradata Parallel Data Pump Reference
285
Glossary
P
parameter: A variable name in a macro for which an argument value is substituted when
the macro is executed.
parser: A program executing in a PE that translates Teradata SQL statements entered by a
user into the steps that accomplish the user’s intensions.
parsing engine (PE): An instance (virtual processor) of the database management session
control, parsing, and dispatching processes and their data context (caches).
Paused MultiLoad Job: A job that was halted, before completing, during the Acquisition
Phase of the Teradata MultiLoad operation. The paused condition can be intentional, or the
result of a system failure or error condition.
PDE:
Parallel Database Extensions
peak perm:
Highest amount of permanent disk space, in bytes, used by a table.
performance groups: A performance group is a collection of parameters used to control
and prioritize resource allocation for a particular set of Teradata Database sessions within the
Priority Scheduler. Every Teradata Database session is assigned to a performance group during
the logon process. Performance groups are the primary consideration in partitioning the
working capacity of the Teradata Database. To learn more about performance groups, see the
Priority Scheduler section of Utilities.
performance periods: A threshold or limit value that determines when a session is under
the control of that performance period. A performance period links PGs/Teradata Database
sessions under its control to an AG that defines a scheduling strategy. A performance period
allows you to change AG assignments based on time-of-day or resource usage.
Physical Data Model: A data model that represents the denormalized physical
implementation of data that support an information system. The logical data model is
denormalized to a physical data model according to specific criteria that do not compromise
the logical data model but allow the database to operate efficiently in a specific operating
environment.
Pipeclient: A command line program used to send commands to Teradata Query Director.
The programs uses named pipes formatting.
Primary server: A Teradata server in which client applications execute transactions through
use of Teradata SQL or utilities such as Teradata MultiLoad and update the tables of one or
more replication groups. The changes are captured by replication services and given to an
intermediary connected to the server.
priority definition set: A collection of data that includes the resource partition,
performance group, allocation group, performance period type, and other definitions that
control how the Priority Scheduler manages and schedules session execution.
product join: In Teradata SQL, the type of join that occurs when the WHERE conditional of
a SELECT statement causes the Teradata Database system to compare all qualifying rows from
286
Teradata Parallel Data Pump Reference
Glossary
one table to all qualifying rows from the other table. Because each row of one table is
compared to each row of another table, this join can be costly in terms of system performance.
Note that product joins without an overall WHERE constraint are considered unconstrained
(Cartesian). If the tables to be joined are small, the effect of an unconstrained join on
performance may be negligible, but if they are large, there may be a severe negative effect on
system performance.
profiles: A profile is a set of parameters you assign to a user, group of users, or an account
that determines what scheduling capabilities are available and how your Teradata Query
Scheduler scheduled requests server handles their scheduled requests.
physical action: A basic action type, such as <Send a Page>, <Send an E-Mail>, etc.
Physical actions must be encapsulated by logical actions in order to be used in the alert policy.
PIC:
Position independent code
Pipeclient: A command line program used to send commands to Teradata Query Director.
The program uses named pipes formatting.
PL/I: Programming Language/1, a programming language supported for MultiLoad
development.
PMPC:
Performance Monitor and Production Controls
PP2:
Preprocessor2
PPP:
Point-to-Point Protocol
Primary Key: A set of one or more data characteristics whose value uniquely identifies each
data occurrence in a data subject. A primary key is also known as a unique identifier.
privilege: A user’s right to perform the Teradata SQL statements granted to him against a
table, database, user, macro, or view. Also known as access right.
procedure: Short name for Teradata stored procedure. Teradata provides Stored Procedural
Language (SPL) to create stored procedures. A stored procedure contains SQL to access data
from within Teradata and SPL to control the execution of the SQL.
producer: A type of operator that retrieves data from an external data store, such as a file,
Teradata Database table, and so on, and provides it to other operators. A producer operator
produces the data into the data stream’s buffer.
production system: A database used in a live environment. A system that is actively used for
day to day business operations. This differs from a test or development system that is used to
create new queries or test new features before using them on the production system.
Protocol:
network.
Teradata Parallel Data Pump Reference
The rules for the format, sequence and relative timing of messages exchanged on a
287
Glossary
Q
query analysis: A feature that estimates the answer set size (number of rows) and processing
time of a SELECT type query.
Query Capture Database (QCD): A database of relational tables that store the steps of any
query plan captured by the Query Capture Facility (QCF).
Query Capture Facility (QCF): Provides a method to capture and store the steps from any
query plan in a set of predefined relational tables called the Query Capture Database (QCD).
query:
A Teradata SQL statement, particularly a SELECT statement.
Query Director: Teradata Query Director. A Teradata client application used to balance
sessions between systems according to user provided algorithms.
query management: The primary function of DWM is to manage logons and queries. This
feature examines logon and query requests before they are dispatched for execution within the
Teradata Database, and may reject logons, and may reject or delay queries. It does this by
comparing the objects referenced in the requests to the types of DBA-defined rules.
Query Resource filter: A Query Resource filter allows you to define the criteria for limiting
resource usage associated with queries. You can define resource criteria such as:
•
Row count
•
Processing time
•
No joins permitted
•
No full table scans permitted
Queries that are estimated to meet or exceed the limits for the rule during the specified dates
and times are rejected. You may define global rules for this type.
Query Session Utility: A separate utility program used to monitor the progress of your
Teradata MultiLoad job. It reports different sets of status information for each phase of your
job.
R
random AMP sample (RAS): An arbitrary sample from an Access Module Processor
(AMP). These are samples of the tables in a query or all of the tables in a given database. Also
known as RAS.
RDBMS (Relational Database Management System): A database management system in
which complex data structures are represented as simple two-dimensional tables consisting of
columns and rows.
Records: When using the Teradata MultiLoad utility, both formatted and unformatted
records are accepted for loading. A formatted record, in the Teradata Database world, consists
of a record created by a Teradata Database utility, such as BTEQ, where the record is packaged
with begin- and end-record bytes specific to the Teradata Database. Unformatted records are
288
Teradata Parallel Data Pump Reference
Glossary
any records not originating on a Teradata Database, such as Lotus 1-2-3 files. These files
contain records that must be defined before loading onto the Teradata Database.
recursive query: A named query expression that is allowed to reference itself in its own
definition, giving the user a simple way to specify a search of a table using iterative self-join
and set operations. Use a recursive query to query hierarchies of data. Hierarchical data could
be organizational structures such as department and sub-department, forums of discussions
such as posting, response, and response to response, bill of materials, and document
hierarchies.
Replication Group: A set of tables for which either data changes are being captured on a
primary server or applied on a subscriber server.
Replication Services: a set of software functions implemented in the Teradata server that
interact with an intermediary to capture or apply change data to the tables of a replication
group.
request: In host software, a message sent from an application program to the Teradata
Database.
resource partition: A collection of prioritized PGs related by their users’ associations. Has
an assigned weight that determines the proportion of resources available to that partition
relative to the other partitions defined for that Teradata Database.
Restart Log Table: One of four restart tables the Teradata MultiLoad utility creates that are
required for restarting a paused Teradata MultiLoad job.
Restoration Lock: A flag set in the table header of a target table indicating that the table was
aborted during the Application Phase and is now ready to be restored. A limited set of
operations can be done on the table: Delete All, Drop Fallback, Drop Index, Drop Table, and
Select with access lock. No Teradata MultiLoad restart will be allowed on a table with a
Restoration Lock.
result: The information returned to the user to satisfy a request made of the Teradata
Database.
results table/file: In the Schedule Request environment, a results table or file is a database
table or a Windows file into which result data for a schedule request that is not self-contained
are stored.
results file storage: A symbolic name to a root directory where scheduled requests results
are stored. You map a file storage location to a Windows root directory where results are
stored.
RowID join: In Teradata SQL, this join occurs when one of the join tables has a non-unique
primary index constant, and another column of that table matches weakly with a non-unique
secondary index column of the second table.
rule: Rules are the name given to the method used by DWM to define what requests are
prohibited from being immediately executed on the Teradata Database. That is, the rules
enforced by DWM provide the Query Management capabilities.
Teradata Parallel Data Pump Reference
289
Glossary
Round Robin: Round robin (-rr) is a command line parameter that tells Teradata Query
Director to route sessions in a specific order.
Routing: A general term that describes how Teradata Query Director receives sessions and
sends them to one system or another.
Routing Configuration File: The routing configuration file in Teradata Query Director
allows administrators to associate specific userids and account strings to specific systems.
row: Whether null or not, that represent one entry under each column in a table. The row is
the smallest unit of information operated on by data manipulation statements.
RowID join: In Teradata SQL, this join occurs when one of the join tables has a non-unique
primary index constant, and another column of that table matches weakly with a non-unique
secondary index column of the second table.
RSG: Relay Services Gateway. A virtual processor residing on a node in which the replication
services software will execute.
RT:
Response Time
RTF:
Rich Text File
run file: A script that is not contained within the SYSIN file, but rather executed through
use of the .RUN BTEQ command.
S
scheduled requests: The capability to store scripts of SQL requests and execute them at a
scheduled later time.
schema: Schemas are used to identify the structure of the data. Producers have an output
schema, to define what the source data will look like in the data stream. Consumers have an
input schema, to define what will be read from the data stream. If the input and output
schemas are the same, you only define the schema once.
script:
A file that contains a set of BTEQ commands and/or SQL statements.
Security token: A binary string generated by a server when a replication group is created or
altered that must be input to secure a change data capture or apply operation.
self-contained statement: A query request that stores the result data that it generates, if any.
For example, an INSERT/SELECT statement would be self-contained, whereas a SELECT
statement would not.
separator: A character or group of characters that separates words and special symbols in
Teradata SQL. Blanks and comments are the most common separators.
server: A computer system running the Teradata Database. Typically, a Teradata Database
server has multiple nodes, which may include both TPA and non-TPA nodes. All nodes of the
server are connected via the Teradata BYNET or other similar interconnect.
290
Teradata Parallel Data Pump Reference
Glossary
Session: A session begins when the user logs on to the Teradata Database and ends when the
user logs off the Teradata Database. Also called a Teradata Database session.
session: In client software, a logical connection between an application program on a host
and the Teradata Database that permits the application program to send one request to and
receive one response from the Teradata Database at a time.
skew: This value (which is only available in V2R4 and above) is calculated based on a single
Database collection interval. If the Session Collection rate is 60, then the skew is calculated for
a 60 second period.
The value is calculated using ‘current' data values. For example, the Max CPU used during the
past 60 seconds relative to the Average used over that same 60 seconds:
skew = 100 * (1 - avg / max)
SMP:
Symmetric Multi-Processing
SNMP: Simple Network Management Protocol.
Sockclient:
A command line program used to send commands to Teradata Query Director.
Source Database: The database from which data will be extracted or copied into the Data
Warehouse.
SQL: Structured Query Language. An industry-standard language for creating, updating
and, querying relational database management systems. SQL was developed by IBM in the
1970s for use in System R. It is the de facto standard as well as being an ISO and ANSI
standard. It is often embedded in general purpose programming languages.
Programming language used to communicate with the Teradata Database.
SSO: Single sign-on, an authentication option that allows users of the Teradata Database on
Windows 2000 systems to access the Teradata Database based on their authorized network
usernames and passwords. This feature simplifies the procedure requiring users to enter an
additional username and password when logging on to Teradata Database via client
applications.
stand-alone operator:
operators.
In TPump, a type of operator that does not exchange data with other
Star Schema: A modeling scheme that has a single object in the middle connected to a
number of objects around it radially.
statement: A request for processing by the Teradata Database that consists of a keyword
verb, optional phrases, operands and is processed as a single entity.
statistics: These are the details of the processes used to collect, analyze, and transform the
database objects used by a given query.
stored procedure: Teradata supports stored procedures. A stored procedure is a
combination of SQL statements and control and conditional handling statements that run
using a single call statement.
Teradata Parallel Data Pump Reference
291
Glossary
Stream operator: A consumer-type operator that allows parallel inserts, updates, and
deletes to new or preexisting Teradata tables.
Subscriber server: A Teradata server in which changes captured from a primary server by an
intermediary are applied to tables that duplicate those of the primary. Replication services
executing on the servers provide the capture and apply functions.
supervisory user: In Data Dictionary, a user who has been delegated authority by the
administrator to further allocate Teradata Database resources such as space and the ability to
create, drop, and modify users within the overall user community.
T
table: A set of one or more columns with zero or more rows that consist of fields of related
information.
Target Database:
Target table:
TCP/IP:
The database in which data will be loaded or inserted.
A user table where changes are to be made by a Teradata MultiLoad task.
Transmission Control Protocol/Internet Protocol.
TDPID: Teradata Director Program Identifier. The name of the Teradata Database being
accessed.
tdwm: The database shared by Teradata Dynamic Workload Manager and Teradata Query
Scheduler. Previously called the dbqrymgr database.
Teradata SQL: The Teradata Database dialect of the relational language SQL, having data
definition and data manipulation statements. A data definition statement would be a CREATE
TABLE statement and a data manipulation statement would be a data retrieval statement (a
SELECT statement).
TDP: Teradata Director Program; TDP provides a high-performance interface for messages
communicated between the client and the Teradata system.
Target Level Emulation (TLE): Permits you to emulate a target environment (target system)
by capturing system-level information from that environment. The captured information is
stored in the relational tables SystemFE.Opt_Cost_Table and SystemFE.Opt_RAS_Table. The
information in these tables can be used on a test system with the appropriate column and
indexes to make the Optimizer generate query plans as if it were operating in the target system
rather than the test system.
test system: A Teradata Database where you want to import Optimizer-specific information
to emulate a target system and create new queries or test new features.
title: In Teradata SQL, a string used as a column heading in a report. By default, it is the
column name, but a title can also be explicitly declared by a TITLE phrase.
292
TPA:
Trusted Parallel Application.
TOS:
Teradata Operating System
Teradata Parallel Data Pump Reference
Glossary
TPM:
Transactions Per Minute
Transport: The process of extracting data from a source, interfacing with a destination
environment, and then loading data to the destination.
transaction: A set of Teradata SQL statements that is performed as a unit. Either all of the
statements are executed normally or else any changes made during the transaction are backed
out and the remainder of the statements in the transaction are not executed. The Teradata
Database supports both ANSI and Teradata transaction semantics.
trigger: One or more Teradata SQL statements associated with a table and executed when
specified conditions are met.
TSM:
Tivoli Storage Management; IBM’s storage management solution.
TTU: Teradata Tools and Utilities is a robust suite of tools and utilities that enables users
and system administrators to enjoy optimal response time and system manageability with
there Teradata system. TPump is included in Teradata Tools and Utilities.
tuple: In a database table (relation), a set of related values one for each attribute (column).
A tuple is stored as a row in a relational database management system. It is analogous to a
record in a nonrelational file.
Two Phase Commit: Two Phase Commit is the process by which a relational database
ensures that distributed transactions are performed in an orderly manner. In this system,
transactions may be terminated by either committing them or rolling them back.
type: An attribute of a column that specifies the representation of data values for fields in
that column. Teradata SQL data types include numerics and strings.
U
UDF
User Defined Functions
UDM User-Defined Methods. The database developer can create custom functions that are
explicitly connected to UDTs; these are known as UDMs. Functionalities directly applicable to
a UDT can be located within the UDMs associated with that UDT rather than being replicated
to all of the applications that use that UDT, resulting in increased maintainability.
UDT A custom data type, known as a user-defined type. By creating UDTs, a database
developer can augment the Teradata Database with data types having capabilities not offered
by Teradata predefined (built-in) data types. Use TPump to import values into tables
containing UDT columns in the same manner as is done for other tables. The input records to
TPump must have the column data for UDT columns in its external type format.
Unformatted Records:
See Records.
Unicode: A fixed-width (16 bits) encoding of virtually all characters present in all languages
in the world.
unique secondary index (USI): One of two types of secondary indexes. A secondary index
may be specified at table creation or at any time during the life of the table. It may consist of
Teradata Parallel Data Pump Reference
293
Glossary
up to 16 columns. To get the benefit of the index, the query has to specify a value for all
columns in the secondary index. A USI has two purposes: It can speed up access to a row
which otherwise might require a full table scan without having to reply on the primary index,
and it can be used to enforce uniqueness of a column or set of columns.
user: In Teradata SQL, a database associated with a person who uses the Teradata Database.
The database stores the person’s private information and accesses other Teradata Databases.
Update operator: A consumer-type operator that emulates some of the functions of the
Teradata MultiLoad utility in the Teradata PT infrastructure.
UPI: Unique primary index; a UPI is required and is typically assigned to major entities in
the database.
user: A database associated with a person who uses the Teradata Database. The database
stores the person’s private information and accesses other Teradata Databases.
user groups: A group of users can be specified within DWM as either as a collection of
individual users, or as all user names which satisfy a character string pattern (such as SALE*).
The Teradata concept of roles is not used to define user groups, as it applies to privileges. User
groups can generally be employed wherever an issuing object can be specified, and any
condition applied to a group implicitly applies to all users within that group.
UTF-8: In simple terms, UTF-8 is an 8 bit encoding of 16 bit Unicode to achieve an
international character representation.
In more technical terms, in UTF-8, characters are encoded using sequences of 1 to 6 octets.
The only octet of a sequence of one has the higher-order bit set to 0, the remaining 7 bits are
used to encode the character value. UTF-8 uses all bits of an octet, but has the quality of
preserving the full US-ASCII range. The UTF-8 encoding of Unicode and UCS avoids the
problems of fixed-length Unicode encodings because an ASCII file encoded in UTF is exactly
same as the original ASCII file and all non-ASCII characters are guaranteed to have the most
significant bit set (bit 0x80). This means that normal tools for text searching work as expected.
UTF16
A 16-bit Unicode Translation Format.
V
value-ordered secondary index (VOSI): A non-unique secondary index (NUSI) can be
value ordered which means the NUSI can be sorted on the key values themselves rather than
on the corresponding hash codes. This is useful for range queries where only a portion of the
index subtable will be accessed. With a value-ordered NUSI, only those blocks in the NUSI
subtable that are within the range are scanned. It must be a number value, up to 4 bytes,
versus a longer character column. DATE is the most commonly used data type. The actual
data value is stored as part of the NUSI structure.
Varbyte:
A data type that represents a variable-length binary string.
Varchar:
A data type that represents a variable-length non-numeric character.
Vargraphic:
294
A data type that represents a variable-length string of characters.
Teradata Parallel Data Pump Reference
Glossary
view: An alternate way of organizing and presenting information in a Teradata Database. A
view, like a table, has rows and columns. However, the rows and columns of a view are not
directly stored by the Teradata Database. They are derived from the rows and columns of
tables (or other views) whenever the view is referenced.
VM (Virtual Machine):
VM/CMS
One of the primary operating systems for large IBM computers.
Virtual Machine/Conversational Monitor System
W
Weighted Round Robin: Weighted round robin (-wrr) is a startup command line parameter
that allows the administrator to weight specific databases for Teradata QD.
workgroups: Workgroups represent collections of related scheduled request work for users,
user groups, or accounts. Each workgroup is assigned a maximum number of requests that
can be executing from that workgroup simultaneously thereby ensuring that requests for all
workgroups get a fair share of their scheduled work done within the execution time frames.
workload limits rule A Workload Limits rule allows you to limit the number of logon
sessions and all-AMP queries, as well as reject or delay queries when workload limits are
encountered. You can define which users, accounts, performance groups, or users within
performance groups that are associated with this type of rule.
Workstation:
A network-attached client.
Work Table: A table created during the Preliminary Phase used to store intermediate data
acquired from the host during a Teradata MultiLoad task. These data will eventually be
applied to a target table.
Write Lock: A write lock enables a single user to modify a table. The Teradata MultiLoad
utility maintains write locks against each target table during the Application Phase, and work
tables and error tables for each task transaction.
X
XML: XML is the eXtensible Markup Language—a system created to define other markup
languages. For this reason, it can also be referred to as a metalanguage. XML is commonly
used on the Internet to create simple methods for the exchange of data among diverse clients.
Teradata Parallel Data Pump Reference
295
Glossary
296
Teradata Parallel Data Pump Reference
Index
Symbols
- 46
&SYSAPLYCNT system variable 60
&SYSDATE system variable 59
&SYSDATE4 system variable 59
&SYSDAY system variable 59
&SYSDELCNT system variable 59
&SYSETCNT system variable 59
&SYSINSCNT 59
&SYSINSCNT system variable 59
&SYSJOBNAME system variable 59
&SYSNOAPLYCNT system variable 60
&SYSOS system variable 60
&SYSRC system variable 60
&SYSRCDCNT system variable 60
&SYSRJCTCNT system variable 60
&SYSUPDCNT system variable 60
&SYSUSER system variable 60
(serialize_on_field specification
DML command 115
./ prefix
EXIT name specification, BEGIN EXPORT command 146
A
abort termination 52
abort, defined 271
aborted TPump job
recovery 55
ACCEPT command
definition 90
function 28
syntax 92
access rights 33
accounts
defined 271
acctid specification
LOGON command 165
Acquisition Error Table 197
administrator, defined 272
aggregate operators, programming considerations 67
all joins
defined 272
ALTER TABLE SQL statement 29
alternate error file runtime parameter 48
ANSI/SQL DateTime specifications
programming considerations 63
Teradata Parallel Data Pump Reference
restrictions 63
ANSIDATE keyword
DATEFORM command 108
API
defined 273
APPEND keyword
BEGIN LOAD command 96
application commands
syntax 89
APPLY label specification, IMPORT command 149
ArraySupport keyword
BEGIN LOAD 101, 116
Assembler INMOD
programming structure 203
Atomic upsert feature 122
Atomic UPSERT keyword
EXECUTE command 127
AXSMOD keyword
IMPORT command 145
AXSMOD name, IMPORT command 145
B
-b runtime parameter 45
batch mode
syntax for invoking on MVS 44
syntax for invoking on UNIX 43
syntax for invoking on VM 44
syntax for invoking on Windows 43
BEGIN LOAD command
definition 90
function 28
in script 69
syntax 95
BRIEF runtime parameter 45
buffers per session runtime parameter 45
BUFFERS runtime parameter 45
bypass objects
defined 274
C
-c charactersetname runtime parameter 45
C language INMODs
programming structure 203
C language, comment support 63
-C runtime parameter 47
Character Sets
297
Index
Unicode 25
UTF16 26
UTF8 25
Character sets
Japanese 23
character sets
Chinese and Korean 23
client system specifications 64
default 65
effects on TPump commands 65
for AXSMOD 65
runtime parameters 45
site defined 27
Teradata RDBMS default 64
character-to-date data conversions 24
character-to-numeric
data conversions 24
charpos1 93
charpos2 93
CHARSET= charactersetname runtime parameter 45
CHECKPOINT keyword
BEGIN LOAD command 99
CHECKPOINT SQL statement 29
checkpoints
description 25
Chinese and Korean character sets 23
cname specification
INSERT command 154
UPDATE statement 186
COBOL INMOD
programming structure 203
COLLECT STATISTICS SQL statement 29
command functions 27
commands
ACCEPT
definition 90
function 28
syntax 92
BEGIN LOAD
definition 90
function 28
syntax 95
DATEFORM
definition 90
function 28
syntax 108
DISPLAY
definition 90
function 28
syntax 111
DML
definition 90
function 28
syntax 113
298
ELSE
definition 90
function 28
syntax 141
END LOAD
definition 90
function 28
syntax 126
ENDIF
definition 90
function 28
syntax 141
FIELD
definition 90
function 28
syntax 130
FILLER
definition 90
function 29
syntax 139
IF
definition 90
function 28
syntax 141
IMPORT
definition 90
function 29
syntax 143
LAYOUT
definition 90
function 29
syntax 157
LOGOFF
definition 90
function 28
syntax 162
LOGON
definition 90
function 28
syntax 164
LOGTABLE
definition 90
function 28
syntax 168
NAME
definition 90
function 28
syntax 170
PARTITION
definition 90
function 29
syntax 172
ROUTE
definition 91
Teradata Parallel Data Pump Reference
Index
function 28
syntax 176
RUN FILE
definition 91
function 28
syntax 178
SET
definition 91
function 28
syntax 180
SYSTEM
definition 91
function 28
syntax 182
TABLE
definition 91
function 29
syntax 184
usage 57
COMMENT 29
comment support 63
condition specification
LAYOUT command 158
conditional expression specification
IF, ELSE, and ENDI command 141
conditional expressions 57
CONFIG runtime parameter 47
configuration file
optional specification 41
parameters overridden by runtime parameters
BRIEF 45
CHARSET 45
ERRLOG 48
runtime parameter 47
conversions
character-to-numeric 24
integer-to-decimal 24
numeric-to-numeric 24
CREATE DATABASE SQL statement 30
CREATE MACRO SQL statement 30
CREATE TABLE SQL statement 30
CREATE VIEW SQL statement 30
D
-d runtime parameter 48
data
file concatenation, programming considerations 67
data conversions
capabilities 24
character-to-date 24
character-to-numeric 24
date-to-character 24
integer-to-decimal 24
Teradata Parallel Data Pump Reference
numeric-to-numeric 24
data definition language 17, 37, 107
data dictionary, defined 276
data formats 22
data manipulation language 17, 37, 107
data manipulation, defined 277
data serialization 17
data types
ANSI/SQL DateTime 63
ANSI/SQL DateTime restrictions 63
database objects
protection and location 53
database specification
DATABASE command 107
EXECUTE command 127
DATABASE SQL statement 30
definition 91
syntax 107
datadesc specification
FIELD command 131
FILLER command 139
DATAENCRYPTION keyword
BEGIN LOAD command 100, 173
DATEFORM command
definition 90
function 28
syntax 108
DateTime data types, specifying 63
date-to-character data conversions 24
dbname specification
BEGIN LOAD command 104
LOGTABLE command 168
dbname. specification
BEGIN LOAD command 97
DBQL
defined 277
DDL 17, 37, 107
ddname specification
IMPORT command 144
decimal, zoned 24
DELETE DATABASE SQL statement 30
DELETE DML statement
in script 70, 71
DELETE keyword
EXECUTE command 127
DELETE macro 128
DELETE SQL statement 30, 109
definition 91
syntax 109
delimiters
defined 278
DISPLAY command
definition 90
function 28
299
Index
syntax 111
DISPLAY ERRORS keyword, IMPORT command 149
DIT
defined 278
DML 17, 37, 107
DML command
definition 90
function 28
in script 70
overview 32
syntax 113
DML statements 32
DROP DATABASE SQL statement 30
DROP keyword, FIELD command 132
DWM
defined 278
dynamn entry point
for C INMOD routines 204
for SAS/C INMOD routines 204
E
-e filename runtime parameter 48
echo 176
ELSE command
definition 90
function 28
syntax 141
END LOAD command
definition 90
function 28
in script 71
syntax 126
ENDIF command
definition 90
function 28
syntax 141
errcount specification
BEGIN LOAD command 99
ERRLIMIT keyword
BEGIN LOAD command 98
ERRLOG=filename runtime parameter 48
error detection 193
error table
acquisition 197
error tables
reading 197
troubleshooting 197
errors 193
ERRORTABLE keyword
BEGIN LOAD command 96
EUC
defined 279
exclusion join
300
defined 279
EXECUTE SQL statement
definition 91
EXECUTE statement
syntax 127
execution time frame
defined 279
EXIT keyword
BEGIN LOAD command 105
EXIT name specification, BEGIN LOAD command 105
exit routines, definition 202
expr specification
UPDATE statement 186
expression specification
INSERT command 154
SET command 180
expressions, programming considerations 67
F
-f runtime parameter 45
failure, defined 279
features, advanced, INMODs 201
FIELD command
definition 90
function 28
in script 70
syntax 130
field, defined 279
fieldexpr specification
FIELD command 132
fieldname specification
FIELD command 130
FILLER command 139
INSERT command 154
file
size, maximum 67
file requirements
for invoking TPump 41
fileid 93, 111
filename specification
IMPORT command 146
FILLER command
definition 90
function 29
in script 70
syntax 139
FOR n specification, IMPORT command 147
FREE option, IMPORT command 146
frequency specification
BEGIN LOAD command 100
FROM keyword
IMPORT command 147
Teradata Parallel Data Pump Reference
Index
G
GIVE SQL statement 30
global rule
defined 280
GRANT SQL statement 30
graphic constants
hexadecimal 67
KanjiEBCDIC 67
support for 67
graphic data types
support for 66
GSS
defined 280
H
HOLD option
IMPORT command 146
hours specification
BEGIN LOAD command 101
I
IF command
definition 90
function 28
syntax 141
IGNORE keyword
DML command 114
IMPORT command
definition 90
function 29
in script 71
syntax 143
INDICATORS keyword
LAYOUT command 159
INFILE filename specification
IMPORT command 146
INFILE keyword
IMPORT command 144, 146
infilename standard input file specification 50
init-string specification
IMPORT command 145
INMODs 201
assembler example 246
C example 250
COBOL example 241
COBOL pass-thru example 244
compiling and linking 216
FastLoad 213
IBM interface 212
input return code values 215
input values 215
interface 213
Teradata Parallel Data Pump Reference
major functions 201
output return code values 215
PL/I example 250
preparing program 214
programming 216
routines
entry points 204
platforms supported 202
programming languages supported 202
programming structure 203
rules and restrictions 209
using 211
TPump interface 212
UNIX interface 213
UNIX programming 216
Windows interface 213
inner join
defined 281
input/output
controlling 38
INSERT DML statement
in script 70
INSERT keyword
DML command 114
EXECUTE command 127
INSERT macro 128
INSERT SQL statement 30
definition 91
syntax 154
INTEGERDATE keyword
DATEFORM command 108
integer-to-decimal
data conversions 24
internationalization 18
invoking
on UNIX platform 42
on Windows platform 42
invoking TPump 41
J
Japanese character sets 23
JIS
defined 282
job
recovery if aborted 55
jobname specification
NAME command 170
join, defined 282
K
kanjiEBCDIC
graphic constants 67
KEY keyword, FIELD command 132
301
Index
Korean and Chinese character sets 23
L
LABEL keyword
DML command 113
label specification
DML command 114
IMPORT command 149
LATENCY keyword
BEGIN LOAD command 102
LAYOUT command
definition 90
function 29
in script 70
syntax 157
LAYOUT name specification, IMPORT command 149
layoutname specification
IMPORT command 149
LAYOUT command 157
lock
access 33
acquisition 33
application 33
row hash locking 33
write 33
log table
space requirement calculation example 85
space requirements 84
LOGDATA command
syntax 160
LOGMECH command
syntax 161
LOGOFF command
definition 90
function 28
messages 80
syntax 162
logoff/disconnect message 74
LOGON command
definition 90
function 28
syntax 164
LOGTABLE command
definition 90
function 28
syntax 168
logtables
non-shareability 168
space requirements 169
M
-m runtime parameter 49
macro runtime parameters 49
302
MACRODB keyword
BEGIN LOAD command 104
macros 32
predefined 33
TPump usage 17
MACROS runtime parameter 49
MARK keyword
DML command 114
maximum
file size, programming considerations 67
row size, programming considerations 67
merge join
defined 283
messages 176
minutes specification
BEGIN LOAD command 102
MODIFY DATABASE SQL statement 30
monitor facility 80
MSG string specification, BEGIN LOAD command 105
MultiLoad utility
data conversion capabilities 24
MULTISET table 104, 114
MVS
syntax for invoking in batch mode 44
N
name 127
NAME command
definition 90
function 28
syntax 170
name specification
BEGIN LOAD command 105
IMPORT command 145
name, defined 284
Named Pipes Access Module 145
nested join
defined 284
NOATOMICUPSERT 104
NODROP keyword
BEGIN LOAD command 96
NOMONITOR keyword
BEGIN LOAD command 104
non-shareability
logtables 168
normal termination 51
NOSTOP keyword, IMPORT command 149
NOTIFY option specification, BEGIN LOAD command 104
NOTIMERPROCESS keyword
BEGIN LOAD command 102
null, defined 284
nullexpr specification
FIELD command 131
Teradata Parallel Data Pump Reference
Index
number specification
BEGIN LOAD command 96
PARTITION command 172
numeric-to-numeric
data conversions 24
O
Object Access filter
defined 285
OLE DB Access Module 145
operators
reserved words 56
options messages 74, 79
oscommand string specification
SYSTEM command 182
outer join
defined 285
outfilename standard output file specification 50
owner, defined 285
P
pack factor 85
PACK keyword
BEGIN LOAD command 102
PARTITION command 173
packing 116, 155
packing factor 85, 103, 174
PACKMAXIMUM keyword
BEGIN LOAD command 102
PARTITION command 173
parms specification
IMPORT command 147
parser, defined 286
parsing engine (PE), defined 286
PARTITION command
definition 90
function 29
syntax 172
PARTITION keyword
DML command 115
partition_name specification
DML command 115
PARTITION command 173
password specification
LOGON command 165
performance checklist
troubleshooting
performance checklist 200
performance group
defined 286
periodicity runtime parameter 48
PL/I language INMODs
programming structure 204
Teradata Parallel Data Pump Reference
PRDICITY runtime parameter 48
predefined
macros 33
preparing a TPump script 69
procedures
defined 287
product join
defined 286
product version numbers 3
profiles
defined 287
programming INMODs
for UNIX-based clients 216
UNIX-based clients 216
Q
queries
defined 288
query analysis
defined 288
query management
defined 288
Query Resource filter
defined 288
R
-r tpump command runtime parameter 49
RATE keyword
BEGIN LOAD command 103
recovery
aborted TPump job 55
procedures 53
reduced print output runtime parameter 45
redundant conversions
supported 24
RENAME SQL statement 30
REPLACE MACRO SQL statement 30
REPLACE VIEW SQL statement 30
reporting
options messages 79
statistics 75
restart 76
request, defined 289
reserved words
use in TPump 56
restart
statistics 76
restart log table 52, 53
restart procedures 53
restrictions and limitations
programming considerations
aggregate operators 67
data file concatenation 67
303
Index
expressions 67
maximum file size 67
maximum row size 67
Teradata RDBMS data retrieval 67
results file storage
defined 289
results files
defined 289
results tables
defined 289
retcode specification
LOGOFF command 162
return codes 52
termination 68
REVOKE SQL statement 30
ROBUST keyword
BEGIN LOAD command 104
ROUTE command
definition 91
function 28
syntax 176
row
size, maximum 67
row count variables 62
row hash locking 33
row, defined 290
RowID join
defined 289, 290
rule
defined 289
RUN FILE command
definition 91
function 28
syntax 178
S
scheduled requests
defined 290
script
example 72
preparation 69
writing guidelines 69
writing procedure 71
scriptencoding parameter 46
seconds specification
BEGIN LOAD command 102
self-contained statement
defined 290
separator, defined 290
SERIALIZE keyword
BEGIN LOAD command 102
SERIALIZEON keyword
DML command 115
304
session, defined 291
sessions 90, 95, 101, 162, 164
SESSIONS keyword
BEGIN LOAD command 96
PARTITION command 173
SET command
definition 91
function 28
syntax 180
SET QUERY_BAND SQL statement 30
SET SESSION COLLATION SQL statement 30
single sign on 164
SLEEP keyword 96, 101, 174
BEGIN LOAD command 103
software releases
supported 3
space requirements
for TPump log tables 84
log table 84
SQL
defined 291
SQL statements
DATABASE
definition 91
syntax 107
DELETE
definition 91
syntax 109
EXECUTE
definition 91
syntax 127
INSERT
definition 91
syntax 154
supported by TPump 29
UPDATE
definition 91
syntax 186
SQL, Teradata 37
SSO
LOGON command 164
starting TPump
on UNIX platform 42
on Windows platform 42
startpos specification
FIELD command 130
FILLER command 139
statement
defined 291
statement_rate specification
BEGIN LOAD command 103
statements specification
BEGIN LOAD command 103
PARTITION command 174
Teradata Parallel Data Pump Reference
Index
statements to execute if FALSE specification
IF, ELSE, and ENDIF command 141
statements to execute if TRUE specification
IF, ELSE, and ENDIF command 141
statements to resume with specification
IF, ELSE, and ENDIF command 141
statistics 75
facility 74
restart 76
stored procedures
defined 291
string variable
MSG specification, BEGIN LOAD command 105
TEXT specification, BEGIN LOAD command 105
supervisory user, defined 292
support commands, defined 27
support environment 37
SYSAPLYCNT system variable 60
SYSDATE system variable 59
SYSDATE4 system variable 59
SYSDAY system variable 59
SYSDELCNT system variable 59
SYSETCNT system variable 59
SYSINSCNT system variable 59
SYSJOBNAME 28
SYSJOBNAME system variable 59
SYSNOAPLYCNT system variable 60
SYSOS system variable 60
SYSRC system variable 60
SYSRCDCNT system variable 60
SYSRJCTCNT system variable 60
SYSTEM command
definition 91
function 28
syntax 182
system failure
restart and recovery 53
system variables 59
&SYSAPLYCNT 60
&SYSDATE 59
&SYSDATE4 59
&SYSDAY 59
&SYSDELCNT 59
&SYSETCNT 59
&SYSJOBNAME 59
&SYSNOAPLYCNT 60
&SYSOS 60
&SYSRC 60
&SYSRCDCNT 60
&SYSRJCTCNT 60
&SYSTIME
&SYSTIME system variable 60
&SYSUPDCNT 60
&SYSUSER 60
Teradata Parallel Data Pump Reference
SYSTIME system variable 60
SYSUPDCNT system variable 60
SYSUSER system variable 60
T
TABLE command
definition 91
function 29
in script 70
syntax 184
table, defined 292
tableref specification
TABLE command 184
tables
fallback 35
nonfallback 35
task
commands 27
task commands 38
syntax and usage 89
usage 57
tdpid specification
LOGON command 164
TENACITY keyword 101
BEGIN LOAD command 101
Teradata RDBMS
data retrieval, programming considerations 67
Teradata SQL 37
Teradata SQL statements
supported by TPump 29
terminating a TPump job 52
termination
return codes 52, 68
text 111
TEXT string specification, BEGIN LOAD command 105
threshold specification
PARTITION command 174
THRU keyword
IMPORT command 147
time and date variables 61
title, defined 292
tname specification
BEGIN LOAD command 97
DELETE statement 109
INSERT command 154
LOGTABLE command 168
UPDATE statement 186
TPump
advanced features 201
invoking
batch mode on MVS 44
batch mode on UNIX 43
batch mode on VM 44
305
Index
batch mode on Windows 43
macros 32
Monitor facility 80
Monitor facility interface 80
script, example of 72
support command, defined 27
support environment 38
using INMOD routines 211
tpump command runtime parameter 49
TPump Conditional Expressions 57
TPump/INMOD Routine Interface 205
transaction, defined 293
troubleshooting 193
early error detection 193
error detection 193
reading error tables 197
type, defined 293
U
Unicode
Character Sets 25
UNIX
starting TPump 42
syntax for invoking in batch mode 43
UPDATE DML statement 70
in script 70
UPDATE keyword
EXECUTE command 127
UPDATE macro 128
UPDATE SQL statement 30
definition 91
syntax 186
upsert
Atomic 122
example 121
feature 32, 120
UPSERT keyword 32
DML command 121
EXECUTE command 127
UPSERT macro 128
USE keyword
DML command 115
use_field specification
DML command 115
user 294
user groups
defined 294
username specification
LOGON command 164
USING (parms) specification
IMPORT command 147
USING keyword
IMPORT command 147
306
UTF16
Character Sets 26
UTF-8
defined 294
UTF8
Character Sets 25
utility variables
supported by TPump 62
V
-v 75
-V runtime parameter 50
-v runtime parameter 50
var 92
var specification
SET command 180
variables
date and time
date and time variables 61
row count 62
substitution 62
utility
supported by TPump 62
VARTEXTvariable-length text record format 148
verbose mode runtime parameter 50
VERBOSE runtime parameter 50
version numbers 3
VM
syntax for invoking in batch mode 44
W
WebSphere® Access Module for Teradata (client version) 145
WebSphere® Access Module for Teradata (server version)
145
WHERE condition specification
DELETE statement 109
IMPORT command 150
UPDATE statement 186
Windows
starting TPump 42
syntax for invoking in batch mode 43
workgroups
defined 295
workload limits rule
defined 295
Z
zoned decimal format 24
Teradata Parallel Data Pump Reference