Download Teradata Aster Big Analytics Appliance 3H

Document related concepts

Microsoft SQL Server wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Oracle Database wikipedia , lookup

Open Database Connectivity wikipedia , lookup

IMDb wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Ingres (database) wikipedia , lookup

Concurrency control wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Database wikipedia , lookup

Functional Database Model wikipedia , lookup

Relational model wikipedia , lookup

Versant Object Database wikipedia , lookup

Database model wikipedia , lookup

Clusterpoint wikipedia , lookup

ContactPoint wikipedia , lookup

Transcript
Teradata Aster Big Analytics Appliance 3H
Database Administrator Guide
Release 5.10
B700-7011-510K
May 2013
The product or products described in this book are licensed products of Teradata Corporation or its affiliates.
Teradata, Active Data Warehousing, Active Enterprise Intelligence, Applications-Within, Aprimo, Aprimo Marketing Studio, Aster, BYNET,
Claraview, DecisionCast, Gridscale, MyCommerce, Raising Intelligence, Smarter. Faster. Wins., SQL-MapReduce, Teradata Decision Experts,
"Teradata Labs" logo, "Teradata Raising Intelligence" logo, Teradata ServiceConnect, Teradata Source Experts, "Teradata The Best Decision Possible"
logo, The Best Decision Possible, WebAnalyst, and Xkoto are trademarks or registered trademarks of Teradata Corporation or its affiliates in the
United States and other countries.
Adaptec and SCSISelect are trademarks or registered trademarks of Adaptec, Inc.
AMD Opteron and Opteron are trademarks of Advanced Micro Devices, Inc.
Apache, Apache Hadoop, Hadoop, and the yellow elephant logo are either registered trademarks or trademarks of the Apache Software Foundation
in the United States and/or other countries.
Axeda is a registered trademark of Axeda Corporation. Axeda Agents, Axeda Applications, Axeda Policy Manager, Axeda Enterprise, Axeda Access,
Axeda Software Management, Axeda Service, Axeda ServiceLink, and Firewall-Friendly are trademarks and Maximum Results and Maximum
Support are servicemarks of Axeda Corporation.
Data Domain, EMC, PowerPath, SRDF, and Symmetrix are registered trademarks of EMC Corporation.
GoldenGate is a trademark of Oracle.
Hewlett-Packard and HP are registered trademarks of Hewlett-Packard Company.
Hortonworks, the Hortonworks logo and other Hortonworks trademarks are trademarks of Hortonworks Inc. in the United States and other
countries.
Intel, Pentium, and XEON are registered trademarks of Intel Corporation.
IBM, CICS, RACF, Tivoli, and z/OS are registered trademarks of International Business Machines Corporation.
Linux is a registered trademark of Linus Torvalds.
LSI is a registered trademark of LSI Corporation.
Microsoft, Active Directory, Windows, Windows NT, and Windows Server are registered trademarks of Microsoft Corporation in the United States
and other countries.
NetVault is a trademark or registered trademark of Quest Software, Inc. in the United States and/or other countries.
Novell and SUSE are registered trademarks of Novell, Inc., in the United States and other countries.
Oracle, Java, and Solaris are registered trademarks of Oracle and/or its affiliates.
QLogic and SANbox are trademarks or registered trademarks of QLogic Corporation.
Red Hat is a trademark of Red Hat, Inc., registered in the U.S. and other countries. Used under license.
SAS and SAS/C are trademarks or registered trademarks of SAS Institute Inc.
SPARC is a registered trademark of SPARC International, Inc.
Symantec, NetBackup, and VERITAS are trademarks or registered trademarks of Symantec Corporation or its affiliates in the United States and
other countries.
Unicode is a registered trademark of Unicode, Inc. in the United States and other countries.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Other product and company names mentioned herein may be the trademarks of their respective owners.
THE INFORMATION CONTAINED IN THIS DOCUMENT IS PROVIDED ON AN "AS-IS" BASIS, WITHOUT WARRANTY OF ANY KIND, EITHER
EXPRESS OR IMPLIED, INCLUDING THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR
NON-INFRINGEMENT. SOME JURISDICTIONS DO NOT ALLOW THE EXCLUSION OF IMPLIED WARRANTIES, SO THE ABOVE EXCLUSION
MAY NOT APPLY TO YOU. IN NO EVENT WILL TERADATA CORPORATION BE LIABLE FOR ANY INDIRECT, DIRECT, SPECIAL, INCIDENTAL,
OR CONSEQUENTIAL DAMAGES, INCLUDING LOST PROFITS OR LOST SAVINGS, EVEN IF EXPRESSLY ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.
The information contained in this document may contain references or cross-references to features, functions, products, or services that are not
announced or available in your country. Such references do not imply that Teradata Corporation intends to announce such features, functions,
products, or services in your country. Please consult your local Teradata Corporation representative for those features, functions, products, or
services available in your country.
Information contained in this document may contain technical inaccuracies or typographical errors. Information may be changed or updated
without notice. Teradata Corporation may also make improvements or changes in the products or services described in this information at any time
without notice.
To maintain the quality of our products and services, we would like your comments on the accuracy, clarity, organization, and value of this document.
Please email: [email protected].
Any comments or materials (collectively referred to as "Feedback") sent to Teradata Corporation will be deemed non-confidential. Teradata
Corporation will have no obligation of any kind with respect to Feedback and will be free to use, reproduce, disclose, exhibit, display, transform,
create derivative works of, and distribute the Feedback and derivative works thereof without limitation on a royalty-free basis. Further, Teradata
Corporation will be free to use any ideas, concepts, know-how, or techniques contained in such Feedback for any purpose whatsoever, including
developing, manufacturing, or marketing products or services incorporating Feedback.
Copyright © 2000-2013 by Teradata Corporation. All Rights Reserved.
Table of Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Conventions Used in This Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Typefaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
SQL Text Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Command Shell Text Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Contact Teradata Global Technical Support (GTS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
About Teradata Aster. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
About This Document. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
VOLUME 1
Aster Database Administrator Guide
Chapter 1: Overview of Cluster Management . . . . . . . . . . . . . . . . . . 16
Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
PostgreSQL and Aster Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Node Types in Aster Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Single-System View. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Data Partitioning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Massively Parallel Processing (MPP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
High Availability Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
User Interfaces to Aster Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Cluster Administration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Database Administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Data Path Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Bulk Data Utilities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Chapter 2: The Aster Management Console (AMC). . . . . . . . . . . . 26
AMC Setup and Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
AMC System Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Teradata Aster Big Analytics Appliance Database Administrator Guide
3
AMC Installation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Log in to the AMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Manage the AMC Certificate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Disable HTTPS access for the AMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
AMC Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Configure Firefox to Trust the AMC Certificate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Configure Internet Explorer to Trust the AMC Certificate . . . . . . . . . . . . . . . . . . . . . . . . 30
Overview of the AMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
The AMC Dashboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Processes Section of the Dashboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Process Summary Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Query Statistics Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Nodes Section of the Dashboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Nodes Summary Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Nodes Statistics Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Cluster-Wide Disk Capacity/Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Aster Database Cluster Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
The Status Icon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Cluster Status Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Allowed Administrative Actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Chapter 3: Process Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Processes Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Process Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Filter the Process List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Disable Automatic Refresh of Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Query Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Monitor Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Monitor Process Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Cancel SQL Statements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Monitor SQL-MapReduce Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Chapter 4: Manage Data and Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Node Overview Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
The Node List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Node Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Node States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4
Teradata Aster Big Analytics Appliance Database Administrator Guide
Disk Storage Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Monitor Hardware. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Node Failures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
View Hardware Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
View Partition Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Replication Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Check the Current Replication Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Restore the Replication Factor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Change the Replication Factor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Inspect Individual Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Read Aster Database Logs in the AMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Create Log Bundles for Support Inquiries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Aster Database Log Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Detect and Manage Skew. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Table Skew (Data Skew) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Partition Level Skew . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Process Skew . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Chapter 5: Cluster Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Convert the Secondary Queen to a Loader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Add New Nodes to the Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Delete All Data to Re-Provision a Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Add Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Activate Aster Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Incorporate the New Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Balance Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Procedure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Split Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Prepare for Partition Splitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Partition Splitting Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Chapter 6: Queen Replacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Introduction to Queen Replacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Replace a Failed Queen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Run the Queen Replacement Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
What is kept and what is lost during queen replacement? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Teradata Aster Big Analytics Appliance Database Administrator Guide
5
Best Practices for Ensuring Queen Recoverability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Supporting Procedures for Queen Replacement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Set Up Passwordless Root SSH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Install the Secondary Queen Software. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Chapter 7: Administrative Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Cluster Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Check Hardware Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Check Node Hardware Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Remove Nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Manage Network Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Multi-NIC Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
NIC Bonding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Set up IP Pools in the AMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Manage Backups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Add a New Backup Manager to the AMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Start a Backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Monitor and Manage Backups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Configure Cluster Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Cluster Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
Sparkline Graph Scale Units. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Graph Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Internet Access Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Support Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
QoS Concurrency Threshold Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Roles and Privileges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
View the List of Available AMC User Privileges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Create an AMC User. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Check Current AMC Privileges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Edit AMC Privileges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Configure Hosts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Set Up Host Entries for all Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Set up DNS entries for all Aster Database nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Restart Aster Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Procedure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Soft Restart. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Backup interaction with soft-restart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Hard Restart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Soft Shutdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Activate Aster Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
6
Teradata Aster Big Analytics Appliance Database Administrator Guide
Situations that Require an Activation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Activate Aster Database: The Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Balance Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Balance Data: The Procedure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Balance Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Balance Process: The Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Cluster Management from the Command Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Check Cluster Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Soft Restart. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Soft Shutdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Soft Startup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Free Space Occupied By Defunct VWorkers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Chapter 8: Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Aster Database Firewall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Default Firewall Settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Open TCP Ports for Aster Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Enable or Disable the Aster Database Firewall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
Chapter 9: Monitor Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
Monitor Events with the Event Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
Event Engine Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Manage Event Subscriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Upgrades of Event Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
View Event Subscriptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Supported Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
Remediations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Event Engine Best Practices/FAQs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
Test the Event Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
Troubleshoot Event Engine Issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
Monitor of Aster Database with SNMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Set Aster Database to send SNMP traps to an NMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Set an NMS to perform SNMP reads on Aster Database . . . . . . . . . . . . . . . . . . . . . . . . . 146
Teradata Aster Big Analytics Appliance Database Administrator Guide
7
Chapter 10: Command Line Interface (ncli) . . . . . . . . . . . . . . . . . . . 148
ncli Installation and Setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Install ncli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Required Privileges to Run ncli Commands. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
ncli Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Who Should Use ncli?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Issue ncli Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Command Line Conventions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
ncli Help. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
ncli Command Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
ncli Command Sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
ncli apm Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
ncli database Section. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
ncli disk Section. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
ncli events Section. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
ncli ice Section. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
ncli ippool Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
ncli netconfig Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
ncli node Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
ncli nsconfig Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
ncli process Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
ncli procman Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
ncli qos Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
ncli query Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
ncli replication Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
ncli session Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
ncli sqlh Section. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
ncli sqlmr Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
ncli statsserver Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
ncli sysman Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
ncli system Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
ncli tables Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
ncli util Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
ncli vworker Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
ncli Flags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Limit Actions of ncli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Format and Sort Flags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
Miscellaneous Flags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
8
Teradata Aster Big Analytics Appliance Database Administrator Guide
Chapter 11: Executables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
Executables Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
Executables Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
Preinstalled Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
Executable Jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
Running Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
Best Practices for Running Scripts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Creating Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
SQL Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
Cluster Utility SQL-MapReduce Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
Best practices for building scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
Upgrades . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
Rebuilding Custom Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
Executables Not Supported from Prior Versions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
Chapter 12: Teradata Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Managing Aster Database with Teradata Viewpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Information available through Viewpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Administrative operations available through Viewpoint . . . . . . . . . . . . . . . . . . . . . . . . . 215
Configuring Aster Database for use with Viewpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Troubleshooting the Viewpoint integration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Chapter 13: Logs in Aster Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
Diagnostic Log Bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
View Diagnostic Bundle Jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
Send a Diagnostic Log Bundle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
Save a Diagnostic Log Bundle on Your Local Filesystem . . . . . . . . . . . . . . . . . . . . . . . . . 223
Include All Nodes in a Diagnostic Log Bundle. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
Prepare a Custom Diagnostic Log Bundle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
Run Custom Commands in a Diagnostic Log Bundle Job . . . . . . . . . . . . . . . . . . . . . . . . 224
View Diagnostic Log Bundle Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
Teradata Aster Big Analytics Appliance Database Administrator Guide
9
Aster Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
10
Teradata Aster Big Analytics Appliance Database Administrator Guide
Preface
This guide explains the software tasks you will perform to manage your Aster Database cluster.
If you’re using a later version, you must download a newer edition of this guide!
The following additional resources are available:
•
Aster Database upgrades, clients and other packages:
http://downloads.teradata.com/download/tools
•
Documentation for existing customers with a Teradata @ Your Service login:
http://tays.teradata.com/
•
Documentation that is available to the public:
http://www.info.teradata.com/
Other documentation available from Teradata Aster includes:
•
Aster Analytics Foundation Guide documents all available SQL-MR analytic functions,
including function name, description, syntax, arguments and examples.
•
Aster Development Environment User Guide documents the Teradata Aster plug-in for
Eclipse, which enables developers to create their own SQL-MR Java-based functions
within a visual development environment.
•
Aster Database Upgrade Guide explains how to upgrade a Aster Database cluster or Aster
Database Backup cluster.
Conventions Used in This Guide
This document assumes that the reader is comfortable working in Windows and Linux/UNIX
environments. Many sections assume you are familiar with SQL.
This document uses the following typographical conventions.
Typefaces
Command line input and output, commands, program code, filenames, directory names, and
system variables are shown in a monospaced font. Words in italics indicate an example or
placeholder value that you must replace with a real value. Bold type is intended to draw your
attention to important or changed items.
Teradata Aster Big Analytics Appliance Database Administrator Guide
11
Contact Teradata Global Technical Support (GTS)
SQL Text Conventions
In the SQL synopsis sections, we follow these conventions
•
Square brackets ([ and ]) indicate one or more optional items.
•
Curly braces ({ and }) indicate that you must choose an item from the list inside the braces.
Choices are separated by vertical lines (|).
•
An ellipsis (...) means the preceding element can be repeated.
•
A comma and an ellipsis (, ...) means the preceding element can be repeated in a commaseparated list.
•
In command line instructions, SQL commands and shell commands are typically written
with no preceding prompt, but where needed the default Aster Database SQL prompt is
shown: beehive=>
Command Shell Text Conventions
For shell commands, the prompt is usually shown. The $ sign introduces a command that’s
being run by a non-root user:
$ ls
The # sign introduces a command that’s being run as root:
# ls
Contact Teradata Global Technical Support
(GTS)
For assistance and updated documentation, contact Teradata Global Technical Support
(GTS):
•
Support Portal: http://tays.teradata.com/
•
International: 212-444-0443
•
US Customers: 877-698-3282
•
Toll Free Number: 877-MyT-Data
About Teradata Aster
Teradata Aster provides data management and advanced analytics for diverse and big data,
enabling the powerful combination of cost-effective storage and ultra-fast analysis of
relational and non-relational data. Teradata Aster is a division of Teradata and is
headquartered in San Carlos, California.
For more information, go to http://www.asterdata.com
Teradata Aster Big Analytics Appliance Database Administrator Guide
12
About This Document
About This Document
This is the “Teradata Aster Big Analytics Appliance Database Administrator Guide,” version
5.10, edition 1. This edition covers Aster Database version AD 5.10.00.00 and was published
May 2, 2013 11:28 am.
Get the latest edition of this guide! This document is updated very frequently. You can find the
latest edition at http://tays.teradata.com/
Revision History
Date
Description
May 2013
Initial Release 5.10
Teradata Aster Big Analytics Appliance Database Administrator Guide
13
Aster Database Administrator Guide
This guide explains how to manage your Aster Database cluster. The subsections are:
•
Overview of Cluster Management
•
The Aster Management Console (AMC)
•
Process Management
•
Manage Data and Nodes
•
Cluster Expansion
•
Queen Replacement
•
Administrative Operations
•
Security
•
Monitor Events
•
Command Line Interface (ncli)
•
Executables
•
Teradata Tools
•
Logs in Aster Database
•
Aster Glossary
Teradata Aster Big Analytics Appliance Database Administrator Guide
14
15
Teradata Aster Big Analytics Appliance Database Administrator Guide
CHAPTER 1
Overview of Cluster Management
Aster Database is a massively parallel processing (MPP), database for general-purpose and
analytic data warehousing. The Aster Database system is built on the foundations of cluster
commodity computing and can glue thousands of commodity machines together into a single
database that gives the user and administrator a single-system view. The notion of a singlesystem view is central to the architecture of Aster Database. Administrators, analysts, and
software applications (business intelligence tools, management consoles, etc.) interact with a
single machine that offers the speed and processing power of the entire cluster, but hides
cluster management tasks from the end-user.
This overview of Aster Database cluster management includes the following sections:
•
Architecture
•
High Availability Overview
•
User Interfaces to Aster Database
Architecture
Aster Database forms a massively-parallel database that can scale from three nodes (servers) to
thousands of nodes. The smallest Aster Database configuration, at three nodes, includes one
queen and two worker nodes. A node in Aster Database is a standard, inexpensive commodity
x86 server from vendors such as HP, IBM, or Dell, with locally-attached storage (also known
as direct-attached storage), and networked with other nodes using commodity Gigabit
Ethernet (GigE) technology. In larger Aster Database configurations that span multiple racks,
individual nodes are internetworked using 1 GigE, while racks are hierarchically networked
using 10 GigE ports or multiple-trunked 1 GigE ports.
The Aster Database system is based on a multi-tiered architecture that emphasizes a clean
separation of roles for nodes to meet the challenges of massive-scale data warehousing and
analytic processing. A tier is formed by a category of nodes that are dedicated to executing a
particular type of warehousing task. The presence of multiple tiers helps isolate workloads
that compete for different cluster resources.
Incremental scaling is a hallmark of Aster Database. Each tier can be independently and
incrementally scaled in response to workload demands. Traditional data warehouses require
customers to plan out warehousing capacity requirements months or years in advance,
Teradata Aster Big Analytics Appliance Database Administrator Guide
16
Overview of Cluster Management
Architecture
necessitating large up front capital investments. In stark contrast, Aster Database can start out
as a small installation and scale out incrementally, one node at a time. You can add capacity
(worker nodes) and loading/exporting bandwidth (loader/exporter nodes) on an as-needed
basis.
Figure 1: Aster Database Cluster Architecture
Aster Database is built to scale on heterogeneous hardware. You can extend multiple
generations of commodity hardware to scale easily over time. Teradata Aster does, however,
recommend that all worker nodes use the same size disk.
PostgreSQL and Aster Database
Aster Database utilizes components from the best-in-breed open source database, PostgreSQL
(or Postgres), to provide high performance local database processing on worker nodes. Aster
Database version 5.10 runs a new database kernel on the nodes. This kernel incorporates
components of version 8.4 of PostgreSQL.
Node Types in Aster Database
Aster Database divides the set of warehousing tasks among various classes of task-dedicated
nodes. The basic classes include the queen, workers, loaders, and backup nodes. Different
classes of nodes typically run on different classes of server hardware.
17
Teradata Aster Big Analytics Appliance Database Administrator Guide
Overview of Cluster Management
Architecture
Figure 2: Aster Database Architecture
•
Queen: The queen node is the cluster coordinator, top-level query planner/coordinator,
and keeper of the data dictionary and other system tables. You can maintain an inactive
queen as a backup.
•
Cluster coordination: The cluster logic that glues all nodes of the system together is
hosted on the queen. This software component is responsible for all cluster,
transaction, and storage management aspects of the system. In this role, the queen is
also responsible for seamless software delivery to all other nodes in the cluster.
•
Distributed query planning: The queen manages the distribution of data in the
cluster, prepares top-level, partition-aware query plans, issues queries to virtual
workers, and assembles the query results. The virtual workers, in turn, prepare local
query plans and execute the queen’s queries in parallel. The queen structures top-level
queries so that little or no data is shipped to the queen until the final phase, when the
query results are assembled and sent to the client.
•
System tables: The queen hosts the Aster Database system tables.
•
Worker Nodes: As the name implies, worker nodes are the physical machines where the
bulk of the data storage, analysis, and retrieval tasks get done in Aster Database. Actually
doing these tasks is the responsibility of the virtual workers (vworkers) that reside on each
worker node. There are usually more than one vworker per worker node. The number of
virtual workers on each worker node is a function of the hardware configuration of the
node: the number of CPU cores, memory, and direct-attached disk capacity. The queen
communicates with vworkers via standard SQL, and the vworkers on various worker
nodes communicate with each other via Teradata Aster’s mechanism.
•
Loader Nodes (also called “Exporters”): These are CPU-heavy nodes that typically have
little to no disk capacity and help in independent scaling of CPU and disk in the cluster.
Teradata Aster Big Analytics Appliance Database Administrator Guide
18
Overview of Cluster Management
Architecture
They are responsible for loading and exporting data into and from the cluster. These
independent nodes also help isolate loads and exports from query processing.
•
Backup Nodes: Individual tables or the contents of your entire Aster Database system can
be backed up to an Aster Database backup cluster. The backup cluster is not an Aster
Database. Instead, it is a set of disk-heavy Aster Database Backup Nodes designed to
efficiently maintain copies of the data you store in Aster Database. This data can be
restored to Aster Database.
Single-System View
All clients of Aster Database communicate with the cluster as if it is a single large system. The
Aster Database Management Console (AMC), the Aster Database Cluster Terminal (ACT), the
Aster Database JDBC and ODBC drivers, and the Aster Database Backup Terminal each
interact with Aster Database as if it were a single database. More details on these client
software follow in subsequent sections.
Data Partitioning
Aster Database achieves a massively-parallel database by exploiting the popular “divide-andconquer” principle in computing. Data is partitioned (distributed) among various sharednothing nodes (worker nodes), and within each worker node, among various shared-nothing
vworkers.
19
•
Distribution: Tables in Aster Database are created with an added SQL qualifier of FACT or
DIMENSION. Fact (or large dimension) tables are distributed into individual vworkers
that span across multiple worker nodes in the cluster. The key (column) to use for
distribution is provided in the CREATE TABLE DDL (data definition language).
•
Logical Partitioning: Physical partitions of fact (or large dimension) tables are further
divided within each vworker via logical partitioning within a physical partition. Multilevel
partition hierarchies provide a powerful mechanism to prune data required during query
processing. These sub-partitions are called “child partitions” in Aster Database and they
report into a “parent table” at the highest level of the hierarchy.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Overview of Cluster Management
Architecture
Figure 3: A table showing both distribution and logical partitioning
Massively Parallel Processing (MPP)
Aster Database is a true massively-parallel database and has been built from the ground up
with inexpensive commodity hardware in mind. Every operation in Aster Database – queries,
loads, exports, index builds and rebuilds, upgrades, replication, backups, restores – is always
executed in a massively parallel fashion.
•
MPP at workers: An Aster Database is comprised of numerous vworkers hosted at the
worker nodes, as described earlier. Each individual query, planned at the queen, is
executed in massively-parallel fashion across all partitions. Partitions communicate with
each other during dynamic repartitioning of data – a concept in shared-nothing databases
explored in later sections – via a massively parallel communication transport called
Optimized Transport.
•
MPP in ETL: The Aster Database Loader utility and loader nodes form the massivelyparallel backbone of the Aster Database ETL pipeline. The Aster Database Loader utility
communicates with loader nodes and acts as a landing zone for bulk data, both during
loads and exports.
•
MPP in Backup and Recovery: When a backup administrator starts a backup by
interacting with the Aster Database Backup Terminal, massively parallel streams of backup
data travel from each partition source (N) to the destination backup nodes (M). This NxM
communication is true for both backups and for the reverse traffic during recoveries.
•
MPP in Upgrades: During a normal cluster startup, software is delivered via the network
in a massively-parallel way. Similarly, after an upgrade package is deployed on the queen,
workers and loaders in Aster Database are upgraded in a massively-parallel fashion
without any manual administrator intervention.
Teradata Aster Big Analytics Appliance Database Administrator Guide
20
Overview of Cluster Management
High Availability Overview
•
MPP in Network: Once you add a worker or loader node, the Aster Database Network
Aggregation feature automatically self-discovers all the other NIC IP addresses connected
to that node. Once that node is activated, you have multiple network links aggregated
together, a feature called Network Aggregation. This “bonding” offers two advantages,
described below.
•
Bandwidth Aggregation: NIC bonding aggregates all the bandwidth of the individual
1GbE links for expanded network throughput. Before bonding, you have 1x1GbE link
per node. After bonding, you have n times that. For example if you have 4x1GbE NICs
on a node, after bonding you have four times the bandwidth or effectively 4Gb
bandwidth.
•
Transparent failover in the event of any single network failure: In the event of any
single network failure (NIC port, cable link, switch), there will be transparent failover
to the remaining links. The only impact is that bandwidth will adjust down
accordingly. If you have 4x1GbE NICs with two links connected to separate redundant
GbE switches and one of the switches fail, the result is an automatic failover to the
redundant switch, and the bandwidth shrinks from 4x1GbE links to 2x1GbE links.
There is no downtime and continuous availability is maintained.
High Availability Overview
Cluster commodity hardware fails in more ways than one. Disks, RAID controllers, chipsets,
DIMM modules, CPU, network cards, and switches are all commodity components that have
a distinct possibility of failing in a large cluster. A RAID-10 disk configuration does not
provide sufficient reliability for the high availability demands of modern data users. Aster
Database, therefore, has high availability built into the cluster and offers replication as a firstclass feature.
•
Replication Factor: Each vworker in Aster Database has zero or more replicas in the
cluster. The Aster Database administrator sets the replication factor to two (recommended;
Aster Database tries to ensure each vworker has a replica at all times and alerts the
administrators if there is not) or one (no replication; not recommended), depending on
the desired trade-off between reliability and storage capacity.
With the introduction of Analytic Tables in Aster Database 5.10, there is less of a trade-off,
because you can declare analytic tables to hold copies of data for analysis. The analytic
tables are not replicated, so they should only be used to hold derived data, and never for
the master data source. This allows you to use an RF factor of 2 for the cluster, with the
option of using analytic tables (with an RF factor of 1, effectively) to hold large amounts of
data for several days. DML (data manipulation language) statements, loads, and queries go
directly to the primary partition. To view your system’s Replication Factor, view the
Dashboard tab of the AMC.
•
21
Automatic Failover and Online Resync: On any failure of the primary partition, the Aster
Database clusterware automatically fails workloads over to a chosen secondary partition.
Stale secondary replicas catch up using delta replication, where “delta” signifies changes
present at the primary but not at the secondary. This is particularly useful after transient
Teradata Aster Big Analytics Appliance Database Administrator Guide
Overview of Cluster Management
High Availability Overview
(temporary) failures. For example, assume Node 2 suffers a transient failure and a
partition fails over to Node 4. If Node 2 fully recovers after two minutes, there may have
been small changes (e.g. additional inserts/updates/deletes). Delta replication enables
“delta re-synchronization” between the partition on Node 4 (up-to-date) and the partition
on Node 2 (slightly out-of-date). Online re-sync can save significant recovery time
compared to conventional approaches that rely on full copy restoration techniques that
take hours or even days to complete. Queries are transparently retried on a failover.
Figure 4: Replication Failover in Aster Database
•
Balance Data: When a new worker is added or a failed worker is brought back into the
system after a transient failure, the node is incorporated in a completely seamless manner
without any perturbation to the existing workload. Any new replicas are created in the
background, totally online.
•
Spare Machines: Teradata Aster recommends keeping spare servers in the cluster so that
such nodes can be quickly repurposed to replace a failed queen, loader, worker, or backup
node.
•
Network Aggregation: Once you add a worker or loader node, the Aster Database
Network Aggregation feature automatically self-discovers all the other NIC IP addresses
connected to that node. When that node is activated, you have multiple network links
aggregated together. In the event of any single network failure (NIC port, cable link,
switch), there is transparent failover to the remaining links.
Teradata Aster Big Analytics Appliance Database Administrator Guide
22
Overview of Cluster Management
User Interfaces to Aster Database
User Interfaces to Aster Database
The tools you use to interact with Aster Database include data-path interfaces such as SQL,
JDBC, and ODBC, bulk data path interfaces for loading/exporting, and tools for managing the
cluster. The sections below explain the most common Aster Database-related tools.
Cluster Administration
Aster Database Management Console (AMC): The AMC is a graphical user interface that is
the primary cluster management interface for Aster Database. For details, see The Aster
Management Console (AMC) (page 26).
Event Notification and System Monitoring: The Aster Database Event Engine lets you set up
alerts that fire when certain events happen on the cluster. See “Monitor Events with the Event
Engine” on page 132. In addition, you can get help from Teradata Aster consulting services to
implement node, service, and network monitoring tools using other frameworks such as the
popular open-source Nagios API.
Command line: Most management tasks can be done in the AMC, but to run some lessfrequently used utilities, you must use the executables framework, the ncli (Aster Database
Command Line Interface) or open a command line session on the Aster Database queen. See
“Executables” on page 192, see “Command Line Interface (ncli)” on page 148 and see “Cluster
Management from the Command Line” on page 124.
Database Administration
Aster Database Cluster Terminal (ACT): ACT is the basic command line interactive terminal
to interact with the cluster. Administrators can run DDLs and other commands to administer
the database, browse the catalog and create/modify objects, and run SQL statements and
scripts.
AquaFold’s Aqua Data Studio (ADS): ADS lets you perform DDL operations and query data
interactively. This is a third-party tool that you may purchase from AquaFold directly.
Data Path Interfaces
JDBC and ODBC: Aster Database provides JDBC and ODBC drivers.
SQL via ACT: Developers and administrators can also run SQL statements and scripts by
using the interactive ACT terminal.
AquaFold’s Aqua Data Studio (ADS): ADS lets you query data interactively and provides
tools that help you write and manage queries efficiently.
Bulk Data Utilities
Aster Database Loader Tool
The Aster Database Loader Tool, ncluster_loader, is a full-featured, high-speed bulk loading
application.
23
Teradata Aster Big Analytics Appliance Database Administrator Guide
Overview of Cluster Management
User Interfaces to Aster Database
Aster Database Backup
Aster Database Backup lets you back up individual tables or your entire Aster Database.
Backups are online operations, meaning your cluster remains up and servicing queries while
the backup runs. Backups can occur automatically and in an incremental fashion to save space
and bandwidth.
Teradata Aster Big Analytics Appliance Database Administrator Guide
24
Overview of Cluster Management
User Interfaces to Aster Database
25
Teradata Aster Big Analytics Appliance Database Administrator Guide
The Aster Management Console
(AMC)
CHAPTER 2
The Aster Database Management Console (AMC) is the main administrative interface to Aster
Database. This chapter provides an overview of the AMC and describes the AMC Dashboard.
•
AMC Setup and Configuration
•
Overview of the AMC
•
Processes Section of the Dashboard
•
Nodes Section of the Dashboard
•
“Aster Database Cluster Status” on page 38
AMC Setup and Configuration
AMC System Requirements
You can use the AMC from most common web browsers. See the Aster Database 5.10 Server
Platform Guide for a list.
AMC Installation
No installation is required. The AMC is installed by default when you install Aster Database
on the queen.
Log in to the AMC
To connect to the AMC, the queen node must be powered on, and the Aster Database software
must be active. To access the AMC:
1
Open a web browser and enter the IP address of the queen node in the URL field.
https://<Queen_IP_Address>/chrysalis/login
Teradata Aster Big Analytics Appliance Database Administrator Guide
26
The Aster Management Console (AMC)
AMC Setup and Configuration
If the login window does not appear, see “Troubleshooting: AMC login window does not
appear” on page 31.
Tip! AMC uses browser cookies. You must enable your browser to accept cookies from the queen node in order to
use AMC.
2
In the Login window, enter your username and password.
When you log in for the first time, the default username/password is
db_superuser/db_superuser.
Figure 5: AMC Login Screen
Role-based access
The AMC enforces role-based access privileges. A user sees only those sections of the AMC to
which his or her user account has been granted access. See “Create an AMC User” on
page 114.
HTTPS and certificate warnings
AMC runs on HTTPS and runs with a default Aster Database self-signed certificate. As a
result, your browser warns you and displays a certificate error. See “Trust the AMC
Certificate” on page 28 for instructions on hiding these error messages.
Other useful URLs
Ganglia is accessible at this URL:
http://<queen IP address>/ganglia
Manage the AMC Certificate
Install the AMC certificate at a new location
The default location of the certificate is /home/beehive/certs/server.cert. You can
install your own AMC certificate at a different location by performing the following steps.
1
Place the new certificate file in the desired location on the Aster Database queen.
2
In the file, /home/beehive/apache/conf/conf.d/ssl.conf, change the line
SSLCertificateFile /home/beehive/server.cert
to
SSLCertificateFile <absolute_path_of_new_cert_file>
27
Teradata Aster Big Analytics Appliance Database Administrator Guide
The Aster Management Console (AMC)
AMC Setup and Configuration
3
Open a command-line session on the queen as user root and issue the following statement
to restart Apache.
# /home/beehive/toolchain/x86_64-unknown-linux-gnu/httpd-2.2.15/bin/
httpd -f /home/beehive/apache/conf/httpd.conf -k restart
This will use the certificate file from the new location.
4
If the restart of Apache fails, do the following:
a
Add the path to the Python installation in the Aster toolchain to the library path:
# LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/beehive/toolchain/
x86_64-unknown-linux-gnu/python-2.5.2/lib/
b
Restart Apache again:
# /home/beehive/toolchain/x86_64-unknown-linux-gnu/httpd-2.2.15/
bin/httpd -f /home/beehive/apache/conf/httpd.conf –k restart
Disable HTTPS access for the AMC
You may optionally allow users to connect to the AMC over an unencrypted HTTP
connection. This procedure disables HTTPS. All the AMC traffic will travel over HTTP.
Procedure
1
Open the file /home/beehive/apache/conf/conf.d/jk.conf in a text editor.
2
Comment out the line LoadModule rewrite_module modules/mod_rewrite.so.
3
Issue the following statement to restart Apache.
/home/beehive/toolchain/x86_64-unknown-linux-gnu/httpd-2.2.15/bin/
httpd -f /home/beehive/apache/conf/httpd.conf -k restart
AMC Troubleshooting
Trust the AMC Certificate
The AMC runs using secure HTTP, at an https URL on the queen machine. To open the AMC,
type https:// and the IP address or hostname of your queen:
https://<Queen IP Address>/chrysalis/login
Your browser will display a warning stating that the connection is not trusted. This warning
appears because the AMC connection is secured by Teradata Aster’s self-signed digital
certificate. You can configure your browser to trust this certificate so that the warning does not
appear. To do this, follow the instructions in one of these sections:
•
Configure Firefox to Trust the AMC Certificate
•
Configure Internet Explorer to Trust the AMC Certificate
Teradata Aster Big Analytics Appliance Database Administrator Guide
28
The Aster Management Console (AMC)
AMC Setup and Configuration
Configure Firefox to Trust the AMC Certificate
1
In Firefox, click the I understand the risks link.
Figure 6: Firefox: Trusting the AMC certificate
2
Click the Add Exception button.
Figure 7: Add Exception button in Firefox
3
In the next window, make sure the checkbox, Permanently store this exception, is checked.
4
Click the Confirm Security Exception button. Firefox will now trust the AMC certificate.
Figure 8: Add Security Exception in Firefox
29
Teradata Aster Big Analytics Appliance Database Administrator Guide
The Aster Management Console (AMC)
AMC Setup and Configuration
Configure Internet Explorer to Trust the AMC Certificate
1
In the IE browser, you will see the message, “There is a problem with this website’s security
certificate.” Click Continue to this website.
Figure 9: Continue to this website in Internet Explorer
2
The navigation bar appears in pink with the message, Certificate Error.
Figure 10: Certificate Error in Internet Explorer
3
Click on the words, Certificate Error.
Figure 11: Click on “Certificate Error”
Teradata Aster Big Analytics Appliance Database Administrator Guide
30
The Aster Management Console (AMC)
AMC Setup and Configuration
4
Click View Certificates.
Figure 12: Install the certificate in Internet Explorer
5
Click Install Certificate. Internet Explorer will now trust the AMC certificate.
Error message “/home/beehive/tmp/server.cert does not exist
When you start the Aster Database queen, it prints this error if the AMC certificate is missing:
SSLCertificateFile: file '/home/beehive/tmp/server.cert' does not exist
or is empty
This indicates the certificate was not found in the location specified in /home/beehive/
apache/conf/conf.d/ssl.conf. Follow the instructions in “Install the AMC certificate at
a new location” on page 27 to fix the problem.
The full text of the error message is:
root@<queen-machine>:/home/beehive/amc/webserver/webapps/ROOT# /home/beehive/toolchain/
x86_64-unknown-linux-gnu/httpd-2.2.15/bin/httpd -f /home/beehive/apache/conf/httpd.conf k restart
Syntax error on line 18 of /home/beehive/apache/conf/conf.d/ssl.conf:
SSLCertificateFile: file '/home/beehive/tmp/server.cert' does not exist or is empty
Troubleshooting: Certificate errors shown in browser
Your browser may display a certificate error when you try to connect to the AMC. This is
expected. See “Trust the AMC Certificate” on page 28 for instructions.
Troubleshooting: AMC login window does not appear
If the AMC refuses to load in your browser, do one of the following:
•
31
If you previously used the pre-version-4.5 AMC, then clear the browser’s cache.
Teradata Aster Big Analytics Appliance Database Administrator Guide
The Aster Management Console (AMC)
Overview of the AMC
•
If the AMC login window still fails to appear after you have cleared the browser’s cache,
there may be a certificate problem. See the instructions in “Install the AMC certificate at a
new location” on page 27.
Troubleshooting: AMC Add Node dialog box displays unexpectedly
The AMC requires that the browser be set to accept cookies from the queen machine. On
symptom of blocked cookies is that the Add Node dialog box displays unexpectedly. This
problem manifests as:
•
When you go to the AMC Admin tab, part of the Add Node dialog box appears, even though
you didn't click on the Add Nodes button.
•
The Admin tab never finishes loading, and the cursor hangs, indicating "busy" for a long
time.
If this happens, make sure to enable cookies from the queen machine in the browser.
Overview of the AMC
The AMC is a web-based interface that lets you manage, configure, and monitor Aster
Database activity. The AMC provides administrators with an authoritative view of the system
and mechanisms for invoking administrative actions. AMC provides developers and other
users with insight into Aster Database activity, such as details on currently executing SQL
statements and statement histories.
Teradata Aster Big Analytics Appliance Database Administrator Guide
32
The Aster Management Console (AMC)
Overview of the AMC
Figure 13: The AMC Dashboard
The AMC Dashboard
The AMC Dashboard is the main information center where you can view the condition of the
cluster and the jobs currently running on it. Many field labels in this window are clickable. By
clicking a label or message, you can usually see more details about the message or navigate to
the commands related to it.
Figure 14: Navigation and Status Messages in AMC
Status lamp
Cluster name
Links to documentation
and downloads
Login details
Status summary
Message board
Top of the Dashboard window
As shown in the image above, the top of the Dashboard consists of the following items.
Clockwise from the upper left, they are:
•
33
Status Lamp: The status lamp lights green to show the cluster is running correctly. The
legend next to the status lamp shows the name of the cluster and its status, and the current
Teradata Aster Big Analytics Appliance Database Administrator Guide
The Aster Management Console (AMC)
Overview of the AMC
queen time, converted to browser-local time. The Aster Database statuses are discussed in
more detail in “Aster Database Cluster Status” on page 38.
•
Cluster Name: The name assigned to the cluster.
•
Link to Docs and Downloads
•
Resource Center: Click this link to open the Teradata Aster Resource Center, a web
page where you can find documentation, videos, and downloadable client software for
various operating systems.
•
Help Link: Click this link to open an HTML page containing information about the
AMC page you are currently viewing.
•
Login Details: In the top right of the window is the Teradata Aster logo. Directly below that
is your current, logged-in AMC user account name. Your user account determines what
actions you can perform in the AMC.
•
Status Summary: In the upper right of the Dashboard tab is the status box. This box is a
fixture not only of the Dashboard, but of all AMC windows. The status box notifies you of
important events in Aster Database.
•
Message Board: In the upper left of the Dashboard tab is the message board. Here, you and
other Aster Database administrators can post messages to all AMC users. To add a
message, click the pencil icon, type the message in the dialog box that appears, and click
OK to post it. All AMC users on this cluster will see your message immediately on the
message board in their AMC session.
Navigation Tabs in the Dashboard Window
Below the Status Icon are Navigation Tabs that provide access to various types of tasks that are
accessible through the AMC. Each tab provides details on a different aspect of Aster Database.
For information about how to use these tabs, see:
•
Process Management
•
Manage Data and Nodes
•
The Aster Management Console (AMC)
The Processes Section of the Dashboard Window
Below the message board and information box is the Processes section of the Dashboard tab.
The Processes section shows an overview of the current and recent jobs in the cluster, as well
as statistics about queries and user activity. See “Processes Section of the Dashboard” on
page 35.
The Nodes Section of the Dashboard Window
At the bottom of the Dashboard tab, below the Processes section, is the Nodes section. The
Nodes section summarizes the operational status of the machines in your cluster, including the
quantity of data stored and the remaining free space in the cluster. See “Nodes Section of the
Dashboard” on page 36.
Teradata Aster Big Analytics Appliance Database Administrator Guide
34
The Aster Management Console (AMC)
Processes Section of the Dashboard
AMC Version Number and Aster Database Version Number
To find out the version of the AMC you are running, click on the About the AMC link at the
bottom of the AMC window.
Hint: To find out your Aster Database version number (release number) and build number
(revision number) from the command line, view the file /home/beehive/.build on the
queen.
Processes Section of the Dashboard
The Processes section of the dashboard shows an overview of the current and recent jobs in
the cluster, as well as statistics including the Most Active Users rankings and the Process
Execution Time graph. The Active Applications box shows currently installed applications that
run on the cluster. The Processes section corresponds to the Processes tab, and clicking most
labels in this section will take you to the Processes tab.
Figure 15: The Processes tab in AMC
Process Summary Box
The green summary box provides a quick overview of the queries running in the cluster.
The green summary box lists the counts of the following states (click any label to show its
details).
•
Running: Count of currently running queries and processes
•
Pending: Count of queries queued for admission to the cluster
•
Active Sessions: Number of users and applications currently connected to Aster Database.
•
Completed: Count of queries that finished running without error in the last 24 hours.
•
Cancelled: Count of queries cancelled by an administrator or user in the last 24 hours.
•
Error: Count of queries that failed and reported an error in the last 24 hours.
•
Unknown: Count of queries that started in the last 24 hours, but whose status is now
unknown.
•
My Processes: Count of finished queries run by you (based on your AMC username) in the
last 24 hours.
•
SQL-MapReduce: Count of finished SQL-MapReduce queries that have run in the last 24
hours.
35
Teradata Aster Big Analytics Appliance Database Administrator Guide
The Aster Management Console (AMC)
Nodes Section of the Dashboard
Query Statistics Summary
The Query Statistics Summary area in the Processes Section provides an overview of the most
active users and longest running queries.
The Query Statistics Summary area in the Processes Section shows:
•
My Last 5 Processes
•
Top 5 Longest Processes
•
Process Execution Time
•
Top 5 Most Active Users
•
Active Applications
Nodes Section of the Dashboard
The lower part of the Dashboard shows the Nodes overview. This section summarizes the
operational status of the machines in your cluster, including the quantity of data stored and
the remaining free space in the cluster.
Figure 16: Nodes overview in AMC
Nodes Summary Box
The green summary box lists the counts of nodes in your cluster and summarizes the status of
the nodes.
This section shows the following (click any label to show its details):
•
Queen(s): Count of queen nodes in this cluster. The Active count is the number of active
queen nodes in this cluster. This can only be 1 or zero. The Passive count is the number of
passive or secondary queens in this cluster.
•
Loader(s): Count of the loader nodes in the cluster.
•
Worker Nodes: Count of worker machines in the cluster. Note this is the count of worker
machines, not the count of virtual workers. Below this are listed the counts of Active, New,
Suspect, and Failed nodes. See “Node States” on page 51 for more details.
Teradata Aster Big Analytics Appliance Database Administrator Guide
36
The Aster Management Console (AMC)
Nodes Section of the Dashboard
Nodes Statistics Summary
The center panel of the Nodes section shows the current replication factor of Aster Database. If
the current replication factor is below your target replication factor (your Aster Database
administrator specified this when installing Aster Database), a warning appears at the top of
this section.
The Replication Factor section shows, first, the cluster-wide current replication factor. Below
that, it shows how many virtual workers are at RF=2 (these are workers that have a valid
backup worker stored in Aster Database) and how many are lacking a backup (RF=1).
Teradata Aster’s recommended setting is to maintain the cluster at RF=2. If you have some
tables which should not be replicated, create those as analytic tables.
The bottom of this section is the Hardware Statistics panel, showing current and recent CPU
usage, memory usage, network bandwidth usage, and disk I/O usage. Click the Nodes >
Hardware Stats tab for more hardware statistics.
Cluster-Wide Disk Capacity/Usage
The right side of the Nodes panel of the AMC Dashboard shows the Data Payload Panel. This
panel provides a cluster-wide view of the data capacity of your cluster and shows how much
disk space is currently being occupied by data and other system files. (Note that you can also
view disk capacity and available space for an individual node, as explained in “Per-Node Disk
Capacity and Current Usage” on page 53.)
This information can be used to quickly determine whether you have a sufficient data storage
capacity in Aster Database or should begin planning to add storage to the cluster.
The measures shown here include:
•
Total Size of Active Data Stored shows the amount of data currently stored in Aster Database.
Active data refers to the raw, uncompressed data size before it is stored on disk. The graph’s
colors indicate the degree of compression applied to different portions of the data.
Hover your mouse pointer over the graph to see the amounts of data stored at each
compression level. The darker the color, the greater the degree of compression applied.
•
Total Data Stored and Disk Capacity: Just below the Total Size of Active Data Stored field is the
Total Data Stored and Disk Capacity graph and a breakdown of its contents. The horizontal bar
graph represents your total available disk space in the cluster, and the colors represent used
and unused portions of the disk space.
•
The % Full icon provides a visual summary of the disk space remaining on your cluster.
This graph turns orange to indicate that more than 70% of the cluster’s disk space has been
used, and turns red to indicate that more than 90% had been used. If this graph is
displayed in orange or red, you must take action by contacting Teradata Support.
Important! Always maintain at least 30% free disk space in Aster Database. This space is required for routine
aggregation and sorting operations.
37
Teradata Aster Big Analytics Appliance Database Administrator Guide
The Aster Management Console (AMC)
Aster Database Cluster Status
•
User Data is shown in dark green. The amount shown here represents the amount of ondisk data in Aster Database. This is the size of data on disk, after (optionally) being
compressed. To see the disk usage statistics for each node, click the Nodes Tab and then
click the Node Overview tab.
•
Data on Secondary is the amount of space occupied by the replica copies of your data.
•
System represents the amount of disk space consumed by operating system files, Aster
Database software files, and other files that do not contain your data.
•
Available represents the amount of unused storage currently available in the cluster.
•
Total Space shows the total amount of disk space in the cluster.
•
Alert fields: If any nodes’ disks are full or nearly full, the AMC displays alerts just below the
Total Space field. Click the alert text to display the Node Overview tab, where you can find the
nodes that are running out or space or nearly out of space. See “Disk Storage Utilization”
on page 52 for details.
Aster Database Cluster Status
The Status Icon
The Status Icon at the top left of the AMC shows the current overall status of Aster Database.
The status icon will show one of five colors and a message describing the current status in
more detail.
•
Green: Aster Database is operating normally and is able to accept new connections and
process statement requests.
•
Blue: Aster Database is operating normally and is able to accept new connections and
process statement requests, however, a current administrative activity may result in a
decrease in performance.
•
Yellow: Aster Database is able to accept new connections, however, it is unable to process
statement requests due to an administrative activity.
•
Red: Aster Database is currently stopped and cannot accept new connections or process
statement requests.
•
White/Clear: The browser client is no longer able to establish a connection to Aster
Database.
Teradata Aster Big Analytics Appliance Database Administrator Guide
38
The Aster Management Console (AMC)
Aster Database Cluster Status
Cluster Status Descriptions
For more details on the current status, position your mouse pointer over the Status Icon. The
table below details the various Aster Database statuses that may be displayed.
Table 2 - 1: Cluster Status Descriptions
Icon
Status
Description
Active
Aster Database is currently active and operating normally. It
has a replication factor of at least 1 and is able to receive and
execute statement requests.
Activating
Aster Database is currently activating new nodes into the
system. During this process, new nodes are brought into service
and the data is redistributed across the workers. During this
time, Aster Database is unable to execute statement requests,
although clients can establish connections to it. For details on
node activation, please refer to “Node Overview Tab” on
page 50.
Replicating
Aster Database is currently replicating online. This process
involves restoring Aster Database's replication factor, but does
not bring the processing capabilities of new workers into use.
For details on replication, please refer to “Node Overview Tab”
on page 50.
Data Imbalanced
The cluster currently has less than the required number of
copies of your data (as specified by your replication factor;
usually two copies are required), or at least one partition of
data and its replica are currently residing on the same physical
node. Generally, this occurs following a node failure or other
administrative action. Either of these conditions is undesirable
and should be resolved using the Balance Data command. See
“Balance Data: The Procedure” on page 123 for details.
Processing Imbalanced
The active vworkers are not evenly distributed on your cluster's
hardware, meaning that at least one worker node contains
more active vworkers than it should. Generally, this occurs
following a node failure or other administrative action. This
condition is undesirable and should be resolved using the
Balance Process command. See “Activate Aster Database” on
page 120 for details.
Warning! Before you invoke the Balance Process, be aware this
process will cancel any running queries. Before doing this
operation, verify there are no running queries and no queries
about to enter the system.
Restarting
39
Aster Database is currently restarting. During this time, it is
unable to execute statement requests. After a short period of
time, the status will change to Unavailable (see below). Please
refer to “Restart Aster Database” on page 118.
Teradata Aster Big Analytics Appliance Database Administrator Guide
The Aster Management Console (AMC)
Aster Database Cluster Status
Table 2 - 1: Cluster Status Descriptions (continued)
Icon
Status
Description
Backing Up
Aster Database is currently making a data backup to an Aster
Database backup cluster. During this process, Aster Database
continues to execute statement requests and other activities
normally. However, there may be some performance overhead
as a result of the backup activities. For details on backing up to
an Aster Database backup cluster, see the Teradata Aster Big
Analytics Appliance 3H Database User Guide.
Restoring
Aster Database is currently restoring data from an Aster
Database backup cluster. During this process, Aster Database is
unable to execute statement requests, although clients can
establish connections to it. For details on backing up to an
Aster Database backup cluster, see the Teradata Aster Big
Analytics Appliance 3H Database User Guide.
Stopped
Aster Database is currently stopped. During this time, Aster
Database is unable to execute statement requests, although
clients may be able to establish connections to it. This status
indicates that there is some issue preventing Aster Database
from being able to operate normally. Such issues include having
no active worker nodes in the system, having a replication
factor below the required minimum, and the occurrence of a
serious failure in Aster Database.
Unavailable
Aster Database is currently unavailable. This means that the
AMC browser client is unable to establish a connection to the
Aster Database. In most situations, this will be the result of a
network issue (i.e. Aster Database is active, but the web browser
cannot establish a connection to it). However, if there are no
network problems, then it indicates that an AMC connection to
the queen node could not be established, which may indicate a
failure of the queen node (during a restart of Aster Database,
the status will briefly be unavailable as the queen node is
rebooted).
Allowed Administrative Actions
Whether or not you can perform an administrative action depends on the:
•
rights that have been granted to you as an AMC user (see “Roles and Privileges” on
page 113); and
•
current status of the cluster. The table below shows which actions can be done in which
Aster Database states. In this table, an “X” indicates the operation is allowed.
Table 2 - 2: Allowed administrative actions
Cluster Status
Soft
Restart
Cluster
Hard
Restart
Cluster
Soft
Restart
Node
Add
Nodes
Remove
Nodes
X
X
X
X
X
Activate
Cluster
Balance
Data
Balance
Upgrade
Processing Software
Unavailable
Active
Teradata Aster Big Analytics Appliance Database Administrator Guide
X
40
The Aster Management Console (AMC)
Aster Database Cluster Status
Table 2 - 2: Allowed administrative actions (continued)
Cluster Status
Soft
Restart
Cluster
Hard
Restart
Cluster
Stopped
X
X
Activating
X
X
Restarting
X
X
Replicating
X
X
Backing Up
X
X
Restoring
X
X
Data
Imbalanced
X
Processing
Imbalanced
X
41
Soft
Restart
Node
Add
Nodes
Remove
Nodes
Activate
Cluster
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
Balance
Data
Balance
Upgrade
Processing Software
X
X
X
Teradata Aster Big Analytics Appliance Database Administrator Guide
CHAPTER 3
Process Management
Monitor and track the SQL statements running in Aster Database using the AMC Processes
tab. The filtering area at the top left is useful for showing and hiding different subsets of the
processes, so you can focus on just the processes of interest to you. The green summary box at
the top right shows counts of current and past statements, categorized by status.
Figure 17: The AMC Process tab
The Processes tab contains three sub-tabs:
•
Processes shows a table with statistics and status for current and past commands. See
“Processes Tab” on page 42.
•
Query Timeline shows a graphical representation of commands run in the past 24 hours. See
“Query Timeline” on page 46.
•
Sessions shows user sessions with the AMC. See “Monitor Sessions” on page 46.
Processes Tab
By default, when you click the Processes tab, AMC displays the list of processes in the
Processes sub-tab. The list displays information about running processes or processes that
finished running on the Aster Database.
Each process is a SQL command or a block of SQL statements (BEGIN ... END). The
statements can contain SQL-MapReduce functions.
Teradata Aster Big Analytics Appliance Database Administrator Guide
42
Process Management
Processes Tab
The Processes list is useful for monitoring activity on your cluster, checking on the progress of
queries you have submitted, and finding performance issues such as statements that take
much longer to run than others.
To display processes:
1
Navigate to Processes > Processes.
Table 3 - 1 describes the type of information displayed for every process.
2
To filter the display of processes in the Query Timeline using the Change Filter button, as
described in “Filter the Process List” on page 44.
3
To display summary information about a process, move the mouse over the process ID.
4
To display detailed information about a connected process, click its ID.
See “Process Information” on page 43 for more information.
Process Information
The following table describes the type of information displayed for every process in the
Processes List.
Table 3 - 1: The AMC Processes Tab Columns
Column
Description of Processes Tab Column
ID
Unique identifying number for the process. The number is truncated in the list view, but you
can see the complete number (and other details, such as the database on which the statement
is acting) by hovering the mouse cursor over the process ID. Click to display the process
detail page, described in “Process Information” on page 43.
Statement
The command being executed by the process. Can be any sort of SQL statement. The
statement is truncated in the list view, but you can see the complete statement by hovering
the mouse cursor over the process ID number or clicking it to display the Process Detail tab.
User
Account that issued the request to run the statement.
Status
A color-coded icon is displayed to indicate the current state of the process:
• Cancelled: the Administrator or user cancelled the statement.
• Cancelling: the Administrator or user has requested that the statement be cancelled, but
the statement is still running while Aster Database makes a best-effort attempt to cancel it.
•
•
Completed: the statement ran successfully.
Pending: the user submitted the statement, and it is in a queue on Aster Database
waiting to be run. Or the statement is blocked pending the release of a system resource
and may be potentially deadlocked on another concurrent statement. In the rare case that
this happens, please look through other concurrently executing statements, or check your
Quality of Service parameters to ensure that they are functioning properly.
•
Running: the statement has started and is underway.
• Unknown: Aster Database is not providing a status at the moment.
•
Execution Time
43
Error: the statement could not finish normally. To see more details about the error,
consult the log files; see “Logs in Aster Database” on page 220.
How long the process ran.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Process Management
Processes Tab
Table 3 - 1: The AMC Processes Tab Columns (continued)
Column
Description of Processes Tab Column
Submit Time
Time when the user requested the statement to be run. At this time, the statement was
queued up on the cluster, but the statement did not necessarily start running immediately at
this time.
Completion Time
Time when the process finished.
Type
Either SQL for an ordinary SQL statement, or SQL-MR for a statement that includes an SQLMapReduce function.
Workload Policy
The name of a set of rules governing how the process is handled when the cluster allocates
resources. See the Workload Management chapter in the Teradata Aster Big Analytics
Appliance 3H Database User Guide.
Priority
A number from 0 to 3 indicating how important the process is, where 3 is most important.
Inherited from the workload policy. See the “Priority” section in the Workload Management
chapter in the Teradata Aster Big Analytics Appliance 3H Database User Guide.
Session ID
Unique identifying number of the user’s AMC session. This is the command interface session
where the user has logged in and issued the SQL command to the cluster. The statement is
truncated in the list view, but you can see the complete statement by clicking the process ID
number to display the process detail page.
Cancel
If it is possible to cancel the process, a Cancel icon is displayed in this column.
Transactions that are not cancellable are transaction-related SQL (e.g. COMMIT,
ROLLBACK), CLOSE cursor, and COPY-in SQL.
Filter the Process List
To make the process display even more useful, you can hide or show different processes by
entering filter criteria.
1
Click the Change Filter button.
2
Enter your filter criteria.
This example shows only the CREATE statements that were submitted on the retail_sales
table in the last 24 hours and took more than 5 minutes to complete.
Teradata Aster Big Analytics Appliance Database Administrator Guide
44
Process Management
Processes Tab
Figure 18: Creating a Process Filter in AMC
3
To make this filter the default filter, click Make Default.
4
Click Go to view the filtered results.
The current process filter terms are displayed above the Change Filter button, and only the
requested processes are displayed in the list.
Warning! When you restart or upgrade your cluster, the settings of the Process Filter in AMC are lost. After each Aster
Database restart, you must re-create your filters.
Disable Automatic Refresh of Processes
By default, AMC uses auto-polling to refresh the information about processes.
To stop the auto-polling of processes:
1
Click Show new processes on manual refresh.
Figure 19: Disabling Auto-Polling in AMC
2
45
When you are ready to update the process list, click the Refresh Now button.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Process Management
Query Timeline
Query Timeline
The Query Timeline tab (Processes > Query Timeline) shows a graphical representation of
commands run in the past 24 hours. By using this bar graph view, you can more quickly spot
commands that are out of the ordinary in terms of processing time.
Figure 20: The Query Timeline tab in AMC
Each bar represents one SQL command. The bars are color-coded using the same status colors
described in the Status column of Table 3 - 1.
To display processes in the Query Timeline:
1
Navigate to Processes > Query Timeline.
2
To filter the display of processes in the Query Timeline using the Change Filter button, as
described “Filter the Process List” on page 44.
3
To display details about a process, move the mouse over it. A popup message appears with
additional information.
Monitor Sessions
The Sessions tab (Processes > Session) shows a list of the connected or closed user sessions on
this cluster. You can use this list to monitor user activity and help troubleshooting user issues.
Teradata Aster Big Analytics Appliance Database Administrator Guide
46
Process Management
Monitor Process Details
Figure 21: The Session tab in AMC
The User Sessions list displays session information that includes the session ID, the host the
user is coming from, the login time, and the session duration.
To display user sessions:
1
Click Query Sessions (Processes > Sessions).
2
To sort the list, click a column heading.
3
To display summary information about a connected process, move the mouse over the
process ID.
4
To display detailed information about a connected process, click its ID.
See “Process Information” on page 43 for more information.
Monitor Process Details
When you click the ID of a process, AMC creates a new tab displaying detailed information
about the process.
47
Teradata Aster Big Analytics Appliance Database Administrator Guide
Process Management
Monitor Process Details
Figure 22: The Process Details tab in AMC
A Process Detail tab for that process is displayed. In addition to the columns displayed in the
process list (see “Process Information” on page 43), this tab shows the following additional
information.
Table 3 - 2: Process information columns
Column
Description of Process Detail Tab Column
ID
The full unique identifying number for the process.
Statement
The full SQL statement being executed by the process.
Status Detail
Additional information, if available, which expands on the one-word status
available in the process list tab.
Database
The database on which the statement is acting.
Session ID
The full unique identifying number of the user’s AMC session. This is the
command interface session where the user has logged in and issued the SQL
command to the cluster.
Progress
A bar that shows what proportion of the statement’s execution has been
completed so far.
Execution Plan
A series of operations that show how Aster Database actually performed (or is
performing) the statement. An SQL statement is typically broken down into
component parts which are executed separately to efficiently achieve the final
result. By default, the execution plan display omits routine or trivial
operations, but you can display the entire plan by clicking Show All Steps.
Teradata Aster Big Analytics Appliance Database Administrator Guide
48
Process Management
Monitor Process Details
Cancel SQL Statements
Sometimes, you may need to cancel a running process on the cluster. For example, suppose a
user runs the query SELECT * from events. If the events table is large, the query could
easily take far too long to complete. Another operation that can be time-consuming is a
CREATE TABLE that inserts a large number of rows.
Some SQL statements that are not cancellable. These are transaction-related SQL statements,
such as COMMIT, ROLLBACK, CLOSE cursor, and COPY-in SQL.
To cancel a running process, do one of the following:
•
In the Processes tab, if a Cancel icon is displayed for a process, click the icon in the Cancel
column (right-most column), then click OK when prompted.
•
In the Process Details tab, you can cancel the statement by clicking the Cancel Process
button.
Either action will place the process in Cancelling mode, which indicates that the cancellation
request has been received. Statement cancellation in Aster Database is an asynchronous, besteffort operation. While executing a statement, the Aster Database back-end checks
periodically to see whether a cancellation request has been issued. If requested, the back-end
acknowledges the cancellation and triggers a best-effort service to cancel the ongoing
execution.
Monitor SQL-MapReduce Statements
You can monitor the execution of SQL-MapReduce functions in the AMC using the same
general procedures outlined in this chapter.
49
Teradata Aster Big Analytics Appliance Database Administrator Guide
CHAPTER 4
Manage Data and Nodes
The Nodes tab in the AMC gives a system-wide overview of the amount of data stored in Aster
Database. In particular, it shows information about the extent to which data is replicated to
tolerate node failures, and how overall node storage is utilized by data in the cluster. It
provides interfaces through which administrators can manage data and replication in the
system.
The Nodes tab is also used to monitor the operation of Aster Database—its virtual workers,
worker nodes, and loader nodes. In the Nodes tab, administrators can view information on
each of the nodes participating in the Aster Database, configure those nodes, and retrieve logs
and other information for debugging purposes.
Figure 23: The Nodes tab in AMC
Node Overview Tab
The Node Overview tab displays status and health information about worker and loader nodes.
Teradata Aster Big Analytics Appliance Database Administrator Guide
50
Manage Data and Nodes
Node Overview Tab
The Node List
The Node List in the Nodes Panel contains a list of all nodes that have been registered with Aster
Database. This includes nodes that are active participants in the system, as well as nodes that
are not currently participating in Aster Database (for example, nodes that have failed and
nodes that have yet to be powered on).
Each node is listed with an icon that depicts the type of node (see below for node types), along
with a color representing its status. Nodes are identified by the IP address that has been
assigned by the system. The nodes in the Node List can be filtered based on node type, using the
Node Type drop-down menu above the list.
Node Types
There are three types of nodes in Aster Database: queen nodes, worker nodes, and loader
nodes.
Queen Nodes
The queen node (or coordinator) is the central node in the system and is represented in the
AMC by an icon with a ‘Q’:
This is the node on which the Aster Database management software (including the AMC) is
installed. It is the node responsible for all management of Aster Database, from node
management to statement execution management.
Worker Nodes
Worker nodes are the workhorses of Aster Database – they are the nodes where data resides
and where query processing occurs. They are represented in the AMC by an icon with a ‘W’:
In general, worker nodes represent the largest group of nodes in an Aster Database installation
and are the focal point of management and administration.
Loader Nodes
Loader nodes are optional nodes that can be added to Aster Database in order to increase the
load throughput of the system (the rate at which data can be loaded into the system). They are
represented in the AMC by an icon with an ‘L’:
By default, data is loaded into Aster Database through the queen node. However, if additional
throughput is required, dedicated loader nodes can be added to the system. These nodes can
also be used for bulk exporting of data from the system.
Please contact Teradata if you need additional loading capacity, so that we may help you plan
and configure your Aster Database optimally.
Node States
The node state or status indicates the operational health or condition of a physical node in
Aster Database. In the AMC’s Nodes > Node Overview tab, each node in the node list has a color
indicating its status. The possible node statuses are:
51
Teradata Aster Big Analytics Appliance Database Administrator Guide
Manage Data and Nodes
Node Overview Tab
New
When a node is first added to Aster Database, or registered, it is considered to be a New node.
At this point, Aster Database is aware of the node’s existence, but the node has not yet
contacted the queen in order to be prepared, or loaded with the Aster Database software.
Nodes are also shown as New immediately following a restart of Aster Database, before their
state can be determined.
Preparing
After the node contacts the queen to be prepared, its status changes to Preparing. While in this
status, it is loading the Aster Database software and preparing itself to become a participant in
Aster Database.
Prepared
Once the node completes preparation, its status becomes Prepared. At this point, the node is
ready to be incorporated into Aster Database so that it can host vworkers.
Active
Active and Passive are the acceptable states for nodes in a running cluster. Active nodes are
nodes that are available immediately to process queries in Aster Database.
Passive
Active and Passive are the acceptable states for nodes in a running cluster. A Passive node is a
standby that holds frequently updated copies of vworkers’ data and later can be made Active to
take on query processing work as needed.
Suspect
Suspect nodes are nodes that have exhibited unusual behavior and are participating in the
Aster Database in a limited capacity while being investigated for potential failures by the
queen.
Failed
Failed nodes are nodes that are no longer participating in the Aster Database.
Disk Storage Utilization
In the AMC, there are two levels at which you can check your cluster’s data capacity and the
amount of disk space currently in use. You can check:
•
Cluster-Wide Disk Capacity and Current Usage and
•
Per-Node Disk Capacity and Current Usage
Cluster-Wide Disk Capacity and Current Usage
The right side of the Nodes section of the AMC Dashboard tab contains the Data Payload Panel
showing a summary of your disk storage utilization in the cluster as a whole. For an
explanation of this panel, see “Cluster-Wide Disk Capacity/Usage” on page 37.
Teradata Aster Big Analytics Appliance Database Administrator Guide
52
Manage Data and Nodes
Node Overview Tab
Per-Node Disk Capacity and Current Usage
To see detailed descriptions of how and to what extent the disks are being used on individual
nodes, click the Nodes tab and click the Node Overview tab.
Disk usage details appear in these columns:
•
Uncompressed Active Data Size: This column shows the amount of data currently stored on
the node. The term “active data” refers to the raw, uncompressed data size before it is
stored on disk.
•
Storage (GB) This column shows a graph showing the current usage of the node’s disk, by
type of data stored (user data, data on secondary, and free space), and lists the amount of
disk space currently occupied by user and data on secondary, expressed in GB. This shows
the actual on-disk space that is used and free on the node. Hover your mouse cursor on the
graph to see these statistics for the node:
•
User Data is the amount of space occupied by primary copies of your data on the node.
•
Data on Secondary is the amount of space occupied by the replica copies of your data on
the node.
•
System represents the amount of the node’s disk space consumed by operating system
files, Aster Database software files, and other files that do not contain your Aster
Database-stored data.
•
•
Available represents the amount of unused storage currently available on the node.
•
Total Space shows the total amount of disk space on the node.
% Full: This column indicates how mach space has been used on this node. This graph
turns orange to indicate that more than 70% of the node disk space has been used, and it
turns red to indicate that more than 90% had been used. If this graph is displayed in
orange or red, you must take action to free up disk space by calling Teradata Support.
Note that the percentage full displayed in the AMC is computed differently from the
percentage full displayed by the UNIX command df. The AMC disk full computation does
the following:
a
Gets the free and total space in bytes from the statvfs command.
b
Computes the used bytes by using total bytes - free bytes.
c
Computes the percentage of used bytes / total bytes.
Important! Always maintain at least 30% free disk space in Aster Database. This space is required for routine
aggregation and sorting operations.
Hover your mouse cursor over any cell in these columns for more information.
53
Teradata Aster Big Analytics Appliance Database Administrator Guide
Manage Data and Nodes
Monitor Hardware
Figure 24: Hover for more information on Disk Capacity and Usage
Monitor Hardware
The Hardware Stats tab contains information on hardware in the cluster.
Figure 25: The Hardware Stats tab in AMC
Node Failures
The queen node in Aster Database actively monitors all nodes participating in the system. If it
observes a node behaving in an unexpected or inappropriate manner, it will consider that
node to be suspicious and change its status to Suspect, and the node will appear yellow in the
AMC.
A Suspect node status does not necessarily imply that the node has experienced a failure, only
that the queen is examining it in order to determine whether one has occurred. If the node
continues to demonstrate suspicious behavior while in Suspect status, the queen will consider
it to be Failed and change its status accordingly.
For instructions on addressing failed and suspect nodes, see “Address failed and suspect
nodes” on page 58.
Teradata Aster Big Analytics Appliance Database Administrator Guide
54
Manage Data and Nodes
Monitor Hardware
What do I do with suspect nodes?
It is important to note that nodes that have been marked Suspect still participate in Aster
Database. They continue to store data and are active participants in statement execution. A
Suspect node is a node on which one or more of the vworker databases of the node reported an
error (disk errors are a frequent cause), and in response, the queen removed that vworker or
vworkers from active status. The other, error-free vworkers on the node remain up and
running in active status.
The presence of a Suspect node does not necessarily imply a decrease in performance, but it
typically means the cluster has fallen from RF=2 to RF=1, meaning that one or more vworkers
may not have a backup vworker. While a node is in Suspect state, the queen monitors the
node’s behavior and only consider it to be Failed if it continues to demonstrate such behavior.
If the behavior that was originally observed was a one-time event (e.g. a transient network
error between the queen and the node), the node will remain an active participant while being
considered Suspect.
In Aster Database, the queen will not automatically transition a node from Suspect to Active.
Instead, a node will be returned to Active status on the next activation or load balancing
activity.
If the system continues to operate for a reasonable length of time after the node was originally
marked as Suspect, Teradata Aster recommends that the node be returned to Active status by
clicking the Balance Data button in the AMC. Allowing a node that is performing normally
(e.g. one that has continued to operate for at least 24 hours without transitioning to Failed) to
remain in a Suspect status for a lengthy period of time increases the chance that the node will
eventually be considered Failed, triggered by an event such as an unrelated transient error.
Recover from failures
When the queen considers a node to be Failed, it will attempt to reboot that node. Aster
Database is designed to enable recovery from many different types of errors through reboot
operations.
When a node reboots, it may pass through the states of New to Preparing to Upgrading to
Prepared (See “Add New Nodes to the Cluster” on page 71.) This is normal. After rebooting the
node, the queen will perform a number of checks on the node during the preparation phase. If
it successfully passes those checks, the node will eventually be returned to the Prepared status
and can be subsequently activated back into Aster Database.
If the checks fail, the node will transition from the Preparing status to a Failed status. The queen
may attempt an additional reboot, after which it will permanently consider the node to be
Failed if no progress is made. If a node is permanently considered Failed, it should be physically
removed from the Aster Database for investigation, as this is likely an indication of a hardware
failure (e.g. permanent CPU failure). For more information on hardware problems, see
“Hardware and Networking Problems on Workers” on page 718.
55
Teradata Aster Big Analytics Appliance Database Administrator Guide
Manage Data and Nodes
View Hardware Configuration
View Hardware Configuration
The Hardware Configuration tab show information about the current hardware configuration.
Figure 26: The Hardware Configuration tab in AMC
View Partition Map
The Partition Map tab show a graphical representation of the cluster with details for each node.
Figure 27: The Partition Map tab in AMC
Replication Factor
The replication factor (“RF”) is the number of copies of your data that are stored in Aster
Database to provide tolerance against failures. Maintaining an RF of two ensures Aster
Database is resilient to node and queen failures. While you can run Aster Database at an RF of
Teradata Aster Big Analytics Appliance Database Administrator Guide
56
Manage Data and Nodes
View Partition Map
one, Teradata Aster strongly recommends that you run with an RF of two. If you have some
tables that should not be replicated, create those as analytic tables. During operation of the
cluster, hardware failures can cause the RF to fall below two, at which point you must take
action to restore the RF.
Figure 28: Replication Factor
Replication factor (“RF”) indicates the number of full copies of data in the cluster:
•
RF=2: When Aster Database has two copies (an original and a copy) of every piece of data,
we say that the cluster has a current RF of two. With an RF of two, Aster Database is able to
tolerate the removal or failure of any single node while remaining available and ensuring
data safety. With an RF of two, there is also a replica of the queen’s data, which ensures that
you can perform queen replacement (“Queen Replacement” on page 82) if the queen fails.
•
RF=1: When the AMC reports an RF of one, it means that at least one node has lost its
replica. This occurs when a node fails or is removed. An RF of one means that one full
copy of data remains in the cluster. With an RF of one, Aster Database remains available
for querying and loading, but the loss of another node might result in data loss. When RF
falls to one, you must restore it to two as soon as possible. This is described in “Restore the
Replication Factor” on page 58.
Check the Current Replication Factor
To check the current RF:
57
1
Open the AMC in your browser.
2
Click the Nodes tab.
3
In the upper right corner, look in the Replication Factor field in the green status box.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Manage Data and Nodes
View Partition Map
Figure 29: The Replication Factor field
4
Click the Partition Map tab for details.
In the upper right corner of the Partition Map tab, the Replication Factor information box shows
the current RF and lists the number of virtual workers running at RF=2, and the number
running at RF=1. Inspect the partition map to find the failed and suspect nodes.
Figure 30: Replication Factor in the Partition Map tab
Restore the Replication Factor
Address failed and suspect nodes
In some cases, as described in the previous section, the replication factor (“RF”) in Aster
Database may fall to one, instead of the recommended RF of 2. If this happens, you should
restore the RF. Follow the steps below, in the order shown:
1
Check the node status in the AMC. Click on the Nodes > Node Overview tab, find the node,
and check its Status:
a
If the node is marked Suspect, check for and fix hardware problems as explained in
“Address Hardware Problems on Workers” on page 719. If you wish to attempt to
restore the RF now, without using the suspect node, proceed to Step 2. If no hardware
problems are found, proceed to Step 3.
b
If the node is marked Failed, note that a node may temporarily display as Failed during
the course of a regular soft or hard reboot. In that case, the cluster may just need time
to come up completely. Do not attempt to reboot the node or take any other actions
until you are certain the cluster has come up completely and that the node is still
displaying as Failed.
If a worker node goes into a Failed state after being added to cluster:
•
Connect to the worker node and look at the /var/log/installer.log to make sure
the installation on the node succeeded
•
If the node is going through a reboot cycle, wait until the node completely boots up,
and then check the log file.
•
If there is no error message in the installation the state of the node should change from
Failed to Preparing to Upgrading to Prepared after the node has gone through a reboot
cycle.
•
If the node remains as Failed, check for and fix hardware problems as explained in
“Address Hardware Problems on Workers” on page 719. If you wish to attempt to
restore the RF now, without using the failed node, proceed to Step 2. If no hardware
problems are found, perform a node restart (type /etc/init.d/local restart) on
the node and wait for it to show up as Prepared in the AMC. When a node reboots, it
may pass through the states of New to Preparing to Upgrading to Prepared. This is normal.
Teradata Aster Big Analytics Appliance Database Administrator Guide
58
Manage Data and Nodes
View Partition Map
Once the node is Prepared, proceed to Step 3.
2
If hardware problems are found, you should fix them as soon as possible, but in the
meantime you might be able to restore the RF to 2 using the existing set of nodes. To do
this, you will perform Step 3 below, but first you must check if the cluster has enough
space to replicate the data that was stored on the failed node. Click on the Nodes: Node
Overview tab and, for each node, click its Data Stored graph to check its remaining free space.
Make sure there is more free space than the total amount of data that was on the failed
node. If there is enough space, proceed to Step 3. Otherwise, you must replace the failed
node hardware before you can restore the RF to 2. (See “Hardware and Networking
Problems on Workers” on page 718.)
3
In the AMC, click the Admin tab and click Balance Data. This balances the storage and brings
the RF to 2. This is an online operation and does not interrupt currently running queries
or other transactions. When this completes, the cluster will be in one of the following
states:
•
Active state indicates all nodes are working. You have successfully restored the RF.
•
Imbalanced state indicates processing has not yet been balanced. Proceed to Step 4.
Warning! Before you invoke the Balance Process, be aware this process will cancel any running queries.
Before doing this operation, verify there are no running queries and no queries about to enter the system.
4
In the AMC, click the Admin tab and click Balance Process button.
A dialog box appears with the following message:
“Are you sure you want to initiate a Balance Process operation? Doing this will cancel any
running queries. Verify that there are no running queries on the system before doing this
operation.” Click OK after you have verified there are no running queries and no new
queries about to enter the system.
This balances the processing. This is a blocking operation (any running queries will be
cancelled) and will take few minutes to complete. At the end of this operation the cluster’s
status changes to Active, indicating you have successfully restored the RF.
Note: When you click Balance Data in the AMC, the system balances the storage (the number of logical workers)
across all worker nodes. This syncs each vworker with its replica. At the end of the balance data operation:
If the number of active logical workers is balanced (that is, if the cluster is processing-balanced) then you do not need
to click the Balance Process button. The cluster status becomes Active. This happens if all the suspect vworkers
happen to be passive vworkers.
If the number of active logical workers is not balanced (that is, if the cluster is processing-imbalanced) then the AMC
shows a cluster status of Imbalanced and you must click the Balance Process button.
Change the Replication Factor
You may change the replication factor on your Aster Database cluster to one of these
supported values:
•
59
RF=1 does not keep a secondary copy of your data.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Manage Data and Nodes
Inspect Individual Nodes
•
RF=2 keeps a secondary (or replica) copy of the data for all nodes. This enables automatic
worker failover and a helps prepare for a queen replacement, should it become necessary.
Changing the replication factor from RF=1 to RF=2 will create replica copies of the data on all
nodes. Before increasing the RF, verify that there is enough available disk space to hold the
new copy of the data. You will notice a slowdown in cluster operations while replication is
being done.
Procedure
1
Open a console window and SSH or log into the console of the queen as the root user.
2
In a text editor, open the file, /home/beehive/config/goalReplicationFactor
3
The file contains a “1”. Change this to a “2”, and save the file.
4
Perform a soft restart on the queen:
# ncli system softrestart
5
Point your browser to the AMC on the new queen, go to the Admin > Cluster Management
tab, and click Activate Cluster.
Note: You can also reduce the RF from 2 to 1 using the procedure shown above, but Teradata Aster does not recommend doing this, because reducing the RF to 1 has the effect of deleting the backup copies of your data. If your cluster is running at RF=1, you will not be able to perform a Queen Replacement procedure.
Inspect Individual Nodes
To see more details about any specific node, navigate to the AMC Nodes > Node Overview > Node
Name column and click its name to display the Node Detail screen. This page will display with a
new tab identified by the node’s IP address.
Teradata Aster Big Analytics Appliance Database Administrator Guide
60
Manage Data and Nodes
Inspect Individual Nodes
The Node Data tab provides information about the virtual workers.
Figure 31: The Node Data tab in AMC
61
Teradata Aster Big Analytics Appliance Database Administrator Guide
Manage Data and Nodes
Inspect Individual Nodes
The Node Hardware Stats tab provides information about CPU, memory, network and disk
usage.
Figure 32: The Node Hardware Stats tab in AMC
Teradata Aster Big Analytics Appliance Database Administrator Guide
62
Manage Data and Nodes
Inspect Individual Nodes
The Node Data tab provides information about the virtual workers.
Figure 33: The Node Data tab in AMC
Read Aster Database Logs in the AMC
The AMC provides administrators with easy access to both Aster Database-wide system logs
and system logs for individual nodes, including each node’s preparation log, system log and
kernel log. Logs can be retrieved via the Individual Node Inspection tab by following these
steps:
1
Navigate to the Nodes panel in the AMC
2
In the Node Overview tab, click the name or IP address of the node whose logs you wish to
view, or click the queen’s entry if you wish to view system logs.
3
An Individual Node Inspection tab appears as the right-most tab. In its upper right corner are
the Logs links.
4
Click the desired Logs link, which is one of:
5
63
•
preparation log (the log of events related to the process of preparing a node for
participation in Aster Database);
•
system log (the contents of the Linux syslog file /var/log/messages); and
•
kernel log (the contents of the Linux kernel buffer provided through dmesg).
The log appears, showing the latest 1000 lines. Click Refresh at any time to load the latest
1000 lines.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Manage Data and Nodes
Inspect Individual Nodes
In the log window, you can search the log by typing a search term in the Enter terms field and
clicking Search.
Create Log Bundles for Support Inquiries
Rather than reading log files one by one as described in the previous section, you can request
the AMC to create a compressed file containing multiple logs. This is useful when you want to
send logs back to Teradata Global Technical Support (GTS) for troubleshooting. For more
information, see “Logs in Aster Database” on page 220.
Aster Database Log Format
The log format is as follows (fields shown in italics are optional fields; some Aster Database
components provide these fields and others do not):
timestamp severity-code PID source-filename:line-number event-id RCID ]
message
•
timestamp: the time the log entry was created, formatted according to the ISO-8601
standard, yyyy-mm-ddTHH:MM:SS.uuuuuu. Per the standard, a “T” time designator
introduces the clock-time portion of the timestamp. The “uuuuuu” in the description
above indicates the microseconds portion of the time. For example, a timestamp might
look like: 2010-03-23T11:26:13.185081.
•
severity-code is a four- or five-letter code indicating the importance of the event being
logged. The codes are
INFO: Informational message; conveys useful information about regular, steady state
operation.
WARN: Indicated unexpected behavior; should be investigated. The system continues
to operate normally, but you may be suffering degraded performance.
ERROR: Some operational error occurred. The operation will abort and the error will
be user-visible.
FATAL: A non-recoverable error happened. Component will abort.
•
PID: Integer code that identifies the process that generated the event being logged.
•
source-filename:line-number is the name of the executable source file that produced the
log entry, followed by a colon and the line number of the application code line that
produced the log entry. For example, StatServer.cpp:105.
•
event-id: Optional. The event-id is present only if the event-producing component is one
that uses event_ids. The event-id is used by the Aster Database Log Server in conjunction
with the Aster Database Alerting Framework (Blackbird) to trigger alerts for Aster
Database administrators. The event-id has the format, XXnnnn, where XX is a two-letter
code that identifies the Aster Database component (such as “BA” for Aster Database
Backup), and nnnn is a four-digit, component-defined event type identifier. For example:
BA0012.
•
RCID: Optional. The request context associated with this log. If there is no request
context, then this field is omitted. For example, 3563712506369035985.
•
the right square bracket (]) marks the start of the message portion of the log entry.
Teradata Aster Big Analytics Appliance Database Administrator Guide
64
Manage Data and Nodes
Detect and Manage Skew
•
message is an arbitrary-length message, in text format. For example, Disk full
detected.
Log format example
Below, we show a sample log entry. This example is a single line, but, depending on the format
in which you’re reading document, it may appear here split across multiple lines:
2010-03-23T11:26:13.185081 INFO 30459 StatServer.cpp:105 ST0012
3563712506369035985] Disk full detected
Set Aster Database to log more or fewer log entries
By default, Aster Database logs messages of all severities. If you wish to improve Aster
Database’s performance by having it log only the more severe events, contact Teradata Global
Technical Support (GTS) and have them configure your cluster to skip the logging of INFO
events (and optionally other lower-importance events). Support sets this via the
minLogSeverity flag, where a value of 4 shows all events, 5 shows WARN and above, 6 shows
ERROR and above, and 7 shows only FATAL events. Values lower than 4 are reserved for
future use.
Some log messages cannot be disabled by the minLogSeverity setting. To disable all INFO
messages, set minLogSeverity=5 and logInfoMaxVerbosity=-1. It is not currently
possible to disable all WARN, ERROR, or FATAL messages.
Verbose logging for debugging
The logging system can be set to provide verbose logging to help Teradata Global Technical
Support (GTS) investigate problems on your cluster. Verbose logging is turned off by default.
You must contact Teradata Global Technical Support (GTS) to have it turned on. When
activated, it can be set to show the amount of verbosity that is needed (level 0, 1, or 2, with
level 0 producing the least logging text and level 3 producing the most. Support sets this via
the maxLogVerbosity flag.
Components not using this log format
As of version 4.5.1, the following Aster Database components do not use the standard Aster
Database logging format:
•
ODBC, JDBC and OLEDB drivers
•
Aster Database Loader
•
AMC
Detect and Manage Skew
Data and processing skew is one of the biggest performance killers in an MPP environment.
Data skew is caused when your table’s distribution key column contains data with an uneven
distribution. Processing skew may be caused by a combination of data skew when joining
tables, or an imbalance of values in the distribution key, or heterogeneous hardware or slow/
malfunctioning hardware/software.
65
Teradata Aster Big Analytics Appliance Database Administrator Guide
Manage Data and Nodes
Detect and Manage Skew
Use the SQL-MR function nc_relationstats to generate various reports for on-disk table size
and statistics for one or more tables. (See “nc_relationstats” on page 200 for more details.)
Table Skew (Data Skew)
First, validate the distribution of the distribution key for a partitioned table. In this example
we are going to query the table MyTable which has a distribution key on the userid column.
SELECT
userid,
COUNT(*) AS usercount
FROM mytable
GROUP BY 1
ORDER BY 2 DESC LIMIT 10;
Note, that the userid at the top of the list has over 50X the number of rows than the next
nearest userid. This will definitely cause processing skew when joining any other table, even if
it is on the same worker, via the userid column.
Possible causes of this condition:
•
The application that populated the table inserted a DEFAULT value in the distribution key
column of too many records.
•
Errors occurred in the ELT/ETL processing when you loaded the table.
How do we fix this?
First, is there possibly another column that would work as a distribution key? The userid
column was likely selected as it is the distribution key of other tables that this table is joined
with. So, we may not be able to choose another column. The next thought is to see what causes
this particular value to be inserted by the ETL/ELT process. Look for processing or logic errors
that will make this particular userid so prominent. If the logic is correct, then an alternative is
to apply a RANK value and make it a negative value so that rows with this userid can be easily
excluded from reporting logic but still available for JOIN operations and no data is lost.
For example:
If the current process does an INSERT/SELECT operation from a staging table where the
userid may be NULL…
INSERT
INTO MYTABLE (userid, ... other columns ... )
SELECT COALESCE(userid, 3089263269635597179)
...
FROM mytable_staging;
As you will note, this will automatically make any null values for the userid column into the
same value, thus causing skew.
There are several algorithms that will work to create unique userid values. One would be the
use of a SEQUENCE. Another would be to use the RANK function with a negative multiplier,
for example:
Step 1:
INSERT
Teradata Aster Big Analytics Appliance Database Administrator Guide
66
Manage Data and Nodes
Detect and Manage Skew
INTO MYTABLE (userid, ... other columns ... )
SELECT UserId,
...
FROM mytable_staging
WHERE userid IS NOT NULL;
Step 2:
INSERT
INTO MYTABLE (userid, ... other columns ... )
SELECT RANK(1) OVER (ORDER BY another column) * -1 AS userid
...
FROM mytable_staging
WHERE userid IS NOT NULL;
There are other variations such as maintaining a physical table with the lowest value and doing
a cross join to get the starting point. It will require that the ETL be broken into multiple steps
and that related data be assigned the same userid, but the effort in the front end (ETL/ELT)
will be more than worthwhile when it comes to relieving skew during reporting (and ELT).
Partition Level Skew
Check vworker (partition) size:
To check the size of partitions and their distribution, run the nc_skew function from the Admin
> Executables tab of the AMC. See “nc_skew” on page 209. If any vworker is substantially larger
than the others, please contact Teradata Global Technical Support (GTS) to help identify the
table that is taking up the space and reduce its size.
Check the table size
Aborted data load operations can consume disk space on workers. To get the on-disk sizes of
tables, use the function “nc_relationstats” on page 200. The SQL-MR function
nc_relationstats allows you to generate various reports for on-disk table size and statistics for
one or more tables.
If a table appears to be larger than its row count would warrant, please contact Teradata
Global Technical Support (GTS) for help.
Process Skew
Information about worker node-level processing skews can be obtained using Ganglia. To find
out whether a particular user query is experiencing processing skew, follow the steps below:
67
•
Given the user name, use system tables to find the start time of the transaction.
•
Do ps –ef | grep postgres on the queen node to find the start time of the local
database session.
•
Issue ps -ef | grep <username>|grep <starttime>|grep -v idle across all
workers using ClusterSSH. This tells us on which partitions and on which worker nodes
the query is still running.
•
If only one partition is still running, then you might be suffering from processing skew. Go
to that node and monitor this Postgres process to see if it is heavy on CPU or I/O or
memory.
Teradata Aster Big Analytics Appliance Database Administrator Guide
CHAPTER 5
Cluster Expansion
Contents of this section:
•
Convert the Secondary Queen to a Loader (page 68)
•
Add New Nodes to the Cluster
•
Activate Aster Database
•
Incorporate the New Nodes
•
Balance Process
•
Split Partitions
Convert the Secondary Queen to a Loader
If desired, you may convert the secondary queen node into a loader node, so that it can be
used to load data to the cluster until the time when it is needed as a secondary queen.
Procedure
The commands in this procedure use the node aster-13 as the secondary queen that will be
converted to a loader. Substitute the name and number of the node in your cluster that you
wish to convert.
1
Log in as root to the secondary queen node that you wish to convert to a loader.
2
Issue a local status command. A status of “unknown” is returned, because of the unofficial
status of the node in the cluster:
# /etc/init.d/local status
* status: unknown
3
Perform a local stop. This command should fail, because the node is not a member of the
cluster:
# /etc/init.d/local stop
Shutting down nCluster Services:
local.stop[13497]: Failed to stop Cluster Services Launcher
4
failed
Create a directory structure to hold the Aster installer binary. This should be the same as
the directory structure on your primary queen. Create this directory if it does not exist, by
Teradata Aster Big Analytics Appliance Database Administrator Guide
68
Cluster Expansion
Convert the Secondary Queen to a Loader
issuing a command like the following, substituting your installer directory path from the
primary queen:
# mkdir /var/opt/teradata/packages/aster3
5
Change directories to the installer directory you created:
# cd /var/opt/teradata/packages/aster3
6
Obtain the Aster installer binary, and place it in the directory you created. If you don't
have the installer on node already, you can SFTP or "SCP -r" to it from the primary queen
node. The name of the package might be AsterInstaller_<release> or if it has recently
updated from TSS patch server it will be something like teradata-asterdb<release>.
7
Ensure that the installer version is the same version of Aster as is running in the cluster.
You can check the version that is running by:
8
•
Going to the Admin: Cluster Management screen in the AMC and checking the Installed
Version displayed for each node, or
•
Issuing ncli node showsummaryconfig on one of the working nodes in the cluster
to display the Aster version.
If necessary, change the file permissions to make the installer executable:
# chmod +x AsterInstaller_5-0-1_r29677.bin
9
Run the Aster installer binary with the -x option, which extracts the installation packages
without installing:
# ./AsterInstaller_5-0-1_r29677.bin -x
10 You should see the installer begin extracting the files:
Aster nCluster Installer
Copyright (c) Aster Data Systems, Inc. All Rights Reserved.
Extracting contents...
./acme-sw.tar.gz
./backup-sw.tar.gz
./centos_repos.tar.gz
./clients-linux32.tar.gz
...
aster-launch-appnexus-instance.sh
AsterCreateEC2Cluster.py
DeployNCluster.py
NClusterDataController.py
aster-launch-ec2-instance.py
osupdates_centos_6.0
decompress
osupdates_centos_5.6
osupdates_rhel_6.0
setup_install_env
AsterPreInstaller.py
version
ncluster.manifest
md5sum: unrecognized option '--quiet'
Try `md5sum --help' for more information.
Package checksum mismatch - Installation terminated
Removing temporary files...
Packages successfully extracted in /tmp/install
11 Find the file NclusterDataController.py under the directory /tmp/install.
69
Teradata Aster Big Analytics Appliance Database Administrator Guide
Cluster Expansion
Convert the Secondary Queen to a Loader
12 After ensuring that you are working on the node that is to be converted to a loader, run the
NclusterDataController.py script with the --uninstall option, which will clean any existing
Aster software and data from the node:
# ./NClusterDataController.py --uninstall
Starting nCluster uninstall on this node..
Tearing down networking...
Killing nCluster processes...
Unmounting iointerceptor/mountdir...
Removing nCluster Services...
Removing nCluster files...
Deleting nCluster user accounts...
Verifying uninstallation...
Uninstall successfully completed on this node
13 Reboot the node.
14 Make sure no Aster environment variables are set using the env command:
# cd /usr/bin
# ./env | egrep -i '(beehive)|(node)|(aster)'
If there are Aster environment variables set, this command will return a value similar to:
NODE_UID=a720385e4da0712fad25bed71c4bb764
You should remove any Aster variables, if found:
# unset NODE_UID
15 Log into the AMC.
16 Navigate to Admin: Cluster Management.
17 Choose Add Node(s) to add the node back to the main cluster as a loader.
18 Fill in the details, and check the Clean Node option. Click Ok.
19 Login back to the primary queen node and check the status of all nodes in the cluster using
ncli node show:
Teradata Aster Big Analytics Appliance Database Administrator Guide
70
Cluster Expansion
Add New Nodes to the Cluster
# ncli node show
+------------+----------------------------------+--------+--------+
| Node IP
| Node ID
| Type
| Status |
+------------+----------------------------------+--------+--------+
| 39.64.8.13 | d4fb2e8ed9e941ce0736323e4b36cca1 | loader | Active |
| 39.64.8.2 | 80fd104ff2987da0dc619ea5b2d5ce6c | queen | Active |
| 39.64.8.3 | bd6447bc132f46e5a2d46b9e2392d06e | worker | Active |
| 39.64.8.4 | 1d922381a636c3f14576228cea13b1fd | worker | Active |
| 39.64.8.5 | 26732e6354ad04089af8d3c188908435 | worker | Active |
| 39.64.8.6 | cd06a044869007b2ed24826ceaa7aba3 | worker | Active |
| 39.64.8.7 | ed9f33a613d27c2be719455e5768f90f | worker | Active |
| 39.64.8.8 | c4274a9eb6d125acebb01a371ce19865 | worker | Active |
| 39.64.8.9 | ae302230fef1f761fdd7a46ce3295345 | worker | Active |
+------------+----------------------------------+--------+--------+
9 rows
20 You should see the new loader node that you added displayed in the output. You are now
ready to use the node as a loader.
Add New Nodes to the Cluster
Perform the following steps to add new workers or loaders to your Aster Database cluster. This
procedure installs the Aster Database software on the worker and loader machines and adds
them to the cluster.
Warning! Be sure there is no data stored on the worker and loader machines. If you are using machines that have
seen previous service as Aster Database workers, re-install the operating system on them, or clean their filesystems
as explained in “Delete All Data to Re-Provision a Node” on page 71.
Warning! Be sure the worker and loader machines all have the same time zone setting as the queen.
Delete All Data to Re-Provision a Node
There are two ways to delete all data from a node before re-adding it as a new node in Aster
Database:
71
•
Specify Clean Node when adding the new node through the AMC. The AMC’s Add Node(s)
button gives you the option to delete any existing data (by checking the Clean Node check
box), or
•
Delete the data manually, by following the instructions below:
Teradata Aster Big Analytics Appliance Database Administrator Guide
Cluster Expansion
Add New Nodes to the Cluster
Before adding a machine as a worker or loader node, remove all user data from that machine.
This is particularly important if you wish to deploy a machine that has previously served as an
Aster Database node in your cluster.
Warning! If you wish to re-deploy a node that previously served as an Aster Database node, make sure the machine
does not contain any data you need, since you must delete all its Aster-stored data before you re-deploy it. As a
guideline, if your cluster is currently running at RF=2 (after removing the node that you will re-deploy), then it is probably safe to delete the node’s data as explained below.
1
To clean up an old Aster Database node for reuse in the cluster, delete the following files
and directories:
•
/primary/w*z (where the asterisk represents the vworker number. For example, you
might see /primary/w5z (vworker number 5) and /primary/w12z (vworker number
12) here.)
•
/primary/iointerceptor
•
/primary/tmp/worker_status
•
/primary/tmp
•
/primary/.deleted
•
/primary/.olddata
•
/primary/upgradeState.*
•
/primary/beehive_id
•
/primary/initialPartitionCount
•
/primary/queenDb* (if present)
2
Empty the contents of /etc/rc.local if the node was previously a part of an AMOS
install.
3
Delete the 'beehive' UNIX user and the 'beehive' UNIX group. To do this, SSH into the
machine as root, and run userdel beehive and groupdel beehive.
4
Reboot the machine. Rebooting ensures that data clean-up is completed.
5
The machine is now ready to be re-added as a new worker or loader node in Aster
Database.
Add Nodes
This procedure installs the Aster Database software on the worker and loader machines and
adds them to the cluster.
Prerequisites
Before you add nodes, make sure you’ve:
6
ensured that the operating system and any required patches are installed on the node;
7
Set up passwordless node-to-node SSH for the root user; and
8
Made a list of the IP addresses of the nodes you plan to add.
Teradata Aster Big Analytics Appliance Database Administrator Guide
72
Cluster Expansion
Add New Nodes to the Cluster
9
If the prospective node machine has been previously used as an Aster Database node, then
you may wish to clean its file system as explained in “Delete All Data to Re-Provision a
Node” on page 71. Alternatively, you can leave the old data in place and tick the Clean Node
checkbox to allow Aster Database to delete the old data when adding the machine as a new
node.
Warning! Adding a node deletes all data on that node. Even if the node was previously an Aster Database node, re-
adding it will cause the deletion of all its data. If you want to replace or repair a node, please read the instructions in
“Address failed and suspect nodes” on page 58.
Procedure
1
Log in to the AMC.
2
Click Admin > Cluster Management.
Figure 34: Choosing Admin > Cluster Management in the AMC
3
Click Add Nodes.
Figure 35: Add Nodes Button in the AMC
The Add Nodes window appears.
4
73
In the Add New Node window, for each node you wish to add:
a
Select a Node Type of worker (or loader if you want to add a loader).
b
Choose IP to identify nodes by IP address.
c
Type the IP Address of the node. em1 or p1p1 on some hardware/OS combinations)
d
Choose a Display Name to identify the node in Aster Database.
e
Optional: Type a Rack Id to indicate the hardware rack where the node resides.
f
If you are re-provisioning a node machine that was previously used in Aster Database,
check the Clean Node check box.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Cluster Expansion
Add New Nodes to the Cluster
If you are reusing hardware, then you may need to check the Clean Node box. This
setting removes all Aster Database-related data from the node. If you do not check this
option and Aster Database data is found on the node machine, the Add Node attempt
will fail. If you do check this option and Aster Database data or processes are found on
the node machine, they will be deleted or stopped and the Add Node operation will
proceed.
Figure 36: Add Node Settings in the AMC
5
To add more nodes at the same time, click the plus (+) sign button in the lower left of the
Add Nodes window. You can add all of the worker and loader nodes at the same time.
Warning! If your prospective node machine was previously deployed in Aster Database, the next step is likely to delete
the old Aster Database data on the prospective worker node. For information on what will be lost, please see “Delete All
Data to Re-Provision a Node” on page 71.
Figure 37: Add Nodes window in the AMC
6
Click OK to dismiss the Add Nodes window.
7
Aster Database adds the nodes. The nodes appear with a status of New, which changes to
Installing or Preparing, Upgrading and finally to Prepared. Note that sometimes, the state
changes temporarily to Failed while the node is restarting.
Teradata Aster Big Analytics Appliance Database Administrator Guide
74
Cluster Expansion
Add New Nodes to the Cluster
Figure 38: Worker Node Statuses in the AMC
8
Wait for these indicators before proceeding:
a
All worker and loader nodes change to Prepared.
Figure 39: Prepared Workers
b
The Installed Version column shows the correct software version number.
c
The message “Add node operation successful” appears in the upper-right corner of the
AMC.
Figure 40: Add Node Successful
Tip! It is normal for the nodes to reboot themselves a few minutes after being added. When you install Aster Database, and whenever you install a new Aster Database version, the installation might include operating system
updates that require a reboot. The nodes will automatically reboot themselves once to put the operating system
updates into effect.
Next Step:
Do one of the following:
75
•
If the node appears as FAILED, see “Node Failures” on page 54.
•
If the node appears as PREPARED, you may activate the cluster as explained in the next
section, “Activate Aster Database” on page 76.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Cluster Expansion
Activate Aster Database
Activate Aster Database
In this phase, you will activate the cluster, making it ready to load data and service queries.
Procedure:
1
Open the Aster Database Management Console (AMC) in a browser window. To do this,
navigate to http://<ip address of the queen>.
2
Click on the Admin > Cluster Management > Nodes tab. Make sure all new worker and loader
nodes in the list show a Status of Prepared.
Tip! See “Node States” on page 51 for more details on node statuses. Worker and loader nodes should boot from the
queen over the network and go through various phases and eventually reach the
Prepared state.
3
Near the top of the Admin > Cluster Management > Nodes tab, click the Activate Cluster button.
4
A confirmation window pops up. Click OK.
Activation takes a couple of minutes. You can see the status in the status indicator.
You can watch the status of each node in the Admin > Cluster Management > Nodes tab.
Teradata Aster Big Analytics Appliance Database Administrator Guide
76
Cluster Expansion
Incorporate the New Nodes
The AMC shows cluster status in the upper left corner. The cluster status changes from
Stopped to Activating and finally to Active. Once the cluster is Active, it is ready to load data
and service queries.
Next Step:
Do one of the following:
•
If the node appears as Failed, see “Node Failures” on page 54.
•
If the node appears as Prepared, you must incorporate the node into the cluster as
explained in the next section, “Incorporate the New Nodes”.
Incorporate the New Nodes
In an active cluster, you can incorporate new or repaired nodes and bring them to a Passive
state (they host backup vworkers, which act as stand-bys but do not process queries) without
disrupting queries and loading operations. Bringing nodes to an Active state (they host active
vworkers, which process queries), by contrast, has the side-effect of briefly disrupting queries
and loading operations.
To incorporate one or more nodes:
77
1
Make sure the new nodes you wish to incorporate are in the Prepared state.
2
In the Nodes panel of the AMC, click the Activate Cluster button. For further information on
cluster activation, see “Activate Aster Database” on page 76.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Cluster Expansion
Balance Process
After incorporation, the new node(s) may be in the Active or Passive state. Either Active or
Passive is an acceptable state for a node, but for performance you should strive to keep all
nodes in the Active state. When a node is Passive, it’s acting as a standby that holds copies of
vworkers’ data, but it is not contributing to query processing. You can make the node Active by
performing a Balance Process operation on it. For details about Passive nodes, see “Balance
Process” on page 78.
Balance Process
Balance Process is an Aster Database administrative action that load-balances the query
processing burden across all worker nodes in Aster Database. It will optimize performance
given the current data placement. The Balance Process step does not create new copies of data,
so it typically runs quickly. It also does some cleanup, deleting data that can no longer be used.
It will briefly disrupt the cluster, aborting any in-progress transactions, a period that can last
from a few seconds to a few minutes.
Important! Before you proceed to the Balance Process, be aware this process will cancel any running queries.
Verify there are no running queries on the system before doing this operation.
It is recommended that Balance Process be run at some point after Balance Data completes,
when a few minutes of downtime are acceptable, so that a new node’s processors are available
to the cluster.
You initiate Balance Process in the AMC by clicking the Balance Process button in the Admin >
Cluster Management tab. See the next section for instructions.
Procedure
The Balance Processes procedure forces currently passive nodes to contribute to the handling
of database queries. This procedure temporarily interrupts the operation of the cluster. It
places active vworkers on all nodes that are able to host them.
Warning! While the Balance Process step is in progress, your cluster cannot process queries. Before you per-
form the next step, make sure all running queries have finished successfully and that no new queries are allowed to
enter Aster Database. Already-running queries will be killed when you click Balance Process, and new queries
submitted after you click it will wait until the balancing is complete.
1
Log in to the AMC.
2
Navigate to the Admin > Cluster Management tab.
3
Click the Balance Process button.
A dialog box appears with the following message:
“Are you sure you want to initiate a Balance Process operation? Doing this will cancel any
running queries. Verify that there are no running queries on the system before doing this
Teradata Aster Big Analytics Appliance Database Administrator Guide
78
Cluster Expansion
Split Partitions
operation.” Click OK after you have verified there are no running queries and no new
queries about to enter the system.
This action takes between a few seconds and a few minutes to complete. Once processing is
balanced, the cluster resumes handling queries and the new node(s) are part of the cluster.
You can check that the new nodes are participating in the cluster by looking for a status of
Active in the AMC’s Nodes panel.
Next Steps
If you are scaling out your cluster, you may want to perform a partition split now to increase
the number of vworkers in the cluster. See “Split Partitions” on page 79.
Split Partitions
Partition splitting is an Aster Database feature that helps you add vworkers so that you can
maintain an optimal ratio of CPU cores to vworkers as your cluster grows.
To scale out your cluster, you add worker nodes (as shown in “Add New Nodes to the Cluster”
on page 71). As you add worker nodes to the cluster, Aster Database does not automatically
increase the number of vworkers. In other words, the number of vworkers stays constant as
you add worker nodes (machines). This means that, as you add nodes to the cluster, the ratio
of CPU cores to vworkers will increase, and eventually your CPUs may become under-utilized.
If this happens, you can improve performance by increasing the number of vworkers (also
known as “splitting partitions”).
Teradata Aster recommends that you manage your cluster so that you have approximately two
CPU cores per vworker. For example, an 8-core node should typically host 4 to 6 vworkers. In
order to avoid having to split partitions, you may elect to set up your cluster with 6 vworkers
per 8-core node and then add nodes as your data grows, until your ratio falls below 4 vworkers
per 8-core node. Once the ratio falls below this point, it’s a good idea to split partitions to
make better use of the processing power of your nodes.
Prepare for Partition Splitting
Before you split partitions, if you have not already expanded your physical cluster to the size
you need, do so now as shown below. (If you already have enough physical nodes, proceed
immediately to “Partition Splitting Procedure” on page 80.)
1
Determine how many vworkers you need. First, check the current partition count by
opening a command shell on the queen as Aster Database administrator and viewing the
partition count files. To see the current partition count, view the file:
$ cat /home/beehive/config/totalPartitionCount
79
Teradata Aster Big Analytics Appliance Database Administrator Guide
Cluster Expansion
Split Partitions
The count shown is the total, cluster-wide count of the number of primary vworkers.
(That is, the count is per cluster, not per-node.)
Tip! You can also find out the initial partition count that was configured when Aster Database was installed. To do this,
look at the file:
$ cat /home/beehive/config/initialPartitionCount
2
Add the desired number of new worker nodes to your cluster. See “Add New Nodes to the
Cluster” on page 71.
3
Activate the new nodes. See “Incorporate the New Nodes” on page 77.
Next Step
Proceed to “Partition Splitting Procedure”, below.
Partition Splitting Procedure
Partition splitting increases the number of vworkers in your cluster. We refer to the number of
active, primary vworkers in your cluster as the cluster’s “partition count.” You will typically
perform a partition split after you have added more physical machines to the cluster.
Warning! Your Aster Database system will be unavailable to users during the partition
splitting operation.
1
Log in to the AMC and check the following:
a
Make sure Aster Database is Active.
b
Make sure all clients and SQL users have logged out from the cluster.
c
Make sure all queries have finished running.
Warning! All client sessions to Aster Database need to be terminated before partition splitting can be started. If there
remain any client sessions, then partition splitting will immediately encounter an error and display this message:
Partition splitting failed: Error notifying queen.
Will terminate online partition splitting.
If you see this message, disconnect all clients and run the partition split again.
2
After all queries have finished, open a SQL session in ACT and type:
COMMIT;
Issuing “COMMIT” causes remaining prepared transactions to run, if any are present on
the cluster. Check the AMC to verify that all transactions have finished running. After they
have finished, proceed to the next step.
3
Make a note of the concurrency threshold (the “QoS”). You will use this number later,
when you need to return the QoS to this setting. The default for QoS for normal cluster
operations is dependent on your hardware and is set automatically by the installer.
# ncli qos showconcurrency
Concurrency is 4
Teradata Aster Big Analytics Appliance Database Administrator Guide
80
Cluster Expansion
Split Partitions
4
Set the concurrency to zero. This prevents all SQL users (even you) from logging in to the
system:
# ncli qos setconcurrency 0
Concurrency is 0
5
Run the ncli command changepartitioncount. Replace <newpartitioncount> with
your desired partition count:
$ ncli system changepartitioncount <newpartitioncount>
{<parallelism>}
The <newpartitioncount> must be greater than the current partition count.
Optionally, you can also pass the argument <parallelism> to indicate that you wish the
splitting to be done in a parallel fashion. Replace <parallelism> with the integer
number of tables that should be split concurrently at any given moment. Typical values are
8 or 16. The default is 1.
The operation may take a number of hours, depending on the amount of data in your
cluster. To complete the operation, soft restart Aster Database. After the restart finishes,
the partition split is complete.
If the operation fails, you should restart it by re-running the changepartitioncount
command, using the same parameters you used the first time. If the second attempt fails,
please contact Teradata Global Technical Support (GTS).
6
Log into the AMC and go to the Node > Partition Map tab to monitor the progress of your
partition split. Green squares represent active vworkers. When the number of active,
primary vworkers reaches your desired partition count, the split is complete.
You can also verify your new number of partitions by logging in to the queen command
line and viewing the file, /home/beehive/config/totalPartitionCount.
7
Once the split operation is complete, you must restore the Aster Database concurrency
threshold (QoS) to its normal value:
# ncli qos setconcurrency 4
Concurrency is 4
8
Perform a Balance Data operation.
9
Perform a Balance Process operation.
10 Activate Aster Database.
81
Teradata Aster Big Analytics Appliance Database Administrator Guide
CHAPTER 6
Queen Replacement
Aster Database provides a facility for replacing your queen node if it fails. This facility is called
queen replacement. Queen replacement requires that you use one of your nodes (either the
secondary queen node or a loader node, depending upon how your appliance is configured) as
the new queen. The failed queen can later be removed from the rack and replaced with a new
loader or secondary queen.
The contents of this chapter are:
•
Introduction to Queen Replacement
•
Replace a Failed Queen
•
What is kept and what is lost during queen replacement?
•
Best Practices for Ensuring Queen Recoverability
•
Supporting Procedures for Queen Replacement
Introduction to Queen Replacement
Aster Database provides a facility for replacing the queen node with new hardware if the
queen fails. This facility, called queen replacement, requires you to install a secondary queen
on hardware identical to that of the queen. If the queen fails, you can run the queen
replacement script on the secondary queen, transforming it into your active queen.
To ensure that your queen replacement will go smoothly, follow the best practices listed in
Best Practices for Ensuring Queen Recoverability (page 84) at all times.
Replace a Failed Queen
In the instructions that follow, we use the name “secondary queen” to refer to the machine
that will act as the new queen, and we use the name “failed primary queen” to refer to the
queen that has failed.
Teradata Aster Big Analytics Appliance Database Administrator Guide
82
Queen Replacement
Replace a Failed Queen
Prerequisites
The following prerequisites must be satisfied before queen replacement can be attempted. If
not, the queen replacement procedure will terminate without modifying the cluster.
All the prerequisite requirements will be verified during the queen replacement process.
1
The cluster must have been activated and running at a replication factor of 2 (RF=2) when
it was last functional.
2
If you do not already have a secondary queen, install a secondary queen now, with the
same version of Aster Database as the primary queen. The procedure for this is shown in
Install the Secondary Queen Software (page 85).
3
Passwordless SSH must work in both directions between the replacement queen and all the
nodes of the cluster as the root user. To set this up, see “Set Up Passwordless Root SSH” on
page 85.
4
Make sure that all nodes of the cluster are running at the time of attempted queen
replacement.
5
Make sure that all nodes of the cluster are able to boot and that the replacement queen is
able to reach all nodes of the cluster over the network.
Run the Queen Replacement Script
Now run the queen replacement script replace-queen to apply the failed primary queen’s
list of workers to the new queen. Queen replacement is invoked as a bash shell command
line.
1
Log in to the secondary queen as root.
2
Change the working directory to the location of the replace-queen script:
# cd /home/beehive/bin/exec/
3
The script takes as an argument --workerIp which is the IP address of a worker reachable
using scp. This is used to locate the hosts file, which is used to identify the nodes in the
cluster.
On the appliance, the argument --keepSecondaryQueenIp is required to indicate that
the IP address of the secondary queen should not be changed to that of the primary queen:
# replace-queen --workerIp <Worker_IP> --ignoreWorkerList=queenDb-0
--keepSecondaryQueenIp
4
If you have not stopped the failed primary queen, the queen replacement script will
automatically power off the node at this point.
5
After the replace-queen script has completed, use ncli to perform a soft restart on the
queen. Because the queen process is not yet running, it is not possible to use the AMC for
the soft restart:
# ncli system softrestart
6
Verify that the queen replacement process completed successfully by inspecting this log
file: /home/beehive/data/logs/QueenReplacement.log. It should end with a
message similar to:
2013-03-21T14:49:34.989917 INFO 6349 RecoverUtil.py:32]
83
Teradata Aster Big Analytics Appliance Database Administrator Guide
Queen Replacement
What is kept and what is lost during queen replacement?
*Finishedrecovery
7
Point your browser to the AMC on the new queen and navigate to the Admin > Cluster
Management screen, and click Activate Cluster.
Your queen replacement is complete. The new queen is now your active queen. To ensure
recoverability, you should install a new secondary queen now.
What is kept and what is lost during queen
replacement?
Reset:
The Aster Database replication factor (RF) is reset to 2 during the replacement.
Kept:
The following items survive the queen replacement:
•
All databases and their data
•
Aster Database statistics
Lost:
The following items are lost during queen replacement:
•
Aster Database logs
•
Scripts and code that you may have stored locally on the queen
Best Practices for Ensuring Queen
Recoverability
To be prepared to replace your queen, you should follow the best practices listed below.
•
Always run Aster Database with RF=2.
Warning! Always run Aster Database with RF=2 (that is, with a replication factor of two). If you try to replace the
queen of a cluster that has replication factor of 1, it will not work and you may lose your data.
•
If you have a secondary queen node in your appliance, keep it in service with its role as a
secondary queen node, as shipped. This is a server with the same version of Aster Database
software installed as your active queen, but not connected to any workers. If you convert it
to a loader node, it will not be available for use as a secondary queen until you remove the
loader software and install the queen software on it.
It is also possible to install a secondary queen after your queen has failed, and perform
queen replacement using that secondary queen. However, Teradata Aster urges you to
avoid this because racking, cabling, and installing a new queen takes time.
Teradata Aster Big Analytics Appliance Database Administrator Guide
84
Queen Replacement
Supporting Procedures for Queen Replacement
•
Upgrade your secondary queen whenever you upgrade your primary queen.
•
Make a note of your existing queen’s network settings. This includes the Public IP Address,
Public DNS, Private DNS, netmask, gateway, and NIC bonding settings, if any.
•
Keep an extra backup copy of any scripts or code that you store on the queen in a separate
location.
•
Optional: Set up a remote management interface for example HP's Integrated Lights-Out
(ILO) for each worker node in Aster Database so that you are still able to reboot the node,
even if you have lost your main network connection to it.
Supporting Procedures for Queen Replacement
Set Up Passwordless Root SSH
The queen replacement script requires that passwordless SSH be set up among all nodes for
the user root. Use the same SSH key as your cluster was using before the primary queen failed.
This will be the SSH key from:
•
the primary queen, if it is still reachable, or
•
from one of the workers in the cluster, or
•
if you can’t do one of the above options, you may generate a new key.
Procedure to Copy the SSH Key to the Secondary Queen
Copy the existing SSH key from the node where it resides to your secondary queen:
# cd
# scp -pr .ssh <secondary-queen-IP>
Then test that SSH is working among all nodes without a password, and continue with Run
the Queen Replacement Script.
Install the Secondary Queen Software
To enable your cluster to recover from a queen failure, you must have a secondary queen
available. The following procedure should be used to turn one of your loader nodes into a
secondary queen under one of these circumstances:
•
You have appliance 2, which didn't come with a secondary queen, so this procedure is
required prior to queen replacement, or
•
You have appliance 3, and you previously converted your secondary queen to a loader
using the procedure in Convert the Secondary Queen to a Loader (page 72), but now you
want to convert it back into a secondary queen.
The version of the Aster Database software that you install on the secondary queen must
match the version of Aster Database that the rest of your cluster is running. Install the queen
software on the loader node as shown below. This transforms the loader into your secondary
85
Teradata Aster Big Analytics Appliance Database Administrator Guide
Queen Replacement
Supporting Procedures for Queen Replacement
queen. For clarity, we’ll call your existing, failed queen the “primary queen” in this discussion.
We use the name “secondary queen” to refer to the machine that will act as the new queen.
Note! This procedure converts your loader to be the new queen. The queen can handle loading tasks while acting as
queen, but you should contact Teradata support immediately for a replacement loader node.
1
If your primary queen is still operational, remove the loader from the cluster before
installing the secondary queen. This can be done by one of the two following methods:
•
Using the remove icon for the loader on the Admin tab of the AMC
•
A command from root on the queen:
ncli system removenode ip_address_of_loader
2
Get the Aster Database installer binary, AsterInstaller_5-10.bin, and copy it to the /
tmp directory on the loader node.
3
SSH or log in as user root on the loader node. In order to run properly, the Aster Database
installer requires that you be logged in as root.
4
Using a text editor, remove the failed queen’s IP address from the /root/.ssh/
known_hosts file.
5
Perform a local stop on the loader node:
# /etc/init.d/local stop
6
Make the installer executable (the installer file name below is just an example; replace it
with the appropriate name for use on your operating system):
# chmod +x AsterInstaller_5-10.bin
7
Run the Aster Database installer from a command shell on the loader:
# ./AsterInstaller_5-10.bin
8
In the Welcome screen, select OK and press <Enter>.
What keys do I use to navigate the installer?
The upper part of the installer window always lists the actions you can take or the information the installer wants you
to supply. Use these keys to navigate:
• Tab moves from field to field. Highlighting shows the currently active field or button.
• Enter executes the highlighted button or field.
• Esc exits the installer at any point. You’ll be asked to confirm before the installer aborts.
• The Help button provides context sensitive help/information.
• Ok and Back allow the user to navigate between screens. Keyboard shortcuts are indicated by the underlined
letter e.g. Ok can be “clicked” by pressing Shift+O.
Note: Shortcuts are not triggered by the Alt key in the installer.
Teradata Aster Big Analytics Appliance Database Administrator Guide
86
Queen Replacement
Supporting Procedures for Queen Replacement
9
In the Previous Installation screen, click 2. No, perform a clean install and click OK. All Aster
Database-related data will be deleted from the node on which you are installing. Wait for
the cleanup to finish, after which the installation continues automatically.
If prompted to uninstall an earlier version of Aster, please do so.
10 The Manage node operating system window appears. Choose No, Node OS is pre-installed.
11 In the Installation Type screen, choose Production Install.
12 In the Select the network device screen, highlight bond0 and press <Enter> to choose the
primary interface used to communicate with the worker and loader nodes.
13 In the Verify Networking Information screen, check your network settings and, if needed, fix
them as explained below:
•
The Enable NIC bonding field, type “y”.
•
The Slave Network Interfaces field lists the slave network interfaces to be bonded into the
connection. Specify these as a comma-separated list without spaces, as in: eth4,eth5.
•
In the Default gateway IP field, always use the address of an actual gateway. Do not use
the queen’s IP address as the gateway address. Use only one default gateway. Multiple
gateways are not supported.
•
Enter/edit the NTP server IP address (if needed). Aster Database nodes use the Network
Time Protocol (NTP) to synchronize their clocks. By default, the hardware clock of the
queen (at IP 127.127.1.0) is used as the NTP server (the synchronizing clock). You can
change this IP to that of another NTP server.
Note: Also shown in this window are the machine’s IP Address, Netmask, Default gateway
IP, Subnet prefix address, and Broadcast IP. The installer reads these settings from the
node’s existing network configuration, so you typically do not need to edit these fields.
•
To continue, Tab to OK and press <Enter>.
14 In the SSH private key for root user screen, choose No since you have already set up
passwordless SSH between all nodes in your cluster.
15 To select the queen node type, choose Secondary Queen.
16 In Please provide the following information about your cluster, specify the same information as
specified for the primary queen:
a
How many worker nodes (machines) will be in the cluster.
b
How many physical CPU cores each worker has.
c
How much memory each worker will have, in GB.
17 Tab to OK and press <Enter>.
18 Review the suggested values for the Number of primary virtual workers, Database replication
factor, and Maximum concurrency. Accept the default value for all the settings on this screen.
87
Teradata Aster Big Analytics Appliance Database Administrator Guide
Queen Replacement
Supporting Procedures for Queen Replacement
a
The Number of primary virtual workers is the number of primary vworker instances in the
cluster.
b
The Database replication factor is the number of copies of each vworker that should be
maintained (replication factor).
c
The Maximum concurrency (also called Quality of Service or QoS) is the number of
concurrent sessions allowed at any given time.
19 Click OK and wait for the installer to install the Aster Database queen software.
20 Once the Aster Database installation is complete, reboot the machine when prompted.
After the installer finishes, the machine reboots and the services are started. When
installing a new queen, the Aster Database Services may take ten or more minutes to start
because they must apply a number of SQL upgrades and other upgrades.
21 SSH back into the machine, and verify that the newly installed Aster Database software is
running:
# /etc/init.d/local status
Look for a status of “started.” (A status of “starting” means the queen is not yet ready.)
Tip: It is normal for the queen to reboot itself a few minutes after being added. When you install Aster Database, and
whenever you install a new Aster Database version, the installation might include operating system updates that
require a reboot. The queen will automatically reboot itself once to put the operating system updates into effect.
Your secondary queen software is now installed.
Teradata Aster Big Analytics Appliance Database Administrator Guide
88
Queen Replacement
Supporting Procedures for Queen Replacement
89
Teradata Aster Big Analytics Appliance Database Administrator Guide
CHAPTER 7
Administrative Operations
This section explains how to start, activate, and manage your Aster Database using commandline tools and the Aster Management Console (AMC). The AMC is Aster Database’s browserbased cluster management tool.
•
Cluster Management
•
Manage Network Settings
•
Manage Backups
•
Configure Cluster Settings
•
Roles and Privileges
•
Configure Hosts
•
Restart Aster Database
•
Activate Aster Database
•
Balance Data
•
Balance Process
•
Cluster Management from the Command Line
See also:
•
Cluster Expansion
•
Administrative Operations
Cluster Management
The Cluster Management page (Admin > Cluster Management) lets you manage your Aster
Database cluster.
Teradata Aster Big Analytics Appliance Database Administrator Guide
90
Administrative Operations
Cluster Management
Figure 41: AMC Cluster Management page
The Cluster Management page lets you perform the following operations:
Table 7 - 1: Cluster Management Table Operations
Task
Section
Restart all nodes in the cluster.
Restart Aster Database.
Register new hardware as a worker or
loader node in the cluster.
Add New Nodes to the Cluster.
Bring the cluster online, incorporate
newly added nodes into the cluster, or
activate nodes in the cluster.
Activate Aster Database.
Ensure the data in the cluster is fully
replicated.
Balance Data.
Balance the placement of vworkers to
ensure data availability and efficient
query execution.
Balance Process.
Upgrade the cluster to a newer version of
the Aster Database software.
Teradata Aster Big Analytics Appliance 3H Upgrade
Guide
Warning! Before you invoke the Balance Process,
be aware this process will cancel any running
queries. Before doing this operation, verify there
are no running queries and no queries about to
enter the system.
Check Hardware Configuration
The Nodes > Hardware Config subpanel shows the hardware configuration detail for a selected
Aster Database node. The panel displays detailed compute, memory and storage information,
including a breakdown by processor, which is relevant in multiprocessor servers.
91
Teradata Aster Big Analytics Appliance Database Administrator Guide
Administrative Operations
Cluster Management
Figure 42: The Hardware Configuration Panel in AMC
Use this window to:
•
Find the MAC / IP address of a node. (Even if the node is currently down or unreachable,
provided it registered successfully in the past.)
•
Find the processor speed or processor type of a node.
•
Find the available memory of a node.
•
Find the disk capacity of a node.
Check Node Hardware Configuration
The Nodes > Node Name subpanel shows detailed information about a selected Aster Database
node. The panel displays detailed compute, memory and storage information, including a
breakdown by processor, which is relevant in multiprocessor servers.
Teradata Aster Big Analytics Appliance Database Administrator Guide
92
Administrative Operations
Cluster Management
Figure 43: The Node Inspection Panel in AMC
Use this window to:
•
Find the MAC / IP address of a node. (Even if the node is currently down or unreachable,
provided it registered successfully in the past.)
•
Find the processor speed or processor type a node.
•
Find the available memory of a node.
•
Find the disk capacity of a node.
•
Where is the queen database replica?
•
What vworkers are on this node?
Click the Node Hardware Config tab to view the hardware configuration details, including
information on NIC bonding for the node.
93
Teradata Aster Big Analytics Appliance Database Administrator Guide
Administrative Operations
Cluster Management
Figure 44: Node Hardware Configuration in AMC
Remove Nodes
You can unregister a node to remove it from the list of nodes that Aster Database considers
part of the system. This is useful if a node has been permanently removed from the system,
such as in the case of a permanent node failure or the re-provisioning of a node.
It is highly recommended that node removal only be performed on nodes that have already
been physically removed from Aster Database. That is, it should only be used on nodes that
are shown as New or Failed in the AMC. Using the AMC to remove (unregister) an Active node
could cause Aster Database to transition to a stopped status.
Warning! Removing and re-adding a node is not the recommended way to address problems on a node, because
re-adding the node will delete the data stored on the node. Before you remove a node for node-maintenance purposes, please read the following:
If you want to repair a node, please read the instructions in “Address failed and suspect nodes” on page 58.
If you want to delete all data from a node and re-provision it as a new Aster Database node, see “Deleting All Data to
Re-Provision a Node” on page 19.
To remove a node from Aster Database, perform the following steps.
1
In the AMC, click Admin > Cluster Management.
2
In the Nodes panel, check that the targeted node is not currently Active (refer to the Status
sub-panel for that node).
Teradata Aster Big Analytics Appliance Database Administrator Guide
94
Administrative Operations
Manage Network Settings
3
In the Remove column, click the X button. A confirmation window appears.
4
Ensure the displayed address is the node you want to remove, and click OK to remove it.
5
Physically shut down or reboot the removed node machine now. Rebooting ensures that
data clean-up is done, so that the machine can later be re-added as a node in Aster
Database.
Warning! Once you have removed a node, you cannot immediately re-add that physical machine as a new node!
Attempting to add a previously used node will fail. If you wish to re-use a machine that has previously served as an
Aster Database node (or that has any data on it at all), you must clean the machine as explained in “Deleting All Data
to Re-Provision a Node” on page 19.
Tip! Normally, when you remove a node (by clicking on the blue X for the node), Aster Database will remove the node
from its list of nodes in the cluster, and will reboot the node. Rebooting the node will stop any current processes that
are executing on that node, and those processes will not be restarted after the node restarts.
If you are in the process of adding a new node, and if you tell AMC to remove the node (abort the add) while the node
is being cleaned, the cleanup may not finish completely and the node may not be rebooted. If cleanup does not finish
completely, you might still have some beehive processes running on the node. This is not normally a problem. You
may manually kill those processes, or you may reboot the node yourself. Similarly, you may manually remove any
remaining files, if necessary. (For more information about removing Aster Database-related files, see “Add node fails
with “user data directories are present” message” on page 232.)
If you later add the node back to a cluster by IP address using the Add Node button in the AMC, and if you put a check
in the "Clean Node" checkbox for that node (and don't remove the node again before the Add Node operation completes), the beehive processes, as well as data files and Aster Database program files, will be cleaned up completely
even if the previous clean-up was only partial.
Manage Network Settings
This section covers various network settings you can make through the AMC:
•
Multi-NIC Machines
•
NIC Bonding
•
Set up IP Pools in the AMC
Multi-NIC Machines
Multi-NIC machines enable the following capabilities in Aster Database:
1
NIC Bonding
2
Segment network traffic by function - For both AMOS and UMOS installations, you can
set up your multi-NIC nodes so that Aster Database traffic is segmented by function for
backup, loads and regular Aster Database traffic (called “default” traffic or “queries”). We
refer to this feature as the “network assignments” feature.
Below, we discuss network assignments set-up in the context of the AMC. However, you can
also configure networking for Multi-NIC machines using the commands found in the ncli
nsconfig section.
95
Teradata Aster Big Analytics Appliance Database Administrator Guide
Administrative Operations
Manage Network Settings
Segment network traffic by function
Aster Database enables you to use different subnets for different functions, in order to keep
network traffic separate for backup, loading and regular Aster Database traffic (queries). Some
reasons for designating a dedicated subnet for different Aster Database functions might
include:
•
Legal requirements
•
IT policies and restrictions, security
•
Resource allocation needs and performance
•
Access from outside the Aster Database subnet for specific functions (loading and backup)
You make the network configuration in two steps:
1
Create a configuration for each node,
2
Apply the configuration you created. Note that although this can be done without
restarting the cluster, any network operations that are already in progress will be
interrupted.
These steps are slightly different depending upon whether the cluster was installed as UMOS
or AMOS:
•
For AMOS, you can configure the network IP settings, manage NIC bonding, and assign a
subnet for each function.
•
For UMOS, you can assign each function to a NIC or a bond.
Tip! The IP addresses assigned to each function must be in the same subnet for all nodes. You may see the error
“The Loads and Backups Networks have not been configured on the queen and/or some nodes.” This error occurs if
one of the functions (loads/backups) is set up to use a particular subnet for the workers, but not for the queen (i.e. set
to the default IP).
But the AMC does not detect the problem if the queen has loads/backups assigned to use a specific subnet, but it is
not in same network as the IPs assigned for those functions on the workers. So if you encounter errors when configuring network traffic by function, check to make sure the IPs assigned to each function are in the same subnet for all
nodes.
Configure network settings
You assign Aster Database functions to their own subnets by using the AMC network settings.
To view and/or edit the network settings:
1
In the AMC, select Admin > Configuration > Network from the drop-down options.
2
The AMC Network Overview screen will appear, showing each node and its current settings.
Warning! The settings displayed in the AMC Network Overview screen reflect the current state of the cluster.
The information is obtained by querying the system for bindings, and as such, may not reflect the same settings as
those in the configuration files. For example, if settings have been made but not applied, the settings displayed will be
those in effect currently, even though a restart of network services will apply the settings as they have been configured.
Teradata Aster Big Analytics Appliance Database Administrator Guide
96
Administrative Operations
Manage Network Settings
Figure 45: The AMC Network Overview screen
Note the following permissions will be applied to this list depending upon the type of
Aster Database installation (UMOS or AMOS).
Table 7 - 2: AMC network permissions for types of Aster Database installations
Aster Database Bonding
Installation Type Configuration
IP Settings
Subnet Function
Assignment
UMOS
View only
View only
View and Edit
AMOS
View and Edit
View only for the primary IP assigned in View and Edit
the installer. View and Edit for additional
IPs.
For each node, you can assign an IP address or NIC for each of the following functions.
•
Queries - for internal database communication between nodes (default).
•
Loads - applies to loaders only.
•
Backups - applies to backups.
Note that if you do not assign an IP address or NIC for backups or loads, the default
(queries) setting will be used.
3
97
Click the Configure button on the far right hand side for the node whose network settings
you want to configure. In the network configuration window for the node, you will see
three tabs for AMOS: Current State, Edit Configuration, and Network Assignments. For UMOS,
the Edit Configuration tab does not appear.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Administrative Operations
Manage Network Settings
Figure 46: The Network Current State tab in AMC
4
Use the Current State tab to view details on the current network assignments and
configuration for this node.
5
Make your network configurations and subnet assignments for Aster Database functions:
For AMOS only:
a
Use the Edit Configuration tab to assign and configure network settings (IP Address,
Subnet Mask, and Gateway), and manage network interfaces. Drag interfaces into the
Interface(s) box of an existing network to configure the interface for that network. You
can drag interfaces to and from other networks or the Unmanaged Interfaces box.
However, each interface must be physically cabled such that it can route through the
assigned network. All network configurations must have an IP Address, Subnet Mask,
and at least one interface.
Teradata Aster Big Analytics Appliance Database Administrator Guide
98
Administrative Operations
Manage Network Settings
Figure 47: The Network Edit Configuration tab in AMC
b
To configure NIC Bonding for an AMOS installation, assign two or more interfaces
(NICs) to one IP address to create the bond.
c
To create a new network, click the Add Network button in the Edit Configuration tab.
Assign the new network an IP Address, Subnet Mask and Gateway. Then, drag and
drop an interface into the Interface(s) box for the new network.
If you see NIC names such as em1 or p1p1 (instead of eth0), do not be alarmed. These
network interfaces can be configured using the same procedure as used in the example.
Warning! Do not attempt to set up more than one gateway. Multiple gateways are not supported.
d
When finished, click the Save button to save your settings or Close to cancel. Note that
clicking Save & Apply will apply your changes by restarting the network, interrupting
any network activity that is in process, but it does not require a restart of Aster
Database.
e
Use the Network Assignments tab to assign Aster Database functions to a subnet for this
node. The default or “queries” subnet will use the primary IP Address. You can
optionally assign a different subnet for Loads and/or Backups using the drop-down
selector. Click the Save & Apply button to save your settings or Close to cancel.
Figure 48: The Network Assignments tab in AMC
For UMOS only:
99
Teradata Aster Big Analytics Appliance Database Administrator Guide
Administrative Operations
Manage Network Settings
a
You manage network settings (IP Address, Subnet Mask, and Gateway) in the OS, so
the Edit Configuration tab does not appear.
Tip! For UMOS, if you are using NIC Bonding, you will have to set it up manually in the OS before installing Aster
Database.
b
Use the Network Assignments tab to assign Aster Database functions to a NIC or a bond.
The default or “queries” subnet will use the primary IP address. You can optionally
assign a different subnet for Loads and/or Backups using the drop-down selector. Click
the Save & Apply button to save your settings or Close to cancel.
Apply network settings
To apply the settings you just made immediately, click Save & Apply on the screen where you
made the settings. If you choose to only save your settings and want to apply them later, click
Save. Your settings will be saved to the network configuration files, and applied automatically
when network services are restarted.
Warning! Applying the network settings is accomplished by restarting network services with the new settings.
Because of this, any operations that are currently running over the network will be interrupted. Be sure that there are
no active queries, loads, or backups before applying network settings.
Figure 49: Confirmation to Apply Network Settings in AMC
Example network configuration
Load Data from Outside the Aster Database Subnet
Loaders can exist on both the Aster Database subnet and a loading subnet (for example, an
outward facing subnet). The latter allows loading to be done from a machine not on the Aster
Database subnet. In this scenario, the loader can still perform its duties in the Aster Database
Teradata Aster Big Analytics Appliance Database Administrator Guide
100
Administrative Operations
Manage Network Settings
cluster, because the network configuration allows loading traffic from the “outside” loading
subnet.
Query network status
From the Admin > Configuration > Network panel, click the Query Network Status button to view the
status of the network. A screen similar to the following should appear indicating the health of
the network.
Figure 50: Query Network Status in AMC
NIC Bonding
Aster Database supports balance-alb (active load balancing)NIC bonding to enable faulttolerance (automatic failover of network links) and to improve cluster performance by
aggregating multiple network connections to form a single, virtual connection (“bandwidth
aggregation”).
Warning! Do not set up NIC bonding before performing an AMOS installation. If you wish to use NIC bonding with an
AMOS installation, you must set up bonding using the AMC or ncli after installing Aster Database. You must remove
any preexisting NIC bonding settings before beginning an AMOS installation of Aster Database.
Warning! Aster Database supports only mode 6 (balance-alb mode) of NIC bonding. No other NIC bonding modes
are supported. The 802.3ad bonding convention is not supported.
Warning! The type of bonding used in Aster Database (mode 6/balance-alb bonding) is not effective in a gatewayed
configuration. Therefore, for proper NIC bonding operation, you must place all Aster Database nodes in the same
subnet. You cannot place gateways between any two nodes.
101
Teradata Aster Big Analytics Appliance Database Administrator Guide
Administrative Operations
Manage Network Settings
NIC bonding benefits
NIC Bonding offers two advantages, described below.
•
Bandwidth Aggregation: NIC bonding allows a multi-NIC machine to use its multiple
network interfaces for expanded network throughput in most situations. Contrary to the
general assumption, bonding two NICs does not double throughput between any two
nodes. Instead, the overall throughput is doubled when a node is communicating with
multiple peers. This is because the bonding drive associates a peer with a particular
physical interface, based on the peer's MAC address. As a result, all traffic to that peer
travels over only that NIC. In Aster Database, since the queen communicates with multiple
workers and workers communicate among themselves, bonding two NICs tends to double
the overall throughput of the cluster.
•
Transparent failover in the event of any single network failure: If you use redundant
switches (for example, cable all eth0 NICs through one physical switch, and cable all eth1
NICs through a separate physical switch, and trunk the switches together), then if a single
network path fails (a NIC interface, cable, or switch), the link automatically fails over to
the remaining, active link(s). To achieve this, Aster Database uses MII monitoring at
100 ms intervals to monitor link availability.
In the event of a failover, the available bandwidth is reduced to that of the remaining links.
For example, if each node has four 1-GbE NICs with two network paths, each connected
through its own GbE switch, and one of the switches fails, the result is an automatic
failover to the remaining switch. When this happens, the maximum available bandwidth
shrinks from that of four 1-GbE links to that of two 1-GbE links.
NIC bonding requirements
Observe the following requirements in setting up your environment to support NIC bonding
in Aster Database:
•
Cabling: NIC bonding only works in a local network configuration. You must cable all
NICs which will be bonded so that they reside on the single Aster Database subnet (which
must contain all the worker and loader nodes). All nodes must be locally connected over
one switch or one set of trunked switches, so that each node is directly reachable from all
other nodes.
•
No gateways in the cluster! You cannot have a network gateway in the path between any
Aster Database nodes. This is because Aster Database uses mode 6 (balance-alb mode) of
NIC bonding, and mode 6 is not effective in a gatewayed configuration. (Mode 6 balances
incoming and outgoing traffic across interfaces based on the MAC addresses of connected
peers. Its logic avoids grouping all peers on a single NIC.)
•
Use redundant switches: For a given machine, Teradata Aster recommends cabling each
NIC to a separate physical switch. The switches must share the same subnet and be
trunked together. This improves network availability.
•
Routing entries: Make the appropriate routing entries at the gateway router for the Aster
Database subnet.
•
Use the Aster Database installer and follow these instructions:
Teradata Aster Big Analytics Appliance Database Administrator Guide
102
Administrative Operations
Manage Network Settings
•
•
For AMOS installations, do not configure network bonding before running the Aster
installer. After installing Aster Database, you may set up NIC Bonding using the AMC
or ncli.
•
For UMOS installations, set up NIC bonding manually on all nodes before installing
Aster Database, as shown below.
Note that 10-Gigabit Ethernet is not supported on systems running CentOS 5.6 with NIC
bonding.
Manual configuration of NIC bonding (UMOS)
For a UMOS install, you must configure NIC Bonding manually, using steps like the
following. This example shows the steps for setting up bond0 using eth0 and eth1; your
setup may differ:
1
For eth0, edit /etc/sysconfig/network-scripts/ifcfg-eth0 as follows:
DEVICE=eth0
BOOTPROTO=none
ONBOOT=yes
MASTER=bond0
SLAVE=yes
USERCTL=NO
2
Then for eth1, edit /etc/sysconfig/network-scripts/ifcfg-eth1 as follows:
DEVICE=eth1
BOOTPROTO=none
ONBOOT=yes
MASTER=bond0
SLAVE=yes
USERCTL=NO
3
Create /etc/sysconfig/network-scripts/ifcfg-bond0 with the following settings,
but replace NETWORK, NETMASK, and IPADDR with your queen’s values:
DEVICE=bond0
BOOTPROTO=none
ONBOOT=yes
NETWORK=10.10.60.0
NETMASK=255.255.255.0
IPADDR=10.10.60.100
USERCTL=NO
4
Modify /etc/modprobe.conf to include:
alias bond0 bonding
5
Reboot the queen.
6
To verify that NIC bonding is working, issue the command:
ifconfig -a
You should see bond0 in the output. Next, issue:
cat /proc/net/bonding/bond0
You should see that it is bonded. You should also see the slave interfaces.
103
Teradata Aster Big Analytics Appliance Database Administrator Guide
Administrative Operations
Manage Network Settings
NIC bonding advisories
To check the NIC bonding is operational, use a text editor to open the file, /proc/net/
bonding/bond0, on the node, and inspect the Currently Active Slave field to see that one of
the slave interfaces is in use.
Activate NIC bonding, post-installation
Normally, you set up NIC bonding using the Aster Database installer. To activate NIC bonding
in Aster Database after installation, you can use the script ConfigureOS.py as follows:
1
Stop local services.
2
Run ConfigureOS.py with the --nic_bonding flag, as shown here:
/home/beehive/bin/lib/configure/ConfigureOS.py --nic_bonding=on
3
Start local services.
Check NIC bonding configuration
In the beehiveparams.cfg configuration file, the following parameter controls NIC
bonding:
•
nic_bonding=<on|off>: Indicates if bonding is enabled (on) or disabled (off)
•
slave_interfaces=<eth0,eth1,...>: List of slave interfaces that should be included
in the bond. These can differ based on your operating system and hardware, so use the
NIC names from your own environment. Changing this has no effect after the cluster has
been started for the first time.
•
eth_intf=<ethX>: Name of the primary interface (configured during installation).
Changing this has no effect after the cluster has been started for the first time.
Set up IP Pools in the AMC
This section applies only to AMOS clusters. Aster Database automatically assigns an IP
address to new worker and loader nodes when they are added to the cluster. You specify the
range of IP addresses that will be used by setting up one or more “IP pools”. This step must be
done after installing Aster Database on the queen, but before adding new worker and loader
nodes.
Tip: For a User-managed installation (UMOS), IP pools are not used. Nodes are assigned an IP address through the OS
directly, and then added to the Aster Database cluster by IP address.
The queen’s IP address has no effect on the allocation scheme other than the fact that it is
already in use and cannot be allocated again. All new nodes will be allocated from the start of
the pool for that node type. Because of this, the pool should be modified prior to adding new
nodes to the system.
These rules govern the settings for IP pools:
•
All IP addresses must be on the same network.
•
There can be no overlap between the IP ranges.
Teradata Aster Big Analytics Appliance Database Administrator Guide
104
Administrative Operations
Manage Network Settings
•
IP pools are not required to be contiguous.
•
The IP pools must include all existing nodes, with the correct type:
•
•
shared (any type)
•
queen
•
worker
•
loader
An IP pool may consist of only one IP address.
You can set IP pools in two different ways:
•
by using the ncli ippool commands, or
•
by using the AMC IP pools tab.
Default IP Pool Behavior
Configuring the IP pool is not required. The default configuration will be a shared pool of the
entire network and if this is suitable to your needs, you may use it without changing it.
However, changing pools later can be difficult as the changes must include all existing nodes
in your cluster.
By default, the queen assumes it can allocate IP addresses for the entire subnet on which it
resides. For example, if you have assigned an IP address and subnet mask of 192.168.10.10/
255.255.255.0 during queen installation, then the queen will create a default IP address
allocation pool of 192.168.10.1 through 192.168.10.254. In many networks, the first address is
reserved for the gateway. The coordinator will omit any IP addresses that it detects are in use.
This default behavior may not be suitable for all installations, especially those where the Aster
Database has been given a specific portion of IP addresses in a network or is part of a very
large network (like a subnet mask 255.255.0.0, for example). In these cases, use the following
procedure to change the IP pool settings.
Procedure
To change the IP pools settings using the AMC:
105
1
Log in to the AMC.
2
Select Admin, Configuration and Network from the drop-down multilevel menu.The AMC
Network Overview screen will appear.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Administrative Operations
Manage Network Settings
3
Click the IP Pools tab to see the current IP pools settings.
Figure 1: The IP Pools tab in AMC
4
Click the Show Assigned IPs button to view a list of assigned IP addresses.
Figure 2: Assigned IP Addresses in AMC
5
Click New IP Range to set up the first IP address range. All IP ranges must be on the same
network, and one of them must include the IP address of the active queen.
Figure 3: Create a New IP Range in AMC
6
7
Set these options, remembering to take into account and include the IP addresses of any
existing nodes with the correct Type assignment:
a
Type of node that can use this IP range (Shared, Queens, Workers or Loaders).
b
The Starting IP address in the range.
c
The Ending IP addresses in the range.
d
Click OK.
If you wish to add more IP ranges, click on New IP Range again and repeat the process to
enter an IP range.
Teradata Aster Big Analytics Appliance Database Administrator Guide
106
Administrative Operations
Manage Backups
8
You will see a message in red saying The configuration has changed. Click Save and Apply to
apply these changes. Click the Save and Apply button to save your changes.
Figure 4: Network Save and Apply button in AMC
9
Make a note of all the IP Pools settings and keep it somewhere separate from your queen
node. If your queen fails, you will not have access to this information. The IP Pools settings
will need to be re-applied manually at the end of the queen replacement procedure, so you
will need to know the IP Pools settings to apply.
Manage Backups
In the AMC, select Admin > Backup to open the Backup panel, which is used for managing and
monitoring backups of tables and database. See “Backup and Restoration” in the Teradata
Aster Big Analytics Appliance 3H Database User Guide for details on setting up Aster Backup
and running backups.
Figure 5: The Backup panel in AMC
From the Backup panel, you can see an entry for each logical and physical backup, along with
information such as:
107
•
backup ID
•
status of the backup
•
database and table (for logical backups)
•
backup type (full or incremental)
Teradata Aster Big Analytics Appliance Database Administrator Guide
Administrative Operations
Manage Backups
•
Backup Manager IP address
•
start, end, and elapsed time
•
controls to pause/resume or cancel a backup
Add a New Backup Manager to the AMC
In order to add a new Backup Manager to the AMC, you will need its IP address. Remember
the software version of Aster Backup Manager and Aster Database must be the same.
To add the Backup Manager, perform the following steps.
1
Click the Add Manager button.
2
Enter the IP Address of the Backup Manager.
Figure 6: Adding a Backup Manager in AMC
3
Click OK.
Once the Backup Manager has been added, you will see a confirmation message stating Backup
Node added successfully for IP address <IP Address of Backup Manager>. Note this message is the
only indication the Backup Manager has been added successfully. The Cluster Backups table
will not populate until a backup has been started.
Start a Backup
Backups are started using the Aster Database Backup CLI on the Backup Manager. See “Using
Aster Database Backup” in the Teradata Aster Big Analytics Appliance 3H Database User Guide.
Monitor and Manage Backups
Once a backup has been started, it can be monitored and managed within the AMC Admin >
Backup tab.
Using the icons on the right hand side of each backup listing, you can Pause or Cancel the
backup.
Teradata Aster Big Analytics Appliance Database Administrator Guide
108
Administrative Operations
Configure Cluster Settings
Figure 7: Pausing a Backup in AMC
After pausing a backup, a message will appear stating the backup was successfully paused. To
resume the backup, simply click the icon under Pause/Resume again. Similarly, after resuming a
backup, a message displays to show the resume was successful. Note that if you cancel a
backup, it cannot be resumed.
Figure 8: Successful paused Backup in AMC
Configure Cluster Settings
In the AMC, select Admin > Configuration > Cluster Settings to open the Cluster Settings panel,
which allows you to set the basic operating parameters for the Aster Database. These settings
apply to the AMC installation; all users will use the settings that are defined and saved here.
109
Teradata Aster Big Analytics Appliance Database Administrator Guide
Administrative Operations
Configure Cluster Settings
Figure 9: Cluster Settings in AMC
The Cluster Settings panel appears. The next few sections explain the configuration settings
available on this panel.
•
Cluster Settings
•
Sparkline Graph Scale Units
•
Graph Scaling
•
Internet Access Settings
•
Support Settings
•
QoS Concurrency Threshold Configuration
Cluster Settings
The Cluster Settings section of the Cluster Settings panel provides a way to specify general
cluster-wide configuration options.
To set the cluster setting, perform the following steps.
1
Go to Admin > Configuration > Cluster Settings.
2
In the Cluster Settings section, enter the following:
Teradata Aster Big Analytics Appliance Database Administrator Guide
110
Administrative Operations
Configure Cluster Settings
3
•
Company Name: Name of the company as you wish it to be displayed in the AMC. This
name is required when you send diagnostic log bundles to Teradata Global Technical
Support (GTS) (see “Logs in Aster Database” on page 220), so if you intend to send log
bundles, do not leave this field blank.
•
Cluster Name: Name of the cluster as you wish it to be displayed in the AMC. Useful
for sites with more than one cluster. This name is required when you send diagnostic
log bundles to Teradata Global Technical Support (GTS) (see “Logs in Aster Database”
on page 220), so if you intend to send log bundles, do not leave this field blank.
•
Process Log Cleanup Frequency: How long you want to keep process log data after each
process completes. Increase the setting to obtain more debugging information,
decrease to 5 minutes if you do not want to keep logs for very long, or accept the
default.
Click Save.
A confirmation message appears in the top right corner of the panel.
Sparkline Graph Scale Units
The Sparkline Graph Scale Units section of the Cluster Settings panel provides a way to specify
the sparkline unit for the display graphs for network, disk I/O, and memory activity.
To set the sparkline unit, perform the following steps.
1
Go to Admin > Configuration > Cluster Settings.
2
In the Sparkline Graph Scales Units section, configure the settings for:
•
Network
•
Disk I/O
•
Memory
Graph Scaling
The Graph Scaling section of the Cluster Settings panel provides a way to control how the
graphical data displays in various AMC tabs are rendered in terms of their numerical scale.
The main Dashboard panel and the Nodes > Hardware Stats tab contain graphs of network, disk I/
O, and memory activity, which are affected by the configuration settings in Graph Scaling.
1
Select Admin > Configuration > Cluster Settings.
2
In the Graph Scaling box, enter one or more of the following figures. Use higher numbers to
make sure the graphs show all high spikes in activity, lower numbers to magnify smaller
fluctuations in activity:
•
Network: The maximum number on the quantity axis in all AMC network graphs, in
Kb/s
•
Disk IO: The maximum number on the quantity axis in all AMC disk I/O graphs, in
Kb/s
•
Memory: The maximum number on the quantity axis in all AMC memory usage
graphs, in MB
111
Teradata Aster Big Analytics Appliance Database Administrator Guide
Administrative Operations
Configure Cluster Settings
3
Click Save.
A confirmation message appears in the top right corner of the panel.
Internet Access Settings
The Internet Access Settings section of the AMC Cluster Settings panel is where you configure
any proxy settings that are needed to enable the queen to have outbound Internet access. The
queen needs to use the Internet when you send diagnostic log bundles to Teradata Global
Technical Support (GTS) (see “Logs in Aster Database” on page 220). Depending on your
network security policy, you might have outbound Internet access even if you do not fill out
these settings.
1
Select Admin > Configuration > Cluster Settings.
2
In the Internet Access Settings box, enter the following:
•
Proxy Hostname or IP Address: Name or IP number of the proxy server which serves as an
intermediary for Internet requests.
•
Port: Number of the port on the proxy server that is available to receive Internet
requests from the queen.
•
3
Username and Password: Credentials that the queen can use to log in to the proxy server.
Click Save.
A confirmation message appears in the top right corner of the panel (you might have to
scroll up to see it).
Support Settings
The Support Settings section of the AMC Cluster Settings panel sets up the AMC’s access to
Teradata Global Technical Support (GTS) servers. The support center URL, username, and
password are required when you send diagnostic log bundles to Teradata Global Technical
Support (GTS) (see “Logs in Aster Database” on page 220). The resource center URL enables
the AMC to display the Teradata Global Technical Support (GTS) page of useful code and
information.
1
Select Admin > Configuration > Cluster Settings.
2
In the Aster Support Settings box, enter the following:
•
Support Center URL: The address of the Teradata Global Technical Support (GTS) server.
This URL is required when you send diagnostic log bundles to Teradata Global
Technical Support (GTS) (see “Logs in Aster Database” on page 220), so if you intend
to send log bundles, do not leave this field blank. This URL is different for each
Teradata Aster customer. If you do not yet have your support URL, contact Teradata
Global Technical Support (GTS).
•
Resource Center URL: The address of the Teradata Aster resource center, a web page
where you can find documentation, videos, and downloadable client software. This
URL provides the destination for the Resource Center link which appears at the top of
every AMC page. Like the Support Center URL, you should also have received this
URL from Teradata Aster.
Teradata Aster Big Analytics Appliance Database Administrator Guide
112
Administrative Operations
Roles and Privileges
When you click the Resource Center link, the page at the Resource Center URL appears
with links to documentation and videos.
Figure 10: Resource Center in AMC
•
3
Username and Password: Your user credentials for logging in to the support center. This
information is required when you send diagnostic log bundles to Teradata Global
Technical Support (GTS) (see “Logs in Aster Database” on page 220), so if you intend
to send log bundles, do not leave these fields blank. The user name and password are
different for each Teradata Aster customer. If you do not yet have your credentials,
contact Teradata Global Technical Support (GTS).
Click Test.
If the cluster can connect to the given URLs, a confirmation message appears in the top
right corner of the panel (you might have to scroll up to see it).
4
Click Save.
A confirmation message appears in the top right corner of the panel (you might have to
scroll up to see it).
QoS Concurrency Threshold Configuration
The Qos Concurrency Threshold Configuration section of the AMC Cluster Settings panel lets you
specify the QoS Concurrency Threshold.
Roles and Privileges
To define the actions a user of the AMC can perform, use the AMC roles and privileges.
View the List of Available AMC User Privileges
113
1
Log into the AMC as an amc_admin user. This is typically the db_superuser account in a
new Aster Database installation.
2
Go to Admin > Configuration > Roles & Privileges.
3
In the Roles & Privileges tab, the available AMC Roles (amc_admin, process_admin,
process_viewer, process_runner, node_admin, and node_viewer) are listed on the horizontal
axis of the table, and the individual privileges are listed on the vertical axis. Each privilege
is a combination of a section of the AMC and an action the user can perform there.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Administrative Operations
Roles and Privileges
Figure 11: Roles and Privileges in AMC
A user is typically granted only one of the roles listed on this page of the AMC.
A user can only connect to Aster Database if he has one of the roles listed on this page of the
AMC.
Create an AMC User
The AMC authenticates all users against the beehive database. This means that any user who
will access the AMC must have CONNECT privileges on the beehive database.
1
In the Roles & Privileges tab of the AMC, review the list of available AMC user privileges,
as explained in “View the List of Available AMC User Privileges” on page 113.
2
Find the AMC Role that has the privileges you want to grant to the new user. Note the
role’s name.
3
Start an ACT session and log in as an administrator (a user with db_admin privileges).
4
At the SQL prompt, use the CREATE USER command to create the user account, and use
the GRANT command to give the user the AMC Role you chose earlier. For example, to
create an account for Topper Headon (theadon) and make him a process_viewer user
in the AMC, you would type this:
CREATE USER theadon IN ROLE process_viewer PASSWORD '5t4g0l33';
5
Attempt to Log in to the AMC as the user you created. If you cannot log in as this user,
make sure the user has privileges on the beehive database:
GRANT CONNECT ON DATABASE beehive TO theadon;
For more general information on creating and managing Aster Database users, see
“Managing Users” in the Teradata Aster Big Analytics Appliance 3H Database User Guide.
Teradata Aster Big Analytics Appliance Database Administrator Guide
114
Administrative Operations
Configure Hosts
Check Current AMC Privileges
To check the current privileges of users, perform the following steps.
1
Log into ACT as a db_admin user.
2
Run this query:
SELECT
nc_users.username,
nc_roles.rolename
FROM
nc_users,
nc_roles,
nc_group_members
WHERE
nc_users.userid = nc_group_members.memberid
AND
nc_roles.roleid = nc_group_members.groupid
GROUP BY
username,
rolename;
Edit AMC Privileges
To edit a user's AMC privileges, perform the following steps.
1
Assess the current state of user privileges:
•
To find the user's current privileges, see “Check Current AMC Privileges” on page 115.
•
To see the list of available AMC user privileges, see “View the List of Available AMC
User Privileges” on page 113.
2
Find the AMC role that has the privilege(s) you want to grant to or revoke from the user.
Note the role's name.
3
Start an ACT session and log in as an administrator (a user with db_admin privileges).
4
At the SQL prompt, use the GRANT or REVOKE command to give or remove the privileges.
For example, to give Topper Headon the process_viewer privilege, you would type this:
GRANT process_viewer TO theadon;
The user's new AMC rights apply for all AMC sessions he or she starts in the future. If the user
is currently logged in, the current session will not be updated with the new rights until he or
she logs out and logs back in.
Configure Hosts
Set Up Host Entries for all Nodes
The Hosts screen allows the administrator to provide a mapping of host names to IP addresses
and have that mapping applied to every node on the cluster. By providing this mapping of
host names to IP addresses, the administrator is enabling SQL/MR functions which make
network connections to connect to other machines in the enterprise using a hostname not
defined in the DNS that the cluster is using.
115
Teradata Aster Big Analytics Appliance Database Administrator Guide
Administrative Operations
Configure Hosts
You can set up host entries on all the nodes of an Aster Database cluster by editing the /etc/
hosts file on each Aster Database node manually (for UMOS clusters) or through the AMC
(for AMOS and UMOS clusters) by performing the following steps.
1
Log into the AMC as an Administrator user.
2
Navigate to Admin > Configuration > Hosts.
3
Click the Hosts tab.
4
Click the New Host Entry button to display the Host dialog box.
5
Complete the following fields for each host entry, using English language characters only:
•
IP Address - Required. The IP address of the host.
•
Aliases - Required. A list of one or more hostnames, called Aliases.
•
Comment - Optional. Helpful descriptions to differentiate among hostnames.
Note: More than one Alias can map to a single IP Address. There are situations where a single
machine has multiple resources which rely on the hostname in the request. An example of this
is an HTTP server with multiple virtual hosts configured.
Tip! If you are making host entries for Teradata nodes, make sure that when you enter the alias, you include "cop#" at the
end (e.g., if you will execute “... load_from_teradata( ... TDPID('dbc')...)”, then enter a
name like “dbccop1” as the alias.) For more information on working with the Teradata-Aster Database Connector and
Teradata nodes, see “Teradata-Aster Database Connector” on page 408.
6
To change the “Order IP Address” (or rearrange the order of the list) click and drag any
row in this screen and move (or drag) the row to a higher or lower position in the list. Now
the numerical order of the list is updated.
7
When you are finished adding entries for each node, click Save and Apply Changes. Your
changes will be written to the hosts file on each Aster Database node.
Teradata Aster Big Analytics Appliance Database Administrator Guide
116
Administrative Operations
Configure Hosts
Figure 12: AMC Hosts tab
Tip! Note that the AMC does not show any entries in /etc/hosts or /etc/resolv.conf that were not
added through the AMC or ncli. Therefore, it does not allow you to edit or remove entries that were not added through
the AMC or ncli.
Warning! Do not manually edit any of the entries in the /etc/hosts or /etc/resolv.conf files within
the sections enclosed in comments indicating that they were added by Aster Database. These are the changes made
through the AMC, and should only be edited through the AMC. Here is a sample of how these entries appear:
## Configured by NCluster. DO NOT EDIT!!! ##
10.51.13.100
dyu
# localhost
## End NCluster Configuration ##
Set up DNS entries for all Aster Database nodes
If the network is set up such that DNS servers are used to resolve the host or database names,
you must add the DNS server(s) to the /etc/resolv.conf file on each Aster Database node.
You can do this by editing the /etc/resolv.conf file on each node manually (for UMOS
clusters) or through the AMC (for AMOS clusters) as follows:
117
1
Log into the AMC as an administrator user.
2
Go to Admin > Configuration > Hosts.
3
Click the DNS Servers tab.
4
Click the New DNS Server button. The following screen appears.
5
Enter the IP address and Comment in the fields provided.
6
Click OK to save or Cancel to cancel the entry.
7
Repeat Steps 4–6 for each DNS entry.
8
Once all the DNS servers have been added, click Save and Apply Changes.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Administrative Operations
Restart Aster Database
Your changes will be written to the resolv.conf file on each node.
Figure 13: DNS Servers tab in AMC
9
To edit an entry, click the Edit button
.
10 To delete an entry, click the Delete button
.
Restart Aster Database
Aster Database is designed to be resilient to many forms of failure. Many serious failures that
Aster Database may encounter can be resolved by restarting the system.
Restarting Aster Database involves a full restart of the system. During this time, queries cannot
be performed and most administrative functions will be unavailable. There are two options
for restarting Aster Database: Soft Restart and Hard Restart.
Figure 14: Soft and Hard Restart buttons in AMC
Procedure
To restart Aster Database:
1
In the AMC, go to the Admin > Cluster Management tab.
2
Click either Soft Restart or Hard Restart (described below).
3
After the restart has finished, you must click the Admin > Cluster Management tab and click
Activate Cluster to make the cluster operational again. See “Activate Aster Database: The
Procedure” on page 121.
Teradata Aster Big Analytics Appliance Database Administrator Guide
118
Administrative Operations
Restart Aster Database
Soft Restart
Clicking Soft Restart invokes a software-level restart of Aster Database. This process generally
takes one to three minutes and involves restarting the software on each node in the system.
During a Soft Restart, the AMC may show nodes as having a status of Upgrading even if the Soft
Restart was not part of an upgrade operation. This happens because upon a Soft Restart, Aster
Database always checks to see whether there are upgrade-related scripts to run, and displays
the Upgrading icon while it does the check. The status for the affected nodes will display as
Upgrading until the check is performed and any upgrade scripts are run. When a node reboots,
it may pass through the states of New to Preparing to Upgrading to Prepared. This is normal. After
this, a status of Prepared will appear.
Most issues requiring a restart will be resolved with a soft restart. After you perform a soft
restart, you must click Activate to make the cluster operational again.
Since the outage period with the Soft Restart option is significantly lower, it is always
recommended to perform a Soft Restart first before trying a Hard Restart. If the issue with
Aster Database is not resolved with Soft Restart, a Hard Restart can be performed.
Note: Setting the QoS concurrency to zero will still allow any new queries that are part of an open transaction. Soft
Restart does not wait until open transactions are finished; it rolls back open transactions and then does the restart.
See also “Soft Shutdown” on page 125.
Backup interaction with soft-restart
Before running soft-restart on a cluster, ensure that there are no data backups or restorations
in progress. Interrupting backup operations in this manner can lead to errors during startup.
To find out how to check whether backup operations are in progress, see “Backup and
Restoration” in the Teradata Aster Big Analytics Appliance 3H Database User Guide.
Warning! Never restart the cluster while Aster Database Backup is running a data backup or restoration.
Hard Restart
In very rare cases, there are errors (typically hardware-related) that require a hard restart.
Clicking Hard Restart will trigger a hardware-level restart of Aster Database. This process can
take 10 minutes or longer, depending on the time needed to reboot the physical servers used in
the system. A hard restart should be issued in cases where a Soft Restart fails to resolve the
problem. When a node reboots, it may pass through the states of New to Preparing to Upgrading
to Prepared. This is normal. After you perform a hard restart, you must wait for all nodes to
become Prepared before you click Activate to make the cluster operational again.
119
Teradata Aster Big Analytics Appliance Database Administrator Guide
Administrative Operations
Activate Aster Database
Soft Shutdown
To shut down Aster Database in preparation for upgrades or hardware moves, see “Soft
Shutdown” on page 125.
Activate Aster Database
Activating a new node into Aster Database allows you to scale disk storage and CPU
processing resources. Node activation is a two-step process. First, data storage is balanced by
shifting vworkers from currently-active worker nodes to newly added node(s). Second,
processing resources are balanced by activating the vworkers on the newly added node(s).
During the data re-balancing (invoked from the Admin > Cluster Management > Nodes tab of the
AMC by clicking the Balance Data button), currently-running queries will not be disrupted
(unlike Balance Process) but will continue to run only on previously-existing nodes. To utilize
the processing power of new nodes, a brief, predictable downtime is necessary (invoked by
clicking the Balance Process button in the AMC).
Warning! Before you invoke the Balance Process, be aware this process will cancel any running queries.
Before doing this operation, verify there are no running queries and no queries about to enter the system.
Table 7 - 3: Aster Database activation and balancing steps
Activation Process
What Happens
Step 1: Balance Data - balancing data
placement
Completely online. Rebalancing time dependent on
data size.
Step 2: Balance Process - optimally locating
vworkers
Very brief (seconds to low minutes). Requires short
outage during this activation period.
Warning! Before you invoke the Balance Process, be
aware this process will cancel any running queries.
Before doing this operation, verify there are no
running queries and no queries about to enter the
system.
When an outage (e.g., query blocking operation) is needed for activation of compute
processing on the new node, it is brief and very predictable – typically a few seconds to a
couple minutes. The details of each of the two activation steps are described below, following
these typical use cases and best practices:
Situations that Require an Activation
Warning! Before invoking the Balance Process be aware this process will cancel any running
queries. Before doing this operation, verify there are no running queries and no queries about
to enter the system.
Teradata Aster Big Analytics Appliance Database Administrator Guide
120
Administrative Operations
Activate Aster Database
•
NEW CLUSTER: The cluster has just started or rebooted. In the Admin > Cluster Management
Panel of the AMC, click the Activate button, and then a new window will pop open. Next,
click Activate Aster Database. This will bring the cluster to the target replication factor and
make all nodes available to help process queries.
•
EXISTING NODE REBOOTS: A node has just rebooted and is recognized as Prepared in
the AMC. Because a node went offline, the replication factor must have fallen below the
target. You can quickly restore the replication factor through the Balance Data feature by
viewing the appropriate node address and then clicking the Activate button and then click
Balance Process. This will make the rebooted node's processors available to the cluster and
move it to “active” state.
•
ADD NEW NODE FOR SCALE-OUT: A new node has been added to the cluster. In this
case, it will probably take a longer period to copy data from existing nodes to the new node
as part of the data re-balancing process; Balance Data will do this in the background and
leave the cluster fully available for loads and queries. First, click the Activate button and
then click Balance Data in the resulting pop-up to copy data over. Later, when a few seconds
to minutes of outage are acceptable, it is recommended that you balance process by
clicking the Activate button and then clicking Balance Process in the resulting popup. This
will make the new node's processors available to the cluster. Note that you can add one or
more nodes to an existing Aster Database and the data-rebalancing occurs in a parallel
manner for maximum performance.
Activate Aster Database: The Procedure
Follow the steps below to activate Aster Database after a restart or shutdown.
121
1
Make sure your queen node is running. If it is not, restart the queen machine now. The
Aster Database software and the AMC will be started automatically.
2
With your browser, navigate to the AMC.
3
In the AMC, click on the Admin > Cluster Management tab. If the cluster is not already Active,
the Aster Database status lamp in the upper left corner will be red with a status of
STOPPED.
4
In the Admin > Cluster Management > Nodes panel, under the label, Node Name, you will see a
list of the queen and nodes, with a Status for each. The next action you need to take
depends on what you see here. Do one of the following:
•
If the worker nodes have a Status of Preparing, turn to Step 5.
•
If the worker nodes have a Status of Prepared, turn to Step 6.
•
If the worker nodes have a Status of New, turn to Step 6.
•
If no worker nodes are displayed, it could be that you have never added nodes to the
cluster. See “Add New Nodes to the Cluster” on page 71. If you know that your cluster
has nodes, but they have not appeared in the Nodes tab, then you should wait a few
more minutes if you have just restarted Aster Database. The worker nodes take a few
minutes to reappear after a hard restart. If you have not already performed a hard
restart, you can do so now as explained in “Restart Aster Database” on page 118. If,
after restarting, the workers fail to appear, then contact Teradata Global Technical
Support (GTS).
Teradata Aster Big Analytics Appliance Database Administrator Guide
Administrative Operations
Balance Data
•
If any nodes have a status of Failed, see “Address Hardware Problems on Workers” on
page 719.
5
While the worker nodes show a Current Status of Preparing, you must wait for them to
become Prepared. Once the status of all nodes is Prepared, turn to the next step.
6
When all worker nodes show a Current Status of Prepared or New, go to the Admin > Cluster
Management screen and click Activate Cluster.
Tip! The Activate Cluster button is also used, under certain circumstances, to incorporate new nodes into the cluster. See “Incorporate the New Nodes” on page 77.
7
If the Activate Nodes dialog appears, click Activate Aster Database again. (Note: If you are
activating from a hard restart, you will have already clicked this button a few minutes ago.
This is normal: The first activation prepares the nodes, and the second brings them
online.)
The green message box in the upper right of the AMC shows that the cluster is Activating.
When the queen has finished activating all nodes, the Aster Database status lamp shows
green with a status of Active. Your Aster Database is ready to use.
Balance Data
Balance Data is an Aster Database administrative action that balances data placement across all
the worker nodes and, if needed, adds vworkers to the cluster. Queries and loads are not
disrupted, though there may be some performance overhead as the activation process uses
system resources. Balance Data can be used to incrementally scale-out a cluster, or to quickly
and non-disruptively restore full data replication after a hardware failure.
You initiate Balance Data in the AMC by clicking the Balance Data button in the Admin > Cluster
Management tab. For instructions, see the next section, “Balance Data: The Procedure” on
page 123.
After you run a Balance Data operation, some worker nodes (in particular, newly added worker
nodes) may be in the Passive (Blue) state in the AMC. At this point, storage of live and standby
data is balanced across all nodes. Queries continue to run only on the Active nodes, while the
Passive nodes act as up-to-date standbys that can be activated when an Active node fails. Your
Active nodes are hosting all the active vworkers, while your Passive nodes are hosting only
passive vworkers.
Balance Data runs in the background, and may run for a long time. Since it balances data across
all nodes, it may need to copy very large amounts of data. For example, suppose you have a
three-node cluster with 400 GB used per worker node and you use Balance Data to add one
node the cluster. To achieve data balance, Aster Database will store roughly (400 * 3) / 4 = 300
GB per node once online activation is finished. This implies that 300GB must be copied onto
the new worker node. Assuming this incorporation occurs over a 1Gbps network that is
otherwise unused, this will take at least (300 * 8 Gigabits / 1 Gbps) = 300*8 seconds = 2400
seconds = 40 minutes. Note that with the Network Aggregation feature, you have the option to
Teradata Aster Big Analytics Appliance Database Administrator Guide
122
Administrative Operations
Balance Process
“bond” together multiple 1Gbps NICs to offer trunked bandwidth – for example if you
bonded 8x 1Gbps links, you would make available the equivalent of 8Gbps of aggregate
bandwidth, which could reduce the time required to re-balance data (assuming network is the
bottleneck).
It is recommended that at some point after Balance Data completes, you find an acceptable time
for a few minutes of downtime and run Balance Process on the cluster to balance the
computational burden across its nodes. Balance Data may create extra copies of some data in
order to achieve balance while not deleting the existing, in-use copies. These extra copies will
not be deleted until you perform Balance Process.
Warning! Before invoking the Balance Process, be aware this process will cancel any running
queries. Before doing this operation, verify there are no running queries and no queries about
to enter the system.
Balance Data: The Procedure
Follow this procedure to balance data placement across all the worker nodes:
1
In the Admin > Cluster Management tab, click Balance Data to balance data across all available
nodes. This step updates existing and new nodes with the data they need. This process
runs for at least a few minutes, and as long as a few hours in large Aster Databases,
depending on the amount of data that must be copied to the newly added node(s). As
mentioned above, Balance Data allows the storage re-balancing process to occur seamlessly
with no downtime for running queries.
Next Steps
If you are scaling out your cluster, you may want to perform a partition split now to increase
the number of vworkers in the cluster. See“Split Partitions” on page 79.
Balance Process
Balance Process is an Aster Database administrative action that load-balances the query
processing burden across all worker nodes in Aster Database. It will optimize performance
given the current data placement. The Balance Process step does not create new copies of data,
so it typically runs quickly. It also does some cleanup, deleting data that can no longer be used.
It will briefly disrupt the cluster, a period that can last from a few seconds to a few minutes.
Warning! Before you invoke the Balance Process, be aware this process will cancel all running
queries. Before doing this operation, verify there are no running queries and no queries about
to enter the system.
It is recommended that Balance Process be run at some point after Balance Data completes,
when a few minutes of downtime are acceptable, so that a new node’s processors are available
to the cluster.
You initiate Balance Process in the AMC by clicking the Balance Process button in the Admin >
Cluster Management tab. See the next section for instructions.
123
Teradata Aster Big Analytics Appliance Database Administrator Guide
Administrative Operations
Cluster Management from the Command Line
Balance Process: The Procedure
Warning! While the Balance Process step is in progress, your cluster cannot process queries. Before you per-
form the next step, make sure all running queries have finished successfully and that no new queries are allowed to
enter Aster Database. Already-running queries will be killed when you click Balance Process, and new queries
submitted after you click it will wait until the balancing is complete.
1
Click the Balance Process button in the AMC’s Admin > Cluster Management tab to force
currently passive nodes to contribute to the handling of database queries. This interrupts
the operation of the cluster. This step places active vworkers on all nodes that are able to
host them.
This action takes at least a few seconds and as many as a few minutes to complete. Once
processing is balanced, the cluster resumes handling queries and the new node(s) are part
of the cluster, which you can check by looking for a status of Active in the AMC’s Nodes
panel.
Next Steps
If you are scaling out your cluster, you may want to perform a partition split now to increase
the number of vworkers in the cluster. See “Split Partitions” on page 79.
Cluster Management from the Command Line
You can manage many aspects of Aster Database from the command line. Most tasks are done
via the Aster Database Command Line Interface (ncli), a tool for inspecting and managing all
nodes in the cluster. Below, we explain the most common cluster management tasks. For more
detailed instructions, see “Command Line Interface (ncli)” on page 148.
To get started with the ncli, open a command shell on the queen, log in as root, and type ncli
to get basic help, and then type, for example, ncli system to show the help for the system
commands.
# ncli
# ncli system
Check Cluster Status
Use the system show and node show commands to check the Aster Database operational state.
To check the queen’s status, log into the queen as root user and, at the command line, run the
system show command:
# ncli system show
Next, use the node show command to check the status of all nodes in the cluster:
# ncli node show
Teradata Aster Big Analytics Appliance Database Administrator Guide
124
Administrative Operations
Cluster Management from the Command Line
Soft Restart
To restart Aster Database, use the softrestart command. Working as root user at the queen
command line, type the command:
# ncli system softrestart
Note: Setting the QoS concurrency to zero will still allow any new queries that are part of an open transaction. Soft
Restart does not wait until open transactions are finished; it rolls back open transactions and then does the restart.
See also: “Soft Restart” on page 119 and “Hard Restart” on page 119.
Soft Shutdown
To shut down Aster Database in preparation for upgrades or hardware moves, use the
softshutdown command. Working as root user at the queen command line, type the
command:
# ncli system softshutdown
If the shutdown attempt fails with the message, “Unable to grab exclusive lock for restart,” you
can use the SoftShutdownBeehive.py script with the --force flag to clean up leftover
processes and shut down. Contact Teradata Global Technical Support (GTS) for help using the
script.
See also: “Soft Shutdown” on page 120.
Soft Startup
To start Aster Database after a soft shutdown has been performed, use the softstartup
command. Working as root user at the queen command line, type the command:
# ncli system softstartup
If the startup attempt fails with the message, “Unable to grab exclusive lock for restart,” you
can use the SoftStartupBeehive.py script with the --force flag to clean up leftover
processes and start. Contact Teradata Global Technical Support (GTS) for help using the
script.
Next, you must activate the cluster (“Activate Aster Database” on page 120).
Free Space Occupied By Defunct VWorkers
When Aster Database deletes vworkers, the space is not freed for approximately 24 hours.
(This can occur, for example, if a replica vworker goes down, the system creates new replica
vworker, and then the original replica vworker comes back up. In this case you have more
replica vworkers than you need, and the unneeded vworker will be deleted automatically.)
The 24-hour waiting period is a safety mechanism, but it can delay your work if you are
adding machines and wish to scale out immediately, because you must wait 24 hours for the
space occupied by the defunct workers to become available for new data.
125
Teradata Aster Big Analytics Appliance Database Administrator Guide
Administrative Operations
Cluster Management from the Command Line
To immediately reclaim space made free by vworker deletions, use the command-line utility,
TrashmanUtil. You can find this tool on the Aster Database queen in /home/beehive/bin/
utils/support/TrashmanUtil. Please contact Teradata Global Technical Support (GTS)
before you attempt to use this tool.
Teradata Aster Big Analytics Appliance Database Administrator Guide
126
Administrative Operations
Cluster Management from the Command Line
127
Teradata Aster Big Analytics Appliance Database Administrator Guide
CHAPTER 8
Security
This chapter explains how to make security settings in Aster Database.
Manage Users and Privileges
•
Database Roles and Privileges: See (CREATE USER, CREATE ROLE, GRANT, ALTER
USER, ALTER ROLE) in the Teradata Aster Big Analytics Appliance 3H SQL and
Function Reference.
AMC
•
AMC Roles: Create an AMC User (page 114)
•
AMC Access: Internet Access Settings (page 112)
SQL-MapReduce
See the Teradata Aster Big Analytics Appliance 3H Database User Guide.
Security between Aster Database and other systems
•
SSL for ODBC and JDBC: ee the Teradata Aster Big Analytics Appliance 3H Database
User Guide.
•
Reporting Tools: See the Teradata Aster Big Analytics Appliance 3H Database User
Guide.
•
Aster Database - Teradata Connector: “Connector Argument Clauses” in the Teradata
Aster Big Analytics Appliance 3H Database User Guide
Aster Database Firewall
The Aster Database firewall runs on all Aster Database nodes, blocking all non-Aster-related
external access to the nodes. The documented public interfaces to Aster Database remain
open. You can disable the firewall if desired, using the ConfigureNCluster.py command.
For installations with special requirements, Teradata Aster services can help you configure
custom firewall policies. Policies can be separately configured for each node.
Teradata Aster Big Analytics Appliance Database Administrator Guide
128
Security
Aster Database Firewall
Default Firewall Settings
Aster Express is an User-Managed Operating System (UMOS) installation, which means that
the operating system and Aster software was pre-installed on all nodes before they were added
to the cluster. Note that in the full version of Aster Database, you can choose an AsterManaged Operating System (AMOS) installation, where the OS on all nodes is installed,
configured and managed by Aster Database. The Aster Database firewall is enabled or disabled
based on your Aster Database installation type:
AMOS
The firewall is enabled by default only on Aster-managed OS (AMOS) deployments.
UMOS
On user-managed OS (UMOS) deployments, the default firewall state is disabled. This is done
because in UMOS deployments, the worker/loader nodes are often placed on a different IP
subnet than the queen. In such a deployment, enabling the firewall would block
communication between the queen and workers since the default firewall policy blocks all
access from outside the queen’s subnet.
On UMOS deployments, you can enable the firewall if required, using the
ConfigureNCluster.py command, provided all nodes are in the same subnet.
Open TCP Ports for Aster Database
You may have firewalls that limit network access to the Aster Database machines (which reside
on the Aster Database subnet). To allow the cluster to operate, you must open the following
ports on the firewalls that control access between the Aster Database machines and your
network.
Tip! Each machine in Aster Database also runs its own Aster Database firewall software to prevent access to most
network ports. Within the Aster Database subnet, all ports are open for subnet-internal traffic, and the Aster Database
firewall prevents subnet-external connections for all ports except those listed below.
For traffic within the Aster Database subnet, all ports are open. For traffic to and from outside
the Aster Database subnet, the Aster Database firewall blocks connections on all ports except
those listed in this section.
Because all ports are accessible within the Aster Database subnet, you do not need to open
specific ports between workers and loaders. Your Aster Database firewall policy should only
unblock ports that must be kept open for subnet-external access.
Queen node
129
•
22 - SSH
•
80 - AMC (Apache web server). Accessing AMC over port 80 redirects to port 443 for
secure access.
•
443 - AMC (over SSL/HTTPS)
Teradata Aster Big Analytics Appliance Database Administrator Guide
Security
Aster Database Firewall
•
2406 - ACT and other clients
•
2407 - AMC (HTTP)
•
2105, 1986, 2113 - Aster Database operations
•
1984, 2111, 3211, 1988, 10999 - Aster Database backup
Worker node
•
22 - SSH
•
1986, 2113 - Aster Database operations
•
1984, 2111, 3211, 11000:12000 - Aster Database backup
Loader node
•
22 - SSH
•
1711 - ACT and ncluster_loader
Enable or Disable the Aster Database Firewall
The Aster Database firewall state on a node is controlled by the “firewall” parameter in the /
home/beehive/config/beehiveparams.cfg file. This takes the following parameters:
•
on -- The Aster Database firewall will be started when you start Aster Database.
•
off -- The Aster Database firewall will not be started when you start the Aster Database.
Use this feature only if you want all Aster Database nodes to run without a firewall.
•
disabled -- When you start Aster Database, no attempt will be made to start, stop, or
configure firewalls. This means that the firewall configuration routine will not run during
cluster startup. This is useful if you have set up your cluster’s firewalls manually, and you
do not want Aster Database to change your configuration.
You pass these parameters using the --firewall flag with the Aster Database configuration
script, ConfigureNCluster.py.
/home/beehive/bin/lib/configure/ConfigureNCluster.py
--firewall=[on|off|disabled]
Procedure
To modify the firewall activation setting on Aster Database use the following steps:
1
Perform a soft shutdown of Aster Database.
2
Run ConfigureNCluster.py with the --firewall flag:
/home/beehive/bin/lib/configure/ConfigureNCluster.py
--firewall=<on|off|disabled>
3
Perform a soft startup, and activate Aster Database.
Worker and loader nodes will automatically get the configuration changes from the queen
during start-up.
Teradata Aster Big Analytics Appliance Database Administrator Guide
130
Security
Aster Database Firewall
131
Teradata Aster Big Analytics Appliance Database Administrator Guide
CHAPTER 9
Monitor Events
This chapter contains two methods for monitoring events:
•
Monitor Events with the Event Engine
•
Monitor of Aster Database with SNMP
Monitor Events with the Event Engine
The Aster Database Event Engine assists in system maintenance and monitoring. The Event
Engine uses a subscription model to send notifications of various events within the system.
You can configure separate subscriptions to be notified of events based on various filters.
Some examples of filters you can create include:
•
“Give me an email when a hardware alert happens”
•
“Give me an email only when bad things happen”, or
•
“Notify me when components change their state”.
The Event Engine resides on the queen. It monitors and generates notification on states and
activities on each node. You create subscriptions to specific types of events in order to be
notified when they occur. These subscriptions are created through ncli, and may be viewed in
ncli or the AMC. When certain events occur, Aster Database can be configured to perform a
remediation, such as a soft shutdown, automatically.
This section covers the following topics:
•
Event Engine Overview
•
Manage Event Subscriptions
•
Upgrades of Event Engine
•
View Event Subscriptions
•
Supported Events
•
Remediations
•
Event Engine Best Practices/FAQs
•
Test the Event Engine
•
Troubleshoot Event Engine Issues
Teradata Aster Big Analytics Appliance Database Administrator Guide
132
Monitor Events
Monitor Events with the Event Engine
Event Engine Overview
Log messages and user actions on all the nodes in Aster Database generate events. When a
triggering event happens or a triggering log message is generated, the node on which the event
occurred notifies the Event Engine. The Event Engine responds by checking for subscriptions
that fit the event profile, and sending an notification to any subscribers.
To subscribe to a particular event or to all events that fit a certain profile, you’ll use the ncli.
For more on using ncli, see “Command Line Interface (ncli)” on page 148.
Manage Event Subscriptions
Upon creating a new event subscription, it automatically becomes active. You can view,
enable/disable, and modify the subscriptions using the ncli events commands on the queen
(see the “ncli events Section” on page 159). These changes take place dynamically while the
Event Engine is running.
To view all existing subscriptions, you can issue:
# ncli events listsubscriptions
To view only one existing subscription, issue the following, specifying the appropriate
subscription id:
# ncli events listsubscriptions <subid>
The AMC also provides a read-only list of existing event subscriptions (see “View Event
Subscriptions” on page 137).
Event components
Events are made up of the following information:
Table 9 - 1: Event components
133
Name
Possible values
Description
event id
See the table “Subscribable Events
in Aster Database” on page 138 for
a list of valid event ids.
The unique identifier for the event.
severity
INFO, WARN, ERROR, or FATAL
The severity of the event. All events
have the default severity ‘INFO’.
message
various
The log message generated by the
event.
priority
LOW, MEDIUM or HIGH
The priority of the event. LOW is
the default.
component
hardware, hardware.disk,
software.aster, etc.
The component affected by the
event.
node IP
The IP address of a non-queen
node in the cluster.
The node affected by the event this is only populated when the
event affects a non-queen node.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Monitor Events
Monitor Events with the Event Engine
Setting appropriate event severity and priority
While there is no requirement that a particular Event ID map to a certain “Priority” or
“Severity”, it is essential to have an understanding of how to determine the right settings for
your particular events.
Table 9 - 2: Setting appropriate event severity and priority
Name
Value
Description
severity
INFO
Is an information (FYI) type of event.
severity
WARN
Means something may be bad, but nothing has failed yet.
severity
ERROR
Means something has or is about to fail.
severity
FATAL
Means something critical has failed and likely the cluster will fail or shut
down.
priority
LOW
Audits attempt to perform operations regardless of outcome.
priority
MEDIUM
Most events are MEDIUM.
priority
HIGH
Most bad things (high severity) are HIGH, but many good things (low
severity) are also HIGH simply because they are important.
Event filters used in subscription definitions
The following table shows subscription definition filters.
Table 9 - 3: Event filters used in subscription definitions
Filter
Required Default Description
--eventIds
[event[,event...]]
Optional all
Filter specifying one or more event ids that will
trigger a notification. Separate event ids by
commas, with no spaces.
--minPriority [low |
medium | high]
Required LOW
Filter specifying the minimum event priority
that will trigger a notification.
--minSeverity [info |
warn | error | fatal]
Required INFO
Filter specifying the minimum event severity
that will trigger a notification.
--componentTypes
filter[,filter...]
Optional all
Filter based on component type string. Matches
as much of the name as given. Examples are:
•
•
•
•
Teradata Aster Big Analytics Appliance Database Administrator Guide
"hardware" to get all hardware events
"hardware.disk" to get disk events
"software" to get all software events
"software.aster" to get just Aster Database
software events
134
Monitor Events
Monitor Events with the Event Engine
Other subscription definition settings
The following table shows other settings to use when adding or editing a subscription:
Table 9 - 4: Other subscription definition settings
Parameter
Required
Default
Description
--id [subid]
Required
for edit
The next
unused id
number
The subscription id. If specified, this should
be an integer. If editing a subscription, the
subscription id must be supplied.
--type [email |
snmp]
Required
--throttleSecs
[secs]
Optional
Must be one of either:
• email for email notifications or
• snmp for SNMP traps (see “Monitor of
Aster Database with SNMP” on
page 145).
0
--to
Required
address[,address.. for email
.]
--from address
The address of the email recipient(s). If
supplying multiple addresses, separate them
with a comma without spaces.
Required
for email
--smtp host[:port] Required
for email
Throttles the notifications for multiple
occurrences of the same event. When set to
the default value 0, messages are generated
for each occurrence of the event. Use this
setting to ensure that for a given event (same
node, same event id) additional messages
will be generated only after the specified
time has elapsed. To set a subscription to
only repeat an event every 30 minutes, you
would specify --throttleSecs 1800
when creating it.
The address of the email sender.
Port defaults
to 25
The hostname or IP address (and optionally
the port) of the email server.
--username
username
Optional
The username for the SMTP server.
--password
password
Optional
The password for the SMTP server.
--manager
host[:port]
Required Port defaults
for SNMP to 162
Used for SNMP subscriptions only. Sets the
target host and port number, when using
SNMP notification. See “Monitor of Aster
Database with SNMP” on page 145.
Create an event subscription
To create a new subscription do the following steps:
1
First, determine what kind of subscription you want to create. You can see a list of options
by looking at:
•
135
the list of available events - “Subscribable Events in Aster Database” on page 138.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Monitor Events
Monitor Events with the Event Engine
•
the filters you may use when creating a subscription - “Event filters used in
subscription definitions” on page 134.
•
other parameters that apply to subscriptions - “Other subscription definition settings”
on page 135
2
Log in to the queen as the user ‘beehive’.
3
Issue the ncli events addsubscription command with the desired subscription filters and
other parameters, like the following example. The command will create the new event
subscription and return a table showing all existing subscriptions:
# ncli events addsubscription --type email --minPriority low
--minSeverity info --componentType "software." --to "[email protected]"
--from [email protected] --smtp smtp-server.teradata.com
Event Subscriptions
+--------+------------+--------------+-------------| Sub ID | Notif Type | Min Priority | Min Severity
+--------+------------+--------------+-------------| 1
| email
| Low
| INFO
+--------+------------+--------------+-------------1 rows
table continued...
+----------------+-----------+---------------+
| Component Type | Event IDs | Throttle Secs |
+----------------+-----------+---------------+
| software.
|
| 0
|
+----------------+-----------+---------------+
4
Note that if you attempt to create a subscription without specifying all the necessary
parameters, or with invalid parameters, an error message will explain what is needed:
# ncli events addsubscription --type email --minPriority low -minSeverity info --componentType "software."
Invalid Arguments: Email requires at least 'to', 'from' and 'smtp'
Edit an event subscription
To edit an existing event subscription, use the same parameters as for adding a subscription,
except that for the --subid, you will supply the subscription id for the subscription you wish
to edit. The following command edits the subscription we created above, to change the -minSeverity, --minPriority and --componentType:
# ncli events editsubscription --subid 1 --type email
--minPriority medium --minSeverity warn --componentType "software.aster"
--to [email protected] --from [email protected]
--smtp smtp-server.teradata.com --username username --password password
Event Subscriptions
+--------+------------+--------------+-------------| Sub ID | Notif Type | Min Priority | Min Severity
+--------+------------+--------------+-------------| 1
| email
| Medium
| WARN
+--------+------------+--------------+-------------1 rows
Teradata Aster Big Analytics Appliance Database Administrator Guide
136
Monitor Events
Monitor Events with the Event Engine
table continued...
+----------------+-----------+---------------+
| Component Type | Event IDs | Throttle Secs |
+----------------+-----------+---------------+
| software.aster |
| 0
|
+----------------+-----------+---------------+
Note that all parameters must be supplied again, with changes made to only those that you
want to edit, when editing a subscription.
Delete an event subscription
To delete an existing event subscription, issue the following command, specifying the
appropriate subscription id. In this example, we delete the subscription created above:
# ncli events deletesubscription 1
Deleted Subscription 1
Upgrades of Event Engine
Beginning in Aster Database version 5.0, the Event Engine works differently than in previous
versions. If you are upgrading from a pre-5.0 to a 5.0 or later version, the upgrade will attempt
to migrate settings from the legacy Event Engine to the subscription-based Event Engine. The
following modifications will be made:
•
The upgrade does a best-effort migration of old settings to the new Event Engine. This
includes migrating email and SMTP alerts to the new subscription model.
The upgrade procedure will build a new subscription for every single SMTP alert that was
configured in the old Event Engine. Then it will attempt to consolidate them into
subscriptions wherever the notification parameters are the same. That is, if you have
configured ten different events in the old system to use the same email parameters, those
will be consolidated into a single subscription since the email parameters are the same.
•
The upgrade then logs changes that have been made, and changes that cannot be made in
the log file:
/primary/logs/PlatformManager.log
View Event Subscriptions
The AMC provides read-only access to event subscriptions. To see a list of subscribed events:
1
Log in to the AMC as a user with the admin role.
2
Select Admin > Events from the menu.
3
In the Event Subscriptions tab, review the table of subscribed events.
On a newly installed cluster, there will be no event subscriptions. Before they can be viewed in
the AMC, you must first add event subscriptions through ncli, as described below.
137
Teradata Aster Big Analytics Appliance Database Administrator Guide
Monitor Events
Monitor Events with the Event Engine
Figure 15: AMC Admin > Events > Event Subscriptions tab with no subscriptions
See “ncli events Section” on page 159 for a discussion of how to add event subscriptions. The
AMC does not display an event until at least one subscription to it has been created. After
creating some event subscriptions, the Admin > Events tab will look more like the following:
Figure 16: AMC Admin > Events > Event Subscriptions tab showing subscriptions
Supported Events
To assist administrators in detecting and managing situations where the cluster is running out
of disk space, a node is suspect or failed, a user is initiating actions in the AMC, or replication
factor issues exist, Aster Database provides the following subscribable events.
Table 9 - 5: Subscribable Events in Aster Database
Event ID
Event Alert Text
Description
MC0001
AMCAudit: A user attempted to
cancel a statement
Occurs when a user attempts to cancel a process
from the AMC by clicking Cancel from the
Processes list.
MC0002
AMCAudit: A user attempted a soft
restart
Occurs when a user attempts a soft restart from the
AMC by clicking the Soft Restart button.
MC0003
AMCAudit: A user attempted a hard Occurs when a user attempts a hard restart from
restart
the AMC by pressing the Hard Restart button.
MC0004
AMCAudit: A user attempted to add Occurs when a user attempts to add one or more
a node
nodes in the AMC by pressing the Add Nodes
button.
MC0005
AMCAudit: A user attempted to
remove a node
Teradata Aster Big Analytics Appliance Database Administrator Guide
Occurs when a user attempts to remove a node in
the AMC by pressing its X (remove) icon.
138
Monitor Events
Monitor Events with the Event Engine
Table 9 - 5: Subscribable Events in Aster Database (continued)
139
Event ID
Event Alert Text
Description
MC0006
AMCAudit: A user attempted to
activate the cluster
Occurs when a user attempts to activate the cluster
from the AMC by pressing the Activate Cluster
button.
MC0007
AMCAudit: A user attempted to
balance data
Occurs when a user attempts to balance data from
the AMC by pressing the Balance Data button.
MC0008
AMCAudit: A user attempted to
balance processes
Occurs when a user attempts a process rebalance
from the AMC by pressing the Balance Process
button.
MC0009
AMCAudit: A user attempted to
upload a software upgrade
Occurs when a user attempts to upload a software
upgrade from the AMC (by pressing Get File and
Distribute in Step 1 of the Upgrade Software dialog
box).
MC0010
AMCAudit: A user attempted to
upgrade software
Occurs when a user attempts to upgrade software
from the AMC (by pressing Upgrade Cluster Now
in Step 2 of the Upgrade Software dialog box).
MC0011
AMCAudit: A user attempted to run Occurs when a user attempts to run an executable
an executable
from the AMC by pressing Run Now in the Enter
Script Variables dialog box.
MC0012
AMCAudit: A user attempted to
modify an executable
Occurs when a user attempts to modify an
executable from the AMC by clicking its “pencil”
(edit) icon in the Executables Library.
MC0013
AMCAudit: A user attempted to
cancel a running executable
Occurs when a user attempts to cancel a running
executable from the AMC by clicking Cancel in the
Executable Jobs list.
MC0014
AMCAudit: A user attempted to
pause a backup
Occurs when a user attempts to pause a running
backup from the AMC by clicking Pause in the
Cluster Backups list.
MC0015
AMCAudit: A user attempted to
resume a backup
Occurs when a user attempts to resume a paused
backup from the AMC by clicking Resume in the
Cluster Backups list.
MC0016
AMCAudit: A user attempted to
cancel a backup
Occurs when a user attempts to cancel a running
backup from the AMC by clicking Cancel in the
Cluster Backups list.
MC0017
AMCAudit: A user attempted to add Occurs when a user attempts to add a backup
a backup manager
manager from the AMC by clicking the Add
Manager button.
MC0018
AMCAudit: A user attempted to
remove a backup manager
Occurs when a user attempts to remove a backup
manager from the AMC by clicking the Remove
Manager button.
MC0019
AMCAudit: A user attempted to
change admin settings
Occurs when a user attempts to change an
administrative setting in the AMC by clicking the
Save button for a setting in Admin > Configuration
> Cluster Settings.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Monitor Events
Monitor Events with the Event Engine
Table 9 - 5: Subscribable Events in Aster Database (continued)
Event ID
Event Alert Text
Description
MC0020
AMCAudit: A user attempted to
create a log bundle
Occurs when a user attempts to create a log bundle
in the AMC by pressing the Manually Initiate
Diagnostic Bundle button on the Admin > Logs
screen.
MC0021
AMCAudit: A user attempted to save Occurs when a user attempts to save a network
a network configuration for a node configuration for a node by pressing the Save
button in the Network Configuration > Edit
Configuration dialog box.
MC0022
AMCAudit: A user attempted to
apply a network configuration for a
node
MC0023
AMCAudit: A user attempted to save Occurs when a user attempts to save a network
a network assignment for a node
assignment for a node by pressing the Save &
Apply button in the Network Configuration >
Network Assignments dialog box.
MC0024
AMCAudit: A user attempted to save Occurs when a user attempts to save IP ranges to
IP ranges to the IP pool
the IP pool by pressing the Save & Apply button in
the Admin > Configuration > Network > IP Pools tab.
ST0001
Disk Full High > 90%
Occurs when any worker node’s used disk space
passes 90%.
ST0002
Disk Full Medium > 80%
Occurs when any worker node’s used disk space
passes 80%.
ST0003
Disk Full Low > 65%
Occurs when any worker node’s used disk space
passes 65%.
SY0001
Node is Suspect
Occurs when a node status changes to Suspect.
SY0002
Node is Failed
Occurs when a node status changes to Failed.
SY0003
Node has changed state
Occurs whenever a node changes state to any state
other than Failed or Suspect.
SY0005
VWorker is Failed
Occurs when a vworker status changes to Failed.
SY0006
VWorker has changed state
Occurs whenever a vworker changes state to any
state other than Failed.
SY0007
Beehive has started
Occurs when Aster Database starts up on the
queen node.
SY0008
Replication Factor is 0, system is
unavailable
System is unavailable.
SY0009
Replication Factor is below the
target value.
Occurs when replication factor falls below target.
SY0010
Replication Factor is at or above the Occurs when replication is at the target or above it.
target value.
SY0011
Disk error detected on worker node. Occurs when a worker node disk error is detected.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Occurs when a user attempts to apply a network
configuration for a node by pressing the Save &
Apply button in the Network Configuration > Edit
Configuration dialog box.
140
Monitor Events
Monitor Events with the Event Engine
Table 9 - 5: Subscribable Events in Aster Database (continued)
Event ID
Event Alert Text
Description
PM0001
Platform Manager notified of core
dump
Occurs when there is a core dump on a node.
QS0001
Query Canceled by Workload
Management
Occurs when Workload Management cancels a
query because of contention for memory
resources.
QS0002
Memory caches dropped
Occurs when Workload Management drops
memory caches associated with a canceled query.
Tip: Note that subscriptions to these events generate messages once when triggered by an event in the system. For
example, one message will be sent whenever a node fails. If another node also fails, another message will be sent.
However, the “Node is failed” subscription does not continue to generate additional messages at intervals while the
node(s) are still down, so plan accordingly when responding to messages from event subscriptions.
Tip: Note that the event subscription messages that are generated by AMC actions (those whose identifier begins
with “MC”), get sent when the action is initiated within the AMC. The firing of the AMC events does not necessarily
indicate that the initiated action was successfully completed.
Remediations
In special cases, Aster Database will take remedial actions to correct a condition reported by
the Event Engine. If you wish to implement further automated remediation (e.g., a node
removal) you can do so through SNMP management frameworks. For more information on
this, see “Monitor of Aster Database with SNMP” on page 145.
The following table shows remedial actions that will be taken automatically when the specified
event occurs:
Table 9 - 6: Automated Remedial Actions
EventID
Description
Action
ST0001
Disk full high
Can be configured to do a soft shutdown. See
“Trigger Cluster Shutdown on Disk Full Condition”
on page 142.
SY0011
RF=0
Soft shutdown
In the case where a soft shutdown has been issued, you will need to correct the problem that
prompted the shutdown, and then restart the cluster. To restart the cluster, log in as the root
user to the queen node and run "ncli system softrestart".
141
Teradata Aster Big Analytics Appliance Database Administrator Guide
Monitor Events
Monitor Events with the Event Engine
Trigger Cluster Shutdown on Disk Full Condition
You can configure Aster Database to do an automatic shutdown of the cluster when an ST0001
event (disk full high) is triggered. This feature is disabled by default. To enable the automatic
shutdown, follow these steps:
1
Log in to the queen as root.
2
Edit the file /home/beehive/config/procmgmtConfigs/coordinator.cfg
3
Find the section that begins:
"taskName": "PlatformManager"
4
Add the executableArgs line as shown in bold:
{
"taskName": "PlatformManager",
"nodeIps": "REPLACE_NODE_IP",
"executableLocation": "/home/beehive/bin/lib/platformmanager/PlatformManager.py",
"executableArgs": "--shutdownOnDiskfull",
"maxTaskRestarts": -1
}
5
Restart the cluster to enable this change to take effect.
Event Engine Best Practices/FAQs
Tip: If you use the fully qualified domain name (FQDN) of the mail server in the Event Engine configuration, ensure
that the Queen can correctly resolve that FQDN. If the Queen cannot resolve it, you will need to edit /etc/
resolv.conf accordingly for the queen to resolve the FQDN.
•
After an Aster Database shutdown, do event subscription emails still go out?
No. If the queen node is not running the Aster Database software stack, emails will not be
generated.
•
Is there a way to be alerted when things are going well in the system (i.e. a worker is fixed)?
Yes. You can subscribe to events with a minSeverity setting of “INFO”.
•
What are some recommendations when using the Event Engine?
Don’t manually edit the configuration file (this was recommended in release 4.5.1, but
beginning in release 4.6, you should only use ncli to modify event subscriptions.) See “ncli
events Section” on page 159.
Test the Event Engine
Once you have set up the Event Engine to trigger emails for specific events, it is useful to test
these to verify everything is working as expected. You don’t want to find out that a
subscription is misconfigured after the event occurs! There are two ways to do this: issuing
sample events, and testing for disk full events.
Issue sample events
You can send a sample alert by issuing the following command:
Teradata Aster Big Analytics Appliance Database Administrator Guide
142
Monitor Events
Monitor Events with the Event Engine
/home/beehive/bin/utils/SendLogEvent <eventID> <message>
For example, issuing the following logs the event SY0005 “VWorker is Failed”:
/home/beehive/bin/utils/SendLogEvent SY0005 “VWorker is Failed.”
This simulates an alert, adding the appropriate entry to generic.log and alerts.log. Log files are
found in the directory /home/beehive/data/logs. In fact, even ignored alerts (those
without subscriptions) should be logged. To determine which events have subscriptions, issue:
# ncli events listsubscriptions
Test for disk full events
Obviously, you want to test without actually creating a real disk 90% full event on a node!
Aster Database provides a method to validate the Event Engine settings by forcing lower
thresholds for the disk full events on a particular node. You can pass a configuration flag at the
command line to reset the threshold temporarily. This only lasts until Aster Database restarts.
When the cluster restarts, all disk full settings return to their default values.
Warning! Make sure you test these automated actions during a time of scheduled maintenance so that users are not
affected by the activity. Automated remediations such as Soft Shutdown will disrupt user activity on the cluster. See
“Remediations” on page 141 for a list of these.
Procedure
1
Select a worker node in the system and determine how much of its disk space is being used.
This is most easily done in the AMC, on the Nodes: Node Overview tab. Let’s say you find
Node 1 is utilizing 32% of its total disk space.
2
To test the warning level disk threshold first (ST0003), we will change it to 30% from the
default of 65%. Open a browser and paste this URL, supplying the IP Address of the node
you identified in step 1 (Node 1 in our example).
http://<IP Address of Node>:1953/std/
configflags?diskfullThresholdLow=30
You should see a message like the following in the browser:
Successfully set --diskfullThresholdLow to 30
You want to set the threshold to something lower than the current disk utilization. Since
the node’s current utilization is 32%, the Warning Disk Full event will happen the next
time the threshold is checked. The new threshold is set only for this particular node, not
for the entire cluster.
Tip: If you are using the Aster Database firewall, you may not be able to connect to port 1953 on the worker node.
You’ll need to disable the firewall temporarily to set this configuration parameter. To do this, see “Enable or Disable the
Aster Database Firewall” on page 130.
3
Review the alerts.log file to validate that disk full event alerts are now being sent by the
worker node. The last few messages should contain a “Low diskfull alert”.
# tail -10 /primary/logs/alerts.log
143
Teradata Aster Big Analytics Appliance Database Administrator Guide
Monitor Events
Monitor Events with the Event Engine
2011-04-18T21:41:21.344197 WARN 2179 StatsManager.cpp:1533 ST0003]
Low diskfull alert
2011-04-18T21:42:21.647056 WARN 2179 StatsManager.cpp:1533 ST0003]
Low diskfull alert
4
If you wish to review the log file, you can log in to the queen and examine the file /
primary/logs/PlatformManager.log
5
The cluster is now generating email for event subscriptions. If you have configured a
subscription that includes the “Warning disk full” event, the email recipient (or SNMP
server) should start to receive those messages.
6
Reset the warning level disk threshold to 65%. Open a browser and paste this URL,
supplying the IP Address of the node (Node 1 in our example).
http://<IP Address of Node>:1953/std/
configflags?diskfullThresholdLow=65
This sets the Warning disk full threshold level back to 65%.
7
You can test the Error and Critical levels of disk full in this same way using the following
URLs:
•
For Error Level Disk Full Threshold (ST0002):
http://<IP Address of Node>:1953/std/
configflags?diskfullThresholdMedium =[Threshold Value]
•
For Critical Level Disk Full Threshold (ST0001):
http://<IP Address of Node>:1953/std/
configflags?diskfullThresholdHigh=[Threshold Value]
Return operations to normal after testing event subscriptions
You will need to restart the Aster Database cluster after testing any events that issue a soft
shutdown. For a list of these see “Remediations” on page 141.
Recover from softShutdown
Restart the cluster. To do this, log in as the 'root' user to the queen node. Run "ncli system
softrestart".
Troubleshoot Event Engine Issues
Occasionally, the Event Engine may not work as you expect it to. Please check the following
items before contacting Teradata Global Technical Support (GTS).
Do the log files show the expected behavior?
There are two log files for the Event Engine.
One that resides on the Queen:
•
/primary/logs/alerts.log - reports logs from the Event Engine service.
And one that resides on the nodes:
•
/primary/logs/PlatformManager.log - reports the events that have occurred on that
node.
Teradata Aster Big Analytics Appliance Database Administrator Guide
144
Monitor Events
Monitor of Aster Database with SNMP
If you encounter an issue with the Event Engine, the first thing to do is verify that it has started
properly. Check the last few messages in the /primary/logs/alerts.log file. They should
start with an ‘INFO’ for information and state that the Blackbird service has started
successfully. If the file contains messages that start with ‘WARN’ or ‘ERROR’, a problem has
occurred. Review the messages to determine the cause.
Is Aster Database firing event alerts to trigger the configured
actions?
Check the last few messages in the /primary/logs/alerts.log file. Check that the
expected disk full alerts are shown in the file. If not, the system is not firing event alerts.
Perhaps you don’t have any nodes with disk utilization high enough to trigger the alert. See
“Test the Event Engine” on page 142 for information on how to set thresholds lower to test
event alerts.
Monitor of Aster Database with SNMP
All Aster Database nodes can send SNMP traps and respond to SNMP read requests. Aster
Database’s SNMP service conforms to net-snmp version 5.4.2.1 and supports the values of the
UC Davis MIB. Follow these instructions to set up SNMP monitoring of Aster Database:
•
Set Aster Database to send SNMP traps to an NMS (page 145)
•
Set an NMS to perform SNMP reads on Aster Database (page 146)
Set Aster Database to send SNMP traps to an NMS
This section explains how set up Aster Database to send SNMP traps to an SNMP network
management system (NMS) such as net-snmp, HP Open View, or CA Unicenter.
Tip! If you don’t already have an NMS installed, the open source net-snmp tool may prove useful to you. For
instructions on setting up the tool, see the net-snmp FAQs on sending traps and receiving traps.
Procedure
1
2
Make a note of the following:
•
IP address of your NMS.
•
SNMP trap listener port number of your NMS. This is the port on which the NMS
listens for SNMP traps. By default, this is port 162.
Issue the ncli command to create a subscription, using the --type snmp and --manager
host[:port] flags as in the example:
# ncli events addsubscription --subid 1 --type snmp --minPriority
medium --minSeverity warn --componentType "hardware" --manager
targetNMS.teradata.com
3
145
Remember that the port defaults to 162, so you don’t need to specify a port unless your
NMS listens for SNMP traps on a different port. If you wanted to specify another port
instead, you would use the following flag to refer to the NMS:
Teradata Aster Big Analytics Appliance Database Administrator Guide
Monitor Events
Monitor of Aster Database with SNMP
--manager targetNMS.teradata.com:port
Set an NMS to perform SNMP reads on Aster Database
You can point your network management system (NMS) to the queen, workers, and loaders in
Aster Database so that the NMS can perform SNMP reads on Aster Database. Each Aster
Database node runs an SNMP agent that listens on port number 19678 for SNMP reads from
the NMS. Consult the documentation for your NMS for instructions on setting it up to
perform reads.
Teradata Aster Big Analytics Appliance Database Administrator Guide
146
Monitor Events
Monitor of Aster Database with SNMP
147
Teradata Aster Big Analytics Appliance Database Administrator Guide
CHAPTER 10
Command Line Interface (ncli)
Aster Database Command Line Interface (ncli) is a command line tool that enables you to
gather operational information from all nodes in Aster Database and to take administrative
actions in a uniform manner throughout the cluster. ncli is functional even if the cluster is
down - at which time, ncli may be used to repair the cluster.
ncli allows you to generate output (such as cluster system statistics) in a format that you can
later analyze. ncli functionality includes a way to look at node status, vworker configuration,
I/O configuration, replication status, and process management job status. Operations may be
performed on one, a group, or all of the nodes. Output may be formatted in tables for screen
viewing, piped to another UNIX command, or saved to a file.
This section explains how to use ncli. The following topics are covered here:
•
ncli Installation and Setup
•
ncli Usage
•
ncli Command Reference
ncli Installation and Setup
Install ncli
ncli is available beginning in Aster Database version 4.6. When installing or upgrading to
version 4.6 or later, ncli is installed under /home/beehive/ncli.
To display the version of ncli currently installed, issue:
$ ncli --version
Required Privileges to Run ncli Commands
To run most ncli commands, you should log in as the UNIX user, beehive. However, to run
certain powerful commands (e.g., softrestart and softshutdown in the system section),
you must be logged in as root. If you attempt to run one of these commands as beehive, an
error message will display indicating that this command may only be run by root.
Teradata Aster Big Analytics Appliance Database Administrator Guide
148
Command Line Interface (ncli)
ncli Installation and Setup
Set up passwordless SSH
Passwordless SSH must be set up among all nodes for the UNIX user who will issue ncli
commands. The Aster Database installer does this automatically for the beehive user on all
machines that are part of the cluster. To run most ncli commands, you should log in as the
UNIX user, beehive. However, to run certain powerful commands (e.g., softrestart and
softshutdown in the system section), you must be logged in as root. If you attempt to run
one of these commands as beehive, an error message will display indicating that this
command may only be run by root. If you find that you cannot issue the ncli commands
which require root access, you may need to set up passwordless SSH for the root user again.
To simplify cluster configuration, Aster Database requires passwordless root SSH among the
queen and all worker nodes in all directions. This means that the queen, all worker nodes, and
any backup nodes must have the same SSH authorized key. This section shows you the Asterrecommended way to set this up.
You also have the option of letting the installer set up Aster Database to accept a custom preshared key. The custom pre-shared key approach is intended mainly for cloud-based
deployments. If you will be using that technique, you should skip this step.
Aster Database requires the use of a DSA or RSA key for passwordless SSH. This example uses
a DSA key. To set up passwordless SSH:
1
Log in as root.
2
If you don’t already have SSH keys you want to use, generate new keys now by running the
following command:
# ssh-keygen -t dsa
3
You will be prompted to enter the directory where the key should be saved and a
passphrase:
Enter file in which to save the key (/root/.ssh/id_dsa):
Press [Enter] key
Enter passphrase (empty for no passphrase): myPassword
Enter same passphrase again: myPassword
Your identification has been saved in /root/.ssh/id_dsa.
Your public key has been saved in /root/.ssh/id_dsa.pub.
The key fingerprint is:
04:be:15:ca:1d:0a:1e:e2:a7:e5:de:98:4f:b1:a6:01 root@aavrach-queen
4
Change to the directory where your key file was created:
# cd .ssh
5
Copy your DSA key file to authorized_keys2 to make it an SSH authorized key:
# cat id_dsa.pub >> authorized_keys2
6
Change the permissions on the authorized key file:
# chmod 600 authorized_keys2
or
# chmod og-rwx authorized_keys2
7
To make sure the key file works, open a SSH session to localhost:
# ssh localhost
149
Teradata Aster Big Analytics Appliance Database Administrator Guide
Command Line Interface (ncli)
ncli Usage
If you are asked to confirm, type yes and press Enter. The login should then succeed
without prompting for a password.
8
Exit the SSH session:
# exit
9
Make sure the worker and loader nodes are powered on and booted into their OS.
10 Back on the queen, change the working directory to your home directory:
# cd
11 Copy the key file to each of the Aster Database nodes by running the following command,
specifying the IP address of the node to which you are copying the key:
# scp -pr .ssh <target-machine-IP>:
If you are asked to confirm, type yes and press Enter. If you are prompted for a password,
enter it.
12 Verify that you can SSH to each worker and loader node without a password.
# ssh <target-machine-IP>
# exit
13 Test the passwordless connections among all nodes by connecting via SSH from each of
the nodes to all the other nodes.
ncli Usage
Who Should Use ncli?
In most installations, your in-house power users, administrators and Teradata Global
Technical Support (GTS) and consultants will use ncli. Any administrator who doesn't want
to use the AMC for various reasons, can use ncli. Some operations, like configuration of event
subscriptions, are only possible through ncli. Administrators will find ncli very useful because
many ncli commands work when the cluster is down (i.e. AMC does not) and it can aid in
troubleshooting.
This is in contrast to the AMC (Aster Database Management Console), which is focused on
setting up, managing, and scaling out Aster Database. The AMC is used by your in-house
Aster Database administrators and DBAs.
Issue ncli Commands
You invoke ncli from the command line on the Aster Database queen by typing the command
“ncli” followed by the section name, command name, and any parameters. The capabilities
of ncli are divided into “sections”, which are groups of commands with related functions. Flags
may be added to commands to modify their actions, for example, by formatting the results or
limiting the action of the command to specific nodes.
To run a command, open a shell on any node in the cluster and type ncli followed by the
high level flags, the name of the section, the command, and finally any command arguments.
In other words, a typical command takes the form:
Teradata Aster Big Analytics Appliance Database Administrator Guide
150
Command Line Interface (ncli)
ncli Usage
$ ncli [<highlevelflag>] <section> <command> [<commandflag>]
For example, to run the show command in the node section, while passing an argument to the
--hosttype flag so that you'll see only information about workers (and not the queen or
loaders), you would type:
$ ncli --hosttype=worker node show
A simpler example, which shows CPU configurations for all nodes, is this:
$ ncli node showcpuconfig
Command Line Conventions
You can use the standard UNIX command line conventions, such as piping your results
through another command. For example:
$ ncli vworker showconfigsignature | grep
8f39c1ddfa4762d81f4a5960a31491ff
ncli Help
To invoke ncli help, type:
$ ncli --help
See the table below for information on how to access detailed help for sections and
commands.
Table 10 - 1: ncli Help Commands
151
Flag
Description
ncli --help
Shows a list of help commands and high level flags.
ncli
Shows a list of available command sections.
ncli <section>
Shows available commands within the specified
section.
ncli --help <section>
<command>
Shows detailed help text for the specified command.
ncli --helpshort
Shows help only for ncli module.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Command Line Interface (ncli)
ncli Command Reference
ncli Command Reference
ncli Command Sections
The capabilities of ncli are divided into sections, which are groups of commands with related
functions. The following table lists the sections:
Table 10 - 2: ncli Command Sections
Section
Description
apm
Commands related to Aster Package Manager (apm)
database
Commands for pre-upgrade database tasks
disk
Commands related to disks
events
Commands to configure events
ice
Commands related to ICE (Inter Cluster Express)
server
ippool
Commands to configure the Aster Database pool of IP
addresses
netconfig
Commands to configure network interfaces by
function
node
Commands related to nodes
nsconfig
Commands to configure name servers and hosts
process
Commands related to running processes
procman
Commands that retrieve status from the process
management master
qos
Commands related to workload management and
admission limits
query
Commands to view the state of running queries in the
system
replication
Commands related to replication
session
Commands to view the state of running sessions in the
system
sqlh
Commands related to SQL-H
sqlmr
Commands related to SQL-MR
statsserver
Commands related to the StatServer
sysman
Commands related to sysman, the Aster Database
system management layer
system
Commands related to Aster Database system status
display and control
Teradata Aster Big Analytics Appliance Database Administrator Guide
152
Command Line Interface (ncli)
ncli Command Reference
Table 10 - 2: ncli Command Sections (continued)
Section
Description
tables
Commands related to table information
util
Miscellaneous commands
vworker
Commands related to vworkers
ncli apm Section
The commands in the apm section must be executed as the root user. The apm section
provides commands for installing and administering third-party packages on Aster Database
clusters. Currently, only the R programming language and environment is supported.
R is an open-source software package especially suitable for data analysis and graphical
representation (http://cran.r-project.org). For more information about installing and
administering R packages, see the Teradata Aster Big Analytics Appliance 3H Database
Administrator Guide.
Table 10 - 3: ncli apm section
Command
Description
ncli apm help R <options>
Provides help information.
ncli apm show [R] <options>
Shows information about third-party packages
installed/available via apm. It also shows packagespecific information such as version.
ncli apm install R <options>
Installs the latest version of the software package/
optional packages not already installed on the queen
in the auxiliary root area. (For R, these could be
optional R packages or optional RPM packages).
ncli apm remove R <options>
Removes the software package/optional package from
the cluster. Tracks the uninstallation status on the
queen and reports the installation failure, if any.
Makes best effort uninstallation on each worker.
ncli apm administer R <options>
Administers the package installation across the Aster
Database cluster. Performs conditional package
synchronization across the Aster Database cluster. Sets
up the repository on the cluster or in an auxiliary root
area. Removes the repository on the cluster or in an
auxiliary root area.
You should execute ncli apm install/remove/administer commands only as root and only from the queen node.
ncli apm help R
Syntax
ncli apm help R [<command>] --full
153
Teradata Aster Big Analytics Appliance Database Administrator Guide
Command Line Interface (ncli)
ncli Command Reference
Table 10 - 4: ncli apm help R
Options
Description
ncli apm help R <command>
Gives full R-specific help corresponding to the
specified command (install, remove, show,
administer).
ncli apm help R
Provides general help information about R.
ncli apm help R --full
Provides detailed help information about R.
ncli apm show
Shows information about third-party packages installed/available via apm. It also shows
package-specific information such as version.
Syntax
ncli apm show [R] [<options>]
Table 10 - 5: ncli apm show R
Options
Description
--info
Displays Base R package information.
--packages=<packages>
Displays optional package version information.
Use a comma-separated list to specify the
packages.
--packages
Displays information about the installed and default
optional packages. If Priority is “base,” the package is
already installed and loaded, which mean that all of its
functions are available upon running R. If Priority is
“recommended,” the package was installed with base
R, but was not loaded.
--localconfig
Displays the R installation status on all the nodes.
--rpmpackage=<package>
Displays the installation status of the specified RPM
packages on all the nodes.
Examples
$ ncli apm show
+--------------+---------------------+
| Package Name | Installation Status |
+--------------+---------------------+
| R
| Installed
|
+--------------+---------------------+
$ ncli apm show R --packages
1 rows
Default R Packages found in LibPath /usr/lib64/R/library
+------------+---------+-------------+
| Package
| Version | Priority |
Teradata Aster Big Analytics Appliance Database Administrator Guide
154
Command Line Interface (ncli)
ncli Command Reference
+------------+---------+-------------+
| base
| 2.15.2 | base
|
| boot
| 1.3-7
| recommended |
| class
| 7.3-5
| recommended |
| cluster
| 1.14.3 | recommended |
| codetools | 0.2-8
| recommended |
| compiler
| 2.15.2 | base
|
| datasets
| 2.15.2 | base
|
...
$ ncli apm show R --info
R version 2.15.3 (2013-03-01) -- "Security Blanket"
Copyright (C) 2013 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
...
$ ncli apm show R --rpmpackage=zip
zip Package Local Installation Status
+--------------+-----------+---------------------+
| Node Ip
| Node Type | Installation Status |
+--------------+-----------+---------------------+
| 10.60.111.50 | Queen
| Installed
|
| 10.60.111.51 | Worker
| Installed
|
| 10.60.111.52 | Worker
| Installed
|
| 10.60.111.53 | Worker
| Not Installed
|
+--------------+-----------+---------------------+
$ ncli apm show R --localconfig
R local configuration
+--------------+-----------+-----------+
| Node Ip
| Node Type | R Version |
+--------------+-----------+-----------+
| 10.60.111.50 | Queen
| 2.15.2
|
| 10.60.111.51 | Worker
| 2.15.2
|
| 10.60.111.52 | Worker
| 2.15.2
|
+--------------+-----------+-----------+
ncli apm install R
Installs the base R package or optional R packages. If you specify the --packages option and
R is already installed, this command installs the specified packages. If no option is specified,
this command installs the R package.
Syntax
ncli apm install R [<options>]
Table 10 - 6: ncli apm install R
Option
Description
If no option is specified, installs the base R package.
--packages=<packages>
155
Installs the specified comma-separated list of packages
on the Aster Database cluster.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Command Line Interface (ncli)
ncli Command Reference
Table 10 - 6: ncli apm install R (continued)
Option
Description
--usedefaultrrepo={True|False}
Specifies whether to use the default repository for R
installation on Red Hat and Suse Linux Enterprise
Server (SLES). By default, the value of this option is
True. To install R from a different repository that
contains R packages, set this option to False.
On Red Hat, the default repository points to:
•
http://download.fedoraproject.org/pub/epel/
6/x86_64 (for R packages)
•
http://ftp1.scientificlinux.org/linux/scientific/
6.0/x86_64//os/Packages (to resolve
dependencies)
On SLSE, the default repository points to:
http://download.opensuse.org/repositories/
devel:/languages:/R:/base/SLE_11_SP1
--repo=<repoName>,<repoUrl>
Installs R on the ncluster using the specified temporary
repository. You can specify multiple space-separated
repositories.
--rpmpackages=<packages>
Installs the RPM packages required for optional
package installation.
Example
$ ncli apm install R
--repo=sles,"http://mirror.asterdata.com/sles/11.sp1/x86_64"
--repo=sles-sdk,"http://mirror.asterdata.com/sles/11.sp1-sdk/x86_64"
Starting APM
R is not installed.
sles Repo created successfully.
sles-sdk Repo created successfully.
Starting APM
Installing package sles-release
Successfully installed sles-release package
…
Installing package R
Successfully installed R package
Installing package R-devel
Successfully installed R-devel package
Installing R on worker nodes
Successfully synchronized R across Aster cluster.
R installation complete.
sles Repo successfully removed.
sles-sdk Repo successfully removed.
$ ncli apm install R --packages=histogram,HMM
Installing R package histogram...
Installing R package HMM...
Successfully synchronized R packages across Aster cluster.
Successfully installed R packages histogram,HMM
Teradata Aster Big Analytics Appliance Database Administrator Guide
156
Command Line Interface (ncli)
ncli Command Reference
$ ncli apm install R --rpmpackages=vim
Starting packages vim installation
Installing package vim
Successfully installed vim package
Successfully synchronized R packages across Aster cluster.
ncli apm remove R
Removes the base R and optional R packages from the cluster. Accepts zero or one parameter.
Syntax
ncli apm remove R [--packages=<packages>]
Table 10 - 7: ncli apm remove R
Option
Description
If no option is specified, removes R.
Installs the specified comma-separated list of packages
on the Aster Database cluster.
--packages=<packages>
Examples
$ ncli apm remove R --packages=nortest,e1071
Successfully uninstalled R packages nortest,e1071
$ ncli apm remove R
Starting APM
Successfully uninstalled R from Aster cluster
ncli apm administer R
Administers the package installation across the Aster cluster.
Syntax
ncli apm administer R <options>
Table 10 - 8: ncli apm administer R
Options
Description
--setuprepo=
<repoName>,<repoURL>
Sets up a repository for the ncli apm command. If the
optional base package name is provided, this
command sets up the repository in an auxiliary root
area.
If no package name is provided, this command sets up
the repository in the <auxiliary root>/R/etc/
yum.repos.d area for Red Hat and <auxiliary root>/R/
etc/zypp/repos.d/ area for SLES.
To add a local repository, you can use these options:
--setuprepo=<repoName>,file:<full repo path>
--setuprepo=<repoName>,file://<full repo path>
157
Teradata Aster Big Analytics Appliance Database Administrator Guide
Command Line Interface (ncli)
ncli Command Reference
Table 10 - 8: ncli apm administer R (continued)
Options
Description
--removerepo=<repoName>
If the optional package name is provided, the
command removes, if present, the repository from the
auxiliary root area. If no package is specified, the
command removes the repository from /etc/
yum.repos.d location.
--synchronize
Performs R synchronization on the queen.
Examples
$ ncli apm administer R --setuprepo=redhat,"http://mirror.example.com/
rhel/6.0/os/x86_64"
redhat Repo created successfully.
$ ncli apm administer R --synchronize
Successfully synchronized R across Aster cluster.
$ ncli apm administer R --removerepo=sles
sles Repo successfully removed.
$ ncli apm administer R --setuprepo=rrepo,file:/home/beehive/packages
ncli database Section
The database section provides commands related to the upgrade process. The syntax to run a
command in the database section looks like this example:
$ ncli database backupmetadata
Table 10 - 9: ncli database section
Command
Description
backupmetadata
Dumps all the Postgres metadata catalogs of all
the databases on all nodes. These dumps can be
used to re-create the pre-upgrade metadata
structures, if necessary.
checkpoint
Execute a CHECKPOINT on active vworkers.
steadystatechecks
Checks for prepared transactions and zombie
databases (databases that were previously
dropped, but some remnants remain).
storagesize <storagetypename>
Computes size of the storage type(s) for all
vworkers in the cluster.
ncli disk Section
The disk section provides commands related to disks. The syntax to run a command in the
disk section looks like this example:
$ ncli disk showallconfig
which returns a result like:
Teradata Aster Big Analytics Appliance Database Administrator Guide
158
Command Line Interface (ncli)
ncli Command Reference
IO Scheduler Configuration
+------------+----------------------------------+------------| Node IP
| Node ID
| Device Path
+------------+----------------------------------+------------| 10.60.11.5 | 840101ef035870509e8d5da5a49fcea9 | /dev/sda1
| 10.60.11.5 | 840101ef035870509e8d5da5a49fcea9 | /dev/sda2
| ...
| 10.60.11.7 | de5dcfbe8146aa3f03f16ec2637a373b | /dev/sda7
+------------+----------------------------------+------------18 rows
table continued...
+-----------+--------------+-----------------+
| Scheduler | Max Requests | Read Ahead (KB) |
+-----------+--------------+-----------------+
| deadline | 4096
| 4096
|
| deadline | 4096
| 4096
|
|
| deadline | 4096
| 4096
|
+-----------+--------------+-----------------+
Table 10 - 10: ncli disk Section
Command
Description
showallconfig
Shows report on disk configuration.
showfsconfig [all]
Shows report on file system configuration.
showhpacucliconfig
Shows report on data generated by the hpacucli utility.
hpacucli (the Array Configuration Utility CLI) is a command
line-based disk configuration program for HP Smart Array
Controllers and RAID Array Controllers.
showioschedconfig
Shows report on IO scheduler configuration.
showmdconfig
Shows report on md configuration (multiple device i.e. RAID).
showmegacliconfig
[--pdonly|--ldonly]
Shows report on disk configuration for Dell machines.
Optionally view only PD (physical disks) or LD (virtual disks).
showmppconfig
For Teradata Global Technical Support (GTS) use only.
Displays information on the Redundant Disk Array Controller
Multi-Path Proxy (MPP) devices used on the Teradata Aster
MapReduce Appliance 2 only.
For all other platforms, this command does not return any
results.
ncli events Section
The events section provides commands to view and configure event subscriptions in the Aster
Database Event Engine. See “Monitor Events with the Event Engine” on page 132 for
information about event subscriptions.
When you set up event subscriptions, you’re setting up subscription to be notified via SNMP
or email whenever events of a particular type occur. The ncli is the only way to add and
manage subscriptions.
159
Teradata Aster Big Analytics Appliance Database Administrator Guide
Command Line Interface (ncli)
ncli Command Reference
The commands in the events section will run against the queen, even if executed from a
worker node. The syntax to run a command in the events section looks like this example:
$ ncli events listsubscriptions
Event Subscriptions
+--------+------------+--------------+--------------+---------------| Sub ID | Notif Type | Min Priority | Min Severity | Component Type
+--------+------------+--------------+--------------+---------------| 9
| snmp
| High
| FATAL
|
| 8
| snmp
| Medium
| ERROR
|
| 7
| snmp
| High
| FATAL
|
| 6
| snmp
| High
| FATAL
|
+--------+------------+--------------+--------------+---------------4 rows
table continued...
+-----------+---------------+----------------------+
| Event IDs | Throttle Secs | Notification Details |
+-----------+---------------+----------------------+
| ST0001
| 0
| manager=10.60.11.5
|
| SY0002
| 0
| manager=10.60.11.5
|
| SY0001
| 0
| manager=10.60.11.5
|
| ST0002
| 0
| manager=10.60.11.5
|
----------------+-----------+----------------------+
To add a new event subscription, issue a command like:
$ ncli events addsubscription --eventIds ST0003 --type snmp --manager
10.60.11.5 --minPriority high --minSeverity fatal
Which displays the event subscription added, returning a result like:
Event Subscriptions
+--------+------------+--------------+--------------+----------------+-----------+
| Sub ID | Notif Type | Min Priority | Min Severity | Component Type | Event IDs |
+--------+------------+--------------+--------------+----------------+-----------+
| 5
| snmp
| High
| FATAL
|
| ST0003
|
+--------+------------+--------------+--------------+----------------+-----------+
table continued...
+---------------+----------------------+
| Throttle Secs | Notification Details |
+---------------+----------------------+
| 0
| manager=10.60.11.5
|
+---------------+----------------------+
1 rows
To see a list of required and optional parameters for an event subscription, issue the following
command.
$ ncli --help events addsubscription
ncli events addsubscription <subscription args> <notification args>
Add a new subscription
Add or Edit a subscription
<subscription args>
[--id id]: Subscription ID. Required for edit
Teradata Aster Big Analytics Appliance Database Administrator Guide
160
Command Line Interface (ncli)
ncli Command Reference
--type email | snmp
--minPriority low | medium | high
--minSeverity info | warn | error | fatal
[--componentTypes filter[,filter...]] : Filter(s) based on component
type string.
[--eventIds event[,event...]] : Specific event ids
[--throttleSecs secs] : Throttle same events. 0 means don't throttle.
<email notification args>
--to address[,address..]
--from address
--smtp host[:port]
[--username username --password password]
<snmp notification args>
--manager host[:port]
For a list of valid event IDs, see “Supported Events” on page 138.
Table 10 - 11: ncli events Section
Command
Description
addsubscription <subscription args>
<notification args>
Adds a new subscription.
deletesubscription <sub id>
Deletes an existing subscription.
editsubscription <subscription
args> <notification args>
Edits an existing subscription.
listsubscriptions [sub id]
Lists existing subscriptions to events, optionally
filtered by subscription identifier.
ncli ice Section
ICE stands for inter-cluster exchange. The ICE section provides commands related to ICE
server, which provides services to move data around in Aster Database. The syntax to run a
command in the ICE section looks like this example:
$ ncli ice showactivetransports
which returns a result like:
Active Transports
+------------+---------------------+--------------------+
| Node
| SessionId
| TransportId
|
+------------+---------------------+--------------------+
| 10.60.11.5 | 2327674903724048181 | 387487523833891928 |
| 10.60.11.6 | 2327674903724048181 | 387487523833891928 |
| 10.60.11.7 | 2327674903724048181 | 387487523833891928 |
+------------+---------------------+--------------------+
3 rows
.
Table 10 - 12: ncli ice Section
161
Command
Description
showactivetransports
Shows all active transports.
showicestats <sessionid>
Show statistics of all ICE servers.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Command Line Interface (ncli)
ncli Command Reference
ncli ippool Section
The ippool section provides general tools for managing IP address allocations in the Aster
Database cluster. This section only applies to AMOS installations. For more detailed
information on IP pools, or to set them up using the AMC, refer to Set up IP Pools in the
AMC (page 104).
Syntax
The syntax to run a command in the ippool section looks like this example:
$ ncli ippool showranges
which displays results like:
IP Address Ranges
+--------+--------------+--------------+
| Type
| Start IP
| End IP
|
+--------+--------------+--------------+
| shared | 10.60.11.100 | 10.60.11.105 |
+--------+--------------+--------------+
1 rows
Table 10 - 13: ncli ippool section
Command
Description
setranges <start ip1>-<end ip1>
[shared|queens|workers|loaders]...
Sets the new IP pool ranges.
showallocations
Shows currently allocated IP addresses.
showranges
Shows currently configured IP ranges.
For command-line help on these commands, see:
# ncli --help ippool
You should issue the ippool showranges command before making any changes, in order to
see what the current settings are.
Set up IP pools
The pools can be reconfigured using the ncli ippool command or the AMC (“Set up IP Pools
in the AMC” on page 104). This should be done prior to adding any new nodes. In many
cases, leaving the default range alone will be suitable. You should only change the allocation
scheme if the network is not entirely owned by Aster Database, or if the range is so large that
the you don’t want to give Aster Database the entire range (e.g. 2^16 in the case of a class-B
subnet).
Tip: Make a note of all the IP Pools settings and keep it somewhere separate from your queen node. If your queen fails,
you will not have access to this information. The IP Pools settings will need to be re-applied manually at the end of the
queen replacement procedure, so you will need to know the IP Pools settings to apply.
Teradata Aster Big Analytics Appliance Database Administrator Guide
162
Command Line Interface (ncli)
ncli Command Reference
The ippool setranges command
Make the pools as large as you expect the cluster to be, with additional space for growth. Also,
some features in Aster Database use IP addresses to do utility operations (like networking
auto-enslavement for bonding) and these operations require temporary use of some of the IP
addresses in the pool. Allowing two or three extra IP addresses in the worker and loader pool
will suffice.
You can use setranges to allocate a group of IP addresses that will be assigned to new
workers, queens, or loaders as they are added to the Aster Database cluster, or you can allocate
a pool of shared IP addresses that can be assigned to any new node that is added. There can be
no overlap between the ranges that are allocated, but you may use two or more IP pools that
are noncontiguous:
ncli ippool setranges 10.60.11.100-10.60.11.105 shared
IP Address Ranges
+--------+--------------+--------------+
| Type
| Start IP
| End IP
|
+--------+--------------+--------------+
| shared | 10.60.11.100 | 10.60.11.105 |
+--------+--------------+--------------+
1 rows
You can use setranges to resize the ranges at any time, but each new range must always
include all existing nodes of its type.
ncli IP pool examples
The following examples describe a few typical installation scenarios and the commands
needed to configure the queen for these installations.
Installation 1: Create a minimal class-C IP pool
Create a shared range for use by all nodes. Assume the coordinator is already installed at
192.168.10.10.
# ncli ippool setranges 192.168.10.10-192.168.10.199 shared
This means all new nodes will be allocated IP addresses from .11 until .199.
Installation 2: Allocate node specific ranges in a class-C network
To allow the queen a portion of the class-C network and to control how nodes are allocated,
the following command can be used:
# ncli ippool setranges 192.168.10.10-192.168.10.11 queens
192.168.10.100-192.168.10.199 workers 192.168.10.200-192.168.10.219
loaders
This command assumes the coordinator is already using 192.168.10.10 and leaves
192.168.10.11 for an additional queen later on. Workers and Loaders will be allocated IP
addresses from .100-.199 and .200-.219 respectively.
Installation 3: Allocate of a portion of a class-B network
Suppose that your IT department has given you an allocation for your Aster Database nodes in
a class B network. You can restrict the node allocation to use just a portion of that network.
163
Teradata Aster Big Analytics Appliance Database Administrator Guide
Command Line Interface (ncli)
ncli Command Reference
Note that the queen still assumes that it will be the only DHCP server reachable by the nodes
for use in PXE provisioning of the OS.
# ncli ippool setranges 172.20.0.10-172.20.0.249 shared
This would allocate 240 addresses in the 172.20.0.0/16 space.
Installation 4: Allocate non-contiguous IP pools
Pools are not required to be contiguous, but they must be in the same network, and they must
include all existing nodes. If you run out of IPs for a particular node type in an existing pool
and need to add another pool, you can do that instead of growing the existing pool. Pools can
also consist of only one IP address.
# ncli ippool setranges 192.168.10.10-192.168.10.10 queens
192.168.10.200-192.168.10.209 workers 192.168.10.230-192.168.10.239
workers 192.168.10.220-192.168.10.229 loaders
This would allocate a single IP address for the queen, ten addresses for workers and ten
addresses for loaders.
ncli netconfig Section
For clusters composed of multi-NIC machines, the network assignments feature gives you the
option of segregating data-backup traffic and data-loading traffic from your query network
traffic. Using this feature, you can cable a NIC interface of each node into a separate subnet
that you dedicate for backup or loading traffic.
You set up network assignments via the ncli netconfig command, as explained below. The
netconfig section provides commands for assigning the Aster Database functions on each
node to a network by IP address or network interface. Some of this functionality is also
available in the AMC’s Network Assignments panel (see “Multi-NIC Machines” on page 95 and
“NIC Bonding” on page 101.).
Tip: Note that before using these commands to configure the network, all nodes must have the appropriate physical
cabling to support the configurations you will make. If you attempt to enslave an uncabled interface, ncli will experience
a long time out while it attempts to configure the network.
The syntax to run a command in the netconfig section looks like this example:
$ ncli netconfig showsystem
which returns results like:
Current Network State
+------------+----------------------------------+------------+-------------+--------| Node IP
| Node ID
| IP Address | Netmask
| Gateway
+------------+----------------------------------+------------+-------------+--------| 10.60.11.5 | 840101ef035870509e8d5da5a49fcea9 | 10.60.11.5 | 255.255.0.0 | 10.60.0.1
| 10.60.11.6 | 9949ee3ebd5560f738add0635e39b40a | 10.60.11.6 | 255.255.0.0 | 10.60.0.1
| 10.60.11.7 | de5dcfbe8146aa3f03f16ec2637a373b | 10.60.11.7 | 255.255.0.0 | 10.60.0.1
+------------+----------------------------------+------------+-------------+--------3 rows
table continued...
Teradata Aster Big Analytics Appliance Database Administrator Guide
164
Command Line Interface (ncli)
ncli Command Reference
+-----------------------+------------+----------------+
| Bonding Enabled (Y/N) | Interfaces | Bonding Master |
+-----------------------+------------+----------------+
| N
| eth1
| N/A
|
| N
| eth0
| N/A
|
| N
| eth0
| N/A
|
+-----------------------+------------+----------------+
or, to see IP addresses assigned to the various Aster Database functions, issue:
$ ncli netconfig showfunctionips
which returns results like:
IPs for each function
+------------+----------------------------------+----------+------------+
| Node IP
| Node ID
| Function | IP Address |
+------------+----------------------------------+----------+------------+
| 10.60.11.5 | 840101ef035870509e8d5da5a49fcea9 | queries | 10.60.11.5 |
| 10.60.11.5 | 840101ef035870509e8d5da5a49fcea9 | loads
| 10.60.11.5 |
| 10.60.11.5 | 840101ef035870509e8d5da5a49fcea9 | backups | 10.60.11.5 |
| 10.60.11.6 | 9949ee3ebd5560f738add0635e39b40a | queries | 10.60.11.6 |
| 10.60.11.6 | 9949ee3ebd5560f738add0635e39b40a | loads
| 10.60.11.6 |
| 10.60.11.6 | 9949ee3ebd5560f738add0635e39b40a | backups | 10.60.11.6 |
| 10.60.11.7 | de5dcfbe8146aa3f03f16ec2637a373b | queries | 10.60.11.7 |
| 10.60.11.7 | de5dcfbe8146aa3f03f16ec2637a373b | loads
| 10.60.11.7 |
| 10.60.11.7 | de5dcfbe8146aa3f03f16ec2637a373b | backups | 10.60.11.7 |
+------------+----------------------------------+----------+------------+
9 rows
Table 10 - 14: ncli netconfig Section
165
Command
Description
apply [--dryRun]
Apply the differences between the network configuration
and the system. Use --dryRun to test your configuration
before applying it.
fromsystem
Sets the network configuration from the system state.
inspect
Runs tests to see if there is a mismatch in functions
assignments.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Command Line Interface (ncli)
ncli Command Reference
Table 10 - 14: ncli netconfig Section (continued)
Command
Description
setconfig ip1 <1.2.3.4>
netmask1 <255.255.255.0>
interfaces1 <eth0,...ethn>
bonding1 <y|n>
Assigns network parameters for this node for one or more
interfaces.
All parameters take an integer index <N> to indicate how
the pairings align.(i.e. - to designate different interfaces
use interfaces1, interfaces2,... interfacesn. For IPaddresses,
use ip1,ip2,...ipn.)
Parameters to configure are:
• ip<N> : The ip address
• netmask<N>: The network mask
• gateway<N>: The gateway (optional). Only one
gateway is supported.
• usebonding<N>: 'y' or 'n' to explicitly enable
bonding. If not specified, it will default to 'y' for
more than one interface specified.
• interfaces<N>: comma separated list of ethernet
interfaces to use. This is operating system and
hardware dependent, so check your network interfaces
to get the correct naming.
Note that setconfig only applies to AMOS installs. For
UMOS installs, these configurations are set through the
operating system.
setfunctions <eth*> loads
<1.2.3.4> backups
Assigns functions by interface or IP or specify [-clear] to erase.
showconfig
Shows currently configured network parameters.
showfunctionips
Displays the IP addresses used for each function.
showfunctions
Shows currently configured functions.
showsystem
Shows current networking state.
ncli netconfig examples
Set up NIC bonding
In the following example, setconfig is used to set up bonding and assign NICs to IP
addresses. Note that this example applies only to AMOS installs. UMOS multi-NIC
installations require NIC bonding to be set up in the OS prior to installing Aster Database.
$ ncli netconfig setconfig ip1 192.168.60.100 netmask1 255.255.255.0 interfaces1
eth0,eth1,eth2 bonding1 y ip2 192.168.25.50 netmask2 255.255.255.0 gateway2
192.168.25.30 interfaces2 eth3
Network Configuration
+----------------+---------------+---------------+-------------+----------------+
| IP Address
| Netmask
| Gateway
| Bonding Y/N | Interfaces
|
+----------------+---------------+---------------+-------------+----------------+
| 192.168.60.100 | 255.255.255.0 | -| Y
| eth0,eth1,eth2 |
| 192.168.25.50 | 255.255.255.0 | 192.168.25.30 | N
| eth3
|
+----------------+---------------+---------------+-------------+----------------+
2 rows
The configuration is then verified using --dryRun to find any issues before it is applied:
Teradata Aster Big Analytics Appliance Database Administrator Guide
166
Command Line Interface (ncli)
ncli Command Reference
$ ncli netconfig apply --dryRun
The output supplies information on how the configuration will be applied:
Operations required to apply
+--------------------------------------------------------------------+
| Operation
|
+--------------------------------------------------------------------+
| Clear ip settings on interface eth1
|
| Take down interface eth1
|
| Create bond bond0
|
| Set ip settings on interface bond0 to 192.168.60.100:255.255.255.0 |
| Add slaves eth0,eth1,eth2 to bond0
|
| Add default gateway 192.168.25.30 for eth3
|
+--------------------------------------------------------------------+
6 rows
Warning! Applying the network settings is accomplished by restarting network services with the new settings.
Because of this, any operations that are currently running over the network will be interrupted. Be sure that there are
no active queries before applying network settings.
After any necessary preparations are made, the command to apply the network configuration
is issued:
$ ncli netconfig apply
which returns results like:
Current Network State
+----------------+---------------+---------------+-------------+----------------+
| IP Address
| Netmask
| Gateway
| Bonding Y/N | Interfaces
|
+----------------+---------------+---------------+-------------+----------------+
| 192.168.60.100 | 255.255.255.0 | -| Y
| eth0,eth1,eth2 |
| 192.168.25.50 | 255.255.255.0 | 192.168.25.30 | N
| eth3
|
+----------------+---------------+---------------+-------------+----------------+
2 rows
Assign Aster Database functions to subnets
This example shows assignment of Aster Database functions to subnets. The example assigns
loads and backups to only use the IP address 192.168.25.50. Query traffic remains on the
default IP address 192.168.60.100.
$ ncli netconfig setfunctions 192.168.25.50 loads 192.168.25.30 backups
+----------+---------------+
| Function | IP Address
|
+----------+---------------+
| queries | 192.168.60.100|
| loads
| 192.168.25.50 |
| backups | 192.168.25.50 |
+----------+---------------+
3 rows
ncli node Section
The most commonly used section is the node section, which provides general tools for
reporting and running UNIX commands on one or many nodes in the cluster. The syntax to
run a command in the node section looks like this example:
167
Teradata Aster Big Analytics Appliance Database Administrator Guide
Command Line Interface (ncli)
ncli Command Reference
$ ncli node showsummaryconfig
which displays a result like this:
Node Configuration
+------------+----------------------------------+-----------------------------------| Node IP
| Node ID
| Platform
+------------+----------------------------------+-----------------------------------| 10.60.11.5 | 840101ef035870509e8d5da5a49fcea9 | DELL R710
| 10.60.11.6 | 9949ee3ebd5560f738add0635e39b40a | DELL R710
| 10.60.11.7 | de5dcfbe8146aa3f03f16ec2637a373b | DELL R710
+------------+----------------------------------+-----------------------------------3 rows
table continued...
+--------------------+---------------------------| Aster Version
| Kernel Version
+--------------------+---------------------------| beehivemain-r28783 | 2.6.32-131.21.1.el6.x86_64
| beehivemain-r28783 | 2.6.32-131.21.1.el6.x86_64
| beehivemain-r28783 | 2.6.32-131.21.1.el6.x86_64
+--------------------+----------------------------
table continued...
+--------------------------------------------------------+------+--------------| Distribution
| CPUs | Free Mem (MB)
+--------------------------------------------------------+------+--------------| Red Hat Enterprise Linux Server release 6.0 (Santiago) | 1
| 6179 / 7873
| Red Hat Enterprise Linux Server release 6.0 (Santiago) | 1
| 6791 / 7873
| Red Hat Enterprise Linux Server release 6.0 (Santiago) | 1
| 6793 / 7873
+--------------------------------------------------------+------+---------------
table continued...
+----------------+----------------+
| Free Swap (MB) | Free Disk (GB) |
+----------------+----------------+
| 10049 / 10049 | 151 / 167
|
| 10049 / 10049 | 79 / 89
|
| 10049 / 10049 | 79 / 89
|
+----------------+----------------+
Here is another example using showcmd, which issues the specified command on every node
and displays results in a table. For example:
$ ncli node showcmd cat /proc/sys/fs/file-max
displays the results:
Command Output for cat /proc/sys/fs/file-max
+------------+----------------------------------+------+--------+--------+
| Node IP
| Node ID
| exit | stdout | stderr |
+------------+----------------------------------+------+--------+--------+
| 10.60.11.5 | 840101ef035870509e8d5da5a49fcea9 | 0
| 784168 |
|
| 10.60.11.6 | 9949ee3ebd5560f738add0635e39b40a | 0
| 784195 |
|
| 10.60.11.7 | de5dcfbe8146aa3f03f16ec2637a373b | 0
| 784196 |
|
+------------+----------------------------------+------+--------+--------+
3 rows
runonall and runonother
The ncli node runonall command may be used to run any executable on multiple nodes.
It can also be used to run a command from a file. The executable must exist on all nodes prior
to the command being run. For some commands (like df ), the command already exists on all
nodes. If a user-written script is being executed, then it must be copied to all nodes using ncli
Teradata Aster Big Analytics Appliance Database Administrator Guide
168
Command Line Interface (ncli)
ncli Command Reference
node clonefile or a similar mechanism. This effectively allows you to run commands in
parallel over SSH on the cluster. An example is:
$ ncli node runonall df
Similarly the runonother command is used to run the specified executable in parallel on all
nodes except the one from which the command is issued:
$ ncli node runonother cat /proc/sys/fs/file-max
Table 10 - 15: ncli node Section
169
Command
Description
changeconfig <oldip> <newip>
<newmac>
Changes configuration of node IP address and/
or MAC address.
clonefile <filename>
Copies the named file <filename> from this
node to all other nodes in the cluster.
filesize <pattern>
Computes size, (for all files/directories
matching the pattern(s)), across all nodes in the
cluster.
runonall <cmd>
Runs the command <cmd> in parallel on all
nodes.
runonother <cmd>
Runs the command <cmd> in parallel on all
nodes except this one.
show [summary]
Shows cluster nodes and optional summary
table.
showcmd <cmd>
Shows the output of command <cmd> for each
of the nodes, in table format.
showcpuconfig [--ids]
Shows CPU configuration for all nodes or just
the nodes specified using the --ids flag.
showhwconfig
Shows all node hardware commands at once.
showinterfaces
Shows details for network interfaces.
showpci
Shows the items on the PCI bus.
showstoragestats [--nodeids],
[--aggregate]
Shows storage statistics of specified nodes, or an
aggregate for all nodes.
showsummaryconfig
Shows summarized node configurations.
showuid
Shows node uid (unique identifier) information
from config files.
showversion
Shows detailed software version information.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Command Line Interface (ncli)
ncli Command Reference
ncli nsconfig Section
The nsconfig section provides commands for setting up nameservers and hosts on all nodes in
the cluster simultaneously. Some of this functionality is also available through the AMC (see
“Set Up Host Entries for all Nodes” on page 115 and “Set up DNS entries for all Aster
Database nodes” on page 117).
The syntax to run a command in the nsconfig section looks like this example:
$ ncli nsconfig show hosts
which shows all the entries in the /etc/hosts file, which have been made through ncli and/
or the AMC. On a clean installation, it returns:
{
"hosts": []
}
After one host has been added through either the AMC or ncli, the result will be like:
{
"hosts": [
{
"comment": "Teradata server",
"ip": "10.31.120.100",
"aliases": [
"tdserver"
]
}
]
}
Note that the example above is also the format to use when creating a file of host entries to be
added through ncli. If there are multiple aliases for a particular host, separate them by
commas.
Add or modify hosts through ncli
You can modify the /etc/hosts for every node in the Aster Database cluster by creating a
“hosts file” and applying it through ncli. Note that doing so will not overwrite any existing
entries that were added manually to /etc/hosts. However, the file you apply will overwrite
any entries that were made through ncli or the AMC. So be sure the hosts file to be applied
contains not only any new entries you wish to add, but also any existing entries added through
the AMC or ncli that you wish to retain.
To apply a hosts file through ncli, issue the following command, substituting your own path to
the file with the entries to be added:
$ ncli nsconfig apply /home/beehive/myhostsfile hosts
The format for the hosts file is:
Teradata Aster Big Analytics Appliance Database Administrator Guide
170
Command Line Interface (ncli)
ncli Command Reference
{
"hosts": [
{
"comment": "<comment>",
"ip": "<ipaddress>",
"aliases": [
"<alias1>"[ ,”<alias2>”[ ,...”<aliasn>”]]
]
}
]
}
Add or modify nameservers through ncli
To add to or modify /etc/resolv.conf for nameservers, issue the following, substituting
your own path to the file with the entries to be added:.
$ ncli nsconfig apply /home/beehive/mynameserversfile nameservers
For the nameservers file, the format is:
{
"nameservers": [
{
"comment": "<comment>",
"ip": "<ipaddress>"
}
]
}
Note that any existing entries in these files not made through ncli or the AMC will not be
overwritten, and only those entries created through ncli or the AMC may be replaced by the
ncli nsconfig apply <filePath> hosts|nameservers command.
Table 10 - 16: ncli nsconfig Section
171
Command
Description
apply <filePath>
hosts|nameservers
Applies the configuration stored at <filePath> to
all nodes in the cluster, for either hosts or
nameservers. See the section above this table for the
format of the file to be applied.
show hosts|nameservers
Shows current configuration.
synccluster hosts|nameservers
Synchronizes the configuration across the cluster.
This command is useful when new nodes are added.
syncnode hosts|nameservers
Synchronizes the configuration on the node where
this is run. This command is useful when a new node
has been added.
validate hosts|nameservers
Validates the configuration.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Command Line Interface (ncli)
ncli Command Reference
ncli process Section
The process section provides commands related to running processes - specifically the
memcheck [<processnamefilter>] command, which reports on memory utilization. The
syntax to run a command in the process section looks like this example:
$ ncli process memcheck postgres
which displays results like:
Process Memory Usage
+------------+----------------------------------+--------------------+------| Node IP
| Node ID
| Process Name
| pid
+------------+----------------------------------+--------------------+------| 10.60.11.5 | 840101ef035870509e8d5da5a49fcea9 | postgres queenDb-0 | 3274
| 10.60.11.5 | 840101ef035870509e8d5da5a49fcea9 | postgres queenDb-0 | 3270
| ...
| 10.60.11.7 | de5dcfbe8146aa3f03f16ec2637a373b | postgres w6z
| 28799
+------------+----------------------------------+--------------------+------27 rows
table continued...
+------------+----------+-------------+
| VSize (MB) | RSS (MB) | Shared (MB) |
+------------+----------+-------------+
| 720
| 62
| 56
|
| 720
| 59
| 53
|
|
| 139
| 6
| 0
|
+------------+----------+-------------+
Table 10 - 17: ncli process Section
Command
Description
memcheck {<processnamefilter>}
Shows memory utilization for all processes.
Optionally, provide a filter by process name or
partial process name.
ncli procman Section
The procman section provides general tools for obtaining statuses from the process
management manager. The syntax to run a command in the procman section looks like this
example:
$ ncli procman showjobs
which displays results like:
Teradata Aster Big Analytics Appliance Database Administrator Guide
172
Command Line Interface (ncli)
ncli Command Reference
ProcMgmt nodes
+-------+-----------------------------------------------+--------------+
| JobId | Name
| Task Indices |
+-------+-----------------------------------------------+--------------+
| 8
| Txman on 10.50.129.100
| 0 1
|
| 14
| sysmanExec on 10.50.129.100
| 0
|
| 18
| AdmctlMonitorMgr on 10.50.129.100
| 0
|
...
| 232
| Net-SNMP on 10.50.129.101
| 0
|
| 236
| HardwareStatCollectorExec on 10.50.129.101
| 0
|
| 240
| System SharedJVM on 10.50.129.101
| 0
|
+-------+-----------------------------------------------+--------------+
56 rows
Table 10 - 18: ncli procman Section
Command
Description
showjobs [<jobId> ...]
Shows registered jobs. Optionally, shows only those
registered jobs listed by <jobId>. Separate multiple
jobIds by spaces.
shownodes
Shows registered nodes.
showtasks
[<jobId:taskIndex> ...]
Shows registered tasks. Optionally, shows only those
registered tasks listed by identifier
<jobId:taskIndex>. The task index is the number
assigned to each individual task that makes up a job. Task
index assignment begins with zero (0) and proceeds
until each task that makes up a job has a unique index.
Separate multiple jobId:taskIndex references by spaces.
showusers
Shows registered OS users.
ncli qos Section
The qos section allows you to view details related to Workload Management and admission
limits.
The Workload Management and Admission Limits commands enable you to connect with the
QosManager to access (to set, edit, remove, or show) the statistics, settings, and rules for
concurrency, workload management, and admission limits. This allows you to query the
admission queue to show why a particular task is still queued and not yet admitted.
The following workload management and admission limit commands are available:
The syntax to run a command in the qos section looks like this example:
$ ncli qos showconcurrency
which displays results like:
Concurrency is 100
NOTE: Escaping for command line arguments is required.
173
Teradata Aster Big Analytics Appliance Database Administrator Guide
Command Line Interface (ncli)
ncli Command Reference
Table 10 - 19: ncli qos Section
Command
Description
cancel <sessionid>
Cancels the specified query.
canceladmission <sessionid>
Cancels the specified query from being admitted.
deleteadmissionlimit <name>
Removes a currently configured admission limit.
setadmissionlimit
<name> <limit> <predicate>
Configures a new or existing admission limit with
a unique name <name>, maximum concurrency
limit <limit> (limit=0 implies deny/cancel), and
predicate <predicate>. The predicate must be a
valid SQL WHERE clause.
setconcurrency <concurrency>
Sets and then displays the maximum query
concurrency. This setting can be used as a Global
Admission Threshold to hold all tasks under a
certain concurrency limit.
showadmissioncontrols
Shows the QoS admission controls.
showadmissionlimits
Shows all the currently configured admission
limits.
showadmissionqueue
Shows all queued tasks, the reason the task is
queued, and other statistics.
showadmissionstats
Shows the current admission limits, statistics, and
counters or charges against those limits. This
shows the impact to each limit by the currently
running tasks. For example, "Skipped because
'batch-jobs' has reached limit of 3".
showall
Shows all of the QoS-related data.
showconcurrency
Shows the maximum query concurrency.
showcpucgroups
Shows all of the CPU group tasks.
showevalstats
Show predicate evaluation statistics
showmemcgroups
Shows all of the memory cgroup tasks.
showprocesses <sessionid> ...
Shows processes under QoS, optionally filtered by
sessionid.
showrules
Shows the QoS rules.
showserviceclasses
Shows the QoS service classes.
showsessiondetails <sessionid> ...
Shows the QoS session details, optionally filtered
by sessionid.
showsessions <sessionid> ...
Shows the QoS sessions, optionally filtered by
sessionid.
showslavesessions <sessionid> ...
Shows the QoS sessions from the QosSlave
processes, optionally filtered by sessionid.
Teradata Aster Big Analytics Appliance Database Administrator Guide
174
Command Line Interface (ncli)
ncli Command Reference
ncli query Section
The query section provides commands to obtain information about active queries, recent
queries, and usage statistics. The syntax to run a command in the query section looks like this
example:
$ ncli query showqueryexecutiontime
which returns a result like:
Process Execution Time Summary
+---------------+-------+
| Time
| Count |
+---------------+-------+
| < 10 sec
| 58
|
| 10 sec-10 min | 1
|
| 10 min-1 hr
| 1
|
| 1-4 hrs
| 0
|
| 4-8 hrs
| 0
|
| >8 hrs
| 0
|
+---------------+-------+
6 rows
Table 10 - 20: ncli query Section
175
Command
Description
cancelprocess <process_id>
Send cancel request for query with the given
process id.
process_phase
[--process_ids=<process_id>,
<process_id>..]
[--max=<num_of_processes
(default:10)>]
Shows the phase for the given process ids. Use
--max=<num-of-processes> to specify the
number to return. If not specified, the default is
10.
process_phase_statements
[--process_ids=<process_id>,
<process_id>..]
[--max=<num_of_processes
(default:10)>]
Shows the statement phase information for the
given process ids. Use
--max=<num-of-processes> to specify the
number to return. If not specified, the default is
10.
process_statements
[--process_ids=<process_id>,
<process_id>..]
[--max_statement_len=<length>]
Shows the result set of statements corresponding
to the given list of process identifiers. You can limit
the length of statements in characters by specifying
--max_statement_len=<length>.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Command Line Interface (ncli)
ncli Command Reference
Table 10 - 20: ncli query Section (continued)
Command
Description
processes {<processfilter>}
Shows queries that have run recently, filtered by
processfilter value. Any plural values you
wish to specify for processfilter can be
expressed in comma-delimited form.
Valid values for processfilter are as follows:
[--process_ids=<process or
statement ids>]
[--users=<users>]
[--databases=<databases>]
[--execution_time_operator=< ">" |
"<" | "<=" >]
[--query_text]
[--verbose]
[--summary] (If this is specified, the output will
include a separate table with the counts of each
process by status.)
[--statuses=<completed, error,
running, pending, canceled>]
showall
Shows all queries in the system.
showlongestprocess
Shows the longest running query within last 24
hours.
showmostactiveuser [--verbose]
Shows the most active users within last 24 hours.
shownoderesourceusage <sessionid>
...
Shows per-node query resource usage for the
specified queries.
showprocessexecutiontime
Shows process execution time within the last 24
hours.
showprocessresourceusage
<sessionid> ...
Shows per-process query resource usage for
specified queries.
showrecent <count>
Shows the most recent queries.
showrunning
Shows running queries in the system.
showsystemresourceusage
<sessionid> ...
Shows system-wide query resource usage for the
specified queries.
workload_policies
Shows the workload policies defined in the system.
workload_service_classes
Shows the workload service classes defined in the
system.
Cancel a Running Query
To cancel a running query, do the following:
1
Find the statement identifier (statementid) of the query by issuing either ncli query
showrecent or ncli query showrunning. For example:
# ncli query showrecent 1
+---------------------+---------------------+---------------------------------| sessionid
| statementid
| statement
Teradata Aster Big Analytics Appliance Database Administrator Guide
176
Command Line Interface (ncli)
ncli Command Reference
+---------------------+---------------------+---------------------------------| 6484512000058641377 | 8004077366722303879 | select * from sales_fact;
+---------------------+---------------------+---------------------------------1 rows
table continued...
+---------------------+---------------------+----------+---------+
| start_time
| end_time
| duration | running |
+---------------------+---------------------+----------+---------+
| 2013-04-03 22:27:28 | None
| 0:06:50 | Y
|
+---------------------+---------------------+----------+---------+
2
Make note of the statementid for the query you want to cancel.
3
Cancel the query:
# ncli query cancelprocess 8004077366722303879
ncli replication Section
The replication section provides commands related to Aster Database replication. The syntax
to run a command in the replication section looks like this example:
$ ncli replication showgoal
which returns a result such as:
Replication factor goal is 2
Table 10 - 21: ncli replication Section
Command
Description
showdetailedrpcstats
Shows detailed replication RPC statistics.
showgoal
Shows the goal replication factor.
showsummaryrpcstats
Shows summarized replication RPC statistics.
ncli session Section
The session section provides commands related to Aster Database sessions. The syntax to run
a command in the session section looks like this example:
$ ncli session show
which returns a result such as:
Sessions
+---------------------+---------+------------------+-----------+-------------| session_id
| user_id | user_ip
| queen_pid | db_name
+---------------------+---------+------------------+-----------+-------------| 5047778509961726353 | beehive | 127.0.0.1
| 17438
| beehive
| 1914926430725305282 | beehive | 127.0.0.1
| 31349
| beehive
| 283128356804756573 | beehive | 127.0.0.1
| 29621
| beehive
| 1627772863387028473 | beehive | 127.0.0.1
| 29668
| retail_sales
| 7450466840935701437 | beehive | 127.0.0.1
| 29765
| beehive
| 1365127267625199444 | beehive | 127.0.0.1
| 12173
| beehive
+---------------------+---------+------------------+-----------+-------------6 rows
table continued...
177
Teradata Aster Big Analytics Appliance Database Administrator Guide
Command Line Interface (ncli)
ncli Command Reference
+---------------+---------------+---------------+----------------+------------------+
| session_state | start_time
| end_time
| login_duration |running_process_id|
+---------------+---------------+---------------+----------------+------------------+
| closed
| 1337538898000 | 1337540698000 | 1800000
|
|
| closed
| 1337754018000 | 1337754022000 | 4000
|
|
| closed
| 1337798303000 | 1337798318000 | 15000
|
|
| closed
| 1337798318000 | 1337798332000 | 14000
|
|
| closed
| 1337798380000 | 1337798473000 | 93000
|
|
| idle
| 1336695793000 | None
|
| None
|
+---------------+---------------+---------------+----------------+------------------+
Table 10 - 22: ncli session Section
Command
Description
show [--ids] [--status]
Shows session information for specified session ids and/
or statuses.
ncli sqlh Section
The sqlh section, provides tools for configuring SQL-H. Alternatively, you can use the AMC to
create, edit, and delete these configurations. The syntax to run a command in the sqlh section
looks like this example:
$ ncli sqlh showservers
which displays a result like:
Server config
+----------------+---------+------+---------+-------------+
| server
| version | port | network | comment
|
+----------------+---------+------+---------+-------------+
| hdp1.aster.com | HDP1.1 | 9083 | private | test cluster|
+----------------+---------+------+---------+-------------+
1 rows
Table 10 - 23: ncli sqlh Section
Command
Description
deleteserverconfig <server>
Delete the SQL-H configuration for a server.
setserverconfig --server --version
[--port]
[--network=<public|private>]
[--comment]
Set the SQL-H configuration for server. The
server and version are required. The other
values are optional. Use the hostname for the
server. Valid versions are: HDP1.1.
showservers
Shows the SQL-H server configurations.
showversions
Shows supported Hadoop distributions and
versions.
ncli sqlmr Section
The sqlmr command section is for monitoring the execution of SQL-MR. It provides
configuration, viewing, filtering, and downloading of the SQL-MR log files. The syntax to run
a command in the sqlmr section looks like this example:
$ ncli sqlmr logshowconfig
Teradata Aster Big Analytics Appliance Database Administrator Guide
178
Command Line Interface (ncli)
ncli Command Reference
which returns results like:
Sqlmr Log Rotation Configuration
+---------------+---------------------------+--------------------------| Worker IP
| Minimum Duration(seconds) | Maximum Duration(seconds)
+---------------+---------------------------+--------------------------| 141.206.66.26 | 86400
| 259200
| 141.206.66.27 | 86400
| 259200
+---------------+---------------------------+--------------------------2 rows
table continued...
+----------------+--------------------------------------------+
| Disk Quota(kB) | Status
|
+----------------+--------------------------------------------+
| 5242880
| Config file does not exist. Used defaults |
| 5242880
| Config file does not exist. Used defaults |
+----------------+--------------------------------------------+
Table 10 - 24: ncli sqlmr Section
Command
Description
logdownload [statementId [file]]
Downloads the log files for the specified
statements. The default is the latest statement.
logresetconfig [--minDuration=<X>]
[--maxDuration=<Y>] [--quota=<Z>]
Resets the SQL-MR log parameters. Specify the
duration using one of the units:
• S for seconds
• M for minutes
• H for hours
• d for days
• w for weeks
• m for months
• y for years
The default is S (seconds).
Specify the quota unit using one of:
• B for bytes
• K for kilobytes
• M for megabytes
• G for gigabytes
• T for terabytes
The default is B (bytes).
179
logshow [statementId]
Shows the log files for the specified statement.
The default is the latest statement.
logshowconfig
Shows the current SQL-MR log rotation
configuration.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Command Line Interface (ncli)
ncli Command Reference
Table 10 - 24: ncli sqlmr Section (continued)
Command
Description
logshowdonebetween <start time
[end time (default:now)]>
Shows the log files for the statements that
finished between the specified start time and
end time.
Use the format yyyy-mm-dd-HH:MM:SS.
logshowrecentlydone [count]
Shows the log files for recent finished
statements. Optionally limit the number of
statements by specifying a count. The default
count is one.
ncli statsserver Section
Normally, you would use the AMC for this type of information. See “Administrative
Operations” on page 90. The ncli statsserver commands are good diagnostic tools to use if the
AMC is not available.
The syntax to run a command in the statsserver section looks like this example:
$ ncli statsserver showclusterstatus
which returns results like:
ClusterStatus
+--------------------------+--------+
| property
| value |
+--------------------------+--------+
| clusterStatus
| Up
|
| replicationFactor
| 2
|
| minimumReplicationFactor | 1
|
| goalReplicationFactor
| 2
|
| clusterType
| ec2
|
| distroName
| redhat |
+--------------------------+--------+
6 rows
IncorporationStatus
+----------------------+------------+
| property
| value
|
+----------------------+------------+
| activating
| False
|
| replicating
| False
|
| activationImbalanced | False
|
| dataImbalanced
| False
|
| timestamp
| 1338916002 |
+----------------------+------------+
5 rows
BackupStatus
+----------+-------+
| property | value |
+----------+-------+
+----------+-------+
0 rows
Teradata Aster Big Analytics Appliance Database Administrator Guide
180
Command Line Interface (ncli)
ncli Command Reference
Table 10 - 25: ncli statsserver Section
Command
Description
showactivestatements
[resolveRunningOrPending]
Shows running statements in the StatsServer. Use
resolveRunningOrPending if you wish to
distinguish between statements that are running
and those that are pending for statements that
have not yet ended.
showclusterstatus
Shows the cluster status from the StatsServer
showhwstats [--nodeids] [--metrics]
[--aggregate]
[--function=[latest|average]]
[--sincetime]
Shows hardware stats from the StatsServer. Use
the flags to filter the results. See “Monitor
Hardware” on page 54.
shownodes [--nodeids] [epochtime |
utctime]
Shows a list of nodes from the StatsServer. Use the
flag --nodeids to filter by node and
epochtime or utctime to specify the time
format.
showphases <statementid>
Shows statement phases from the StatsServer,
optionally filtered by statementid.
showsessions
Shows active sessions in the StatsServer.
showstatements
[resolveRunningOrPending]
Shows statements in the StatsServer. Use
resolveRunningOrPending if you wish to
distinguish between statements that are running
and those that are pending for statements that
have not yet ended.
showvworkers
Shows the vworkers from the StatsServer.
ncli sysman Section
The sysman section provides commands related to the system manager process. The syntax to
run a command in the sysman section looks like this example:
$ ncli sysman memcheck
which returns a result like:
-----------------------------------------------class
1 [
8 bytes ] :
496 objs;
0.0 MB;
0.0 cum
class
2 [
16 bytes ] :
233 objs;
0.0 MB;
0.0 cum
class
3 [
32 bytes ] :
167 objs;
0.0 MB;
0.0 cum
...
class 42 [
4608 bytes ] :
3 objs;
0.0 MB;
0.6 cum
class 47 [
8192 bytes ] :
1 objs;
0.0 MB;
0.7 cum
class 53 [
16384 bytes ] :
2 objs;
0.0 MB;
0.7 cum
-----------------------------------------------PageHeap: 1 sizes;
0.8 MB free
-----------------------------------------------206 pages *
1 spans ~
0.8 MB;
0.8 MB cum; unmapped:
MB;
0.0 MB cum
Normal large spans:
Unmapped large spans:
181
MB
MB
MB
MB
MB
MB
0.0
Teradata Aster Big Analytics Appliance Database Administrator Guide
Command Line Interface (ncli)
ncli Command Reference
>255
large *
0 spans ~
0.0 MB;
0.8 MB cum; unmapped:
MB;
0.0 MB cum
-----------------------------------------------DevMemSysAllocator: failed_=0
SbrkSysAllocator: failed_=0
MmapSysAllocator: failed_=0
-----------------------------------------------MALLOC:
3145728 (
3.0 MB) Heap size
MALLOC:
1581104 (
1.5 MB) Bytes in use by application
...
MALLOC:
1
Thread heaps in use
MALLOC:
5242880 (
5.0 MB) Metadata allocated
------------------------------------------------
0.0
Sysman Process Memory Statistics(PID 11535)
+---------------------+----+
| Virtual Memory Size | 92 |
| Resident Set Size
| 9 |
+---------------------+----+
Table 10 - 26: ncli sysman Section
Command
Description
demerits
Reports on vworker/node demerits.
logclusterview
Logs detailed information in the sysman log.
memcheck
Shows report on memory utilization.
ping
Checks to determine if sysman is up.
showactivitystatus
Shows sysman activity status.
showreplicas vworkerid
[vworkerid ...]
Shows vworker replicas, optionally filtered by
vworkerid.
showrf
Shows cluster’s current and target replication
factor.
showversion
Shows sysman version string.
showvworkers
Shows cluster view of virtual workers.
ncli system Section
The system section provides commands related to Aster Database status display and control.
The syntax for commands in the system section looks like this example:
$ ncli system softrestart
Note: Setting the QoS concurrency to zero will still allow any new queries that are part of an open transaction. Soft
Restart does not wait until open transactions are finished; it rolls back open transactions and then does the restart.
Most of the commands in the system section duplicate some of the functionality available
through the AMC. Exposing them through ncli enables you to run those commands even if
the AMC is not running or you do not have access to the AMC for whatever reason. Because
Teradata Aster Big Analytics Appliance Database Administrator Guide
182
Command Line Interface (ncli)
ncli Command Reference
they are so powerful, the commands softrestart and softshutdown must be run by the
root OS user. The softrestart command may be issued on a cluster after it has been shut
down or to restart it when it is running. The softrestart command should be issued for the
first time on a new cluster only after the workers have attained a status of Prepared (you can
check the worker status in the AMC) When a node reboots, it may pass through the states of
New to Preparing to Upgrading, before reaching Prepared. This is normal..
Tip: Using the commands softrestart and softshutdown through ncli is preferable to issuing them the
old way using the Python utilities SoftShutdownBeehive.py and SoftRestartBeehive.py
located by default in the directory /home/beehive/bin/utils/primitives/. If AMC is running, it is
even better to use the Soft Restart or Hard Restart buttons in the Admin > Cluster Management tab, because
they can be used without logging in to the system at the command line, and you can see the cluster status during the
restart.
Table 10 - 27: ncli system Section
Command
Description
activate
Activates Aster Database. See “Activate Aster
Database” on page 120.
addnode <worker or loader>
<ip or mac> [--clean]
[group <group name>]
[display <display name>]
Adds/registers a new node.
balancedata
Balances data in Aster Database. See “Balance
Data” on page 122.
balanceprocess
Balances Aster Database processes. See “Balance
Process” on page 123.
“Add New Nodes to the Cluster” on page 71
Note: Balance Process will cancel any running
queries. Verify that there are no running queries
on the system before doing this operation.
changepartitioncount
Changes the partition count and the parallelism
<newpartitioncount> {<parallelism>} to the values specified. See “Split Partitions” on
page 79
changepartitioncountstatus
Shows progress made in changing the partition
count.
removenode <ip address> [--force]
[--dryRun]
Removes/unregisters the node specified by ip
address. Use --dryRun to test if the node is
eligible to be removed. Use --force to force
removal if necessary (if internal checks fail).
See “Check Hardware Configuration” on page 91.
183
show
Shows Aster Database queen status. Note that a
status of Up means only that the queen is
running; it does not tell you if the cluster is ready
to accept queries. For that, use the ncli node
show command as described in “ncli node
Section” on page 167.
showpartitioncount
Shows system partition count.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Command Line Interface (ncli)
ncli Command Reference
Table 10 - 27: ncli system Section (continued)
Command
Description
softrestart
Issues a soft restart to Aster Database. You must
be root to issue this command. See also: “Soft
Restart” on page 119.
softshutdown
Issues a soft shutdown to Aster Database. You
must be root to issue this command. See also:
“Soft Shutdown” on page 120.
softstartup
Performs a soft startup of Aster Database. You
must be root to issue this command. See also:
“Soft Startup” on page 125.
ncli tables Section
The tables section provides general tools for returning information about tables. The syntax to
run a command in the tables section looks like the examples shown below.
First, you must gather information by issuing:
$ ncli tables gathertableinfo --forcerun=true
which displays results like:
Table space information has been recorded in /home/beehive/data/tmp/
table_space_data_beehive
Table space information has been recorded in /home/beehive/data/tmp/
table_space_data_retail_sales
Then you can display the information gathered by issuing:
# ncli tables showtableinfo
which returns results like:
Table Spaces
+--------------+---------+-------------+----------------+------------| dbname
| vworker | schema_name | table_name
| compression
+--------------+---------+-------------+----------------+------------| beehive
| w6z
| public
| employees
| none
| beehive
| w5z
| public
| employees
| none
| retail_sales | w5z
| public
| customer_index | none
| retail_sales | w5z
| public
| region_dim
| none
...
| retail_sales | w6z
| public
| geo_dim
| none
| retail_sales | w6z
| public
| date_dim
| none
+--------------+---------+-------------+----------------+-------------
table continued...
+------------+------------+-----------------+------------------+
| table_type | total_size | total_disk_size | dead_tuple_count |
+------------+------------+-----------------+------------------+
| row
| 32768
| 32768
| 0
|
| row
| 0
| 0
| 0
|
| row
| 11698176
| 11698176
| 0
|
Teradata Aster Big Analytics Appliance Database Administrator Guide
184
Command Line Interface (ncli)
ncli Command Reference
| row
| 32768
| 32768
| 0
|
...
| row
| 32768
| 32768
| 0
|
| row
| 32768
| 32768
| 0
|
+------------+------------+-----------------+------------------+
Table 10 - 28: ncli tables Section
Command
Description
gathertableinfo
[--configfile][--dbnames]
[--forcerun]
Gathers table information and saves it to temporary
storage. The automatic process that gathers table
information runs periodically. However, you can force it
to run at any time by specifying the -forcerun=true flag. Use the --configfile flag to
specify where the results should be written, and use
--dbnames to supply the databases for which to gather
information.
showtableinfo [--tables]
[--dbnames][--aggregate]
Shows table information collected by ncli tables
gathertableinfo. Use --dbnames or -tables to filter results by database and/or table. Use -aggregate to display logical partitioned table
statistics in aggregate, rather than per partition.
ncli util Section
Typically, ncli commands generate output in the form of named tables. The util section allows
the output of other ncli commands to be used as table data sources in a SELECT SQL query.
This lets users combine the output of multiple ncli commands using JOINs, GROUP BYs, or
other constructs.
The basic syntax to run a command in the util section looks like this example:
$ ncli util sql <SELECT sql query>
Note that only SELECT statements are permitted.
You might issue the following to view the version of the cluster.
$ ncli util sql "SELECT distinct build_version FROM (ncli node
showversion)"
which displays results like:
+---------------+
| build_version |
+----------------+
| beehive-r28783|
+---------------+
1 rows
Review the help section for details about running more complex queries by issuing the
following:
$ ncli --help util sql
185
Teradata Aster Big Analytics Appliance Database Administrator Guide
Command Line Interface (ncli)
ncli Command Reference
Table 10 - 29: ncli util Section
Command
Description
sql <select sql query>
Allows you to use an ncli command within an
SQL query.
ncli vworker Section
The vworker section provides commands related to Aster Database vworkers. A vworker
represents a single instance of the Aster Database data management software on an Aster
Database node (machine). Each Aster Database node typically has a number of vworkers
running on it.
The syntax to run a command in the vworker section looks like this example:
$ ncli vworker show
which returns a result like the following, where Node is the IP address of the Aster Database
node machine, Status is the current operational status of one or more vworkers on the node,
and Count is the number of vworkers that currently have that status on that node:
vworkers
+---------------+-------------+---------------+
| Node
| Status
| VWorker Count |
+---------------+-------------+---------------+
| 10.50.129.100 | Deactivated | 0
|
| 10.50.129.100 | Active
| 1
|
| 10.50.129.101 | Active
| 1
|
| 10.50.129.101 | Deactivated | 1
|
| 10.50.129.102 | Active
| 1
|
| 10.50.129.102 | Deactivated | 2
|
+---------------+-------------+---------------+
6 rows
Tip: With any command in the node section, you can use flags to limit commands to only specific machine(s) or
type(s) of nodes. See the section ncli Flags (page 187) for information on using flags.
Table 10 - 30: ncli vworker Section
Command
Description
show
Summarizes operational status of the vworkers.
showconfigsignature
Shows vworkers’ configuration signatures.
showdetail [shownodeid]
Shows vworkers’ details, with Node ID if shownodeid
is specified.
Teradata Aster Big Analytics Appliance Database Administrator Guide
186
Command Line Interface (ncli)
ncli Command Reference
ncli Flags
Flags in ncli allow you to modify the actions of commands. For example, you can constrain
commands to only worker nodes, only loader nodes, only the local machine, or only a specific
IP addresses. You may also use flags to control output by formatting reports.
There are two types of flags:
•
High level flags affect any command they are used with.
•
Command related flags apply only to particular commands.
The online help for command related flags appears with the command they are used with. The
online help for high level flags appears when invoking the main online help by issuing:
$ ncli --help
Command related flags are listed with their commands in the “ncli Command Reference” on
page 152. High level flags are discussed in this section. The syntax for using the high level flags
is as follows:
$ ncli <flag>=<parameter1,paramter2,...parametern> <section> <command>
Limit Actions of ncli
To limit the actions of ncli to specific machines, use the --hosts, --hostsfile, or -hosttype flag. These flags are used most commonly with the node section, but may be used
with any command. To limit the number of nodethreads, use the --nodethreads flag.
--hosts flag
The --hosts flag tells ncli to return only results related to the specified Aster Database node
or nodes. You may use IP address(es), hostname(s), “localhost”, or a mixture of these.
Some examples using the --hosts flag are:
$ ncli --hosts=localhost node showsummaryconfig
$ ncli --hosts=10.75.10.221,10.75.10.222 vworker showdetail
--hosttype flag
The --hosttype flag tells ncli to return only results related to the specified type of Aster
Database node. It works with the node section and accepts the argument worker, queen, or
loader. To specify more than one type, a comma separated list may be used. For example:
$ ncli --hosttype=worker,loader node showsummaryconfig
--hostsfile flag
Allows you to supply a file containing one host per line as input. The command issued will act
on the hosts listed in the file. An example using the --hostfile flag is:
$ cat > /tmp/myhosts
10.50.129.100
10.50.129.101
10.50.129.102
Ctrl-D
$ ncli --hostsfile=/tmp/hosts node showcmd cat /proc/loadavg
187
Teradata Aster Big Analytics Appliance Database Administrator Guide
Command Line Interface (ncli)
ncli Command Reference
Command Output for cat /proc/loadavg
+-------------+------+------------------------------+--------+
| Node
| exit | stdout
| stderr |
+-------------+------+------------------------------+--------+
| 10.75.10.23 | 0
| 6.83 12.24 13.37 4/954 23178 |
|
| 10.75.10.25 | 0
| 14.95 11.08 9.35 3/959 19246 |
|
+-------------+------+------------------------------+--------+
2 rows
--nodethreads
The --nodethreads flag allows you to specify the number of node threads to run in parallel.
For ncli operations that need to gather information from multiple worker nodes, ncli spawns
SSH processes to connect to the nodes via threads. By default, the size of the threadpool used
is 20. This means if you are operating on a cluster of 40 nodes, the initial communication will
be with only 20 nodes. As the threads complete their operation on the individual nodes, they
will start running the same operation on the nodes that remain. So the data is still fetched
from all 40 nodes, but will parallelize to only 20 nodes at any one time. The --nodethreads
flag provides the option to control this degree of parallelism.
An example is:
$ ncli --nodethreads=10 node runonall uptime
--vworkers
Specifies the vworkers to act upon.
Format and Sort Flags
The formatting and sorting flags allow you to change the way results are displayed.
They include --tablefilterregex, --tableformat, --tabletype, and -tablesortcolname.
--tablefilterregex
When this flag is set, it is treated as a regular expression that table titles are matched against.
Only tables with titles that match against the regular expression are displayed. This is useful if
a particular ncli command outputs multiple tables, but only a single table is desired.
--tableformat flag
The --tableformat flag tells ncli to return the results in a particular table format. Available
format options are json or cli.
The cli option is intended to format results to display on screen in a table format, but may
also be written to a file. The default setting is cli and it does not need to be specified.
For example, issuing:
$ ncli node show
gives the following result, in cli format:
Teradata Aster Big Analytics Appliance Database Administrator Guide
188
Command Line Interface (ncli)
ncli Command Reference
+---------------+--------+--------+
| Node
| Type
| Status |
+---------------+--------+--------+
| 10.50.129.100 | queen | Active |
| 10.50.129.101 | worker | Active |
| 10.50.129.102 | worker | Active |
+---------------+--------+--------+
3 rows
The json option formats results in JSON (JavaScript Object Notation) format. For more
information on JSON, see the URL, http://www.json.org.
So the json flag should be used if you’re going to parse the output for use with a script, for
example. You should use JSON to parse the output into a string and then pass that string to
the script. You can also use the json flag to view the results on screen in order to check what is
being passed.
Here is an example:
$ ncli --tableformat=json node show
gives the result:
{
"header": ["Node", "Type", "Status"] ,
"rows":
[
["10.50.129.100", "queen", "Active"] ,
["10.50.129.101", "worker", "Active"] ,
["10.50.129.102", "worker", "Active"]
]
}
--tabletype flag
Options for the --tabletype flag are normal and diff. The --tabletype=normal flag
displays one row for each row of data returned. The --tabletype=diff flag tells ncli to
return only one row for each unique result, effectively grouping by like values. This is useful
when comparing the settings of many nodes.
Here are two code examples with their resulting reports, first with the default (normal)
tabletype:
189
Teradata Aster Big Analytics Appliance Database Administrator Guide
Command Line Interface (ncli)
ncli Command Reference
$ ncli node show
+--------------+--------+--------+
| Node
| Type
| Status |
+--------------+--------+--------+
| 10.75.10.11 | queen | Active |
| 10.75.10.12 | worker | Active |
| 10.75.10.13 | worker | Active |
| 10.75.10.14 | worker | Active |
| 10.75.10.15 | worker | Active |
. . .
| 10.75.10.23 | worker | Active |
| 10.75.10.240 | loader | Active |
| 10.75.10.241 | loader | Active |
| 10.75.10.243 | loader | Active |
| 10.75.10.25 | worker | Active |
| 10.75.10.26 | worker | Active |
+--------------+--------+--------+
18 rows
and then with --tabletype=diff :
$ ncli --tabletype=diff node show
+-------+--------------+--------+--------+
| Count | Sample Node | Type
| Status |
+-------+--------------+--------+--------+
|
1 | 10.75.10.11 | queen | Active |
|
3 | 10.75.10.240 | loader | Active |
|
14 | 10.75.10.13 | worker | Active |
+-------+--------------+--------+--------+
3 rows
Notice how the resulting table shows the results grouped by the column ‘Type’.
The --tabletype=diff flag is especially useful for detecting discrepancies in configuration
or performance among nodes. The following example shows the status of vworker processes
grouped by status and node. It gives you a quick look at how many active and inactive
vworkers exist on the nodes, and whether there is data skew, without having to weed through
a list of every single vworker’s status:
$ ncli --tabletype=diff vworker show
vworkers
+-------+-------------+-------------+---------------+
| Count | Sample Node | Status
| Vworker Count |
+-------+-------------+-------------+---------------+
|
1 | 10.75.10.11 | Deactivated | 0
|
|
1 | 10.75.10.13 | Deactivated | 4
|
|
1 | 10.75.10.11 | Active
| 1
|
|
2 | 10.75.10.17 | Active
| 6
|
|
4 | 10.75.10.12 | Deactivated | 6
|
|
9 | 10.75.10.25 | Deactivated | 5
|
|
12 | 10.75.10.26 | Active
| 5
|
+-------+-------------+-------------+---------------+
7 rows
Teradata Aster Big Analytics Appliance Database Administrator Guide
190
Command Line Interface (ncli)
ncli Command Reference
--tablesortcolname flag
The --tablesortcolname flag tells ncli to return the results in a particular order by sorting
on the specified column. It takes as its argument the name of any column in the command’s
results. For example, to view details for vworkers sorted by partition, issue:
$ ncli --tablesortcolname='Partition' vworker showdetail
If you specify a column that does not exist, ncli returns an empty table with column headings
so you can see the available column names.
Miscellaneous Flags
--help
Prints the online help for ncli.
--verbose
Increases verbosity by providing better and more detailed error messages in case of errors.
--version
Displays the version number of ncli.
--flagfile
Inserts flag definitions from the given file into the command line. This is useful when
specifying many flags for one command.
--minLogSeverity
Sets the minimum log severity to the supplied integer for ncli. Valid values are 2(verbose)
through 7(fatal). The default value is 4.
--undefok
Used to specify a comma-separated list of flag names that may be specified on the command
line, even if ncli does not define a flag with that name. It is important to note that flags in this
list that have arguments MUST use the --flag=value format. The default value for -undefok is ''.
191
Teradata Aster Big Analytics Appliance Database Administrator Guide
CHAPTER 11
Executables
The Aster Database Executables framework is a set of script management tools that allow
Aster Database administrators to create, manage, and run custom scripts on one or many
nodes in their cluster. Scripts can be shell scripts, SQL scripts, or can invoke SQL-MapReduce
functions. Scripts can be run on any node on the cluster, or they can be restricted to run on
only specified nodes.
AMC Executables provide an easier way to diagnose cluster issues, such as data skew, and
perform routine cluster maintenance. Prior to AMC Executables, you could create custom
scripts to provide these benefits, but they had to be run by a user logging in through a shell,
which could be inconvenient depending on security and IT policies. There was also no
provision for creating a library of scripts before AMC Executables.
This section explains how to store, run, and manage your Aster Database scripts using the
AMC. The following topics are covered:
•
Executables Tab
•
Running Scripts
•
Creating Scripts
•
Best practices for building scripts
•
Upgrades
Executables Tab
You must be logged in as an administrator user to access the Executables tab. Access the
Executables tab in the AMC by clicking on the Admin tab and choosing Executables from the
submenu. The Executables tab has two views:
•
Executables Library
•
Executable Jobs
Executables Library
The Executables Library provides a list of all available scripts, both those that come preinstalled
and any custom scripts. Each script includes information on what variables are needed to run
Teradata Aster Big Analytics Appliance Database Administrator Guide
192
Executables
Executables Tab
it, who created it, and when it was created. There is a Run Now button to invoke the script, a
pencil icon to edit the script, and an X icon to delete it (if it is a custom script).
Tip: Some of the functions that used to appear in the AMC > Admin > Executables Library screen have been
replaced by newer functions beginning in Aster Database 5.10. After a clean install, you will not see these functions in
the executables library. But if you have upgraded from an older version, the function names still appear, with the term
"(Not Supported)" within the function descriptions. The function nc_relationstats replaces the functionality of these
older, unsupported executables. For more information on these, see Executables Not Supported from Prior Versions.
Figure 1: AMC Executables Library
Preinstalled Scripts
Aster provides several preinstalled scripts, which install automatically with a clean install or
upon upgrading. These scripts perform cluster administration tasks, such as finding data skew
and determining table information such as size. These scripts cannot be modified or deleted,
but they serve as a useful reference when creating your own custom scripts. Many of the
scripts cascade, which means that if they are acting on a parent table, they will automatically
act on all of its descendants as well.
You may view the code from these scripts by selecting the pencil icon. A window will appear
with information about the script and the code itself, all read-only. By clicking the Clone
button, you will create a copy of the script that may be edited for use in creating your own
scripts.
The Aster Database preinstalled scripts are:
•
Data Skew Detector - Identifies any tables in a database with statistically significant skew.
•
Relation statistics ( for all relations in a database ) - For all relations in a database, it gets the
table size of all tables in a database (cascade).
•
Relation statistics (for relations matching a pattern) - For relations matching a pattern, it gets
the size of the relations (matching the pattern) and database on each vworker (cascade).
193
•
Relation statistics ( for specific relations in a database ) - For specific relations in a database, it
gets the size of the specified relations and database on each vworker (cascade).
•
Table Info - Gets the table statistics of the specified table in the database (cascade).
Teradata Aster Big Analytics Appliance Database Administrator Guide
Executables
Running Scripts
Executable Jobs
To view the status of executable jobs, select the Executable Jobs tab. You will see a table listing
all jobs with information, such as their status (Running, Completed, or Error), who submitted
the job, start, end, and elapsed time, a link to see the output, and a Cancel link for jobs in
progress. Note that the list of jobs is automatically refreshed periodically. You can click on the
column headers to sort by a specific column. The history of script runs is maintained for a
week, with a maximum history of 1000 runs.
Figure 2: Executable Jobs in AMC
Viewing Output
To view the output of a job, click the Output link for that job. The following example shows
output from the preinstalled script, Data Skew Detector.
Figure 3: Viewing the output of a job in AMC
Running Scripts
To run scripts, you must be logged in to the AMC as an administrator. Each script runs
immediately when you click the Run Now button. The following list provides detailed steps for
running scripts.
1
2
Log into the AMC as db_superuser or another administrator user. The nc_skew script
requires that you log in as db_superuser.
Using the navigation tabs at the top of the page, go to Admin > Executables. The Executable
Library panel appears.
Teradata Aster Big Analytics Appliance Database Administrator Guide
194
Executables
Running Scripts
3
Find the script you wish to run in the list of available scripts, and click Run Now.
4
A window will prompt you to enter information (script variables) necessary to run the
script. Note that not all scripts require the same variables, so this screen may look different
depending upon which script you have chosen to run.
5
Enter the variables, and optionally select Save as Template to save a template that
automatically uses the same variables. If saving as a template, give your template a name.
This is the name that will be displayed in the Executables Library.
6
When finished, click Run Now.
Figure 4: Running an Executable Script in AMC
7
The script will run immediately. To view progress and output, select the Executable Jobs
tab.
8
If you have chosen to save the variables entered when running the script as a template, you
can access the template by finding it in the Executables Library. The following example
shows a template created when running the preinstalled script, Relation statistics (for specific
relation in a database).
Figure 5: Accessing an Executable Template in AMC
195
Teradata Aster Big Analytics Appliance Database Administrator Guide
Executables
Running Scripts
Best Practices for Running Scripts
The following information explains AMC Executables and their use in more detail:
Table Info Executable Limitations
The AMC executable Table Info may fail for logically partitioned tables. Most of its
functionality can be obtained by using or of these executables instead:
•
Relation statistics ( for specific relations in a database ), or
•
Relation statistics (for relations matching a pattern).
However, some statistics like updated tuple count are not available using these executables.
Memory limits
Aster Database enforces a memory usage limit for scripts you run (200MB). If your script
exceeds the limit, the script will be cancelled and an error message will be issued.
Disk space
Aster Database enforces a maximum disk usage of 5GB disk space for scripts. If your script
exceeds the limit, the script will be cancelled and an error message will be issued.
Workload management
Your Aster Database workload management rules apply to all SQL jobs you run via the
Executable Jobs tab, but workload management rules do NOT apply to non-SQL scripts that
you run. This means that before you run shell scripts on the cluster, you should consider the
performance impact they will have on other Aster Database users.
Logging
Note that the Executable Jobs tab will show the script as having been run by the database user
who invoked it, but in the Aster Database logs, the scripts are logged as having been run
directly from the AMC (i.e., by the extensibility OS user).
If those scripts, in turn, run SQL scripts via ACT, then the SQL jobs will be logged under the
user name that was passed by the script to invoke ACT.
Capturing runtime information
AMC Executables are run in the context of a bash shell (bash -c <command>). Stdout and
stderr are redirected to temp files. AMC picks up stdout, stderr and the status code when a
worker thread has finished running the script. These are displayed in the AMC Executable
Jobs tab, and can be viewed by selecting Output for a specific job. You can use the information
displayed by stdout to troubleshoot your custom scripts.
Using Checkpoints
Use the SQL-MR function “nc_relationstats” on page 200 to generate various reports for ondisk table size, tuple counts, and other statistics for one or more tables. But be aware that this
command relies on the storage system being up to date and accurate when reporting these
statistics.
Teradata Aster Big Analytics Appliance Database Administrator Guide
196
Executables
Creating Scripts
Once a data manipulating operation (INSERT, DELETE, DROP, UPDATE, etc.) is executed on
a table, the changes are stored in the buffers of the vworkers. These get flushed to disk when a
CHECKPOINT is executed. Automatic CHECKPOINTs execute on the vworkers at intervals
of 5-10 minutes.
If a user runs the nc_relationstats function after a data manipulating operation is performed
and before a CHECKPOINT is executed, the storage system would not be accurate and the
data in the buffer would not be accounted for in the results shown by nc_relationstats,
although the database itself is up to date.
If the user wants to see accurate (and up to date) storage sizes without waiting for this interval,
the user must execute the command ncli database checkpoint on the coordinator node.
This will execute a CHECKPOINT on each vworker database instance and make sure that the
database storage system is accurate.
However, if the user only requires an estimate of the storage sizes or is aware that no data
manipulations occurred in the last 5-10 minutes, then the nc_relationstats may be executed
without running the ncli database checkpoint command. For more information on
using this command, see the “ncli database Section” on page 158.
To learn about gathering additional, unreported, storage space estimates, see “Estimating
Storage Size for Non-Leaf Partitions” on page 209.
Creating Scripts
Requirements
The following are requirements for creating scripts:
197
•
The script can be a created using Python, Perl, or a shell scripting language. It can also be a
SQL script and it can also call a SQL-MR function.
•
The script must specify its variables and their types. The following variable types are
supported: Boolean, integer, ipaddress, multiline text, password, string. The AMC will do
basic validation against these types at runtime. These variables are made available to the
script as environment variables. During script definition, you will define the list of
variables and their types.
•
When loading a script, you assign it a category from the list provided. This list is supplied
by Aster and cannot be edited.
•
The AMC does not support non-English input when adding executables, for the following
settings:
•
Executable Names
•
Descriptions
•
Variable Inputs
•
Labels
•
Variable Names
•
Help Text
Teradata Aster Big Analytics Appliance Database Administrator Guide
Executables
Creating Scripts
Variables
These are the rules for using variables in your scripts:
1
Variable names can only consist of the following characters: a-z, A-Z, 0-9, and _
(underscore).
2
Variables are identified in SQL statements by the '&' prefix.
3
To get a literal '&' in the output stream, use '&&' in the input stream.
4
Variable names terminate when an invalid variable character is encountered.
5
If a variable name is terminated by the '.' character, the '.' character is not output in the
output stream. This permits strings to be appended to variable values without creating
unwanted whitespace.
SQL Scripts
SQL scripts with AMC Executables are run using ACT (which is launched by the AMC using
the extensibility OS user). You enter the SQL script directly into the New Executable dialog
box, and AMC Executables will launch ACT to run the script at runtime. This example shows
how to create a SQL script that queries the system table nc_all_child_partitions to
retrieve a list of all partitions in the database:
1
Go to AMC > Admin > Executables.
2
From the Executables Library tab, click the New Executable button.
3
Enter the name of the executable and description in the provided fields.
4
Select a category from the drop-down list. Categories are supplied by Aster and cannot be
added to or modified. For this example, select Database Executables.
5
Choose Yes or No for Is this language SQL? For this example, we will choose Yes, which will
allow AMC to create the variables needed (database, user, and password) to connect to the
database through ACT automatically.
6
Either upload your script file or enter the script directly into the source code text area.
For this example enter the following SQL script directly:
SELECT * FROM nc_all_child_partitions;
Teradata Aster Big Analytics Appliance Database Administrator Guide
198
Executables
Creating Scripts
Figure 6: Creating a new Executable in AMC
7
Supply any additional variables needed to run the script in the Variable Inputs section by
clicking the Define Variables button. The Executable Variables screen appears.
8
Click the Add Variable button.
9
Define the variable by specifying a label, name, type, help text, and whether it is required
for the script to run for each variable.
10 Repeat steps 9–11 to define an additional variable.
11 Click Save.
Figure 7: Saving Variables for an Executable in AMC
Cluster Utility SQL-MapReduce Functions
In order to invoke any SQL-MapReduce functions via an Executable Job, those functions must
first be installed (using ACT). You run the function just as you would any SQL statement, that
is, following the instructions for “SQL Scripts” on page 198. The database user invoking ACT
must have the correct permissions to run the SQL-MR function. As with SQL scripts, AMC
automatically creates the variables for database access (database, user, and password), but you
may also create additional variables in your script.
199
Teradata Aster Big Analytics Appliance Database Administrator Guide
Executables
Creating Scripts
The preinstalled scripts use the following SQL-MR functions. These functions are
automatically installed as part of the Aster Database installation. They are all system functions
that operate on partitions. These functions are meant to be invoked through AMC
Executables, and can only be run by an administrator user, but they can be executed on any
schema as long as the designated database user has the necessary permissions on that schema.
Note that if you type \dF in ACT, these preinstalled functions will not appear, as they are
internal-only functions. You can, however, use these in your own custom scripts.
Note that for these functions, the SELECT 1 and PARTITION BY 1 clauses are merely used to
invoke the functions. The functions will still work even if you operate on a table, instead.
The available preinstalled SQL-MR functions are:
•
nc_relationstats
•
nc_skew
Note: “nc_relationstats” on page 200 replaces the functionality of the following deprecated
functions: ncluster_storagestat, nc_recursive, and nc_tablesize.
Note: In a table where multiple updates and deletes have been performed, a side-effect can be file bloating. This may
cause the results of the compressed_size to be greater than the results of the uncompressed_size when running the
nc_relationstats function. If this scenario is encountered, you should run a VACUUM FULL on the table.
nc_relationstats
The SQL-MR function nc_relationstats enables an Administrator to generate various reports
for on-disk table size and statistics for one or more tables.
To learn about gathering accurate and up to date statistics, see “Using Checkpoints” on
page 196 within the “Best Practices for Running Scripts (page 196)” section.
To learn more about various operations to retrieve table size information and to gather
additional, unreported, storage space estimates, see the “Example nc_relationstats Calls” on
page 205 section and the following topic: “Estimating Storage Size for Non-Leaf Partitions” on
page 209.
Tip: Avoid running DDL statements concurrently with execution of the nc_relationstats function. Dropping tables and
databases while running this function may return inconsistent results across the cluster.
If Aster Database crashes and fails to shutdown normally during the execution of a DDL statement, orphan files may
be left behind. The space occupied by these orphan files is not accounted for in the results of nc_relationstats. It is
possible to determine if a discrepancy has occurred by comparing the following sets of results:
• the size of all databases reported by the nc_relationstats function
• the size of disk usage reported by the AMC
If a discrepancy is discovered it could be due to the existence of orphan files that need to be cleaned up. To fix this,
you need to contact Teradata Global Technical Support (GTS).
Warning!
The 'exact' mode should be used with caution as it is very expensive in terms of the amount of time and resources
it needs to complete, especially on compressed tables.
Teradata Aster Big Analytics Appliance Database Administrator Guide
200
Executables
Creating Scripts
Syntax
SELECT * FROM nc_relationstats(
ON (SELECT 1)
PARTITION BY 1
[DATABASES ('*'|'dbname1','dbname2',...)]
[RELATIONS ('[schema.]relation',...)]
[PATTERN (['schema%',]'relations%')]
[PARTITIONS ('p1.p2.p3','',...)]
[OWNERS ('owner1','owner2',...)]
[COMPRESS_FILTER (('none'|'high'|'medium'|'low')+)]
[INHERIT ('true' | 'false')]
[REPORT_SIZE ('compressed' | 'uncompressed' | 'all' | 'none')]
[REPORT_STATS_MODE ('estimated' | 'exact')]
[REPORT_STATS ('tuple_count' | 'tuple_space' | 'all' | 'none')]
[TOAST_SIZE ('combined' | 'separate' | 'none')]
[TOAST_STATS ('combined' | 'separate' | 'none')]
[PERSISTENCE_MODE ('all' | 'no_analytic' | 'only_analytic')]
[RELATIONS_SHOWN ('user' | 'only_catalog' | 'user_and_catalog')]
);
Note:
The Oversized-Attribute Storage Technique (TOAST) is a storage technique used by PostgreSQL. If a tuple size
exceeds the size of a page, the variable length attributes are compressed and/or broken up into multiple physical
rows. These extra rows are stored in a separate table called a TOAST table.
Arguments
The following optional arguments or parameters provide input options and filters on which
tables are analyzed for their space usage. These arguments affect the number of rows in the
output.
Please read the following “Warning!” about "exact" mode and the “Note:” about TOAST.
DATABASES: Defaults to the beehive database. List of database names to include in the report.
Use * to include information about all databases.
Note: The nc_relationstats function does not allow database names (the databases() clause)
to include a comma or to be listed in double quotes. Use only single quotes.
RELATIONS: Defaults to include all relations, i.e., all tables and all indexes. List of relation
names, with an optional schema qualifier. Note: The user can specify either relations or
pattern but not both.
To look for a relation in a particular schema other than public, e.g., "rel_e1" in schema "schm".
You must specify the name of that schema "schm.rel_e1". If no schema is specified, it looks for
that relation in the public schema. Tables that do not have their name in the list are excluded.
If the "inherit" flag is set to "true" then the default will exclude tables (in order to avoid
redundancy) that inherit from another table and will only include the root. If you omit or do
not use the clause, then output is "all relations". Text in double quotes is case sensitive, so
'myschema."table"' and 'myschema."Table"' are different. Any text not in double quotes will be
treated as lower case. Escaping is required for special characters.
201
Teradata Aster Big Analytics Appliance Database Administrator Guide
Executables
Creating Scripts
PATTERN: Default is all relations. Note: The user can specify either pattern or relations but
not both. This allows either one or two inputs. If only one input is provided, it is taken as an
expression for relation names, e.g., pattern('rel%'). If two inputs are provided then the first
input is used as schema pattern, and the second input is used for the relation names, e.g.,
pattern('schema%','rel%'). The behavior of '%' matches zero or more characters of any type.
For example, 'nc%' matches all names starting with 'nc'; '%compress' matches all names
ending with compress; and pattern('nc%','%') returns all relations in all schema starting with
'nc'.
Note: The text in the pattern is case sensitive, so 'nc%' and 'NC%' are not the same. Also,
underscores are treated as a normal character, as opposed to how they are treated in
PostgreSQL as a wildcard. In Aster, if you want to use a wildcard, the correct symbol is "%"
and not an underscore. However, you must be careful when using wild cards. 'pattern()' allows
use of the wildcard character "%". Be careful using wild cards. If you perform
'schema.wildcard' the query will retrieve everything in the database. This will be a very time
consuming process and should be used with caution.
PARTITIONS: Default is all partitions. Lists the partition names to be included in the output.
Partition references are in the form "p1.p2.p3", where "p1" is the direct child of the root
relation. The partition reference needs to be a complete path to a leaf partition in the partition
hierarchy. Multiple partitions (or the root) can be specified from the same table. See example
10 (page 207) for more information.
OWNERS: Optional. Default is all owners. Lists the owner names to be included in the report.
COMPRESS_FILTER: Default is all compression types. To include a particular compress_filter
in the output, you must specify the compression level name, e.g., "low". Allows one or more of
the following compression level names: "none", "low", "medium", "high".
INHERIT: Default is "false". If set to "true", the metrics for a given table includes all inheritors,
but the child tables will not appear in the output. If set to 'false', only the metrics for top level
tables are given for the root table. This additionally affects the values reported in the columns
due to grouping of all the child tables.
Output Column Arguments
The following optional arguments determine which metrics are reported. These arguments
affect the number of columns in the output.
REPORT_SIZE('compressed' | 'uncompressed' | 'all' | 'none') - Default is 'compressed'.
Determines how to report table size. If 'compressed', size in secondary storage (after
compression) is reported. If 'uncompressed', size before compression is reported. If 'all', both
compressed and uncompressed sizes are reported. If 'none', no table size is reported.
REPORT_STATS_MODE('estimated' | 'exact') - Default is 'estimated'. 'estimated' returns the
approximate values as stored by the local database. Please read the following “Warning!” about
'exact' mode.
REPORT_STATS('tuple_count' | 'tuple_space' | 'all' | 'none') - Default is 'none'. Enables
reporting of usage statistics of the space within a file. The 'tuple_space' argument would be
disallowed in 'estimated' mode since we don't currently provide estimated space usage. If
'none', no space statistics are reported.
Teradata Aster Big Analytics Appliance Database Administrator Guide
202
Executables
Creating Scripts
TOAST_SIZE('combined' | 'separate' | 'none' ) - Default is 'combined'. Determines how to
include size of The Oversized-Attribute Storage Technique (TOAST) structures. If 'combined',
the sizes of the TOAST table and TOAST index are included with the table size. If 'separate',
the size of TOAST storage is reported in separate columns with "toast_" prefix. If 'none', the
TOAST table and index are ignored when reporting table size.
TOAST_STATS('combined' | 'separate' | 'none') - Default is 'none'. Determines how to include
stats for the tuples in the TOAST table. Ignored if report_stats('none') is given. If 'combined',
the statistics generated by report_stats will include the TOAST table, but not the toast_index.
If 'separate', then for each column generated by report_stats, there will be a similar column
with a "toast_" prefix which reports the statistics for the TOAST table. If 'none', no usage
statistics are reported for the TOAST table.
PERSISTENCE_MODE('all' | 'no_analytic' | 'only_analytic') - Default is 'all'. Determines how
to report analytic tables. If 'together', analytic tables are reported along with other tables. If
'no_analytic', analytic tables are ignored in the listing. If 'only_analytic', only analytic tables
are reported.
RELATIONS_SHOWN('user' | 'only_catalog' | 'user_and_catalog') - Default is 'user'.
Determines how to report user and catalog tables. This option acts as another layer of filter
over the databases and relations. If 'user', catalog tables are not reported. If 'only_catalog',
then only catalog tables are reported.
Output Column Arguments for Qualifying and Statistical Information
The following are output columns that contain qualifying and statistical information about
the relation.
Where a flag is required (in the list below) the arguments for that particular flag are listed. If a
flag is not required, the output column is generated automatically.
•
ipaddress - IP address for the worker, uniquely determined by vworker column.
•
vworker - Identifier for the virtual worker
•
database - Name of the database with the table or index.
•
schema - Schema name for the table or index.
•
relation - Name of the table or index.
•
partition - Partition reference as a dotted list of names. Only leaf partitions are shown in
the report. Empty if this is a logically partitioned table with no child partitions. NULL if
this is not a logically partitioned table. Partition skew can be observed from these partition
size statistics.
•
owner_name - Owner of the relation/partition.
•
object_type - The object type is either 'relation' or 'index'.
•
storage_type - The storage type is either 'row' or 'column' and is null for indexes.
•
persistence_type - The persistence type is either 'permanent' or 'analytic'.
•
compression_level - The compression level is either 'none', 'low', 'medium', or 'high'.
The following are output columns that contain statistics about the relation.
Please read the following “Warning!” about 'exact' mode and the “Note:” about TOAST.
203
Teradata Aster Big Analytics Appliance Database Administrator Guide
Executables
Creating Scripts
•
storage_size - Required Flags: report_size('compressed' | 'all').
Size of the item on disk, after compression. For tables, this includes the TOAST storage by
default. It will not include TOAST storage if the toast_size argument has a value other than
'combined'. Note: If report_size('uncompressed') is specified here, this output column will
not be displayed.
•
uncompressed_size - Required Flags: report_size('uncompressed' | 'all')
Similar to storage_size, but reports the size before applying file compression. Note that
other data compression on individual values (such as wide varchar columns) may be
performed before computing this value. This happens as part of the TOAST process.
•
toast_storage_size - Required Flags: report_size('compressed' | 'all') AND
toast_size('separate')
Size of TOAST storage on disk after compression.
•
toast_uncompressed_size - Required Flags: report_size('uncompressed' | 'all') AND
toast_size('separate')
Size of TOAST storage before applying file compression.
•
tuple_count - Required Flags: report_stats('tuple_count' | 'all')
Number of live tuples that may be visible to some transaction. If
report_stats_mode('estimate') is given, this is only an approximation, which is zero for
indexes. If toast_stats('combined') is given, this includes the number of tuples in the
TOAST table, which may differ from the main table.
•
dead_tuple_count - Required Flags: report_stats('tuple_count' | 'all')
Number of dead tuples that are not visible to any transaction. If
report_stats_mode('estimate') is given, this is only an approximation, which is zero for
indexes. If toast_stats('combined') is given, this includes the number of dead tuples in the
TOAST table, which may differ from the main table.
•
tuple_space - Required Flags: report_stats('tuple_space' | 'all') AND
report_stats_mode('exact')
Size of space occupied by live tuples. If toast_stats('combined') is given, this includes the
space used by live tuples in the TOAST table.
•
dead_tuple_space - Required Flags: report_stats('tuple_space' | 'all') AND
report_stats_mode('exact')
Size of space occupied by dead tuples. If toast_stats('combined') is given, this includes the
space used by dead tuples in the TOAST table.
•
free_space - Required Flags: report_stats('tuple_space' | 'all') AND
report_stats_mode('exact') If toast_stats('combined') is given, this includes the free space
in the TOAST table. Size of free space present in the relation's allocated space.
Note: free_space is the amount of free space that is present in the currently allocated space
for this table. This can be used for new tuples, but this does not limit the space that can be
allocated for new tuples.
•
toast_tuple_count - Required Flags: report_stats('tuple_count' | 'all') AND
toast_stats('separate')
Teradata Aster Big Analytics Appliance Database Administrator Guide
204
Executables
Creating Scripts
Number of live tuples in the TOAST table. If report_stats_mode('estimate') is given, this is
only an approximation, which is zero for indexes.
•
toast_dead_tuple_count - Required Flags: report_stats('tuple_count' | 'all') AND
toast_stats('separate')
Number of dead tuples in the TOAST table. If report_stats('estimate') is given, this is only
an approximation, which is zero for indexes.
•
toast_tuple_space - Required Flags: report_stats('tuple_space' | 'all') AND
report_stats_mode('exact') AND toast_stats('separate')
Size of space occupied by current 'live' or 'active' tuples in the TOAST table.
•
toast_dead_tuple_space - Required Flags: report_stats('tuple_space' | 'all') AND
report_stats_mode('exact') AND toast_stats('separate')
Size of space occupied by dead tuples in the TOAST table.
•
toast_free_space - Required Flags: report_stats('tuple_space' | 'all') AND
report_stats_mode('exact') AND toast_stats('separate')
Size of free space present in the TOAST table's allocated space.
Note: The nc_relationstats function reports tuple counts as the tuple count physically stored on each vworker. For a dimension table containing n tuples, it will report n tuples per vworker. This data is correct as it is replicated across all the vworkers.
However, if you perform the following query:
SELECT count(*) from dimensionTable;
it will only report n tuples of the dimension table as a single entity and will not account for all the replicated tuples on each
vworker.
Example nc_relationstats Calls
The table below contains sample code for various operations to retrieve table size information
and to gather additional, unreported, storage space estimates using nc_relationstats.
Note: In the Aster Database, the term "relation" refers to both "table" and "index".
Table 11 - 1: Sample Code for nc_relationstats
No.
Description
Sample Code
1
Get relation size for
everything on a database (all
schema/relations).
SELECT * FROM nc_relationstats(
ON (SELECT 1)
PARTITION BY 1
DATABASES ('niray')
);
2
Get relation size for
everything that is
uncompressed in a database.
SELECT * FROM nc_relationstats(
ON (SELECT 1)
PARTITION BY 1
DATABASES ('niray')
COMPRESS_FILTER ('none') -- Replace 'none' with 'high',
'low' or 'medium' to filter by correct compression
);
This is helpful in
determining which tables
you may need to compress, if
disk space is an issue.
205
Teradata Aster Big Analytics Appliance Database Administrator Guide
Executables
Creating Scripts
Table 11 - 1: Sample Code for nc_relationstats (continued)
No.
Description
Sample Code
3
Get sizes for all relations in a
user schema 'karthik'.
SELECT * FROM nc_relationstats(
ON (SELECT 1)
PARTITION BY 1
DATABASES ('niray')
PATTERN ('karthik','%') -- first entry is for schema, and
the second entry is for relations in that schema
);
4
Get sizes of all relations in a
public schema where
relation name matches a
wildcard.
SELECT * FROM nc_relationstats(
ON (SELECT 1)
PARTITION BY 1
DATABASES ('niray')
PATTERN ('public','factTables%')
);
For example, all tables that
start with the prefix
'factTables'.
5
Get sizes of all relations
whose relation name has the
word 'part' in all schemas
but only in the specified
database.
SELECT * FROM nc_relationstats(
ON (SELECT 1)
PARTITION BY 1
DATABASES ('niray')
PATTERN ('%part%')
);
6
This example gives the top
10 uncompressed tables that
could be compressed to save
space.
SELECT schema, relation, partition_name,
round(sum(storage_size)/(1024*1024*1024), 2) GB
FROM nc_relationstats(
ON (SELECT 1)
PARTITION BY 1
DATABASES ('niray')
COMPRESS_FILTER ('none')
)
GROUP BY 1,2,3
ORDER BY 4 desc limit 10;
Partition skew can be
observed from the partition
size statistics.
7
Top 10 tables which cause
maximum skew in terms of
on-disk space.
SELECT schema, relation, partition_name, max(storage_size) min(storage_size) as worst_skew
FROM nc_relationstats(
ON (SELECT 1)
PARTITION BY 1
DATABASES ('niray')
)
where object_type='relation' -- avoids indexes
GROUP BY 1,2,3
ORDER BY 4 desc
limit 10;
Teradata Aster Big Analytics Appliance Database Administrator Guide
206
Executables
Creating Scripts
Table 11 - 1: Sample Code for nc_relationstats (continued)
No.
Description
Sample Code
8
Use a Group By to find out
the full tuple count and size
of a fact table.
SELECT schema, relation, sum(storage_size) as storage_size,
sum(tuple_count) as num_tuples
FROM nc_relationstats(
ON (SELECT 1)
PARTITION BY 1
DATABASES ('niray')
RELATIONS ('schema."factTblName"')
REPORT_STATS('tuple_count')
REPORT_STATS_MODE('exact')
)
GROUP BY schema, relation;
Please read this “Warning!”
regarding caution in the use
of the 'exact' mode.
9
Check the sizes of logical
partitions separately.
SELECT schema, relation, partition_name, sum(storage_size)
FROM nc_relationstats(
ON (SELECT 1)
PARTITION BY 1
DATABASES ('niray')
RELATIONS ('schema."TblName"')
)
GROUP BY 1,2,3;
10
Find the exact dead tuple
count.
SELECT schema, relation, sum(dead_tuple_count) as
dead_tuples, sum(tuple_count) as total_tuples
FROM nc_relationstats(
ON (SELECT 1)
PARTITION BY 1
DATABASES ('niray')
RELATION ('schema."TblName"')
REPORT_STATS('tuple_count')
REPORT_STATS_MODE('exact')
)
GROUP BY schema, relation;
This is useful in making a
decision on performing a
VACUUM or not.
Please read this “Warning!”
regarding caution in the use
of the 'exact' mode.
11
Get the size of two specific
partitions.
SELECT * FROM nc_relationstats(
ON (SELECT 1)
PARTITION BY 1
DATABASES ('niray')
RELATIONS ('karthik."tableName"')
PARTITIONS ('level1a.level2','level1b.level2')
);
12
Find storage sizes occupied
by SQL-MR functions and
installed files.
SELECT vworker, sum(storage_size) as sqlmr_function_size
FROM nc_relationstats(
ON (SELECT 1)
PARTITION BY 1
DATABASES ('*')
RELATIONS ('_bee_special.nc_installed_files')
REPORT_STATS_MODE('estimated')
REPORT_STATS('tuple_count')
RELATIONS_SHOWN('user_and_catalog')
)
GROUP BY vworker
ORDER BY vworker;
The output will contain the
amount of space occupied
per vworker.
This will not contain the
space occupied by the SQLMR functions packaged with
the Aster Database.
Example with Output
SELECT * FROM nc_relationstats(
207
Teradata Aster Big Analytics Appliance Database Administrator Guide
Executables
Creating Scripts
ON (SELECT 1)
PARTITION BY 1
DATABASES('beehive')
RELATIONS('tab')
REPORT_STATS_MODE('estimated')
REPORT_SIZE('compressed')
REPORT_STATS('tuple_count')
)
ORDER BY relation,vworker
;
Returns results like:
+--------------+---------+----------+--------+----------+------------------+--------------+
|
ipaddress
| vworker | database | schema | relation | partition_name
|
owner_name
|
+--------------+---------+----------+--------+----------+------------------+--------------+
| 10.80.142.22 | w1005z
| beehive
| public | tab
| tab_partition_1
| db_superuser |
| 10.80.142.22 | w1005z
| beehive
| public | tab
| tab_partition_2
| db_superuser |
| 10.80.142.21 | w1006z
| beehive
| public | tab
| tab_partition_1
| db_superuser |
| 10.80.142.21 | w1006z
| beehive
| public | tab
| tab_partition_2
| db_superuser |
+--------------+---------+----------+--------+----------+------------------+--------------+
(4 rows)
table continued...
+--------------+-------------+------------------+-------------------+
| storage_type | object_type | persistence_type | compression_level |
+--------------+-------------+------------------+-------------------+
| row
| relation
| permanent
| none
|
| row
| relation
| permanent
| none
|
| row
| relation
| permanent
| none
|
| row
| relation
| permanent
| none
|
+--------------+-------------+------------------+-------------------+
table continued...
+--------------+-------------+------------------+
| storage_size | tuple_count | dead_tuple_count |
+--------------+-------------+------------------+
|
0 |
0 |
0 |
|
32768 |
1 |
0 |
|
32768 |
1 |
0 |
|
32768 |
0 |
1 |
+--------------+-------------+------------------+
Using ANALYZE Before Estimating Size
The 'estimated' mode in nc_relationstats depends on the statistics gathered by the local
database instance on each node. This is approximated as accurately as possible. However, if
major DML changes to a relation have occurred, the approximate values can become far off
from the exact values. In this scenario, you can use the ANALYZE command to refresh the
statistics for the relation and then obtain a better approximation through nc_relationstats.
Tuple Count with Replicated Dimension Tables
When running nc_relationstats, the count(*) number of tuples will be shown per vworker in
tuple_count. When a GROUP BY is done on the relation, the function will sum up the tuple
counts of all vworkers. This can be confusing when working with replicated dimension tables.
Even though the tuples are the same (replicated) on each vworker in this case, the total
number of tuples across all vworkers is reported.
Teradata Aster Big Analytics Appliance Database Administrator Guide
208
Executables
Creating Scripts
Estimating Storage Size for Non-Leaf Partitions
In logically partitioned tables, the tuples reside only in the leaf nodes. However, the non-leaf
partitions also occupy storage space. The function nc_relationstats does not report storage
size for the non-leaf partitions in a logically partitioned table.
You can use the following information to estimate the storage size of non-leaf partitions. The
storage at non-leaf partitions is comprised of:
•
One page that is allocated for TOAST storage, provided it exists for the relation. This is
applicable for both column-based and row-based relations.
•
Two or three pages (depending on the data type: for int it is two and for larger types like
numeric or varchar it is three) that are allocated for each attribute if the relation is
columnar.
Note: When a COPY command or an “ncluster_loader” command is executed on a columnar table without specifying any input data, it results in reserving space for each of the columnar files. Although intended as a pre-allocation
step, this discrepancy will be evident when we are creating logically partitioned tables and also when we load data
(using ncluster_loader) into a fact table and one of the partitions does not receive any tuples due to hash distribution.
nc_skew
The nc_skew function (table skew function) does a statistical test for skew across the cluster
on a set of data (tables) you supply at runtime. The function takes a distribution of some
metric (usually the table size or number of rows) of a table over many vworkers and
determines whether or not the data distribution is skewed. It tests the skewness of table
distribution by using a chi-square test. If there is no skew in a table, the test result for that table
will not be output to the screen.
Note: the “nc_relationstats” on page 200 function is another excellent tool for detecting skew.
Syntax
SELECT * FROM nc_skew(
ON input_table
PARTITION BY partition_by_columns
PARTITIONS('partition_by_columns')
[METRIC('metric_column_name')]
[PVALUE('p_value')]
[VWORKERCHECK('true | false')]
);
Arguments
PARTITIONS: Required. A list of column names that are used in the 'PARTITION BY' clause.
The order can be different than the order in which they occur in the 'PARTITION BY' clause.
METRIC: Optional. The name of the column containing the metric. If not specified, the
function uses the second column by default (since partition is often first).
PVALUE: Optional. The significant value for the chi-square test. Valid values should be in the
range of [0,0.25]. Default value is 0.05.
209
Teradata Aster Big Analytics Appliance Database Administrator Guide
Executables
Creating Scripts
VWORKERCHECK: Optional. If 'true', the function will compare the number of data
partitions to the number of function invocations to see if any expected data is missing. Default
value is 'false'.
Output
The function outputs a table that includes the tablename, p-value, chi-square result, and the
minimum, maximum and average values of the metric for each table where skew was detected.
Example
The table below shows example Input Data from table nc_data_skew_testdata.
Table 11 - 1: Input to nc_data_skew_testdata
ind
tablename
cnt
1
table1
106
2
table1
90
3
table1
105
4
table1
108
5
table1
123
6
table1
114
7
table1
78
8
table1
125
9
table1
112
10
table1
84
11
table1
92
12
table1
82
13
table1
105
14
table1
103
15
table1
136
16
table1
78
17
table1
74
18
table1
73
19
table1
127
20
table1
108
21
table2
15
22
table2
18
23
table2
16
Teradata Aster Big Analytics Appliance Database Administrator Guide
210
Executables
Creating Scripts
Table 11 - 1: Input to nc_data_skew_testdata (continued)
211
ind
tablename
cnt
24
table2
20
25
table2
20
26
table2
18
27
table2
5
28
table2
6
29
table2
15
30
table2
10
31
table2
14
32
table2
1
33
table2
1
34
table2
1
35
table2
4
36
table2
12
37
table2
2
38
table2
15
39
table2
11
40
table2
19
41
table3
100
42
table3
100
43
table3
100
44
table3
99
45
table3
99
46
table3
101
47
table3
98
48
table3
100
49
table3
99
50
table3
100
51
table3
99
52
table3
99
53
table3
100
Teradata Aster Big Analytics Appliance Database Administrator Guide
Executables
Best practices for building scripts
Table 11 - 1: Input to nc_data_skew_testdata (continued)
ind
tablename
cnt
54
table3
101
55
table3
99
56
table3
101
57
table3
101
58
table3
101
59
table3
101
60
table3
101
Example SQL-MR call
SELECT * FROM nc_skew(
ON nc_data_skew_testdata
PARTITION BY tablename
PARTITIONS('tablename')
METRIC('cnt')
)
ORDER BY tablename;
The table below shows sample output from nc_skew.
Table 11 - 2: Sample output from nc_skew
tablename
pvalue
chisquare
min_value
max_value
avg_value
table1
2.33907705871061e-07
67.548690064261
73
136
101
table2
1.47257894766994e-09
80.5874439461883
1
20
11
Best practices for building scripts
The following best practices will help you build successful scripts:
•
Use the preinstalled scripts as a template whenever possible.
These scripts have been tested, and can be relied upon to have proper syntax and logic.
Note that preinstalled scripts may be examined, but not edited. You may view the code
from these scripts by selecting the pencil icon. A window will appear with information
about the script and the code itself, all read-only. By clicking the “Clone” button, you will
create a copy of the script that may be edited for use in creating your own scripts.
•
Issue several SQL commands in a single transaction. Although this is not always possible,
doing so reduces the number of times ACT is launched, and therefore cuts down the
potential points of failure for the script. It is also usually faster, not only because you don't
need to load and run ACT multiple times, but also because there is substantially less
overhead in running a single transaction than in running multiple transactions, especially
in a distributed system like Aster Database.
Teradata Aster Big Analytics Appliance Database Administrator Guide
212
Executables
Upgrades
•
Invoke ACT via bash to perform checks before issuing SQL commands. If your SQL script
contains many transactions, consider invoking ACT via bash instead of using the built-in
SQL script functionality in the AMC. To invoke ACT via bash, you would first create a SQL
script and save it on the queen. When invoking ACT, you must pass the username and
password of an Aster Database user who has sufficient rights to run the SQL in the script.
Then create a shell script like the following to call ACT and run your SQL script:
#!/bin/bash
#This script acts as a wrapper around act to launch an
#SQL script called logical_partitioning_list. The basic
#parts of the act call are the username/password
#credentials, options (db, etc) and the reference
#to the SQL script file to be run.
#You may wish to specify the database in the script
#with -d, or ask the user.
act -u "$FLAG_username" -w "$FLAG_password" -c "SELECT 1" -f /home/
beehive/scripts/logical_partitioning_list.sql
If you are invoking ACT via bash, you can look at the exit code of ACT or the stdout/stderr
data that ACT returns to determine whether to run the next command in the sequence. If
the you are invoking ACT via SQL scripts, then the behavior is identical to that of invoking
ACT with a -f option that points to a file containing a list of SQL statements, and there is
no opportunity to perform checks before issuing additional commands.
Upgrades
Rebuilding Custom Scripts
The Aster Database Executables framework is very flexible and powerful, because you have full
access to the cluster at the command-line level. The flip side is that you are not dealing with an
API that guarantees a stable set of commands through subsequent upgrades. Scripting
interacts with many parts of Aster Database that could potentially change in each new version
(for example, CLI changes or SQL changes). However, SQL scripts are more likely to remain
functional after upgrading. To assist in troubleshooting any issues, each script is tagged with
the Aster Database version number it was created under, but it is possible that upgrading may
require rebuilding any scripts created under a prior version.
Executables Not Supported from Prior Versions
The function nc_relationstats replaces the functionality of all the (Not Supported) executables
in the following list. These will only appear in the list of executables if you have upgraded from
an older version of Aster Database. Even if they do appear in the list of executables, you should
use nc_relationstats instead:
213
•
All Table Sizes - (Not Supported)
•
Table Size - (Not Supported)
•
Table Size (Details) - (Not Supported)
Teradata Aster Big Analytics Appliance Database Administrator Guide
CHAPTER 12
Teradata Tools
Teradata provides a unified management environment to support both Teradata and Aster
Database. This chapter describes the tools available from Teradata for managing Aster
Database.
See these sections for details:
•
Managing Aster Database with Teradata Viewpoint
Managing Aster Database with Teradata
Viewpoint
Beginning with Aster Database version 5.0, you can use Teradata Viewpoint to manage Aster
Database on the Teradata Aster MapReduce Appliance. This gives database administrators one
unified platform from which to view information about and perform administrative tasks for
both Teradata database and Aster Database. Viewpoint is analogous to the Aster Database
AMC in its functionality.
Overview
Viewpoint communicates with Aster Database through a Web service API, using SSL and
HTTPS. The API enables administrators to view information about Aster Database securely
through Viewpoint. It also enables performance of many administrative tasks though
Viewpoint.
Note that you must use the “db_superuser” credential when accessing Aster Database through
Viewpoint. This ensures that you will have the correct permissions to view information about
Aster Database and perform administrative functions.
Information available through Viewpoint
The information in this section is not meant to be exhaustive, but merely to provide an idea of
what kind of information about Aster Database is available through Viewpoint. Based on this
and the information about the AMC, administrators can decide which access portal makes
more sense for their purposes.
Teradata Aster Big Analytics Appliance Database Administrator Guide
214
Teradata Tools
Managing Aster Database with Teradata Viewpoint
Most of the information Viewpoint displays about Aster Database is real time. The one
exception is information about table size. Table size information is updated daily.
The following information about Aster Database can be viewed in Viewpoint:
Processes and Sessions
Table 12 - 1: Process and Session information available in Viewpoint
Information Type
Information Details
Processes
Returns a list of all processes, optionally filtered by attributes.
Process Statements
Information about the statements that make up a process.
Process Phases
Information about process phases and their statuses
Process Phase
Statements
Information about the individual statements that make up a
process phase.
Workload Management
Information about workload policies and service classes configured
within Aster Database.
Sessions
Information about sessions (connected and historical) and their
status.
Cluster and Nodes Resources
Table 12 - 2: Cluster and Node Resource information available in Viewpoint
Information Type
Information Details
Cluster Status
Information about the status of the cluster.
Replication Factor
Information about the replication factor.
Nodes
Information about the nodes in a cluster.
Node Status
Status information about the nodes in a cluster.
Storage
Storage information about the cluster.
Virtual Workers
Information about virtual workers in a cluster.
Component Statistics
Statistics about the components in a cluster.
Hardware Configuration
Information about a node’s hardware configuration, including
CPU(s), RAM and CPU cache.
Tablespace
Information about tablespace compression, storage type, dead
tuples, and space used.
Administrative operations available through Viewpoint
The information in this section is not meant to be exhaustive, but merely to provide an idea of
what kind of administrative functions are available through Viewpoint. Based on this and the
215
Teradata Aster Big Analytics Appliance Database Administrator Guide
Teradata Tools
Managing Aster Database with Teradata Viewpoint
information about the AMC, administrators can decide which access portal makes more sense
for their purposes.
The following tasks can be performed in Aster Database through Viewpoint:
Cluster Administration
Table 12 - 3: Cluster Administration functions available in Viewpoint
Task Name
Task Details
Hard Restart
Performs a hard restart of the cluster (reboots all nodes).
Soft Restart
Performs a soft restart of the cluster (restarts Aster Database
services).
Rebalance Data
Rebalances data among vworkers.
Rebalance Process
Rebalances processes among vworkers.
Upload and Distribute
Uploads a file to the cluster and distributes is to vworkers.
Workload Administration
Table 12 - 4: Workload Administration functions available in Viewpoint
Task Name
Task Details
Service Classes
Performs a hard restart of the cluster (reboots all nodes).
Workload Policies
Performs a soft restart of the cluster (restarts Aster Database
services).
Save Service Classes
Rebalances data among vworkers.
Save Workload Policies
Rebalances processes among vworkers.
Teradata Aster Big Analytics Appliance Database Administrator Guide
216
Teradata Tools
Managing Aster Database with Teradata Viewpoint
Network Administration
Table 12 - 5: Network Administration functions available in Viewpoint
Task Name
Task Details
Node Network Configuration
Sets a network configuration for a node.
Node Network Current
Configuration
Shows the current network configuration.
Node Network Function Assignments
Assigns an Aster Database function to a network on a
node.
Save Network Configuration
Saves the network configuration.
Apply Network Configuration
Applies the network configuration.
Save Network Function Assignments
Saves the network function assignments.
Executables Framework
Table 12 - 6: Executables Framework Administration functions available in Viewpoint
Task Name
Task Details
Executables Job List
Lists executable jobs.
Executables List
Lists available executable jobs.
Start Executable
Starts the designated executable.
Save Executable
Saves the executable settings.
Log Bundling
Table 12 - 7: Log Bundling functions available in Viewpoint
Task Name
Task Details
Log Bundles
Shows existing log bundles.
Create Log Bundles
Creates log bundles for transmission to Teradata Global Technical
Support (GTS).
For more information, see the Viewpoint documentation, available from Teradata.
Configuring Aster Database for use with Viewpoint
Before you can access Aster Database through Viewpoint, you must edit a configuration file to
enable the integration. To do this:
1
Open the configuration file in your favorite editor:
/home/beehive/config/dbinfocollector_default.cfg
217
2
Set the disabled flag to false.
3
Modify the following parameters appropriately:
Teradata Aster Big Analytics Appliance Database Administrator Guide
Teradata Tools
Managing Aster Database with Teradata Viewpoint
•
credential - the username and password of the database user that runs the statistics
collection query. This should be ‘db_superuser’ or the equivalent.
•
schedule_day - The day of the week (0-6) to run the statistics collection, or 7 for
every day of the week.
•
schedule_time - the time of day to run the statistics collection. This should be a time
when the cluster is not busy.
Troubleshooting the Viewpoint integration
If you notice that Viewpoint is not reporting accurate values in a timely manner, see the points
below:
•
If you find that the information displayed in Viewpoint is more than about 20 seconds out
of date, this is expected only for statistics related to disk consumption (disk used, free disk
still available, etc.) when compression (compressed tables or compressed indexes) is
involved.
•
In clusters that are running near their maximum CPU or disk I/O capacity, or in which
large amounts of new or changed data need HIGH compression, delays may be even
longer than 30 minutes.
Teradata Aster Big Analytics Appliance Database Administrator Guide
218
Teradata Tools
Managing Aster Database with Teradata Viewpoint
219
Teradata Aster Big Analytics Appliance Database Administrator Guide
CHAPTER 13
Logs in Aster Database
Aster Database automatically tracks its activity in a variety of log files. The log files are useful
when you need to find the cause of an error or unexpected behavior, or when you just want to
confirm that an operation has taken place.
You can access Aster Database log files through the AMC.
•
To view log files for individual worker or loader nodes or view the system logs stored on
the queen, use the Node Details tab. To display this tab, click Nodes > Node Overview, then
click the IP address or name of the node. Click the Prep, System, or Kernel link to view the
desired log. See “Read Aster Database Logs in the AMC” on page 63.
•
To view (or create) bundles containing multiple log files, which you can send to Teradata
Global Technical Support (GTS) along with a request for troubleshooting assistance, use
the Logs tab. To display this tab, click Admin > Logs, then click the Prepare, Download, or Send
link.
Diagnostic Log Bundles
This section explains contains these topics:
•
Overview
•
View Diagnostic Bundle Jobs
•
Send a Diagnostic Log Bundle
•
Save a Diagnostic Log Bundle on Your Local Filesystem
•
Include All Nodes in a Diagnostic Log Bundle
•
Prepare a Custom Diagnostic Log Bundle
•
Run Custom Commands in a Diagnostic Log Bundle Job
•
“View Diagnostic Log Bundle Contents” on page 225
Overview
When an issue arises on a cluster, one of the first steps in finding the cause is to retrieve the
relevant log files. Aster Database is made up of a large array of distinct services, and it
produces more than 60 different logs spread across every node in the cluster. The AMC
Teradata Aster Big Analytics Appliance Database Administrator Guide
220
Logs in Aster Database
Diagnostic Log Bundles
provides an easy way for you to deal with all these different logs by creating diagnostic log
bundles. A diagnostic log bundle is a compressed tarball containing data used to determine the
system context and diagnose Aster Database issues. This data may come in system logs from
the queen and subordinate nodes (worker and loader).
By using diagnostic log bundles, you can more easily send information to Teradata Global
Technical Support (GTS) for analysis, reducing the time and effort required to diagnose
system problems.
Only AMC users with administrative privileges can create, download, and send diagnostic log
bundles.
View Diagnostic Bundle Jobs
To display the diagnostic bundle jobs in AMC, choose Admin > Logs.
A list of diagnostic bundle jobs is displayed.
Figure 8: AMC list of diagnostic bundle jobs
For each job, the following information is shown:
Table 13 - 1: Diagnostic Bundle Job Information in AMC
221
Field
Description
Job ID
System-generated unique number to identify the job.
Type
Queen or Cluster. A queen-type bundle includes only log files and
information from the queen. A cluster-type bundle includes log files
and information from all nodes, including the queen.
Status
Tells whether the job is currently running, completed, or failed.
Submitted by
Tells what initiated the job. “System” means the job was run
automatically by the AMC. If the job was manually initiated, the
username of the person who submitted the job is displayed.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Logs in Aster Database
Diagnostic Log Bundles
Table 13 - 1: Diagnostic Bundle Job Information in AMC (continued)
Field
Description
Start Time
Start time of the log content. That is, the time of the first logged event
included in the bundle.
End Time
End time of the log content.
Filename
Name of the log bundle file. The name indicates the time the bundle
creation job was initiated.
Filesize
Size of the log bundle file in MB.
PrepareClusterBundle
Click Prepare create a complete bundle that includes logs from the
other nodes as well
Download
Click Download to download a diagnostic log bundle.
Send to Aster Support
Click Send to use to send the log bundle to Teradata Global Technical
Support (GTS).
Send a Diagnostic Log Bundle
The Diagnostic Bundle Jobs panel provides links that you can use to send log bundles directly
to Teradata Global Technical Support (GTS).
Note! Before you can send logs directly to Teradata Global Technical Support (GTS), you must configure the Cluster
Settings and Aster Support Settings (Admin > Configuration > Cluster Settings).
To send a log bundle, follow these steps:
1
Open the Diagnostic Bundle Jobs panel in AMC (Admin > Logs).
2
To send a log bundle to Teradata Global Technical Support (GTS), click the log’s
corresponding Send link in the Send to Aster Support column.
Figure 9: Sending a log file to Teradata Aster
3
In the confirmation dialog, click OK.
While the bundle is being sent, a blue progress bar appears next to the Send link. If the
sending succeeds, the bar becomes green. If the sending fails, the bar becomes red. Move
the mouse over the bar to display status information.
Teradata Aster Big Analytics Appliance Database Administrator Guide
222
Logs in Aster Database
Diagnostic Log Bundles
Save a Diagnostic Log Bundle on Your Local Filesystem
You can download and save a diagnostic log bundle file on your local filesystem. This is useful
if you want to view the contents of the file or if you need to send the bundle to Teradata Global
Technical Support (GTS) but you can not connect to the support server URL.
To download a diagnostic log bundle:
1
Open the Diagnostic Bundle Jobs panel in AMC (Admin > Logs).
2
Click the log’s corresponding Download link in the Download column.
3
Follow the instructions to save the bundle (a .gz file) on your system.
Include All Nodes in a Diagnostic Log Bundle
By default, a diagnostic log bundle contains only system logs from the queen. If you want to
create a complete bundle that includes logs from the other nodes as well, you can create what
is called a “cluster bundle” by clicking the Prepare link.
Another way to include all nodes in a bundle is to click the Manually Initiate Diagnostic Bundle
button. This displays a dialog that provides many more choices, including the choice to
include queen and cluster nodes in the bundle, set a time window, and add custom
commands. For more information, see the next section, “Prepare a Custom Diagnostic Log
Bundle”.
Prepare a Custom Diagnostic Log Bundle
You can start a log bundle job with custom settings such as a particular date range.
1
Display the Diagnostic Bundle Jobs panel as described in “View Diagnostic Bundle Jobs”
on page 221.
2
Click Manually Initiate Diagnostic Bundle.
Figure 10: The Manually Initiate Diagnostic Bundle button
3
223
Modify the settings as desired. For example, set a custom start and end time for the
contents of the log bundle.
Teradata Aster Big Analytics Appliance Database Administrator Guide
Logs in Aster Database
Diagnostic Log Bundles
Figure 11: Initiating Diagnostic Bundle
4
If desired, click Advanced to add custom commands. These are explained in the next
section, “Run Custom Commands in a Diagnostic Log Bundle Job”.
5
Click Create Bundle Now.
The job starts immediately. You can not save the settings and schedule the job to run later.
Run Custom Commands in a Diagnostic Log Bundle Job
When you manually initiate a diagnostic log bundle job, you can optionally run additional
commands and record the output of those commands as part of the log bundle. For example,
you might want to run Linux commands that will help diagnose a problem, such as ps or
vmstat.
The commands are run on each machine where a bundle is being created. If the bundle is a
queen-only bundle, the commands run on the queen. If the bundle is a queen and cluster
bundle, then the commands run on all of the nodes (queen, workers, and loaders). The
command must, of course, exist on every node where it will be run.
Warning! Custom commands run as user “beehive,” which is a fairly powerful user role. Be careful what commands
you perform, as “beehive” has broad permissions that might permit you to unintentionally disrupt the cluster.
To add custom commands to a diagnostic log bundle job, perform the following steps.
1
If the command is a custom program or shell script, copy it to every node on which it will
be run, and put it in the same directory on each node.
2
Display the Diagnostic Bundle Jobs panel as described in “View Diagnostic Bundle Jobs”
on page 221.
3
Click Manually Initiate Diagnostic Bundle.
4
Click Advanced.
5
Additional fields appear so you can enter your custom commands. The following example
shows two types of commands: a custom script and a query file passed as an argument to
ACT.
Teradata Aster Big Analytics Appliance Database Administrator Guide
224
Logs in Aster Database
Diagnostic Log Bundles
Figure 12: Initiate Diagnostic Bundle
Enter any one-line command that you could normally run at the Linux command line,
such as:
6
•
A standard Linux operation; for example, ps, vmstat, ls, and so on.
•
Your own shell script or custom program. Include the full directory path as well as the
command name.
•
An SQL command, specified by running ACT from the command line and passing in
the SQL as a parameter. When you pass in the SQL as a file, the SQL file must be
present on every node where the command will run.
Click Create Bundle Now.
If the commands succeed, the output of the commands will be included in the bundle
(.tgz) file(s). If you request a queen and cluster bundle, you will get two separate .tgz files,
one for the queen and one for the rest of the cluster.
Troubleshooting: If the command(s) do not run successfully, the bundling operation will
usually complete anyway, but in the Diagnostic Bundle Jobs table on the Support tab you
will see that the Status column contains a yellow exclamation point rather than a green
checkmark. The word Completed will be underlined, and if you hover over the word or click
it, you will get a short description of the problems that occurred when the AMC tried to
run the command.
To see the output, use the steps in the next section, “View Diagnostic Log Bundle Contents”.
View Diagnostic Log Bundle Contents
To look at the contents of a diagnostic log bundle:
225
Teradata Aster Big Analytics Appliance Database Administrator Guide
Logs in Aster Database
Diagnostic Log Bundles
1
Find the .tgz file. Do one of the following:
•
Download the file as described in “Save a Diagnostic Log Bundle on Your Local
Filesystem” on page 223; or,
•
Open an ssh session to the queen machine and look in the directory /primary/
diagbundles. Look for a file with the same name shown in the list of diagnostic log
bundle jobs. The name will look similar to YYYYMMDD_HH.MI.SS.tgz, where YYYY
represents the year, MM represents the month, and so on.
2
Unzip the file. The result is a .tar file.
3
Untar the .tar file.
For a queen bundle, this will yield a set of directories with names like configs, cores,
customcmds, logs, meta, and sysprofile.
For a cluster bundle, when you untar the bundle, you will get two directories, one named
“meta” and one named “gather”. Inside the “gather” directory is a .tgz file for each worker
node and loader node. If you unzip and untar one of these .tgz files, you will get the same
directories shown above, with the files gathered from that node.
4
Use cd to change into the appropriate directory.
For example, to see the output of any custom commands that you passed in to the job, go
to the customcmds directory. Use ls to make sure one file is displayed for each of the
commands that you specified. The file names are the command numbers. In our example,
there would be two files named 1 and 2.
5
Use your favorite text editor to view the contents of any file in the bundle directory.
Teradata Aster Big Analytics Appliance Database Administrator Guide
226
Logs in Aster Database
Diagnostic Log Bundles
227
Teradata Aster Big Analytics Appliance Database Administrator Guide
Aster Glossary
This glossary lists terms you will encounter in building and using databases and applications
in Aster Database.
ACT
Aster Database Cluster Terminal (ACT) is the terminal-based SQL query client for Aster
Database.
AMC
Aster Management Console is a web-based administrative console that allows you to monitor
and control Aster Database.
Aster Database partitioning
See distribution (of rows).
Aster Database Data Validator
Discontinued utility in Aster Database that was used to check data.
Aster Database Loader
Also written as “ncluster_loader”, this is Aster Database’s command-line bulk loading utility.
Customers are encouraged to use this rather than Bulk Feeder. It provides an alternative to the
SQL INSERT statement and offers much better performance and error handling. Hint! Do not
confuse Aster Database Loader with a loader node.
Aster Database replication
To provide availability, Aster Database is designed to maintain multiple copies (usually two) of
your data. Maintaining these copies is called replication, and, when you create your cluster,
you specify a desired replication factor that tells the system how many copies to maintain.
Teradata Aster recommends running Aster Database at a replication factor of two, which
means the cluster stores two copies of your data at all times.
Replication is achieved by maintaining a copy of each Aster Database vworker. Recall that, in
the distributed architecture of Aster Database, your data is distributed across many vworkers
that do the work of retrieving data and performing calculations on the data.
Teradata Aster Big Analytics Appliance Database Administrator Guide
228
For a given partition, we refer to the vworker holding the active copy of your data as the
“active vworker” and the one holding the backup copy as the “passive vworker.” If the active
vworker fails, the passive vworker takes over immediately.
automatic logical partitioning
The method of partitioning or splitting a large table into child partitions to optimize query
performance and simplify table administration. Automatic logical partitioning uses the
PARTITION BY RANGE or PARTITION BY LIST clause to create a partitioned table, and is
the preferred method of logical partitioning.
_bee_stats database
See system tables.
backup queen
A second queen that you can activate if your queen fails. Usually it's kept in powered-down or
STOPPED state with no workers connected, and only powered up when needed. Sometimes
called a backup queen.
balance process
Also known as “balance processing,” this is the act of making sure active vworkers are evenly
distributed on your cluster's hardware, so that each worker node has about the same number
of active vworkers as all other worker nodes.
balance data
Also known as “balance storage,” this is the act of making sure your Aster Database contains
the required number of copies of your data (as specified by your replication factor; usually two
copies), and making sure that each replica copy is located on a separate physical worker node
from the primary copy.
BIT
Outdated name for ACT.
Bulk Feeder
Unsupported bulk-loading application; replaced with Aster Database Loader.
child partition
In an automatic logical partitioning schema, a child partition is one partition of the data in
the partitioned table.
child table
In a parent-child table inheritance schema, a child table is one partition of the data.
coordinator
Old term for the Aster Database queen.
CSV
Comma-separated value file format where commas are used as field separators.
229
Teradata Aster Big Analytics Appliance Database Administrator Guide
CTAS
CREATE TABLE AS SELECT This is just a variant of a CREATE TABLE statement with a
SELECT subquery that populates the new table with rows from an existing table. These are
used very frequently in the Aster Database context to manually repartition a table or to do the
‘transform’ part of ELT. See ELT and ETL.
data locality
The state of having needed data local to (on the same machine as) an operation or other data.
Having data locality is a key factor affecting the efficiency of an operation in an MPP system.
data model
A database’s structure of tables and columns that determine what form and format the data
will be stored in.
DDL
Data definition language to create and alter database tables.
dimension table
One of the two main table types in a star schema-style database. A row in a dimension table
usually describes an item in detail. A dimension table stores unchanging or slowly changing
descriptions of the participants in the actions tracked by your database. (The details of each
action are recorded in the fact table.) For example, product names and descriptions usually
live in a dimension table. The volume of data in a dimension table typically grows slowly.
Often a dimension table enumerates the set of known values for a particular category.
distributed dimension table
In Aster Database you can optionally distribute a dimension table by declaring a distribution
key on it.
distributed query planning
The queen manages the distribution of data in the cluster, prepares top-level, partition-aware
query plans, issues queries to vworkers, and assembles the query results. The vworkers, in
turn, prepare local query plans and execute the queen's queries in parallel. The queen
structures top-level queries so that little or no data is shipped to the queen until the final
phase, when the query results are assembled and sent to the client.
distribution (of rows)
Distribution of rows (sometimes called “physical partitioning”) means splitting a table’s data
across many vworkers in Aster Database to allow scaling. This is a key Aster Database feature.
A physical partition is a subset or “slice” of rows stored on a vworker.
Don’t confuse Aster Database distribution with the common data modeling practice of logical
partitioning (a.k.a. parent-child table inheritance or automatic logical partitioning), which
you can also do in Aster Database. The difference is this: distribution happens automatically
based on the distribution key you declare using DISTRIBUTE BY when you create the table.
The distribution is automatic in the sense that you don’t have to declare the boundaries of
each partition. Instead, you just say which column (this is called the distribution key column)
Teradata Aster Big Analytics Appliance Database Administrator Guide
230
provides the values that will be used to define the distribution, and Aster Database chooses
boundaries to split up the records. Logical partitioning, on the other hand, requires you to
explicitly declare the boundaries of each partition.
distribution key
When you create a fact (and optionally when you create a dimension table), you specify a
distribution key that determines how Aster Database will physically distribute that table’s data
across the cluster. The distribution key specifies which column's value will be evaluated to
determine its location in the cluster. If the table has a primary key defined, then the
distribution key must be one of the columns from the primary key. The column you choose as
your distribution key must be of a datatype allowed for use as a distribution key.
ELT
Aster’s better alternative to the long-standing datawarehousing practice of ETL (extract,
transform, and load). In Aster Database it’s usually much better to extract, load, and only then
transform, because you can use the computing power of the cluster to carry out the data
transformations. The main tools for performing such transformations are the CTAS
command and SQL-MapReduce transformation functions that you write.
ETL
The long-standing datawarehousing practice for loading data into the warehouse. ETL stands
for extract, transform, and load. By ‘transform’, we mean the reformatting of the data that you
must do to ensure consistent and correct data representation in the warehouse. In Aster
Database, we prefer to follow the ELT approach to loading, rather than ETL.
fact table
One of the two main table types in a star schema-style database. In a star schema, the fact table
is usually the largest table in the database and records the minute-to-minute actions that your
database was built to track. (The job of storing the more detailed information about the
actions’ participants is delegated to a set of dimension tables.) In the fact table, each row
usually represents an action or movement, such as a sales transaction or a web pageview.
Because of this, the volume of data in a fact table tends to grow fast.
A fact table contains two types of columns: columns that contain facts (say, timestamp and
price of a sale) and columns that are foreign keys to the dimension tables (links to the rows,
for example, that describe the product sold and the customer who bought it). Note that Aster
Database does not enforce referential constraints. Foreign keys are used mainly for joining
tables. The effective primary key of a fact table is usually a composite of more than one
column.
You create a fact table in Aster Database with the CREATE FACT TABLE command, and you
must distribute each fact table by declaring one of its columns to be its distribution key, using
the keyword DISTRIBUTE BY.
foreign key
The column that is used to join a fact table with a dimension table. Aster Database does not
enforce referential constraints. Foreign keys are used mainly for joining tables.
231
Teradata Aster Big Analytics Appliance Database Administrator Guide
Hadoop
Apache Hadoop is an open source platform for storing and managing big data. Teradata Aster
provides SQL-H to enable business users to access the Hadoop data from Aster Database
directly. Aster Database manages communication with Hadoop nodes through SQL-H to read
data for SQL queries and SQL-MR functions.
hash distribution
See physical partitioning.
HCatalog
HCatalog is the table and storage management service for data stored in Apache Hadoop.
Hive
An open-source SQL layer for Hadoop. It is not compliant with SQL-92 and supports none of
the SQL guarantees.
ICE
The InterConnect Executable process in Aster Database. This is the Aster Database service
responsible for finding and shuffling partitions of data between vworkers. For example, if the
users table is distributed across many partitions and you run the query, SELECT * FROM
myusers, then ICE collects the rows from all the partitions.
imbalanced
Undesirable cluster state that you should fix by running either a balance data or balance
process. This typically means you have one worker node with more than the desired number
of active vworkers running on it, or you have a single worker node hosting both an active
vworker and its corresponding replica vworker.
in-database applications
Aster Database’s in-cluster, in-database applications let you inject user functions
(applications) into the data flow at the lowest levels. For many applications, Aster Database is
superior to other distributed computing frameworks because Aster Database provides better
tools for data manipulation (partitioning, sorting, and the like), as well as process
management and workload management.
incorporate
Outdated term for balance data.
JDBC
Standard Java API that allows clients to access a database. Aster Database offers a JDBC driver.
list partitioning
See logical partitioning.
loader node
An optional node in Aster Database that is specialized in loading data. Normally, you can
route all loading directly though the queen, but for high-volume loading requirements, you
Teradata Aster Big Analytics Appliance Database Administrator Guide
232
can deploy loader nodes to increase loading capacity. When using loader nodes, you initiate
the loading using the ncluster_loader utility, which communicates with the queen. The queen
then delegates loading to the loader nodes, which load data into the appropriate vworkers in
parallel. You can also force the use of a particular loader node.
logical partition
A child table or child partition and its data.
logical partitioning
Splitting one large table into smaller logical pieces for faster performance and easier
management. This is done via automatic logical partitioning (preferred) or parent-child table
inheritance (supported for backward compatibility) and is a common database practice as
well as a popular feature of Aster Database.
Each partition is created as a child partition of the single partitioned table. The top level table
is normally empty, there to represent the data set. Some logical partitioning designs contain
multiple generations of partitions. For example, you might have a schema in which table
sales_2008 has yearly child partitions sales_2008_01 through sales_2008_12,
and each yearly child partition has daily child partitions like sales_2008_01_01 through
sales_2008_01_31.
Don't confuse logical partitioning with the more automatic physical partitioning feature of
Aster Database.
For clarity when discussing logical partitioning in this document, we avoid the term “logical
partition,” and instead use the more explicit terms child table or child partition.
machine
See node.
materialized projection
A relatively narrow table that contains a copy of a group of columns that are commonly
accessed together. A materialized projection usually contains a subset of the columns of a
wider table and is created to allow queries to run faster.
nc_ tables
see system tables.
NIC bonding
Network link aggregation that allows you to combine multiple network interface cards to
support a common connection for better performance.
node
In the cluster, a node is a server machine that hosts vworkers, a loader, or a queen. Typically a
node is a physical machine, but if you’ve installed your cluster on VMware or in the cloud,
then it’s a virtual machine. In Aster Database, each node has a designated role as a queen node,
worker node, loader node.
233
Teradata Aster Big Analytics Appliance Database Administrator Guide
node splitting
See partition splitting.
ODBC
A standard API that allows clients to access a database. Aster Database offers an ODBC driver.
Optimized Transport
A massively parallel communication transport mechanism that enables dynamic
repartitioning of data.
parent-child table inheritance
An older method of splitting a large table into child tables to optimize query performance
using the INHERITS keyword. This approach has been replaced by the preferred automatic
logical partitioning.
partition
See physical partition or logical partition. For clarity, we avoid using the unqualified term
“partition” in this document and instead say “child table” or “child partition” for a logical
partition, or “physical partition” for a partition that Aster Database maintains automatically
based on a distribution key.
partition count
Each worker node in Aster Database contains a number of vworkers. The total number of
vworkers in the cluster is the “partition count” of the cluster.
partition splitting
The act of increasing the number of vworkers in your Aster Database. Having the appropriate
ratio of CPU cores to vworkers ensures efficient use of your workers’ computing power. As
your cluster grows and you add more worker machines, it eventually makes sense to increase
the total number of vworkers in order to maintain a good ratio. Contact Teradata support to
find out the proper CPU core/vworker ratio for your hardware.
Don't confuse repartitioning with partition splitting. Repartitioning happens to rows inside a
query, and does not involve changing the physical location of rows on disk. In repartitioning,
only the location of an in-memory copy of the row is changed. Partition splitting, on the other
hand, 'permanently' moves some rows to a different vworker for storage.
partitioning
See logical partitioning or distribution (of rows). The term “partitioning” is ambiguous.
passive coordinator
Outdated term for backup queen.
physical partition
See vworker.
physical partitioning
Outdated term. See distribution (of rows).
Teradata Aster Big Analytics Appliance Database Administrator Guide
234
primary interface
The Ethernet NIC that the Aster Database administrator designated as the main networking
interface for cluster communications. This is specified by interface name and is often eth0.
primary queen
When discussing the queen and the backup queen, we refer to the currently operating queen
as the “primary queen”.
queen
The queen node is the Aster Database coordinator, distributed query planner, distributed
query coordinator, and keeper of the data dictionary and system tables. The queen is
responsible for cluster, transaction, and storage management. The queen handles software
delivery to all nodes. See also distributed query planning.
range partitioning
See logical partitioning.
repartitioning
The act of reshuffling the rows of a distributed table to nodes on the cluster where they are
needed for a join or aggregation. Repartitioning is frequently a prerequisite step for query
execution in which the data required for a join is laid out as though it were distributed by the
attribute/expression in the join or the aggregation. For example, when you run SELECT
column-a FROM foo GROUP BY column-a, if column-a is not the distribution key of foo,
then Aster Database must repartition foo so that, for the duration of this operation, it’s
distributed on column-a.
Don't confuse repartitioning with partition splitting. Repartitioning happens to rows inside a
query, and does not involve changing the physical location of rows on disk. In repartitioning,
only the location of an in-memory copy of the row is changed. Partition splitting, on the other
hand, 'permanently' moves some rows to a different vworker for storage.
replicate
The act of updating the replica of a given piece of data when that piece of data changes. With
each change in a vworker's data, Aster Database ensures that the vworker's replica gets a record
of the change. See Aster Database replication.
Tip! The term “replica” also arises in the case of a replicated dimension table. Don’t confuse
the two.
replicated dimension table
A dimension table whose entire contents are copied to all vworkers for faster lookup. This is
the default behavior of a dimension table in Aster Database, or you can include the clause
DISTRIBUTE BY REPLICATION in your CREATE TABLE statement to create a replicated
dimension table.
Good to know: Don't confuse replicated dimension tables with Aster Database replication!
They are not closely related. What's being replicated in Aster Database replication are vworkers
235
Teradata Aster Big Analytics Appliance Database Administrator Guide
(sometimes called “partitions”) whereas what's being replicated in a replicated dimension
table is the whole contents of the table.
replication
See Aster Database replication.
replication factor (goal)
Also written as “RF(g)”, this is the desired number of copies of data to be kept in Aster
Database. This is almost always 2. This is specified at installation time and can be changed.
This setting is stored on the queen as /home/beehive/config/
goalReplicationFactor.
replication factor (current)
Also written as “RF(c)”, this is your cluster's current replication factor. RF(c) is the replication
degree of the partition with the lowest replication degree in the cluster. In other words, if one
partition in the cluster has lost its replica, meaning its current replication degree has fallen to
1, then the current replication factor of your cluster is 1. When RF(c) falls below RF(g), the
AMC alerts you that you need to take action to restore your cluster's replication factor.
RF
See replication factor (current) and replication factor (goal).
schema
Logical subdivision of a database, typically schemas are used to cordon off sections of the
database so that different groups of users have authority over the use of those sections.
Tip! In this document, we do not use the term “schema” to mean data model. We use “data
model” instead.
shared-nothing
A distributed computing architecture in which nodes are independent and do not share disk
or memory.
SMC
Outdated term for the AMC.
SQL-MapReduce
Aster’s programming framework and API for writing data analysis and manipulation
functions that you can run in a distributed manner.
SQL-MapReduce function
A function, usually invoked in a SELECT statement, that operates in Aster Database’s SQLMapReduce framework. You can write SQL-MapReduce functions yourself, or use Teradata
Aster’s functions.
standby queen
See backup queen.
Teradata Aster Big Analytics Appliance Database Administrator Guide
236
star schema
The database design schema that DBAs most commonly use in Aster Database is the star
schema, consisting of (usually) one fact table surrounded by a set of dimension tables. Fact
tables store the running log of events or transactions. Dimension tables describe items in
detail. When you diagram the schema, it looks like a star, with the central fact table
surrounded by dimension tables.
stats db
See system tables.
system tables
Tables that hold Aster Database system information. These tables’ names start with “nc_”.
These tables are often referred to as the “stats db” or as the “_bee_stats db”.
tuple
A “tuple” is an ordered set of values that we think of as a single record. Rows are the elements
that comprise a database table. At any given time, each row will be represented by a specific
tuple of values. A row can be updated over time to contain a different tuple. In the Aster
Database documentation, we use “row,” except on those rare occasions when we’re trying to
show the distinction between a tuple and a row.
UDF
user-defined function
vworker
A virtual worker responsible for storing and operating on data in Aster Database.
Conceptually, a vworker is roughly equivalent to a physical data partition in Aster Database,
and as a result you will often hear people refer to a vworker as a “partition” or “physical
partition.”
The queen delegates work to vworkers, and query results are aggregated and returned via the
queen. In a typical installation, you'll have as many active vworkers per worker node as you
have CPU cores per worker node. See also partition count.
view
A stored query accessible as a virtual table. A view is composed of the result set of a query. A
view is not part of the physical schema, but is instead a dynamic, virtual table computed or
collated from data in the database.
virtual worker
See vworker.
WAL file
A Postgres write-ahead log file.
237
Teradata Aster Big Analytics Appliance Database Administrator Guide
worker
In this document, we avoid this term. Instead, we say vworker to mean the basic Aster
Database unit that does work, or we say worker node to mean the physical or virtual server
that acts a worker machine in the cluster.
worker node
An Aster Database node (machine) that contains vworkers.
Teradata Aster Big Analytics Appliance Database Administrator Guide
238
239
Teradata Aster Big Analytics Appliance Database Administrator Guide
Index
Symbols
/primary 71
A
about this book 13
Activate and Balance Processing 59
Activate and Balance Storage 122
to restore RF 58
activate Aster Database 121
activate nodes 121
Activating status 39
activation 120
about 120
instructions 121
Active (node state) 122
active node 52, 122
compared with passive node 122
Active status 39
add node 71
Add Node button 91
Admin 90, 192
Configuration tab 109
Roles and Privileges tab 113
admin console (ncli) 148
admin console URL 26
Admin tab in AMC 90, 192
administrative actions 90
allowed actions based on cluster status 40
allowed AMC actions based on user privileges 113
configure hosts 115
administrative console URL 26
administrator
command line controls 148
creating an AMC administrator 114
administrator role
allowed AMC actions based on user privileges 113
alerts 132
AMC 26
address to type in browser 26
admin actions 90, 192
admin actions allowed based on role 113
admin actions allowed based on status 40
Admin Tab 90, 192
Admin: Configuration tab 109
Admin: Roles and Privileges tab 113
certificate, managing 27
Teradata Aster Big Analytics Appliance Database Administrator Guide
Config Panel 91
Dashboard 26
Data Panel 50
documentation link 34
executables
not supported 213
introduction 26
launching 26
Nodes Panel 50
opening the AMC in your browser 26
overview 32
process management 42
security warning 28
status lamp 38
troubleshooting 31
trusting the certificate 28
URL 26
user (creating an AMC user) 114
user roles, editing 115
user roles, list all 113
user roles, list current 115
window layout 33
analytic tables 21
architecture 16
architecture diagram
tiers 18
Aster Data, about 12
Aster Database
activate 121
activation, about 120
activation, instructions 121
checking status from the command line 124
command-line cluster controls 124
overview 16
restarting 118
starting from the command line 125
status of 124
stopping from the command line 125
Aster Database backup
use separate network for 95
Aster Database status 38
administrative action rights and 40
Aster Management Console 26
Aster Relational Compute Engine (ARC) 17
Aster support portal 12
availability 21
available space 52
240
cluster-wide 37
per-node 53
B
Backing Up status 40
backup
use separate network for 95
backup node
defined 19
backup queen 82
activating 82
restoring 82
balance data 122
defined 229
to restore RF 58
Balance Data button 122
balance process 78, 123
Balance Process button 59, 78, 123
BIT, the application 229
blackbird 132
blue light 39
build number 35
bulk export
Teradata Aster connector 116
bulk load
Teradata Aster connector 116
C
capacity 52
capacity gauge
cluster-wide 37
per-node 53
certificate
managing the AMC certificate 27
check status 124
child partition 229
Clean Node check box
background 71
UMOS 73
cleaning a node for re-use 71
cli for Aster 148
cluster
monitoring 132
monitoring events 132
securing 128
SNMP monitoring 145
cluster architecture 16
cluster status 124
administrative action rights and 40
command line interface, ncli 148
command-line cluster management tools 124
compression
viewing in the AMC 38
241
concurrency
maximum recommended 87
Config Panel of AMC 91
Configuration tab 109
configure DNSs 115
Configure Hosts
Aliases 116
IP Address 116
IP Address and Aliases 117
connect
Teradata Aster Connector 116
console
URL 26
console (ncli) 148
conventions 11
coordinator 229
coordinator reconstuction 82
copyright 13
cores per vworker
adding vworkers for better CPU utilization 79
CPU cores per vworker
adding vworkers for better CPU utilization 79
CREATE USER
creating AMC user 114
CSV 229
CTAS 230
customer support 12
D
Dashboard 26
Nodes section of the Dashboard 36
Processes section of the Dashboard 35
window layout 33
data
balance data 122
compressed and raw size of 38
skew 65
space used and remaining, cluster-wide 37
space used and remaining, per-node 53
Teradata Aster connector 116
data export
Teradata Aster connector 116
data loading
Teradata Aster connector 116
data locality 230
Data Panel in AMC 50
data storage 52
balancing in cluster 122
cluster-wide 37
per-node 53
restoring replication 58
utilization 52
date of publication 13
Teradata Aster Big Analytics Appliance Database Administrator Guide
DDL 230
delete
all data from a node machine 71
disk
free space
cluster-wide 37
per-node 53
reclaiming vworker space 125
disk failure 59
disk space
cluster-wide 37
per-node 53
distribution key
defined 231
dmesg 63
documentation 34
opening from AMC 34
documentation conventions 11
documentation version and updates 13
documentation, about 11
drivers
Teradata Aster Connector 116
E
edition 13
ELT 231
ETL 231
event monitoring 132
executable jobs 192
executables 192
unsupported
All Table Sizes 213
Table Size 213
Table Size (Details) 213
export
Teradata Aster Connector 116
export data
Teradata Aster connector 116
exporter, defined 18
F
failed node 58
failover 21
queen 82
replication factor 56
failure recovery 82
firewall 128
disable 130
open ports 129
open ports on enterprise firewall 129
foreign key: declaration not supported in Aster Database 231
free space
cluster-wide 37
Teradata Aster Big Analytics Appliance Database Administrator Guide
freeing space occupied by dead vworker 125
overview in AMC 37
per-node 53
viewing the amount of 52
G
get latest documentation 13
glossary 228
green light 39
H
H2 Head2
Check the Current Replication Factor 57
Node Statistics Summary 37
H3 Head3
ncli netconfig examples 166
HA 21
Hadoop 232
hard restart 119
hardware
monitoring 132
hardware failure 59
hash partitioning 232
HCatalog 232
help 12
Help link 34
high availability 21
history of jobs or queries run 42
HTTPS
allowing HTTP connections to the AMC 28
managing the AMC certificate 27
I
imbalanced 232
data imbalanced 39
processing imbalanced 39
incorporate 232
in-database applications 232
init.d/local status command 124
Is the cluster up? 124
IWT 232
J
job history 42
timeline 46
K
kernel log 63
242
L
lamp 38
launching the AMC 26
light greyed out 40
list nodes 51
list of statements run 42
list partitioning 232
load data
Teradata Aster connector 116
loader
defined 18
installing secondary queen software on 85
loader node
adding 71
use secondary queen as a 68
loading
Teradata Aster Connector 116
use separate network for 95
local restart 58
local status 124
log 63
format of Aster Database logs 64
retrieve in AMC 63
logging 63
alerts 132
format of Aster Database logs 64
logical partitioning
defined 233
logs
filtering display of 44
M
machine 233
management console
URL 26
massively parallel processing 16
monitor
SQL-MapReduce execution 49
monitoring 132
cluster 132
SNMP monitoring 145
mpp 16
overview 20
multi-NIC machines 95
N
nc_ tables
defined 233
nc_relationstats
detecting skew 209
nc_skew
detecting skew 209
243
ncli 148
preupgrade commands 153, 158
ncli qos 173
Admission Limits 173
Workload Managment 173
network
NIC bonding requirements 102
open ports 129
open ports on enterprise firewall 129
Network Assignments feature (since 4.6.3) 95
network configuration on cluster 115
network topology 16
network traffic, segregating by function 95
networking
multi-NIC machines 95
new node 52
NIC
multi-NIC machines in Aster Database 95
NIC bonding
advisories 104
benefits of 102
defined 233
mode is balance-alb (mode 6) 101, 102
no support for 802.3ad 101
requirements 102
settings 104
node 50
activate 121
activation, about 120
activation, instructions 121
active 52
active node 122
adding 71
cleaning a node for re-use 71
data skew 65
defined 17, 233
failure 54
failure, fixing 58
list of nodes 51
logs for a node 63
new status 52
node state 51
node status, list of 51
passive 52
passive node 122
preparing status 52
reconstructing 82
reprovision old node 71
restarting 58
suspect 52
suspect node 58
types in AMC 51
node list 51
node splitting 234
Teradata Aster Big Analytics Appliance Database Administrator Guide
node state 51
Nodes Panel in AMC 50
O
ODBC, defined 234
on-disk data size 38
online help, launching 34
open ports 129
on enterprise firewall 129
opening the AMC 26
optimization
split partitions 79
overview panel in AMC 32
P
parent-child table inheritance 234
partition 234
defined 234
partition count 79
current 79
defined 234
initial (UMOS) 87
partition splitting 79
defined 234
partitioning
defined 234
initial partition count (UMOS) 87
overview 19
split partitions 79
Passive (node state) 122
passive coordinator 234
passive node 52, 122
compared with active node 122
passwordless root SSH, setting up
for queen replacement 85
payload gauge
cluster-wide 37
per-node 53
performance tuning
adding vworkers for better CPU utilization 79
permissions
AMC user permissions 113
physical partition 234
physical partitioning
defined 234
physical worker 18
planner, distributed aspects of 18
portal 12
ports
open ports on enterprise firewall 129
PostgreSQL version 17
preparation log (Prep Log) 63
prepared node 52
Teradata Aster Big Analytics Appliance Database Administrator Guide
preparing node 52
primary interface
defined 235
primary queen 235
privileges
AMC user permissions 113
Process Filter 44
process management
AMC 42
processing power, balancing in cluster 78, 123
processing skew 65
processing: balance 123
processor requirements
adding vworkers for better CPU utilization 79
prompt, ncli 148
provision node
reprovision old node 71
Q
QosManager 173
queen
backup 82
defined 18, 235
failure recovery 82
migration 82
reconstruction 82
recovery 82
replace failed queen 82
replacement 82
standby queen 82
queen reconstruction 82
queen recovery 82
queen replacement 82
procedure 82
query
cancel 49
list of statements run (UI) 42
time elapsed (UI) 46
query history
AMC query history page 42
timeline 46
query timeline 46
quiesce 125
R
range partitioning 235
ratio of cores to vworker
adding vworkers for better CPU utilization 79
rebalance data 122
rebalance processing 123
rebalance processing power 78, 123
rebalance storage 122
reboot
244
soft shutdown 125
reconstructing the queen node 82
recover queen 82
recovery 82
red light 40
release number 35
repartitioning 235
replace disk 59
replace queen 82
replicate
defined 235
replicated dimension table 235
Replicating status 39
replication
defined 228
replication factor 21, 37, 56
changing 59
checking 57
queen recovery and 84
recommended 87
restoring 58
setting initial replication factor 87
viewing summary of 37
replication factor, current 236
replication factor, goal 236
reprovision old node 71
restart 118
hard restart 119
soft 119
soft shutdown 125
restarting Aster Database 118
Restarting status 39
restore backup queen 82
restoring replication factor 58
Restoring status 40
revision number 35
RF 56
changing 59
setting initial RF 87
viewing summary of 37
role
AMC user role 113
edit roles of AMC user 115
list available AMC user roles 113
list roles of AMC user 115
Roles and Privileges tab 113
S
scale out 71
add vworkers 79
split partitions 79
schema, defined 236
script management 192
245
scripts 192
secondary queen
installing 85
secondary queen node
convert to a loader 68
security 128
firewall 128
separate networks by function 95
shutdown 125
command-line shutdown 125
soft shutdown 125
size of table, checking 67
skew 65
skew, finding 67
SMC 236
SNMP 145
SNMP read configuration 146
SNMP monitoring 145
soft restart 119
soft shutdown 125
soft startup 125
SoftShutdownBeehive.py 125
SoftStartupBeehive.py 125
space 52
reclaiming vworker space 125
space available
cluster-wide 37
per-node 53
split partitions 79
SQL-MapReduce
history of jobs run 42
monitoring 49
SSH
setting up passwordless SSH for new queen 85
standby node 122
standby queen 82
activating 82
restoring 82
start
command-line startup 125
start Aster Database 121
state
Aster Database state 38
node state 51
statisics
disk usage, cluster-wide 37
disk usage, per-node 53
stats db 237
status 38
administrative action rights and 40
Aster Database status 38
command-line status check 124
node status 51
status lamp 38
Teradata Aster Big Analytics Appliance Database Administrator Guide
Stopped status 40
storage 50
viewing stored data size in the AMC 38
storage utilization 52
cluster-wide 37
per-node 53
storage, balancing 122
storage, restoring replication in 58
support 12
Suspect node
explained 54
suspect node 52, 58
fixing 58
system log 63
system overview 16
system statistics
disk usage cluster-wide 37
disk usage per-node 53
system status 38, 124
system tables, defined 237
permissions in AMC 113
permissions in AMC, editing 115
permissions in AMC, list all 113
permissions in AMC, list current 115
utilities
command-line cluster controls 124
firewall 128
init.d/local status command 124
initialPartitionCount 80
local restart 58
monitoring tools 132
SNMP monitoring 145
SoftShutdownBeehive.py 125
SoftStartupBeehive.py 125
Teradata Aster Connector 116
totalPartitionCount 79
V
table
size, checking 67
table size, checking 67
TCP ports 129
technical support 12
Teradata Aster connector 116
timeline of jobs run 46
troubleshooting
AMC Add Node dialog box displays unexpectedly 32
AMC certificates 31
AMC login window refuses to load 31
skew, detecting 67
tuning
adding vworkers for better CPU utilization 79
tuple 237
typeface conventions 11
var/log/messages 63
version 35
AMC version 35
Aster Database software version 35
checking from command line 35
documentation version 13
view
defined 237
virtual worker 18
adding more virtual workers 79
adding vworkers for better CPU utilization 79
defined 237
freeing space occupied by dead vworker 125
initial number of virtual workers 87
primary vs. replica 56
set-up 87
vworker: See virtual worker.
vworkers
number of 87
recommended settings 87
U
W
Unavailable status 40
updated documentation 13
URL 12
AMC 26
Aster Support URL 12
Ganglia 27
old AMC 27
user
AMC user role 113
AMC user roles, editing 115
AMC user roles, list all 113
AMC user roles, list current 115
creating AMC user 114
WAL file 237
worker
defined 238
worker node 18
adding 71
adding more vworkers per worker node 79
defined 18, 238
failed or suspect node 58
restarting 58
worker: See "worker node" or "virtual worker."
T
Teradata Aster Big Analytics Appliance Database Administrator Guide
246
Y
yellow light 39, 40
247
Teradata Aster Big Analytics Appliance Database Administrator Guide