* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Aster Database Administrator Guide - Information Products
Survey
Document related concepts
Microsoft SQL Server wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Oracle Database wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Ingres (database) wikipedia , lookup
Concurrency control wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Functional Database Model wikipedia , lookup
Relational model wikipedia , lookup
Versant Object Database wikipedia , lookup
Database model wikipedia , lookup
Transcript
Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Release 5.0.1 B035-5495-102K Revised March 2013 The product or products described in this book are licensed products of Teradata Corporation or its affiliates. Teradata, Active Enterprise Intelligence, Applications-Within, Aprimo, Aprimo Marketing Studio, Aster, BYNET, Claraview, DecisionCast, Gridscale, MyCommerce, Raising Intelligence, Smarter. Faster. Wins., SQL-MapReduce, Teradata Decision Experts, "Teradata Labs" logo, "Teradata Raising Intelligence" logo, Teradata ServiceConnect, Teradata Source Experts, "Teradata The Best Decision Possible" logo, The Best Decision Possible, WebAnalyst, and Xkoto are trademarks or registered trademarks of Teradata Corporation or its affiliates in the United States and other countries. Adaptec and SCSISelect are trademarks or registered trademarks of Adaptec, Inc. AMD Opteron and Opteron are trademarks of Advanced Micro Devices, Inc. Apache, Apache Hadoop, Hadoop, and the yellow elephant logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. Axeda is a registered trademark of Axeda Corporation. Axeda Agents, Axeda Applications, Axeda Policy Manager, Axeda Enterprise, Axeda Access, Axeda Software Management, Axeda Service, Axeda ServiceLink, and Firewall-Friendly are trademarks and Maximum Results and Maximum Support are servicemarks of Axeda Corporation. Data Domain, EMC, PowerPath, SRDF, and Symmetrix are registered trademarks of EMC Corporation. GoldenGate is a trademark of Oracle. Hewlett-Packard and HP are registered trademarks of Hewlett-Packard Company. Hortonworks, the Hortonworks logo and other Hortonworks trademarks are trademarks of Hortonworks Inc. in the United States and other countries. Intel, Pentium, and XEON are registered trademarks of Intel Corporation. IBM, CICS, RACF, Tivoli, and z/OS are registered trademarks of International Business Machines Corporation. Linux is a registered trademark of Linus Torvalds. LSI is a registered trademark of LSI Corporation. Microsoft, Active Directory, Windows, Windows NT, and Windows Server are registered trademarks of Microsoft Corporation in the United States and other countries. NetVault is a trademark or registered trademark of Quest Software, Inc. in the United States and/or other countries. Novell and SUSE are registered trademarks of Novell, Inc., in the United States and other countries. Oracle, Java, and Solaris are registered trademarks of Oracle and/or its affiliates. QLogic and SANbox are trademarks or registered trademarks of QLogic Corporation. Red Hat is a trademark of Red Hat, Inc., registered in the U.S. and other countries. Used under license. SAS and SAS/C are trademarks or registered trademarks of SAS Institute Inc. SPARC is a registered trademark of SPARC International, Inc. Symantec, NetBackup, and VERITAS are trademarks or registered trademarks of Symantec Corporation or its affiliates in the United States and other countries. Unicode is a registered trademark of Unicode, Inc. in the United States and other countries. UNIX is a registered trademark of The Open Group in the United States and other countries. Other product and company names mentioned herein may be the trademarks of their respective owners. THE INFORMATION CONTAINED IN THIS DOCUMENT IS PROVIDED ON AN "AS-IS" BASIS, WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT. SOME JURISDICTIONS DO NOT ALLOW THE EXCLUSION OF IMPLIED WARRANTIES, SO THE ABOVE EXCLUSION MAY NOT APPLY TO YOU. IN NO EVENT WILL TERADATA CORPORATION BE LIABLE FOR ANY INDIRECT, DIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING LOST PROFITS OR LOST SAVINGS, EVEN IF EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. The information contained in this document may contain references or cross-references to features, functions, products, or services that are not announced or available in your country. Such references do not imply that Teradata Corporation intends to announce such features, functions, products, or services in your country. Please consult your local Teradata Corporation representative for those features, functions, products, or services available in your country. Information contained in this document may contain technical inaccuracies or typographical errors. Information may be changed or updated without notice. Teradata Corporation may also make improvements or changes in the products or services described in this information at any time without notice. To maintain the quality of our products and services, we would like your comments on the accuracy, clarity, organization, and value of this document. Please email: [email protected]. Any comments or materials (collectively referred to as "Feedback") sent to Teradata Corporation will be deemed non-confidential. Teradata Corporation will have no obligation of any kind with respect to Feedback and will be free to use, reproduce, disclose, exhibit, display, transform, create derivative works of, and distribute the Feedback and derivative works thereof without limitation on a royalty-free basis. Further, Teradata Corporation will be free to use any ideas, concepts, know-how, or techniques contained in such Feedback for any purpose whatsoever, including developing, manufacturing, or marketing products or services incorporating Feedback. Copyright © 2000-2013 by Teradata Corporation. All Rights Reserved. Table of Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Conventions Used in This Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Typefaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 SQL Text Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Command Shell Text Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Contacting Technical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 About Teradata Aster. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 About This Document. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 VOLUME 1 Aster Database Administrator Guide Chapter 1: Overview of Cluster Management . . . . . . . . . . . . . . . . . . 16 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 PostgreSQL and Aster Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Node Types in Aster Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Single-System View. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Data Partitioning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Massively Parallel Processing (MPP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 High Availability Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 User Interfaces to Aster Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Cluster Administration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Database Administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Data Path Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Bulk Data Utilities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Chapter 2: Dashboard: The AMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Overview of AMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Launching the AMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 The AMC Dashboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 3 Processes Section of the Dashboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Dashboard: Processes Section: Green Summary Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Query Statistics Summaries in the Processes Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Nodes Section of the Dashboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Dashboard: Nodes section: Green Summary Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Dashboard: Nodes section: Center Panel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Dashboard: Nodes: Cluster-Wide Disk Capacity/Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Configuring and Using the AMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 AMC System Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Installing AMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Connecting to the AMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Managing AMC Certificates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Disabling HTTPS access for AMC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Troubleshooting the AMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Understanding Aster Database Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 The Status Icon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Aster Database Status Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Allowed Administrative Actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Chapter 3: Processes: Managing Activity. . . . . . . . . . . . . . . . . . . . . . . . 38 Processes Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Process Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Filtering the Process List: The Process Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Disabling Auto-Polling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Query Timeline Tab. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Sessions Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Process Details Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Cancelling SQL Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 SQL-MapReduce Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Chapter 4: Nodes: Managing Data and Nodes . . . . . . . . . . . . . . . . . . 46 Node Overview Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 The Node List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Node Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Understanding Node States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Disk Storage Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 Hardware Stats Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Node Failures in Aster Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Addressing Hardware and Networking Problems on Workers . . . . . . . . . . . . . . . . . . . . . 51 Hardware Configuration Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Partition Map Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Replication Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Checking the Current Replication Factor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 Restoring Replication Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 Changing the Replication Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Individual Node Inspection Tab. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Reading Aster Database Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Creating Log Bundles for Support Inquiries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Aster Database Log Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Detecting and Managing Skew in Aster Database. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Table Skew (Data Skew) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Partition Level Skew . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Processing Skew . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Chapter 5: Cluster Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Add New Nodes to the Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Activate Aster Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Incorporate the New Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Balance Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Balance Process: The Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Splitting Partitions in Aster Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Appendix: Troubleshooting Cluster Expansions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 If Your “Add Node” Attempt Stalls or Fails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Add Node Fails With “user data directories are present” Message . . . . . . . . . . . . . . . . . . 75 Add Node Hangs at Installing OS Phase. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Deleting All Data to Re-Provision a Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Chapter 6: Queen Replacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Best Practices for Ensuring Queen Recoverability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 Install the Queen Software on the Loader Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Replace the Failed Queen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Outline of the Queen Replacement Procedure: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Build the MAC Address File. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 5 Removing the Failed Primary Queen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Setting Up Passwordless Root SSH Among All Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Shutting Down Workers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 Shutting Down and Configuring the Backup Queen. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 Swapping Old and New Queens’ Network Settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 Backup Queen is Now Primary Queen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Running the Queen Replacement Script. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 What is kept and what is lost during queen replacement? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 Chapter 7: Admin: Administrative Operations . . . . . . . . . . . . . . . . . 90 Admin: Cluster Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Using Multi-NIC Machines in Aster Database. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Remove Nodes from the Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 The Hardware Configuration Panel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 The Node Inspection Panel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Admin: Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Admin: Executables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Admin: Backup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Adding a New Backup Manager to the AMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 Starting a Backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Monitoring and Managing Backups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Admin: Configuration: Cluster Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Cluster Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Sparkline Graph Scale Units. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Graph Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Internet Access Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Aster Support Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 QoS Concurrency Threshold Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Admin: Configuration: Workload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Admin: Configuration: Roles and Privileges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Viewing the list of available AMC user privileges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Creating an AMC user in Aster Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Checking users’ current AMC privileges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Editing users’ AMC privileges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Admin: Configuration: Hosts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 Setting up Host entries for all Aster Database nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 Setting up DNS entries for all Aster Database nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Admin: Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Restarting Aster Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 6 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Procedure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Soft Restart. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Backup interaction with soft-restart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Hard Restart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Soft Shutdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Activating Aster Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Situations that Require an Activation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Activating Aster Database: The Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 Balance Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Balance Data: The Procedure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Balance Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 Balance Process: The Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 Cluster Management from the Command Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Checking Cluster Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Soft Restart. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Soft Shutdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Soft Startup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Freeing Space Occupied By Defunct V-Workers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Setting Up Passwordless Root SSH Between Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 Chapter 8: Securing Aster Database. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 Aster Database Firewall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 Default Firewall Settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Open Ports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Enabling and Disabling Aster Database Firewall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Chapter 9: Monitoring Aster Database . . . . . . . . . . . . . . . . . . . . . . . . . . 122 Event Monitoring with the Event Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 Event Engine Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Managing Event Subscriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Upgrades of Event Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Viewing Event Subscriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 Supported Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 Remediations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Event Engine Best Practices/FAQs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Testing the Event Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 Troubleshooting Event Engine Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 SNMP Monitoring of Aster Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 7 Setting Aster Database to send SNMP traps to an NMS . . . . . . . . . . . . . . . . . . . . . . . . . . 134 Setting an NMS to perform SNMP reads on Aster Database . . . . . . . . . . . . . . . . . . . . . . 135 Chapter 10: Admin: ncli (Aster Database Command Line Interface). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 ncli Installation and Setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Installing ncli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Setting up Passwordless SSH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Using ncli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Who Should Use ncli?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Issuing ncli Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Command Line Conventions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 ncli help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 ncli Command Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 ncli command sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 ncli node section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 ncli tables Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 ncli procman Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 ncli qos Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 ncli process Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 ncli ippool Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 ncli vworker Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 ncli system Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 ncli ice Section. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 ncli disk Section. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 ncli replication Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 ncli session Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 ncli nsconfig Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 ncli query Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 ncli netconfig Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 ncli statsserver Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 ncli events Section. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 ncli util Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 ncli sysman Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 ncli database Section. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 ncli Flags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Limiting Actions of ncli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Formatting and Sorting Flags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 Miscellaneous Flags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 8 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Chapter 11: Admin: Executables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 Executables Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 Executables Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 Out-of-the-Box Scripts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Executable Jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Running Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Best Practices for Running Scripts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Creating Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 Variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 SQL Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Cluster Utility SQL-MapReduce Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 Best practices for building scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 Upgrades . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 Chapter 12: Using Teradata Tools to Manage Aster Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 Managing Aster Database with Teradata Viewpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 Information available through Viewpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 Administrative operations available through Viewpoint . . . . . . . . . . . . . . . . . . . . . . . . . 190 Configuring Aster Database for use with Viewpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 Troubleshooting the Viewpoint integration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 Chapter 13: Aster Database Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 Overview of Diagnostic Log Bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 Using Diagnostic Log Bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 Displaying the Diagnostic Bundle Jobs Panel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 Sending a Diagnostic Log Bundle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 Saving a Diagnostic Log Bundle on Your Local Filesystem. . . . . . . . . . . . . . . . . . . . . . . . 197 Including All Nodes in a Diagnostic Log Bundle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 Making a Custom Diagnostic Log Bundle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 Running Custom Commands in a Diagnostic Log Bundle Job . . . . . . . . . . . . . . . . . . . . 198 Viewing Diagnostic Log Bundle Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 9 Aster Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 10 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Preface This guide explains the software tasks you will perform to manage your Aster Database cluster. You can download other useful tools and documents from http://tays.teradata.com/. In addition, the Teradata Aster Resource Center at https:// everest.asterdata.com:444/resourcecenter/index.php provides documents, videos, and downloadable client software for various operating systems. Figure 1: Teradata Aster Resource Center Conventions Used in This Guide This document assumes that the reader is comfortable working in Windows and Linux/UNIX environments. Many sections assume you are familiar with SQL. This document uses the following typographical conventions. Typefaces Command line input and output, commands, program code, filenames, directory names, and system variables are shown in a monospaced font. Words in italics indicate an example or placeholder value that you must replace with a real value. Bold type is intended to draw your attention to important or changed items. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 11 Contacting Technical Support SQL Text Conventions In the SQL synopsis sections, we follow these conventions • Square brackets ([ and ]) indicate one or more optional items. • Curly braces ({ and }) indicate that you must choose an item from the list inside the braces. Choices are separated by vertical lines (|). • An ellipsis (...) means the preceding element can be repeated. • A comma and an ellipsis (, ...) means the preceding element can be repeated in a commaseparated list. • In command line instructions, SQL commands and shell commands are typically written with no preceding prompt, but where needed the default Aster Database SQL prompt is shown: beehive=> Command Shell Text Conventions For shell commands, the prompt is usually shown. The $ sign introduces a command that’s being run by a non-root user: $ ls The # sign introduces a command that’s being run as root: # ls Contacting Technical Support If you need the latest documentation or client software, check the Teradata Aster Resource Center at https://everest.asterdata.com:444/resourcecenter/index.php. Here you will find the latest documents, videos, and downloadable client software for various operating systems. Figure 2: Teradata Aster Resource Center For further assistance, contact Teradata technical support. Support Portal: http://tays.teradata.com/ Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 12 About Teradata Aster Email: [email protected] Telephone: +1-650-273-5599 About Teradata Aster Teradata Aster provides data management and advanced analytics for diverse and big data, enabling the powerful combination of cost-effective storage and ultra-fast analysis of relational and non-relational data. Teradata Aster is a division of Teradata and is headquartered in San Carlos, California. For more information, go to http://tays.teradata.com/ About This Document This is the “Teradata Aster Big Analytics Appliance 3H Database Administrator Guide,” version 5.0.1, edition 1. This edition covers Aster Database version 5.0.1_r29677 and was originally published March 14, 2013 3:01 pm. Get the latest edition of this guide! This document is updated very frequently. You can find the latest edition at http://tays.teradata.com/ Document revision history: March, 2013: 5.0.1, with modifications to the queen replacement procedure. September, 2012: 5.0.1 To obtain printed copies of this manual or editions of this manual in other electronic formats, contact Teradata Aster. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 13 Aster Database Administrator Guide This guide explains how to manage your Aster Database cluster. The subsections are: • Overview of Cluster Management (page 16) • Dashboard: The AMC (page 26) • Processes: Managing Activity (page 38) • Nodes: Managing Data and Nodes (page 46) • Cluster Expansion (page 66) • Queen Replacement (page 78) • Admin: Administrative Operations (page 90) • Securing Aster Database (page 118) • Monitoring Aster Database (page 122) • Admin: ncli (Aster Database Command Line Interface) (page 136) • Admin: Executables (page 170) • Using Teradata Tools to Manage Aster Database (page 188) • Aster Database Logging (page 194) • Aster Glossary (page 202) Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 14 15 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide CHAPTER 1 Overview of Cluster Management Aster Database is a massively parallel processing (MPP), database for general-purpose and analytic data warehousing. The Aster Database system is built on the foundations of cluster commodity computing and can glue thousands of commodity machines together into a single database that gives the user and administrator a single-system view. The notion of a singlesystem view is central to the architecture of Aster Database. Administrators, analysts, and software applications (business intelligence tools, management consoles, etc.) interact with a single machine that offers the speed and processing power of the entire cluster, but hides cluster management tasks from the end-user. This overview of Aster Database cluster managment includes the following sections: • Architecture (page 16) • High Availability Overview (page 21) • User Interfaces to Aster Database (page 22) Architecture Aster Database forms a massively-parallel database that can scale from three nodes (servers) to thousands of nodes. The smallest Aster Database configuration, at three nodes, includes one queen and two worker nodes. A node in Aster Database is a standard, inexpensive commodity x86 server from vendors such as HP, IBM, or Dell, with locally-attached storage (also known as direct-attached storage), and networked with other nodes using commodity Gigabit Ethernet (GigE) technology. In larger Aster Database configurations that span multiple racks, individual nodes are internetworked using 1 GigE, while racks are hierarchically networked using 10 GigE ports or multiple-trunked 1 GigE ports. The Aster Database system is based on a multi-tiered architecture that emphasizes a clean separation of roles for nodes to meet the challenges of massive-scale data warehousing and analytic processing. A tier is formed by a category of nodes that are dedicated to executing a particular type of warehousing task. The presence of multiple tiers helps isolate workloads that compete for different cluster resources. Incremental scaling is a hallmark of Aster Database. Each tier can be independently and incrementally scaled in response to workload demands. Traditional data warehouses require customers to plan out warehousing capacity requirements months or years in advance, Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 16 Overview of Cluster Management Architecture necessitating large upfront capital investments. In stark contrast, Aster Database can start out as a small installation and scale out incrementally, one node at a time. You can add capacity (worker nodes) and loading/exporting bandwidth (loader/exporter nodes) on an as-needed basis. Figure 3: Aster Database Cluster Architecture Aster Database is built to scale on heterogeneous hardware. You can extend multiple generations of commodity hardware to scale easily over time. Teradata Aster does, however, recommend that all worker nodes use the same size disk. PostgreSQL and Aster Database Aster Database utilizes components from the best-in-breed open source database, PostgreSQL (or Postgres), to provide high performance local database processing on worker nodes. Aster Database version 5.0.1 runs a new database kernel on the nodes. This kernel incorporates components of version 8.4 of PostgreSQL. Node Types in Aster Database Aster Database divides the set of warehousing tasks among various classes of task-dedicated nodes. The basic classes include the queen, workers, loaders, and backup nodes. Different classes of nodes typically run on different classes of server hardware. 17 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Overview of Cluster Management Architecture Figure 4: Aster Database Architecture • Queen: The queen node is the cluster coordinator, top-level query planner/coordinator, and keeper of the data dictionary and other system tables. You can maintain an inactive queen as a backup. • Cluster coordination: The cluster logic that glues all nodes of the system together is hosted on the queen. This software component is responsible for all cluster, transaction, and storage management aspects of the system. In this role, the queen is also responsible for seamless software delivery to all other nodes in the cluster. • Distributed query planning: The queen manages the distribution of data in the cluster, prepares top-level, partition-aware query plans, issues queries to virtual workers, and assembles the query results. The virtual workers, in turn, prepare local query plans and execute the queen’s queries in parallel. The queen structures top-level queries so that little or no data is shipped to the queen until the final phase, when the query results are assembled and sent to the client. • System tables: The queen hosts the Aster Database system tables. • Worker Nodes: As the name implies, worker nodes are the physical machines where the bulk of the data storage, analysis, and retrieval tasks get done in Aster Database. Actually doing these tasks is the responsibility of the virtual workers (vworkers) that reside on each worker node. There are usually more than one vworker per worker node. The number of virtual workers on each worker node is a function of the hardware configuration of the node: the number of CPU cores, memory, and direct-attached disk capacity. The queen communicates with vworkers via standard SQL, and the vworkers on various worker nodes communicate with each other via Teradata Aster’s mechanism. • Loader Nodes (also called “Exporters”): These are CPU-heavy nodes that typically have little to no disk capacity and help in independent scaling of CPU and disk in the cluster. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 18 Overview of Cluster Management Architecture They are responsible for loading and exporting data into and from the cluster. These independent nodes also help isolate loads and exports from query processing. • Backup Nodes: Individual tables or the contents of your entire Aster Database system can be backed up to an Aster Database backup cluster. The backup cluster is not an Aster Database. Instead, it is a set of disk-heavy Aster Database Backup Nodes designed to efficiently maintain copies of the data you store in Aster Database. This data can be restored to Aster Database. Single-System View All clients of Aster Database communicate with the cluster as if it is a single large system. The Aster Database Management Console (AMC), the Aster Database Terminal (ACT), the Aster Database JDBC and ODBC drivers, and the Aster Database Backup Terminal each interact with Aster Database as if it were a single database. More details on these client software follow in subsequent sections. Data Partitioning Aster Database achieves a massively-parallel database by exploiting the popular “divide-andconquer” principle in computing. Data is partitioned (distributed) among various sharednothing nodes (worker nodes), and within each worker node, among various shared-nothing vworkers. 19 • Distribution: Tables in Aster Database are created with an added SQL qualifier of FACT or DIMENSION. Fact (or large dimension) tables are distributed into individual vworkers that span across multiple worker nodes in the cluster. The key (column) to use for distribution is provided in the CREATE TABLE DDL (data definition language). • Logical Partitioning: Physical partitions of fact (or large dimension) tables are further divided within each vworker via logical partitioning within a physical partition. Multilevel partition hierarchies provide a powerful mechanism to prune data required during query processing. These sub-partitions are called “child partitions” in Aster Database and they report into a “parent table” at the highest level of the hierarchy. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Overview of Cluster Management Architecture Figure 5: A table showing both distribution and logical partitioning Massively Parallel Processing (MPP) Aster Database is a true massively-parallel database and has been built from the ground up with inexpensive commodity hardware in mind. Every operation in Aster Database – queries, loads, exports, index builds and rebuilds, upgrades, replication, backups, restores – is always executed in a massively parallel fashion. • MPP at workers: An Aster Database is comprised of numerous vworkers hosted at the worker nodes, as described earlier. Each individual query, planned at the queen, is executed in massively-parallel fashion across all partitions. Partitions communicate with each other during dynamic repartitioning of data – a concept in shared-nothing databases explored in later sections – via a massively parallel communication transport called Optimized Transport. • MPP in ETL: The Aster Database Loader utility and loader nodes form the massivelyparallel backbone of the Aster Database ETL pipeline. The Aster Database Loader utility communicates with loader nodes and acts as a landing zone for bulk data, both during loads and exports. • MPP in Backup and Recovery: When a backup administrator starts a backup by interacting with the Aster Database Backup Terminal, massively parallel streams of backup data travel from each partition source (N) to the destination backup nodes (M). This NxM communication is true for both backups and for the reverse traffic during recoveries. • MPP in Upgrades: During a normal cluster startup, software is delivered via the network in a massively-parallel way. Similarly, after an upgrade package is deployed on the queen, workers and loaders in Aster Database are upgraded in a massively-parallel fashion without any manual administrator intervention. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 20 Overview of Cluster Management High Availability Overview • MPP in Network: Once you add a worker or loader node, the Aster Database Network Aggregation feature automatically self-discovers all the other NIC IP addresses connected to that node. Once that node is activated, you have multiple network links aggregated together, a feature called Network Aggregation. This “bonding” offers two advantages, described below. • Bandwidth Aggregation: NIC bonding aggregates all the bandwidth of the individual 1GbE links for expanded network throughput. Before bonding, you have 1x1GbE link per node. After bonding, you have n times that. For example if you have 4x1GbE NICs on a node, after bonding you have four times the bandwidth or effectively 4Gb bandwidth. • Transparent failover in the event of any single network failure: In the event of any single network failure (NIC port, cable link, switch), there will be transparent failover to the remaining links. The only impact is that bandwidth will adjust down accordingly. If you have 4x1GbE NICs with two links connected to separate redundant GbE switches and one of the switches fail, the result is an automatic failover to the redundant switch, and the bandwidth shrinks from 4x1GbE links to 2x1GbE links. There is no downtime and continuous availability is maintained. High Availability Overview Cluster commodity hardware fails in more ways than one. Disks, RAID controllers, chipsets, DIMM modules, CPU, network cards, and switches are all commodity components that have a distinct possibility of failing in a large cluster. A RAID-10 disk configuration does not provide sufficient reliability for the high availability demands of modern data users. Aster Database, therefore, has high availability built into the cluster and offers replication as a firstclass feature. 21 • Replication Factor: Each vworker in Aster Database has zero or more replicas in the cluster. The Aster Database administrator sets the replication factor to two (recommended; Aster Database tries to ensure each vworker has a replica at all times and alerts the administrators if there is not) or one (no replication; not recommended), depending on the desired tradeoff between reliability and storage capacity. DMLs (data modification language) statements, loads, and queries go directly to the primary partition. To view your system’s Replication Factor, view the Dashboard tab of the AMC. • Automatic Failover and Online Resync: On any failure of the primary partition, the Aster Database clusterware automatically fails workloads over to a chosen secondary partition. Stale secondary replicas catch up using delta replication, where “delta” signifies changes present at the primary but not at the secondary. This is particularly useful after transient (temporary) failures. For example, assume Node 2 suffers a transient failure and a partition fails over to Node 4. If Node 2 fully recovers after two minutes, there may have been small changes (e.g. additional inserts/updates/deletes). Delta replication enables “delta re-synchronization” between the partition on Node 4 (up-to-date) and the partition on Node 2 (slightly out-of-date). Online re-sync can save significant recovery time compared to conventional approaches that rely on full copy restoration techniques that take hours or even days to complete. Queries are transparently retried on a failover. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Overview of Cluster Management User Interfaces to Aster Database Figure 6: Replication Failover in Aster Database • Balance Data: When a new worker is added or a failed worker is brought back into the system after a transient failure, the node is incorporated in a completely seamless manner without any perturbation to the existing workload. Any new replicas are created in the background, totally online. • Spare Machines: Teradata Aster recommends keeping spare servers in the cluster so that such nodes can be quickly repurposed to replace a failed queen, loader, worker, or backup node. • Network Aggregation: Once you add a worker or loader node, the Aster Database Network Aggregation feature automatically self-discovers all the other NIC IP addresses connected to that node. When that node is activated, you have multiple network links aggregated together. In the event of any single network failure (NIC port, cable link, switch), there is transparent failover to the remaining links. User Interfaces to Aster Database The tools you use to interact with Aster Database include data-path interfaces such as SQL, JDBC, and ODBC, bulk data path interfaces for loading/exporting, and tools for managing the cluster. The sections below explain the most common Aster Database-related tools. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 22 Overview of Cluster Management User Interfaces to Aster Database Cluster Administration Aster Database Management Console (AMC): The AMC is a graphical user interface that is the primary cluster management interface for Aster Database. For details, see Dashboard: The AMC (page 26). Event Notification and System Monitoring: The Aster Database Event Engine lets you set up alerts that fire when certain events happen on the cluster. See “Event Monitoring with the Event Engine” on page 122. In addition, you can get help from Teradata Aster consulting services to implement node, service, and network monitoring tools using other frameworks such as the popular open-source Nagios API. Command line: Most management tasks can be done in the AMC, but to run some lessfrequently used utilities, you must use the executables framework, the ncli (Aster Database Command Line Interface) or open a command line session on the Aster Database queen. See “Admin: Executables” on page 170, see “Admin: ncli (Aster Database Command Line Interface)” on page 136 and see “Cluster Management from the Command Line” on page 115. Database Administration Aster Database Terminal (ACT): ACT is the basic command line interactive terminal to interact with the cluster. Administrators can run DDLs and other commands to administer the database, browse the catalog and create/modify objects, and run SQL statements and scripts. AquaFold’s Aqua Data Studio (ADS): ADS lets you perform DDL operations and query data interactively. This is a third-party tool that you may purchase from AquaFold directly. Data Path Interfaces JDBC and ODBC: Aster Database provides JDBC and ODBC drivers. SQL via ACT: Developers and administrators can also run SQL statements and scripts by using the interactive ACT terminal. AquaFold’s Aqua Data Studio (ADS): ADS lets you query data interactively and provides tools that help you write and manage queries efficiently. Bulk Data Utilities Aster Database Loader Tool The Aster Database Loader Tool, ncluster_loader, is a full-featured, high-speed bulk loading application. Aster Database Backup Aster Database Backup lets you back up individual tables or your entire Aster Database. Backups are online operations, meaning your cluster remains up and servicing queries while the backup runs. Backups can occur automatically and in an incremental fashion to save space and bandwidth. 23 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Overview of Cluster Management User Interfaces to Aster Database Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 24 Overview of Cluster Management User Interfaces to Aster Database 25 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide CHAPTER 2 Dashboard: The AMC The Aster Database Management Console (AMC) is the main administrative interface to Aster Database. This chapter provides an overview of the AMC and describes the AMC Dashboard. • Overview of AMC (page 26) • Processes Section of the Dashboard (page 30) • Nodes Section of the Dashboard (page 31) • Configuring and Using the AMC (page 33) • Understanding Aster Database Status (page 35) Overview of AMC The AMC is a web-based interface that lets you manage, configure, and monitor Aster Database activity. The AMC provides administrators with an authoritative view of the system and mechanisms for invoking administrative actions. AMC provides developers and other users with insight into Aster Database activity, such as details on currently executing SQL statements and statement histories. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 26 Dashboard: The AMC Overview of AMC Figure 7: The AMC Dashboard Launching the AMC To access the AMC: 1 Open a web browser and enter the IP address of the queen node in the URL field. https://Queen_IP_Address/chrysalis/login If the login window does not appear, see “Troubleshooting: AMC login window does not appear” on page 34. Tip! AMC uses browser cookies. You must enable your browser to accept cookies from the queen node in order to use AMC. 2 In the Login window, enter your username and password. When you log in for the first time, the default username/password is db_superuser/db_superuser. 27 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Dashboard: The AMC Overview of AMC Figure 8: AMC Login Screen Role-based access The AMC enforces role-based access privileges. A user sees only those sections of the AMC to which his or her user account has been granted access. See “Creating an AMC user in Aster Database” on page 106. HTTPS and certificate warnings AMC runs on HTTPS and runs with a default Aster Database self-signed certificate. As a result, your browser warns you and displays a certificate error. See the Teradata Aster Big Analytics Appliance 3H Installation Guide for instructions on hiding these error messages. Other useful URLs The old, pre-4.5 AMC is still available, as well. To open it, navigate to http://<queen IP address>/amc. Ganglia is accessible at this URL: http://<queen IP address>/amc/ganglia. The AMC Dashboard The AMC Dashboard is the main information center where you can view the condition of the cluster and the jobs currently running on it. Many field labels in this window are clickable. By clicking a label or message, you can usually see more details about the message or navigate to the commands related to it. Figure 9: Navigation and Status Messages in AMC Status lamp Cluster name Links to documentation and downloads Message board Login details Status summary Top of the Dashboard window As shown in the image above, the top of the Dashboard consists of the following items. Clockwise from the upper left, they are: Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 28 Dashboard: The AMC Overview of AMC • Status Lamp: The status lamp lights green to show the cluster is running correctly. The legend next to the status lamp shows the name of the cluster and its status, and the current queen time, converted to browser-local time. The Aster Database statuses are discussed in more detail in “Understanding Aster Database Status” on page 35. • Cluster Name: The name assigned to the cluster. • Link to Docs and Downloads • Resource Center: Click this link to open the Teradata Aster Resource Center, a web page where you can find documentation, videos, and downloadable client software for various operating systems. • Help Link: Click this link to open an HTML page containing information about the AMC page you are currently viewing. • Login Details: In the top right of the window is the Teradata Aster logo. Directly below that is your current, logged-in AMC user account name. Your user account determines what actions you can perform in the AMC. • Status Summary: In the upper right of the Dashboard tab is the status box. This box is a fixture not only of the Dashboard, but of all AMC windows. The status box notifies you of important events in Aster Database. • Message Board: In the upper left of the Dashboard tab is the message board. Here, you and other Aster Database administrators can post messages to all AMC users. To add a message, click the pencil icon, type the message in the dialog box that appears, and click OK to post it. All AMC users on this cluster will see your message immediately on the message board in their AMC session. Navigation Tabs in the Dashboard Window Below the Status Icon are Navigation Tabs that provide access to various types of tasks that are accessible through the AMC. Each tab provides details on a different aspect of Aster Database. For information about how to use these tabs, see: • “Processes: Managing Activity” on page 38 • “Nodes: Managing Data and Nodes” on page 46 • “Admin: Administrative Operations” on page 90 The Processes Section of the Dashboard Window Below the message board and information box is the Processes section of the Dashboard tab. The Processes section shows an overview of the current and recent jobs in the cluster, as well as statistics about queries and user activity. See “Processes Section of the Dashboard” on page 30. The Nodes Section of the Dashboard Window At the bottom of the Dashboard tab, below the Processes section, is the Nodes section. The Nodes section summarizes the operational status of the machines in your cluster, including the quantity of data stored and the remaining free space in the cluster. See “Nodes Section of the Dashboard” on page 31. 29 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Dashboard: The AMC Processes Section of the Dashboard AMC Version Number and Aster Database Version Number To find out the version of the AMC you are running, click on the About the AMC link at the bottom of the AMC window. Hint: To find out your Aster Database version number (release number) and build number (revision number) from the command line, view the file /home/beehive/.build on the queen. Processes Section of the Dashboard The Processes section of the dashboard shows an overview of the current and recent jobs in the cluster, as well as statistics including the Most Active Users’ rankings and the Process Execution Time graph. The Active Applications box shows currently installed applications that run on the cluster. The Processes section corresponds to the Processes tab, and clicking most labels in this section will take you to the Processes tab. Figure 10: The Processes tab in AMC Dashboard: Processes Section: Green Summary Box The green summary box provides a quick overview of the queries running in the cluster. The green summary box lists the counts of the following states (click any label to show its details). • Running: Count of currently running queries and processes • Pending: Count of queries queued for admission to the cluster • Active Sessions: Number of users and applications currently connected to Aster Database. • Completed: Count of queries that finished running without error in the last 24 hours. • Cancelled: Count of queries cancelled by an administrator or user in the last 24 hours. • Error: Count of queries that failed and reported an error in the last 24 hours. • Unknown: Count of queries that started in the last 24 hours, but whose status is now unknown. • My Processes: Count of finished queries run by you (based on your AMC username) in the last 24 hours. • SQL-MapReduce: Count of finished SQL-MapReduce queries that have run in the last 24 hours. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 30 Dashboard: The AMC Nodes Section of the Dashboard Query Statistics Summaries in the Processes Section The Query Statistics Summaries area in the Processes Section provides an overview of the most active users and longest running queries. The Query Statistics Summaries area in the Processes Section shows: • My Last 5 Processes • Top 5 Longest Processes • Process Execution Time • Top 5 Most Active Users • Active Applications Nodes Section of the Dashboard The lower part of the Dashboard shows the Nodes overview. This section summarizes the operational status of the machines in your cluster, including the quantity of data stored and the remaining free space in the cluster. Figure 11: Nodes overview in AMC Dashboard: Nodes section: Green Summary Box The green summary box lists the counts of nodes in your cluster and summarizes the status of the nodes. This section shows the following (click any label to show its details): 31 • Queen(s): Count of queen nodes in this cluster. The Active count is the number of active queen nodes in this cluster. This can only be 1 or zero. The Passive count is the number of passive (backup) queens in this cluster. • Loader(s): Count of the loader nodes in the cluster. • Worker Nodes: Count of worker machines in the cluster. Note this is the count of worker machines, not the count of virtual workers. Below this are listed the counts of Active, New, Suspect, and Failed nodes. See “Understanding Node States” on page 47 for more details. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Dashboard: The AMC Nodes Section of the Dashboard Dashboard: Nodes section: Center Panel The center panel of the Nodes section shows the current replication factor of Aster Database. If the current replication factor is below your target replication factor (your Aster Database administrator specified this when installing Aster Database), a warning appears at the top of this section. The Replication Factor section shows, first, the cluster-wide current replication factor. Below that, it shows how many virtual workers are at RF=2 (these are workers that have a valid backup worker stored in Aster Database) and how many are lacking a backup (RF=1). Teradata Aster’s recommended setting is to maintain the cluster at RF=2. The bottom of this section is the Hardware Statistics panel, showing current and recent CPU usage, memory usage, network bandwidth usage, and disk I/O usage. Click the Nodes: Hardware Stats tab for more hardware statistics. Dashboard: Nodes: Cluster-Wide Disk Capacity/Usage The right side of the Nodes panel of the AMC Dashboard shows the Data Payload Panel. This panel provides a cluster-wide view of the data capacity of your cluster and shows how much disk space is currently being occupied by data and other system files. (Note that you can also view disk capacity and available space for an individual node, as explained in “Per-Node Disk Capacity and Current Usage” on page 49.) This information can be used to quickly determine whether you have a sufficient data storage capacity in Aster Database or should begin planning to add storage to the cluster. The measures shown here include: • Total Size of Active Data Stored shows the amount of data currently stored in Aster Database. Active data refers to the raw, uncompressed data size before it is stored on disk. The graph’s colors indicate the degree of compression applied to different portions of the data. Tip! Hover your mouse pointer over the graph to see the amounts of data stored at each compression level. The darker the color, the greater the degree of compression applied. • Total Data Stored and Disk Capacity: Just below the Total Size of Active Data Stored field is the Total Data Stored and Disk Capacity graph and a breakdown of its contents. The horizontal bar graph represents your total available disk space in the cluster, and the colors represent used and unused portions of the disk space. • The % Full icon provides a visual summary of the disk space remaining on your cluster. This graph turns orange to indicate that more than 70% of the cluster’s disk space has been used, and turns red to indicate that more than 90% had been used. If this graph is displayed in orange or red, you must take action by contacting Teradata Support. Important! Always maintain at least 30% free disk space in Aster Database. This space is required for routine aggregation and sorting operations. • User Data is shown in dark green. The amount shown here represents the amount of ondisk data in Aster Database. This is the size of data on disk, after (optionally) being Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 32 Dashboard: The AMC Configuring and Using the AMC compressed. To see the disk usage statistics for each node, click the Nodes Tab and then click the Node Overview tab. • Replica Data is the amount of space occupied by the replica copies of your data. • System represents the amount of disk space consumed by operating system files, Aster Database software files, and other files that do not contain your data. • Available represents the amount of unused storage currently available in the cluster. • Total Space shows the total amount of disk space in the cluster. • Alert fields: If any nodes’ disks are full or nearly full, the AMC displays alerts just below the Total Space field. Click the alert text to display the Node Overview tab, where you can find the nodes that are running out or space or nearly out of space. See “Disk Storage Utilization” on page 48 for details. Configuring and Using the AMC AMC System Requirements You can use the AMC from most common web browsers. See the Aster Database 5.0.1 Server Platform Guide for a list. Installing AMC No installation is required! The AMC is installed by default on your Aster Database queen. Connecting to the AMC In order to use the AMC, open a supported browser and navigate to the IP address of the Aster Database’s queen node. Note that the queen node must be powered on, and the Aster Database software must be active in order for the AMC to be accessible. Managing AMC Certificates Installing an AMC certificate at a location of your choosing The default location of the certificate is /home/beehive/certs/server.cert. You can install your own AMC certificate at a different location by performing the following steps. 1 Place the new certificate file in the desired location on the Aster Database queen. 2 In the file, /home/beehive/apache/conf/conf.d/ssl.conf, change the line "SSLCertificateFile /home/beehive/server.cert" to "SSLCertificateFile <absolute_path_of_new_cert_file>" 3 Open a command-line session on the queen as user root and issue the following statement to restart apache. /home/beehive/toolchain/x86_64-unknown-linux-gnu/httpd-2.2.15/bin/httpd -f /home/beehive/apache/conf/httpd.conf -k restart This will use the certificate file from the new location. 33 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Dashboard: The AMC Configuring and Using the AMC Disabling HTTPS access for AMC To allow users to connect to the AMC over an unencrypted HTTP connection, do this: • In the file "/home/beehive/apache/conf/conf.d/jk.conf ", comment out the line "LoadModule rewrite_module modules/mod_rewrite.so". • Issue the following statement to restart apache. /home/beehive/toolchain/x86_64-unknown-linux-gnu/httpd-2.2.15/bin/httpd -f /home/beehive/apache/conf/httpd.conf -k restart This disables HTTPS. All the traffic will travel over HTTP. Troubleshooting the AMC Troubleshooting: Error message “/home/beehive/tmp/ server.cert does not exist” When you start the Aster Database queen, it prints this error if the AMC certificate is missing: SSLCertificateFile: file '/home/beehive/tmp/server.cert' does not exist or is empty This indicates the certificate was not found in the location specified in /home/beehive/ apache/conf/conf.d/ssl.conf. Follow the instructions in “Installing an AMC certificate at a location of your choosing” on page 33 to fix the problem. The full text of the error message is: root@<queen-machine>:/home/beehive/amc/webserver/webapps/ROOT# /home/beehive/toolchain/ x86_64-unknown-linux-gnu/httpd-2.2.15/bin/httpd -f /home/beehive/apache/conf/httpd.conf k restart Syntax error on line 18 of /home/beehive/apache/conf/conf.d/ssl.conf: SSLCertificateFile: file '/home/beehive/tmp/server.cert' does not exist or is empty Troubleshooting: Certificate errors shown in browser Your browser may display a certificate error when you try to connect to the AMC. This is expected. Troubleshooting: AMC login window does not appear If the AMC refuses to load in your browser, do one of the following: • If you previously used the pre-version-4.5 AMC, then clear the browser’s cache. • If the AMC login window still fails to appear after you have cleared the browser’s cache, there may be a certificate problem. See the instructions in “Installing an AMC certificate at a location of your choosing” on page 33. Troubleshooting: AMC Add Node dialog box displays unexpectedly The AMC requires that the browser be set to accept cookies from the queen machine. On symptom of blocked cookies is that the Add Node dialog box displays unexpectedly. This problem manifests as: • When you go to the AMC Admin tab, part of the Add Node dialog box appears, even though you didn't click on the "Add Nodes" button. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 34 Dashboard: The AMC Understanding Aster Database Status • The Admin tab never finishes loading, and the cursor hangs, indicating "busy" for a long time. If this happens, make sure to enable cookies from the queen machine in the browser. Understanding Aster Database Status The Status Icon The Status Icon at the top left of the AMC shows the current overall status of Aster Database. The status icon will show one of five colors and a message describing the current status in more detail. • Green: Aster Database is operating normally and is able to accept new connections and process statement requests. • Blue: Aster Database is operating normally and is able to accept new connections and process statement requests, however, a current administrative activity may result in a decrease in performance. • Yellow: Aster Database is able to accept new connections, however, it is unable to process statement requests due to an administrative activity. • Red: Aster Database is currently stopped and cannot accept new connections or process statement requests. • White/Clear: The browser client is no longer able to establish a connection to Aster Database. Aster Database Status Descriptions For more details on the current status, position your mouse pointer over the Status Icon. The table below details the various Aster Database statuses that may be displayed. Table 2 - 1: Aster Database Status Descriptions Icon 35 Status Description Active Aster Database is currently active and operating normally. It has a replication factor of at least 1 and is able to receive and execute statement requests. Activating Aster Database is currently activating new nodes into the system. During this process, new nodes are brought into service and the data is redistributed across the workers. During this time, Aster Database is unable to execute statement requests, although clients can establish connections to it. For details on node activation, please refer to “Node Overview Tab” on page 464. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Dashboard: The AMC Understanding Aster Database Status Table 2 - 1: Aster Database Status Descriptions (continued) Icon Status Description Replicating Aster Database is currently replicating online. This process involves restoring Aster Database's replication factor, but does not bring the processing capabilities of new workers into use. For details on replication, please refer to “Node Overview Tab” on page 46. Data Imbalanced The cluster currently has less than the required number of copies of your data (as specified by your replication factor; usually two copies are required), or at least one partition of data and its replica are currently residing on the same physical node. Generally, this occurs following a node failure or other administrative action. Either of these conditions is undesirable and should be resolved using the Balance Data command. See “Balance Data: The Procedure” on page 113 for details. Processing Imbalanced The active vworkers are not evenly distributed on your cluster's hardware, meaning that at least one worker node contains more active vworkers than it should. Generally, this occurs following a node failure or other administrative action. This condition is undesirable and should be resolved using the Balance Process command. See “Activating Aster Database” on page 111 for details. Restarting Aster Database is currently restarting. During this time, it is unable to execute statement requests. After a short period of time, the status will change to Unavailable (see below). Please refer to “Restarting Aster Database” on page 109. Backing Up Aster Database is currently making a data backup to an Aster Database backup cluster. During this process, Aster Database continues to execute statement requests and other activities normally. However, there may be some performance overhead as a result of the backup activities. For details on backing up to an Aster Database backup cluster, see the Teradata Aster Big Analytics Appliance 3H Database User Guide. Restoring Aster Database is currently restoring data from an Aster Database backup cluster. During this process, Aster Database is unable to execute statement requests, although clients can establish connections to it. For details on backing up to an Aster Database backup cluster, see the Teradata Aster Big Analytics Appliance 3H Database User Guide. Stopped Aster Database is currently stopped. During this time, Aster Database is unable to execute statement requests, although clients may be able to establish connections to it. This status indicates that there is some issue preventing Aster Database from being able to operate normally. Such issues include having no active worker nodes in the system, having a replication factor below the required minimum, and the occurrence of a serious failure in Aster Database. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 36 Dashboard: The AMC Understanding Aster Database Status Table 2 - 1: Aster Database Status Descriptions (continued) Icon Status Description Unavailable Aster Database is currently unavailable. This means that the AMC browser client is unable to establish a connection to the Aster Database. In most situations, this will be the result of a network issue (i.e. Aster Database is active, but the web browser cannot establish a connection to it). However, if there are no network problems, then it indicates that an AMC connection to the queen node could not be established, which may indicate a failure of the queen node (during a restart of Aster Database, the status will briefly be unavailable as the queen node is rebooted). Allowed Administrative Actions Whether or not you can perform an administrative action depends on the: • rights that have been granted to you as an AMC user (see “Admin: Configuration: Roles and Privileges” on page 106); and • current status of the cluster. The table below shows which actions can be done in which Aster Database states. In this table, an “X” indicates the operation is allowed. Table 2 - 2: Allowed administrative actions Soft Restart Cluster Status Cluster Hard Restart Cluster Soft Restart Node Add Nodes Remove Nodes X X X X X Activate Cluster Balance Data Balance Processin g Upgrade Software Unavailable Active X X Stopped X X Activating X X Restarting X X Replicating X X X X X Backing Up X X X X X Restoring X X Data Imbalanced X X X X X Processing Imbalanced X X X 37 X X X X X X Teradata Aster Big Analytics Appliance 3H Database Administrator Guide CHAPTER 3 Processes: Managing Activity The AMC Process tab lets you monitor and track the SQL statements running in Aster Database. The filtering area at the top left is useful for showing and hiding different subsets of the processes, so you can focus on just the processes of interest to you. The green summary box at the top right shows counts of current and past statements, categorized by status. Figure 12: The AMC Process tab The Processes tab contains three sub-tabs: • Processes shows a table with statistics and status for current and past commands. See “Processes Tab” on page 38. • Query Timeline shows a graphical representation of commands run in the past 24 hours. See “Query Timeline Tab” on page 42. • Sessions shows user sessions with the AMC. See “Sessions Tab” on page 42. Processes Tab By default, when you click the Processes tab, AMC displays the list of processes in the Processes sub-tab. The list displays information about running processes or processes that finished running on the Aster Database. Each process is a SQL command or a block of SQL statements (BEGIN ... END). The statements can contain SQL-MapReduce functions. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 38 Processes: Managing Activity Processes Tab The Processes list is useful for monitoring activity on your cluster, checking on the progress of queries you have submitted, and finding performance issues such as statements that take much longer to run than others. To display processes: 1 Click Processes (Processes > Processes). Table 3 - 1 describes the type of information displayed for every process. 2 To filter the display of processes in the Query Timeline using the Change Filter button, as described “Filtering the Process List: The Process Filter” on page 40. 3 To display summary information about a process, move the mouse over the process ID. 4 To display detailed information about a connected process, click its ID. See “Process Details Tab” on page 43 for more information. Process Information The following table describes the type of information displayed for every process in the Processes List. Table 3 - 1: The AMC Processes Tab Columns Column Description of Processes Tab Column ID Unique identifying number for the process. The number is truncated in the list view, but you can see the complete number (and other details, such as the database on which the statement is acting) by hovering the mouse cursor over the process ID. Click to display the process detail page, described in “Process Details Tab” on page 43. Statement The command being executed by the process. Can be any sort of SQL statement. The statement is truncated in the list view, but you can see the complete statement by hovering the mouse cursor over the process ID number or clicking it to display the Process Detail tab. User Account that issued the request to run the statement. Status A color-coded icon is displayed to indicate the current state of the process: • Cancelled: the Administrator or user cancelled the statement. • Cancelling: the Administrator or user has requested that the statement be cancelled, but the statement is still running while Aster Database makes a best-effort attempt to cancel it. • • Completed: the statement ran successfully. Pending: the user submitted the statement, and it is in a queue on Aster Database waiting to be run. Or the statement is blocked pending the release of a system resource and may be potentially deadlocked on another concurrent statement. In the rare case that this happens, please look through other concurrently executing statements, or check your Quality of Service parameters to ensure that they are functioning properly. • Running: the statement has started and is underway. • Unknown: Aster Database is not providing a status at the moment. • Execution Time 39 Error: the statement could not finish normally. To see more details about the error, consult the log files; see “Aster Database Logging” on page 194. How long the process ran. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Processes: Managing Activity Processes Tab Table 3 - 1: The AMC Processes Tab Columns (continued) Column Description of Processes Tab Column Submit Time Time when the user requested the statement to be run. At this time, the statement was queued up on the cluster, but the statement did not necessarily start running immediately at this time. Completion Time Time when the process finished. Type Either SQL for an ordinary SQL statement, or SQL-MR for a statement that includes an SQLMapReduce function. Workload Policy The name of a set of rules governing how the process is handled when the cluster allocates resources. See the Workload Management chapter in the Teradata Aster Big Analytics Appliance 3H Database User Guide. Priority A number from 0 to 3 indicating how important the process is, where 3 is most important. Inherited from the workload policy. See the “Priority” section in the Workload Management chapter in the Teradata Aster Big Analytics Appliance 3H Database User Guide. Session ID Unique identifying number of the user’s AMC session. This is the command interface session where the user has logged in and issued the SQL command to the cluster. The statement is truncated in the list view, but you can see the complete statement by clicking the process ID number to display the process detail page. Cancel If it is possible to cancel the process, a Cancel icon is displayed in this column. Transactions that are not cancellable are transaction-related SQL (e.g. COMMIT, ROLLBACK), CLOSE cursor, and COPY-in SQL. Filtering the Process List: The Process Filter To make the process display even more useful, you can hide or show different processes by entering filter criteria. 1 Click the Change Filter button. 2 Enter your filter criteria. This example shows only the CREATE statements that were submitted on the retail_sales table in the last 24 hours and took more than 5 minutes to complete. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 40 Processes: Managing Activity Processes Tab Figure 13: Creating a Process Filter in AMC 3 To make this filter the default filter, click Make Default. 4 Click Go to view the filtered results. The current process filter terms are displayed above the Change Filter button, and only the requested processes are displayed in the list. Warning! When you restart or upgrade your cluster, the settings of the Process Filter in AMC are lost. After each Aster Database restart, you must re-create your filters. Disabling Auto-Polling By default, AMC uses auto-polling to display information about processes. To stop the auto-polling of processes: 1 Click Show new processes on manual refresh. Figure 14: Disabling Auto-Polling in AMC 2 41 When you are ready to update the process list, click the Refresh Now button. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Processes: Managing Activity Query Timeline Tab Query Timeline Tab The Query Timeline tab (Processes > Query Timeline) shows a graphical representation of commands run in the past 24 hours. By using this bar graph view, you can more quickly spot commands that are out of the ordinary in terms of processing time. Figure 15: The Query Timeline tab in AMC Each bar represents one SQL command. The bars are color-coded using the same status colors described in the Status column of Table 3 - 1. To display processes in the Query Timeline: 1 Click Query Timeline (Processes > Query Timeline). 2 To filter the display of processes in the Query Timeline using the Change Filter button, as described “Filtering the Process List: The Process Filter” on page 40. 3 To display details about a process, move the mouse over it. A popup message appears with additional information. Sessions Tab The Sessions tab (Processes > Session) shows a list of the connected or closed user sessions on this cluster. You can use this list to monitor user activity and help troubleshooting user issues. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 42 Processes: Managing Activity Process Details Tab Figure 16: The Session tab in AMC The User Sessions list displays session information that includes the session ID, the host the user is coming from, the login time, and the session duration. To display user sessions: 1 Click Query Sessions (Processes > Sessions). 2 To sort the list, click a column heading. 3 To display summary information about a connected process, move the mouse over the process ID. 4 To display detailed information about a connected process, click its ID. See “Process Details Tab” on page 43 for more information. Process Details Tab When you click the ID of a process, AMC creates a new tab displaying detailed information about the process. 43 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Processes: Managing Activity Process Details Tab Figure 17: The Process Details tab in AMC A Process Detail tab for that process is displayed. In addition to the columns displayed in the process list (see “Process Information” on page 39), this tab shows the following additional information. Table 3 - 2: Process information columns Column Description of Process Detail Tab Column ID The full unique identifying number for the process. Statement The full SQL statement being executed by the process. Status Detail Additional information, if available, which expands on the one-word status available in the process list tab. Database The database on which the statement is acting. Session ID The full unique identifying number of the user’s AMC session. This is the command interface session where the user has logged in and issued the SQL command to the cluster. Progress A bar that shows what proportion of the statement’s execution has been completed so far. Execution Plan A series of operations that show how Aster Database actually performed (or is performing) the statement. An SQL statement is typically broken down into component parts which are executed separately to efficiently achieve the final result. By default, the execution plan display omits routine or trivial operations, but you can display the entire plan by clicking Show All Steps. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 44 Processes: Managing Activity Process Details Tab Cancelling SQL Statements Sometimes, you may need to cancel a running process on the cluster. For example, suppose a user runs the query SELECT * from events. If the events table is large, the query could easily take far too long to complete. Another operation that can be time-consuming is a CREATE TABLE that inserts a large number of rows. Some SQL statements that are not cancellable. These are transaction-related SQL statements, such as COMMIT, ROLLBACK, CLOSE cursor, and COPY-in SQL. To cancel a running process, do one of the following: • In the Processes tab (Processes > Processes), if a Cancel icon is displayed for a process, click the icon in the Cancel column (right-most column), then click OK when prompted. • In the Process Details tab, you can cancel the statement by clicking the Cancel Process button. Either action will place the process in Cancelling mode, which indicates that the cancellation request has been received. Statement cancellation in Aster Database is an asynchronous, besteffort operation. While executing a statement, the Aster Database back-end checks periodically to see whether a cancellation request has been issued. If requested, the back-end acknowledges the cancellation and triggers a best-effort service to cancel the ongoing execution. SQL-MapReduce Statements You can monitor the execution of SQL-MapReduce functions in the AMC using the same general procedures outlined in this chapter. 45 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide CHAPTER 4 Nodes: Managing Data and Nodes The Nodes tab in the AMC gives a system-wide overview of the amount of data stored in Aster Database. In particular, it shows information about the extent to which data is replicated to tolerate node failures, and how overall node storage is utilized by data in the cluster. It provides interfaces through which administrators can manage data and replication in the system. The Nodes tab is also used to monitor the operation of Aster Database—its virtual workers, worker nodes, and loader nodes. In the Nodes tab, administrators can view information on each of the nodes participating in the Aster Database, configure those nodes, and retrieve logs and other information for debugging purposes. Figure 18: The Nodes tab in AMC Node Overview Tab The Node overview tab displays status and health information about worker and loader nodes. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 46 Nodes: Managing Data and Nodes Node Overview Tab The Node List The Node List in the Nodes Panel contains a list of all nodes that have been registered with Aster Database. This includes nodes that are active participants in the system, as well as nodes that are not currently participating in Aster Database (for example, nodes that have failed and nodes that have yet to be powered on). Each node is listed with an icon that depicts the type of node (see below for node types), along with a color representing its status. Nodes are identified by the IP address that has been assigned by the system. The nodes in the Node List can be filtered based on node type, using the “Node Type” drop-down menu above the list. Node Types There are three types of nodes in Aster Database: queen nodes, worker nodes, and loader nodes. Queen Nodes The queen node is the central node in the system and is represented in the AMC by an icon with a ‘Q’: This is the node on which the Aster Database management software (including the AMC) is installed. It is the node responsible for all management of Aster Database, from node management to statement execution management. Worker Nodes Worker nodes are the workhorses of Aster Database – they are the nodes where data resides and where query processing occurs. They are represented in the AMC by an icon with a ‘W’: In general, worker nodes represent the largest group of nodes in an Aster Database installation and are the focal point of management and administration. For instructions that show you how to add and manage nodes, see “Admin: Cluster Management” on page 90. Loader Nodes Loader nodes are optional nodes that can be added to Aster Database in order to increase the load throughput of the system (the rate at which data can be loaded into the system). They are represented in the AMC by an icon with an ‘L’: By default, data is loaded into Aster Database through the queen node. However, if additional throughput is required, dedicated loader nodes can be added to the system. These nodes can also be used for bulk exporting of data from the system. Please contact Teradata if you need additional loading capacity, so that we may help you plan and configure your Aster Database optimally. Understanding Node States Node status indicates the operational health or condition of a physical node in Aster Database. In the AMC’s Nodes: Node Overview tab, each node in the node list has a color indicating its status. Below, we list the node statuses of Aster Database. 47 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Nodes: Managing Data and Nodes Node Overview Tab New, Preparing, and Prepared Nodes When a node is first added to Aster Database, or registered, it is considered to be a New node. At this point, Aster Database is aware of the node’s existence, but the node has not yet contacted the queen in order to be prepared, or loaded with the Aster Database software. Nodes are also shown as New immediately following a restart of Aster Database. After the node contacts the queen to be prepared, its status changes to Preparing. While in this status, it is loading the Aster Database software and preparing itself to become a participant in Aster Database. Once the node completes preparation, its status becomes Prepared. At this point, the node is ready to be incorporated into Aster Database so that it can host vworkers. See “Incorporate the New Nodes” on page 71. Active Active and Passive are the acceptable states for nodes in a running cluster. Active nodes are nodes that are available immediately to process queries in Aster Database. For details, see “Balance Data” on page 113. Passive Active and Passive are the acceptable states for nodes in a running cluster. A Passive node is a standby that holds frequently updated copies of vworkers’ data and later can be made Active to take on query processing work as needed. For details, see “Balance Data” on page 113. Suspect Suspect nodes are nodes that have exhibited unusual behavior and are participating in the Aster Database in a limited capacity while being investigated for potential failures by the queen. See “Node Failures in Aster Database” on page 50. Failed Failed nodes are nodes that are no longer participating in the Aster Database. See “Node Failures in Aster Database” on page 50. Disk Storage Utilization In the AMC, there are two levels at which you can check your cluster’s data capacity and the amount of disk space currently in use. You can check: • Cluster-Wide Disk Capacity and Current Usage (page 48) and • Per-Node Disk Capacity and Current Usage (page 49) Cluster-Wide Disk Capacity and Current Usage The right side of the Nodes section of the AMC Dashboard tab contains the Data Payload Panel showing a summary of your disk storage utilization in the cluster as a whole. For an explanation of this panel, see “Dashboard: Nodes: Cluster-Wide Disk Capacity/Usage” on page 32. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 48 Nodes: Managing Data and Nodes Node Overview Tab Per-Node Disk Capacity and Current Usage To see detailed descriptions of how and to what extent the disks are being used on individual nodes, click the Nodes tab and click the Node Overview tab. Disk usage details appear in these columns: • Uncompressed Active Data Size: This column shows the amount of data currently stored on the node. The term “active data” refers to the raw, uncompressed data size before it is stored on disk. • Storage (GB) This column shows a graph showing the current usage of the node’s disk, by type of data stored (user data, replica data, and free space), and lists the amount of disk space currently occupied by user and replica data, expressed in GB. This shows the actual on-disk space that is used and free on the node. Hover your mouse cursor on the graph to see these statistics for the node: • User Data is the amount of space occupied by primary copies of your data on the node. • Replica Data is the amount of space occupied by the replica copies of your data on the node. • System represents the amount of the node’s disk space consumed by operating system files, Aster Database software files, and other files that do not contain your Aster Database-stored data. • • Available represents the amount of unused storage currently available on the node. • Total Space shows the total amount of disk space on the node. % Full: This column indicates how mach space has been used on this node. This graph turns orange to indicate that more than 70% of the node disk space has been used, and it turns red to indicate that more than 90% had been used. If this graph is displayed in orange or red, you must take action by calling Teradata Support. Important! Always maintain at least 30% free disk space in Aster Database. This space is required for routine aggregation and sorting operations. Hover your mouse cursor over any cell in these columns for more information. 49 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Nodes: Managing Data and Nodes Hardware Stats Tab Figure 19: Hover for more information on Disk Capacity and Usage Hardware Stats Tab The Hardware Stats tab contains information on hardware in the cluster. Figure 20: The Hardware Stats tab in AMC Node Failures in Aster Database The queen node in Aster Database actively monitors all nodes participating in the system. If it observes a node behaving in an unexpected or inappropriate manner, it will consider that node to be suspicious and change its status to Suspect, and the node will appear yellow in the AMC. A Suspect node status does not necessarily imply that the node has experienced a failure, only that the queen is examining it in order to determine whether one has occurred. If the node continues to demonstrate suspicious behavior while in Suspect status, the queen will consider it to be Failed and change its status accordingly. For instructions on addressing failed and suspect nodes, see “Addressing Failed and Suspect Nodes” on page 55. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 50 Nodes: Managing Data and Nodes Hardware Stats Tab What do I do with Suspect Nodes? It is important to note that nodes that have been marked Suspect still participate in Aster Database. They continue to store data and are active participants in statement execution. A Suspect node is a node on which one or more of the vworker databases of the node reported an error (disk errors are a frequent cause), and in response, the queen removed that vworker or vworkers from active status. The other, error-free vworkers on the node remain up and running in active status. The presence of a Suspect node does not necessarily imply a decrease in performance, but it typically means the cluster has fallen from RF=2 to RF=1, meaning that one or more vworkers may not have a backup vworker. While a node is in Suspect state, the queen monitors the node’s behavior and only consider it to be Failed if it continues to demonstrate such behavior. If the behavior that was originally observed was a one-time event (e.g. a transient network error between the queen and the node), the node will remain an active participant while being considered Suspect. In Aster Database, the queen will not automatically transition a node from Suspect to Active. Instead, a node will be returned to Active status on the next activation or load balancing activity. If the system continues to operate for a reasonable length of time after the node was originally marked as Suspect, Teradata Aster recommends that the node be returned to Active status by clicking the Balance Data button in the AMC. Allowing a node that is performing normally (e.g. one that has continued to operate for at least 24 hours without transitioning to Failed) to remain in a Suspect status for a lengthy period of time increases the chance that the node will eventually be considered Failed, triggered by an event such as an unrelated transient error. Recovering from Failures When the queen considers a node to be Failed, it will attempt to reboot that node. Aster Database is designed to enable recovery from many different types of errors through reboot operations. After rebooting the node, the queen will perform a number of checks on the node during the preparation phase. If it successfully passes those checks, the node will be returned to the Prepared status and can be subsequently activated back into Aster Database. (See “Incorporate the New Nodes” on page 71.) If the checks fail, the node will transition from the Preparing status to a Failed status. The queen may attempt an additional reboot, after which it will permanently consider the node to be Failed if no progress is made. If a node is permanently considered Failed, it should be physically removed from the Aster Database for investigation, as this is likely an indication of a hardware failure (e.g. permanent CPU failure). Addressing Hardware and Networking Problems on Workers If a worker node comes up as Failed or is missing from the AMC’s Cluster Management panel, you should check the node’s logs for hardware and networking failures. To do this, open an SSH session on the worker and check its Aster Database logs. Check the following: /data/ncluster/logs/dmesg.log 51 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Nodes: Managing Data and Nodes Hardware Stats Tab /var/log/messages /var/log/clusterservices.log Based on what you find in the logs, do one of the following: • If you do not find networking error messages in the logs, the node may have a hardware problem. The same is true if SCSI errors or I/O errors are logged in a worker’s log files. In either case, see “Addressing Hardware Problems on Workers” on page 52 Addressing Hardware Problems on Workers If you see SCSI errors or I/O errors in the worker’s log files, this may indicate problems with the disk or RAID controller. In this case, do the following: 1 Have your data center administrator inspect the worker node hardware. • If the administrator finds a failed disk, he should replace it. • If the administrator finds a failed RAID controller, he should replace it and reformat the disk. Important! When you replace the RAID controller, you must reformat the disk. A disk that has been operating with a failed RAID controller may contain corrupt data. 2 After the disk has been replaced, reboot the node. Upon rebooting, the worker node performs a number of checks and reformats the disk automatically if needed. The steps Aster Database takes are: a If the partition table is not found on the disk (this can happen when the node is new or one of the disks on the node has been replaced), then Aster Database automatically reformats the disk. Once reformatting is complete and the node shows as prepared on AMC, go to Step 3 below. b If the partition table exists on the node, then Aster Database compares the actual partition size to the size recorded in the partition table. If the actual size on the disk is the same as the size stated in the partition table or within the tolerance threshold of the stated size, then the node is allowed to advance to Prepared state, and you can go to Step 3 below. (Aster Database 3.0.2 and later allows the tolerance to be specified.) If the actual size on the disk differs from the stated size by more than the tolerance, it is marked as Failed. Reformat the disk manually or increase the tolerance threshold on the cluster (this will require the cluster to be shutdown before you increase threshold) and perform /etc/init.d/local start if local is not already started. 3 After the node is marked Prepared in the AMC, it is recommended that you run a sanity check to read and write large amount of data from/to disks on that node before the node is added back into the cluster. These tests could indicate potential, unfixed disk issues. You can perform these checks using the following commands. Write test: time dd if=/dev/zero of=/primary/tmp/zero.out count=100 bs=1G Read Test: time dd if=/primary/tmp/zero.out of=/dev/null count=100 bs=1G The first command creates a 100 GB file on /primary and the second one reads it. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 52 Nodes: Managing Data and Nodes Hardware Configuration Tab Delete /primary/tmp/zero.out after the tests are completed; otherwise it will continue occupying 100 GB of space on the node: rm /primary/tmp/zero.out 4 Restore the replication factor of your cluster as shown in Step 3. Hardware Configuration Tab The Hardware Configuration tab show information about the current hardware configuration. Figure 21: The Hardware Configuration tab in AMC Partition Map Tab The Partition Map tab show a graphical representation of the cluster with details for each node. 53 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Nodes: Managing Data and Nodes Partition Map Tab Figure 22: The Partition Map tab in AMC Replication Factor The replication factor (“RF”) is the number of copies of your data that are stored in Aster Database to provide tolerance against failures. Maintaining an RF of two ensures Aster Database is resilient to node and queen failures. While you can run Aster Database at an RF of one, Teradata Aster strongly recommends that you run with an RF of two. During operation of the cluster, hardware failures can cause the RF to fall below two, at which point you must take action to restore the RF. Figure 23: Replication Factor in Aster Database Replication factor (“RF”) indicates the number of full copies of data in the cluster: • RF=2: When Aster Database has two copies (an original and a copy) of every piece of data, we say that the cluster has a current RF of two. With an RF of two, Aster Database is able to tolerate the removal or failure of any single node while remaining available and ensuring Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 54 Nodes: Managing Data and Nodes Partition Map Tab data safety. With an RF of two, there is also a replica of the queen’s data, which ensures that you can perform queen replacement (“Queen Replacement” on page 78) if the queen fails. • RF=1: When the AMC reports an RF of one, it means that at least one node has lost its replica. This occurs when a node fails or is removed. An RF of one means that one full copy of data remains in the cluster. With an RF of one, Aster Database remains available for querying and loading, but the loss of another node might result in data loss. When RF falls to one, you must restore it to two as soon as possible. This is described in “Restoring Replication Factor” on page 55. Checking the Current Replication Factor To check the current RF: 1 Open the AMC in your browser. 2 Click the Nodes tab. 3 In the upper right corner, look in the Replication Factor field in the green status box. Figure 24: The Replication Factor field 4 Click the Partition Map tab for details. In the upper right corner of the Partition Map tab, the Replication Factor information box shows the current RF and lists the number of virtual workers running at RF=2, and the number running at RF=1. Inspect the partition map to find the failed and suspect nodes. Figure 25: Replication Factor in the Partition Map tab Restoring Replication Factor Addressing Failed and Suspect Nodes In some cases, as described in the previous section, the replication factor (“RF”) in Aster Database may fall to one, instead of the recommended RF of 2. If this happens, you should restore the RF. Follow the steps below, in the order shown: 1 55 Check the node status in the AMC. Click on the Nodes: Node Overview tab, find the Node, and check its Status: a If the node is marked “Suspect,” check for and fix hardware problems as explained in “Addressing Hardware Problems on Workers” on page 52. If you wish to attempt to restore the RF now, without using the suspect node, proceed to Step 2. If no hardware problems are found, proceed to Step 3. b If the node is marked “Failed,” note that a node may temporarily display as Failed during the course of a regular soft or hard reboot. In that case, the cluster may just need time to come up completely. Do not attempt to reboot the node or take any other Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Nodes: Managing Data and Nodes Partition Map Tab actions until you are certain the cluster has come up completely and that the node is still displaying as Failed. If a worker node geoes into a “Failed” state after being added to cluster: • Connect to the worker node and look at the /var/log/installer.log to make sure the installaton on the node succeeded • If the node is going through a reboot cycle, wait until the node completly boots up, and then check the log file. • If there is no error message in the installation the state of the node should change from failed to preoparing/prepared after the node has gone through a reboot cycle. • If the node remains as “Failed”, check for and fix hardware problems as explained in “Addressing Hardware Problems on Workers” on page 52. If you wish to attempt to restore the RF now, without using the failed node, proceed to Step 2. If no hardware problems are found, perform a node restart (type /etc/init.d/local restart) on the node and wait for it to show up as “Prepared” in the AMC. Once the node is prepared, proceed to Step 3. 2 If hardware problems are found, you should fix them as soon as possible, but in the meantime you might be able to restore the RF to 2 using the existing set of nodes. To do this, you will perform Step 3 below, but first you must check if the cluster has enough space to replicate the data that was stored on the failed node. Click on the Nodes: Node Overview tab and, for each node, click its Data Stored graph to check its remaining free space. Make sure there is more free space than the total amount of data that was on the failed node. If there is enough space, proceed to Step 3. Otherwise, you must replace the failed node hardware before you can restore the RF to 2. (See “Addressing Hardware and Networking Problems on Workers” on page 51.) 3 In the AMC, click the Admin Tab and click Balance Data. This balances the storage and brings the RF to 2.This is an online operation and does not interrupt currently running queries or other transactions. When this completes, the cluster will be in one of the following states: 4 • Active state indicates all nodes are working. You have successfully restored the RF. • Imbalanced state indicates processing has not yet been balanced. Proceed to Step 4. In the AMC, click the Admin Tab and click Balance Process. This balances the processing. This is a blocking operation (any running queries will be cancelled) and will take few minutes to complete. At the end of this operation the cluster’s status changes to Active, indicating you have successfully restored the RF. Note: When you click Balance Data in the AMC, the system balances the storage (the number of logical workers) across all worker nodes. This syncs each vworker with its replica. At the end of the balance data operation: If the number of active logical workers is balanced (that is, if the cluster is processing-balanced) then you do not need to click the Balance Process button. The cluster status becomes Active. This happens if all the suspect vworkers happen to be passive vworkers. If the number of active logical workers is not balanced (that is, if the cluster is processing-imbalanced) then the AMC shows a cluster status of Imbalanced and you must click the Balance Process button. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 56 Nodes: Managing Data and Nodes Partition Map Tab Changing the Replication Factor To change the replication factor of Aster Database from RF=1 to RF=2, do this: 1 Open a console window and SSH or log into the console of the queen as root user. 2 In a text editor, open the file, /home/beehive/config/goalReplicationFactor 3 The file contains a “1”. Change this to a “2”, and save the file. 4 Perform a soft restart on the queen: # ncli system softrestart 5 Point your browser to the AMC on the new queen, go to the Admin: Cluster Management tab, and click Activate Cluster. Note: You can also reduce the RF from 2 to 1 using the procedure shown above, but Teradata Aster does not recommend doing this, because reducing the RF to 1 has the effect of deleting the backup copies of your data. 57 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Nodes: Managing Data and Nodes Individual Node Inspection Tab Individual Node Inspection Tab To see details about a node, click its name. The Node Data tab provides information about the virtual workers. Figure 26: The Node Data tab in AMC Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 58 Nodes: Managing Data and Nodes Individual Node Inspection Tab The Node Hardware Stats tab provides information about CPU, memory, network and disk usage. Figure 27: The Node Hardware Stats tab in AMC 59 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Nodes: Managing Data and Nodes Individual Node Inspection Tab The Node Data tab provides information about the virtual workers. Figure 28: The Node Data tab in AMC Reading Aster Database Logs The AMC provides administrators with easy access to both Aster Database-wide system logs and system logs for individual nodes, including each node’s preparation log, system log and kernel log. Logs can be retrieved via the Individual Node Inspection tab by following these steps: 1 Navigate to the Nodes Panel in the AMC 2 In the Node Overview tab, click the name or IP address of the node whose logs you wish to view, or click the queen’s entry if you wish to view system logs. 3 An Individual Node Inspection tab appears as the right-most tab. In its upper right corner are the Logs links. 4 Click the desired Logs link, which is one of: 5 • preparation log (the log of events related to the process of preparing a node for participation in Aster Database); • system log (the contents of the Linux syslog file /var/log/messages); and • kernel log (the contents of the Linux kernel buffer provided through dmesg). The log appears, showing the latest 1000 lines. Click Refresh at any time to load the latest 1000 lines. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 60 Nodes: Managing Data and Nodes Individual Node Inspection Tab In the log window, you can search the log by typing a search term in the Enter terms field and clicking Search. Creating Log Bundles for Support Inquiries Rather than reading log files one by one as described in the previous section, you can request the AMC to create a compressed file containing multiple logs. This is useful when you want to send logs back to Teradata’s support team for troubleshooting. For more information, see “Aster Database Logging” on page 194. Aster Database Log Format The log format is as follows (fields shown in italics are optional fields; some Aster Database components provide these fields and others do not): timestamp severity-code PID source-filename:line-number event-id RCID ] message • timestamp: the time the log entry was created, formatted according to the ISO-8601 standard, yyyy-mm-ddTHH:MM:SS.uuuuuu. Per the standard, a “T” time designator introduces the clock-time portion of the timestamp. The “uuuuuu” in the description above indicates the microseconds portion of the time. For example, a timestamp might look like: 2010-03-23T11:26:13.185081. • severity-code is a four- or five-letter code indicating the importance of the event being logged. The codes are INFO: Informational message; conveys useful information about regular, steady state operation. WARN: Indicated unexpected behavior; should be investigated. The system continues to operate normally, but you may be suffering degraded performance. ERROR: Some operational error occurred. The operation will abort and the error will be user-visible. FATAL: A non-recoverable error happened. Component will abort. 61 • PID: Integer code that identifies the process that generated the event being logged. • source-filename:line-number is the name of the executable source file that produced the log entry, followed by a colon and the line number of the application code line that produced the log entry. For example, StatServer.cpp:105. • event-id: Optional. The event-id is present only if the event-producing component is one that uses event_ids. The event-id is used by the Aster Database Log Server in conjunction with the Aster Database Alerting Framework (Blackbird) to trigger alerts for Aster Database administrators. The event-id has the format, XXnnnn, where XX is a two-letter code that identifies the Aster Database component (such as “BA” for Aster Database Backup), and nnnn is a four-digit, component-defined event type identifier. For example: BA0012. • RCID: Optional. The request context associated with this log. If there is no request context, then this field is omitted. For example, 3563712506369035985. • the right square bracket (]) marks the start of the message portion of the log entry. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Nodes: Managing Data and Nodes Detecting and Managing Skew in Aster Database • message is an arbitrary-length message, in text format. For example, Disk full detected. Log format example Below, we show a sample log entry. This example is a single line, but, depending on the format in which you’re reading document, it may appear here split across multiple lines: 2010-03-23T11:26:13.185081 INFO 30459 StatServer.cpp:105 ST0012 3563712506369035985] Disk full detected Setting Aster Database to log more or fewer log entries By default, Aster Database logs messages of all severities. If you wish to improve Aster Database’s performance by having it log only the more severe events, contact Teradata support and have them configure your cluster to skip the logging of INFO events (and optionally other lower-importance events). Our support personnel set this via the minLogSeverity flag, where a value of 4 shows all events, 5 shows WARN and above, 6 shows ERROR and above, and 7 shows only FATAL events. Values lower than 4 are reserved for future use. Some log messages cannot be disabled by the minLogSeverity setting. To disable all INFO messages, set minLogSeverity=5 and logInfoMaxVerbosity=-1. It is not currently possible to disable all WARN, ERROR, or FATAL messages. Verbose logging for debugging The logging system can be set to provide verbose logging to help Teradata support investigate problems on your cluster. Verbose logging is turned off by default. You must contact Teradata support to have it turned on. When activated, it can be set to show the amount of verbosity that is needed (level 0, 1, or 2, with level 0 producing the least logging text and level 3 producing the most. Our support personnel set this via the maxLogVerbosity flag. Components not using this log format As of version 4.5.1, the following Aster Database components do not use the standard Aster Database logging format: • ODBC, JDBC and OLEDB drivers • Aster Database Loader • AMC Detecting and Managing Skew in Aster Database Data and processing skew is one of the biggest performance killers in an MPP environment. Data skew is caused when your table’s distribution key column contains data with an uneven distribution. Processing skew may be caused by a combination of data skew when joining tables, or an imbalance of values in the distribution key, or heterogeneous hardware or slow/ malfunctioning hardware/software. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 62 Nodes: Managing Data and Nodes Detecting and Managing Skew in Aster Database Table Skew (Data Skew) First, validate the distribution of the distribution key for a partitioned table. In this example we are going to query the table MyTable which has a distribution key on the userid column. SELECT userid, COUNT(*) AS usercount FROM mytable GROUP BY 1 ORDER BY 2 DESC LIMIT 10; Note, that the userid at the top of the list has over 50X the number of rows than the next nearest userid. This will definitely cause processing skew when joining any other table, even if it is on the same worker, via the userid column. Possible causes of this condition: • The application that populated the table inserted a DEFAULT value in the distribution key column of too many records. • Errors occurred in the ELT/ETL processing when you loaded the table. How do we fix this? First, is there possibly another column that would work as a distribution key? The userid column was likely selected as it is the distribution key of other tables that this table is joined with. So, we may not be able to choose another column. The next thought is to see what causes this particular value to be inserted by the ETL/ELT process. Look for processing or logic errors that will make this particular userid so prominent. If the logic is correct, then an alternative is to apply a RANK value and make it a negative value so that rows with this userid can be easily excluded from reporting logic but still available for JOIN operations and no data is lost. For example: If the current process does an INSERT/SELECT operation from a staging table where the userid may be NULL… INSERT INTO MYTABLE (userid, ... other columns ... ) SELECT COALESCE(userid, 3089263269635597179) ... FROM mytable_staging; As you will note, this will automatically make any null values for the userid column into the same value, thus causing skew. There are several algorithms that will work to create unique userid values. One would be the use of a SEQUENCE. Another would be to use the RANK function with a negative multiplier, for example: Step 1: INSERT INTO MYTABLE (userid, ... other columns ... ) SELECT UserId, ... FROM mytable_staging 63 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Nodes: Managing Data and Nodes Detecting and Managing Skew in Aster Database WHERE userid IS NOT NULL; Step 2: INSERT INTO MYTABLE (userid, ... other columns ... ) SELECT RANK(1) OVER (ORDER BY another column) * -1 AS userid ... FROM mytable_staging WHERE userid IS NOT NULL; There are other variations such as maintaining a physical table with the lowest value and doing a cross join to get the starting point. It will require that the ETL be broken into multiple steps and that related data be assigned the same userid, but the effort in the front end (ETL/ELT) will be more than worthwhile when it comes to relieving skew during reporting (and ELT). Partition Level Skew Checking vWorker (partition) size: To check the size of partitions and their distribution, run the nc_skew function from the Admin: Executables tab of the AMC. See “nc_skew” on page 180. If any vworker is substantially larger than the others, please contact Teradata support to help identify the table that is taking up the space and reduce its size. Checking the Table Size Aborted data load operations can consume disk space on workers. To get the on-disk sizes of tables, use the Table Size functions in the Admin: Executables tab of the AMC. See “nc_tablesize” on page 179. If a table appears to be larger than its row count would warrant, please contact Teradata support for help. Processing Skew Information about worker node-level processing skews can be obtained using Ganglia. To find out whether a particular user query is experiencing processing skew, follow the steps below: • Given the user name, use system tables to find the start time of the transaction. • Do ps –ef | grep postgres on Queen node to find the start time of the Postgres session. • Issue ps -ef | grep <username>|grep <starttime>|grep -v idle across all Workers using ClusterSSH. This tells us on which partitions and on which worker nodes the query is still running. • If only one partition is still running, then you might be suffering from processing skew. Go to that node and monitor this Postgres process to see if it is heavy on CPU or I/O or memory. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 64 Nodes: Managing Data and Nodes Detecting and Managing Skew in Aster Database 65 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide CHAPTER 5 Cluster Expansion Contents of this section: • Add New Nodes to the Cluster (page 66) • Incorporate the New Nodes (page 71) • Balance Process (page 72) • Splitting Partitions in Aster Database (page 73) • Appendix: Troubleshooting Cluster Expansions (page 75) Add New Nodes to the Cluster Perform the following steps to add a new worker or loader to your cluster. This procedure installs the Aster Database software on the worker and loader machines and adds them to the cluster. Prerequisites: Before you add nodes, make sure you do the following: • Set up the operating system on each node and applied any necessary patches. • Set up passwordless node-to-node SSH for the root user. • Make a list of the IP addresses for each node to be added. • If the prospective node machine has been previously used as an Aster Database node, then you may wish to clean its file system as explained in “Deleting All Data to Re-Provision a Node” on page 76. Alternatively, you can leave the old data in place and tick the Clean Node checkbox to allow Aster Database to delete the old data when adding the machine as a new node. Warning! Be sure there is no data stored on the worker and loader machines. If you are using machines that have seen previous service as Aster Database workers, re-install the operating system on them, or clean their filesystems as explained in “Deleting All Data to Re-Provision a Node” on page 76. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 66 Cluster Expansion Add New Nodes to the Cluster Warning! Be sure the worker and loader machines all have the same time zone setting as the queen. Procedure: 1 Make a note of the IP addresses of all machines you will use as workers. 2 Open the Aster Database Management Console (AMC) in a browser window. To do this, navigate to http://<ip address of the queen>. 3 In the login window, type the username db_superuser and the password, db_superuser. 4 Click Admin: Cluster Management. 5 In the Admin: Cluster Management: Nodes tab, click the Add Nodes button. The Add Nodes window appears. 6 67 In the Add New Node window, for each node you wish to add: • Select a Node Type of worker (or loader if you want to add a loader). • Choose IP to identify nodes by IP address. (It is also permissible to choose MAC, but for a UMOS installation, you will typically use IP.) Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Cluster Expansion Add New Nodes to the Cluster 7 • Type the IP Address of the node. Important: Ensure that the worker IP you type here is accessible from the queen box and that password-less SSH for user “root” is configured in both directions. • Choose a Display Name to identify the node in Aster Database. • Optionally, type a Rack Id to indicate which hardware rack the node resides in. • The Clean Node check box is a convenience for administrators who are reprovisioning a node machine that was previously used in Aster Database. If you are not re-provisioning an old node, you may leave this box unchecked. If you are reusing hardware, then you may need to check this box. To remove all Aster Database-related data from the node, check the Clean Node check box. If you do not check this option and Aster Database data is found on the node machine, the Add Node attempt will fail. If you do check this option and Aster Database data or processes are found on the node machine, they will be deleted or stopped and the Add Node operation will proceed. To add more nodes, click the plus sign (+)button in the lower left of the Add Nodes window. You can add all of the worker and loader nodes in parallel. (That is, you don’t need to wait for a given node to become active or connected before you start adding the next one.) Warning! If your prospective node machine was previously deployed in Aster Database, the next step is likely to delete the old Aster Database data on the prospective worker node. For information on what will be lost, please see “Deleting All Data to Re-Provision a Node” on page 76. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 68 Cluster Expansion Add New Nodes to the Cluster 8 Click OK to dismiss the Add Nodes window. Aster Database adds the nodes. The nodes appear with a status of New which will automatically advance to Installing, Upgrading, and finally Prepared. 9 Wait for all worker and loader nodes to appear in as Prepared in the Admin: Cluster Management: Nodes tab, with the correct software version displayed in the Installed Version column. When each node has become Prepared, the message “Add node operation successful” appears in the upper right corner. This does not mean the node is active. Rather, is it ready to be activated. Tip! It is normal for the nodes to reboot themselves a few minutes after being added. When you install Aster Database, and whenever you install a new Aster Database version, the installation might include operating system updates that require a reboot. The nodes will automatically reboot themselves once to put the operating system updates into effect. 69 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Cluster Expansion Activate Aster Database Next Step: Do one of the following: • If the node appears as FAILED in the Admin: Cluster Management: Nodes tab, see “Node Failures in Aster Database” on page 50. • If the node appears as PREPARED, proceed to the next section, “Activate Aster Database” on page 70. Activate Aster Database In this phase, you will activate the cluster, making it ready to load data and service queries. Procedure: 1 Open the Aster Database Management Console (AMC) in a browser window. To do this, navigate to http://<ip address of the queen>. 2 Click on the Admin: Cluster Management: Nodes tab. Make sure all new worker and loader nodes in the list show a Status of Prepared. Tip! See “Understanding Aster Database Status” on page 35 for more details on node statuses. Worker and loader nodes should boot from the queen over the network and go through various phases and eventually reach the “Prepared” state. 3 Near the top of the Admin: Cluster Management: Nodes tab, click the Activate Cluster button. A confirmation window pops up: Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 70 Cluster Expansion Incorporate the New Nodes Click OK. Activation takes a couple of minutes. You can watch the status of each node in the Admin: Cluster Management: Nodes tab. The AMC shows cluster status in the upper left corner. The cluster status changes from Stopped to Activating and finally to Active. Once the cluster is Active, it is ready to load data and service queries. Incorporate the New Nodes In an Active cluster, you can incorporate new or repaired nodes and bring them to a Passive state (they host backup v-workers, which act as stand-bys but do not process queries) without disrupting queries and loading operations. Bringing nodes to an Active state (they host active v-workers, which process queries), by contrast, has the side-effect of briefly disrupting queries and loading operations. To incorporate one or more nodes: 1 Make sure the new nodes you wish to incorporate are in the Prepared state. 2 In the Nodes panel of the AMC, click the Activate Cluster button. For further information on cluster activation, see “Activating Aster Database” in the Aster Database Administrator Guide. Next step: After incorporation, the new node(s) may be in the Active or Passive state. Either Active or Passive is an acceptable state for a node, but for performance you should strive to keep 71 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Cluster Expansion Balance Process all nodes in the Active state. When a node is Passive, it’s acting as a standby that holds copies of v-workers’ data, but it is not contributing to query processing. You can make the node Active by performing a Balance Process operation on it. Balance Process Balance Process is an Aster Database administrative action that load-balances the query processing burden across all worker nodes in Aster Database. It will optimize performance given the current data placement. The Balance Process step does not create new copies of data, so it typically runs quickly. It also does some cleanup, deleting data that can no longer be used. It will briefly disrupt the cluster, aborting any in-progress transactions, a period that can last from a few seconds to a few minutes. It is recommended that Balance Process be run at some point after Balance Data completes, when a few minutes of downtime are acceptable, so that a new node’s processors are available to the cluster. You initiate Balance Process in the AMC by clicking the Balance Process button in the Admin: Cluster Management tab. See the next section for instructions. Balance Process: The Procedure Warning! While the Balance Process step is in progress, your cluster cannot process queries. Before you per- form the next step, make sure all running queries have finished successfully and that no new queries are allowed to enter Aster Database. Already-running queries will be killed when you click Balance Process, and new queries submitted after you click it will wait until the balancing is complete. 1 Click the Balance Process button in the AMC’s Admin: Cluster Management tab to force currently passive nodes to contribute to the handling of database queries. This interrupts the operation of the cluster. This step places active v-workers on all nodes that are able to host them. This action takes at least a few seconds and as many as a few minutes to complete. Once processing is balanced, the cluster resumes handling queries and the new node(s) are part of the cluster, which you can check by looking for a status of Active in the AMC’s Nodes panel. Next Steps If you are scaling out your cluster, you may want to perform a partition split now to increase the number of v-workers in the cluster. See “Splitting Partitions in Aster Database” on page 73. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 72 Cluster Expansion Splitting Partitions in Aster Database Splitting Partitions in Aster Database Partition splitting is an Aster Database feature that helps you add vworkers so that you can maintain an optimal ratio of CPU cores to vworkers as your cluster grows. To scale out your cluster, you add worker nodes (as shown in “Add New Nodes to the Cluster” on page 66). As you add worker nodes to the cluster, Aster Database does not automatically increase the number of vworkers. In other words, the number of vworkers stays constant as you add worker nodes (machines). This means that, as you add nodes to the cluster, the ratio of CPU cores to vworkers will increase, and eventually your CPUs may become under-utilized. If this happens, you can improve performance by increasing the number of vworkers (also known as “splitting partitions”). Teradata Aster recommends that you manage your cluster so that you have approximately two CPU cores per vworker. For example, an 8-core node should typically host 4 to 6 vworkers. In order to avoid having to split partitions, you may elect to set up your cluster with 6 vworkers per 8-core node and then add nodes as your data grows, until your ratio falls below 4 vworkers per 8-core node. Once the ratio falls below this point, it’s a good idea to split partitions to make better use of the processing power of your nodes. Preparing to Split Partitions Before you split partitions, if you have not already expanded your physical cluster to the size you need, do so now as shown below. (If you already have enough physical nodes, proceed immediately to “Partition Splitting Procedure” on page 73.) 1 Determine how many vworkers you need. First, check the current partition count by opening a command shell on the queen as Aster Database administrator and viewing the partition count files. To see the current partition count, view the file: $ cat /home/beehive/config/totalPartitionCount The counts shown are the total, cluster-wide counts of the partitions. (That is, the count is per cluster, not per-node.) Tip! You can also find out the initial partition count that was configured when Aster Database was installed. To do this, look at the file: $ cat /home/beehive/config/initialPartitionCount 2 Add the desired number of new worker nodes to your cluster. See “Add New Nodes to the Cluster” on page 66. 3 Activate the new nodes. See “Incorporate the New Nodes” on page 71. Proceed to “Partition Splitting Procedure”, below. Partition Splitting Procedure Partition splitting increases the number of vworkers in your cluster. We refer to the number of active vworkers in your cluster as the cluster’s “partition count.” You will typically perform a partition split after you have added more physical machines to the cluster. 73 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Cluster Expansion Splitting Partitions in Aster Database Warning! Your Aster Database system will be unavailable to users during the partition splitting operation. 1 Open the AMC and check the following: • Make sure Aster Database is Active. • Make sure all clients and SQL users have logged out from the cluster. • Make sure all queries have finished running. Warning! All client sessions to Aster Database need to be terminated before partition splitting can be started. If there remain any client sessions, then partition splitting will immediately encounter an error and display this message: Partition splitting failed: Error notifying queen. Will terminate online partition splitting. If you see this message, disconnect all clients and run the partition split again. 2 After all queries have finished, open a SQL session in ACT and type: COMMIT; Issuing “COMMIT” causes remaining prepared transactions to run, if any are present on the cluster. Check the AMC to verify that all transactions have finished running. After they have finished, proceed to the next step. 3 Set the Aster Database concurrency threshold (the “QoS”) to zero. This prevents all SQL users (even you) from logging in to the system. Use the SetConcurrency.py script at the queen command line: # cd /home/beehive/bin/utils/support # ./SetConcurrency.py --setConcurrency=0 Current concurrency threshold is 100. Setting concurrency to 0... The command prints the pre-edit threshold. Make a note of this number; later you must return the QoS to this setting. (The default for normal cluster operations is 100.) 4 Run the Change Partition Count utility which requires at least one argument: the desired partition count. Replace <num> here with your desired count. You can run this as user beehive: /home/beehive/bin/exec/changePartitionCountExec --desiredPartitionCount=<num> The desiredPartitionCount must be greater than the current partition count. Optionally, you can also pass the argument -desiredParallelism <num> to indicate that you wish the splitting to be done in a parallel fashion. Replace <num> with the integer number of tables that should be split concurrently at any given moment. Typical values are 8 or 16. The default is 1. Type the --help option for help. The operation may take a number of hours, depending on the amount of data in your cluster. To complete the operation, soft restart Aster Database. After the restart finishes, the partition split is complete. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 74 Cluster Expansion Appendix: Troubleshooting Cluster Expansions If the operation fails, you should restart it by re-running the changePartitionCountExec utility, using the same parameters you used the first time. If the second attempt fails, please contact Teradata support. 5 Log into the AMC and go to the Node: Partition Map tab to monitor the progress of your partition split. Green squares represent active vworkers. When the number of active vworkers reaches your desired partition count, the split is complete. You can also verify your new number of partitions by logging in to the queen command line and viewing the file, /home/beehive/config/totalPartitionCount. 6 Once the split operation is complete, you must restore the Aster Database concurrency threshold (QoS) to its normal value. This is typically “100”. Use the SetConcurrency.py script at the queen command line: # cd /home/beehive/bin/utils/support # ./SetConcurrency.py --setConcurrency=100 Current concurrency threshold is 0. Setting concurrency to 100... Appendix: Troubleshooting Cluster Expansions If Your “Add Node” Attempt Stalls or Fails If an Add Node attempt fails or appears to be installed, you will typically see an Install Failed message in the Admin: Cluster Management: Nodes tab. Additionally, a message will appear in the right-hand corner. You will see the following message if you are trying to add a node that already exists. To check why the Add Node operation failed, you can call Aster support or log into the machine you tried to add, and check the /var/log/installer.log and /var/log/ installerShim.log files for error messages. • If you see the message, “user data directories are present,” see “Add Node Fails With “user data directories are present” Message” on page 75. • If the Nodes tab shows “InstallingOS” for more than 15 minutes, see “Add Node Hangs at Installing OS Phase” on page 76. Add Node Fails With “user data directories are present” Message When an Add Node operation fails, the most common reason is that Aster Database found data on the machine being added. You should ensure the machine you wish to add as a node doesn’t contain data. If your attempt to add a node fails and returns the message, “user data directories are present,” it probably means the new node machine was previously used as an Aster Database node and was not cleaned up properly before you attempted to re-add it. In this case, follow this troubleshooting step: 75 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Cluster Expansion Appendix: Troubleshooting Cluster Expansions Is your cluster currently running with a RF (replication factor (current)) of two (“RF=2”)? (You can find out by looking at the AMC Dashboard in the Nodes panel, in the Replication Factor section.) • If current RF=2, then it is safe to delete the node’s data. Proceed to “Deleting All Data to Re-Provision a Node” on page 76 • If current RF=1, do not delete data files from the node. Instead, call Teradata Support for assistance to ensure no data is lost. Add Node Hangs at Installing OS Phase If your cluster is an AMOS cluster that has been upgraded from 4.5.1, there is a known issue that causes node addition to become stuck as the “Installing OS” phase. Please follow the workaround below to fix this problem. This works on version 4.6 or later for AMOS clusters only. 1 Log into the queen as root and run the following command: cp /home/beehive/installer/InstallerShim /home/beehive/ubuntu/aster/ resources/InstallerShim 2 Open the AMC’s Admin: Cluster Management tab, find the node that is stuck at the “Installing OS” phase, and click the “X” in the Remove column to remove it. 3 Add the node again using its MAC address. Deleting All Data to Re-Provision a Node Aster Data recommends that before you add a machine as a worker or loader node in Aster Database, you should remove all user data from that machine. This is particularly important if you wish to deploy a machine that has previously served as an Aster Database node in your cluster. The AMC’s Add Node(s) button gives you the option to delete this data (by checking the Clean Node check box), but if you wish to delete the data manually, follow the instructions below. Warning! If you wish to re-deploy a node that previously served as an Aster Database node, you should make sure the machine does not contain data you need, since you must delete all its Aster-stored data before you re-deploy it. As a guideline, if your cluster is currently running at RF=2 (after removing the node that you will re-deploy), then it is probably safe to delete the node’s data as explained below. 1 To clean up an old Aster Database node for reuse in the cluster, delete the following files and directories: • /primary/w*z (where the asterisk represents the vworker number. For example, you might see /primary/w5z (vworker number 5) and /primary/w12z (vworker number 12) here.) • /primary/iointerceptor • /primary/tmp/worker_status • /primary/tmp • /primary/.deleted Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 76 Cluster Expansion Appendix: Troubleshooting Cluster Expansions • /primary/.olddata • /primary/upgradeState.* • /primary/beehive_id • /primary/initialPartitionCount • /primary/queenDb* (if present) 2 Empty the contents of /etc/rc.local if the node was previously a part of an AMOS install. 3 Delete the 'beehive' UNIX user and the 'beehive' UNIX group. To do this, SSH into the machine as root, and run userdel beehive and groupdel beehive. 4 Once the node has been cleaned by completing steps 1-3 above, reboot the machine. Rebooting ensures that data clean-up is completed so that the machine can be re-added as a node in Aster Database at a later time. Once you have finished the clean-up steps, you may re-add the machine as a new worker or loader node, or deploy it as a new Aster Database queen. Proceed to “Add New Nodes to the Cluster” on page 66 to re-add the machine as a new node, or turn to the installation chapters to install the queen software for your operating system. 77 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide CHAPTER 6 Queen Replacement Aster Database provides a facility for replacing your queen node if it fails. This facility, called queen replacement, requires that you repurpose your loader node as the new queen. The failed queen can later be removed from the rack and replaced with a new loader. The contents of this document are: • Best Practices for Ensuring Queen Recoverability (page 78) • Install the Queen Software on the Loader Node (page 79) • Replace the Failed Queen (page 82) • What is kept and what is lost during queen replacement? (page 89) Best Practices for Ensuring Queen Recoverability To be prepared to replace your queen, you should follow the best practices listed below. • Always run Aster Database with RF=2. Warning! Always run Aster Database with RF=2 (that is, with a replication factor of two). If you try to replace the queen of a cluster that has replication factor of 1, it will not work and you may lose all your data. • Install a backup queen when you install Aster Database. This is a server with the same version of Aster Database software installed as your active queen, but not connected to any workers. (It is also possible to install a backup queen after your queen has failed, and perform queen replacement using that backup queen, but Aster Data urges you to avoid this because racking, cabling, and installing a new queen takes time.) To set this up: On a server with identical hardware to that of your queen, install the Aster Database software. • Make a periodic backup of the /home/beehive/smc/.macs file from your active queen. This contains the MAC addresses of the workers. Create a directory /home/beehive/ backup-smc on the backup queen and place the backup .macs file there. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 78 Queen Replacement Install the Queen Software on the Loader Node Warning! Do not place the backup .macs file into the /home/beehive/smc directory of your backup queen until Step 3 instructs you to do so. Placing the .macs file in /home/beehive/smc causes the queen to assume control of the cluster! • Make a note of your existing queen’s network settings including the IP address, netmask, gateway, and NIC bonding settings, if any. • For all scripts and code that you store on the queen, make sure you always keep a copy of the latest version of that script or code in another location not on the queen. • Set up out-of band management (also known as “integrated lights-out” or ILO) for each worker node in Aster Database so that you are still able to reboot the node, even if you have lost your main network connection to it. Install the Queen Software on the Loader Node To make your cluster able to recover from a queen failure, you must install a backup queen. Install the queen software on the loader node as shown below. This transforms the loader into your backup queen. For clarity, we’ll call your existing, failed queen the “primary queen” in this discussion. We use the name “backup queen” to refer to the machine that will act as the new queen. Warning! This procedure converts your loader to be the new queen. The queen can handle loading tasks while acting as queen, but you should contact Teradata support immediately for a replacement loader node. 1 Get the Aster Database installer binary, AsterInstaller_5-0-1_r29677.bin, and copy it to the /tmp directory on the loader node. 2 SSH or log in as user root on the loader node. In order to run properly, the Aster Database installer requires that you be logged in as root. 3 Using a text editor, remove the failed queen’s IP address from the /root/.ssh/ known_hosts file. 4 Perform a local stop on the loader node: # /etc/init.d/local stop 5 Make the installer executable (the installer file name below is just an example; replace it with the appropriate name for use on your operating system): # chmod +x AsterInstaller_5-0-1_r29677.bin 6 Run the Aster Database installer from a command shell on the loader: # ./AsterInstaller_5-0-1_r29677.bin 79 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Queen Replacement Install the Queen Software on the Loader Node 7 In the Welcome screen, select OK and press <Enter>. What keys do I use to navigate the installer? The upper part of the installer window always lists the actions you can take or the information the installer wants you to supply. Use these keys to navigate: Tab moves from field to field. Highlighting shows the currently active field or button. Enter executes the highlighted button or field. Esc exits the installer at any point. You’ll be asked to confirm before the installer aborts. The Help button provides context sensitive help/information. Ok and Back allow the user to navigate between screens. Keyboard shortcuts are indicated by the underlined letter e.g. Ok can be “clicked” by pressing Shift+O. Note: Shortcuts are not triggered by the Alt key in the installer. 8 In the Previous Installation screen, click 2. No, perform a clean install and click OK. All Aster Database-related data will be deleted from the node on which you are installing. Wait for the cleanup to finish, after which the installation continues automatically. If prompted to uninstall an earlier version of Aster, please do so. 9 The Manage node operating system window appears. Choose No, Node OS is pre-installed. 10 In the Installation Type screen, choose Production Install. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 80 Queen Replacement Install the Queen Software on the Loader Node 11 In the Select the network device screen, highlight bond0 and press <Enter> to choose the primary interface used to communicate with the worker and loader nodes. 12 In the Verify Networking Information screen, check your network settings and, if needed, fix them as explained below: • The Enable NIC bonding field, type “y”. • The Slave Network Interfaces field lists the slave network interfaces to be bonded into the connection. Specify these as a comma-separated list without spaces, as in: eth4,eth5. • In the Default gateway IP field, always use the address of an actual gateway. Do not use the queen’s IP address as the gateway address. • Enter/edit the NTP server IP address (if needed). Aster Database nodes use the Network Time Protocol (NTP) to synchronize their clocks. By default, the hardware clock of the queen (at IP 127.127.1.0) is used as the NTP server (the synchronizing clock). You can change this IP to that of another NTP server. Note: Also shown in this window are the machine’s IP Address, Netmask, Default gateway IP, Subnet prefix address, and Broadcast IP. The installer reads these settings from the node’s existing network configuration, so you typically do not need to edit these fields. • To continue, Tab to OK and press <Enter>. 13 In the SSH private key for root user screen, choose No since you have already set up passwordless SSH between all nodes in your cluster. 14 To select the Queen node type, choose Backup Queen. 15 In the Please Provide... screen: • In How many worker nodes, specify how the number of worker nodes (machines) in the Aster Database cluster. • In How many CPU cores, specify 2. • Tab to OK and press <Enter>. 16 Accept the default for Number of primary virtual workers. 17 In the Database replication factor field, it is very important that you keep the default value of “2”. Running with a replication factor of two means that your data is likely to survive the loss of a worker node, and it ensures that if the queen fails, you can replace it. 18 Click OK and wait for the installer to install the Aster Database queen software. 19 Once the Aster Database installation is complete, reboot the machine when prompted. After the installer finishes, the machine reboots and the services are started. When installing a new queen, the Aster Database Services may take ten or more minutes to start because they must apply a number of SQL upgrades and other upgrades. 20 SSH back into the machine, and verify that the newly installed Aster Database software is running: # /etc/init.d/local status 81 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Queen Replacement Replace the Failed Queen Look for a status of “started.” (A status of “starting” means the queen is not yet ready.) Tip: It is normal for the queen to reboot itself a few minutes after being added. When you install Aster Database, and whenever you install a new Aster Database version, the installation might include operating system updates that require a reboot. The queen will automatically reboot itself once to put the operating system updates into effect. Your backup queen software is installed. A note about node names in these instructions From this point on, we’ll refer to the machine that used to be your loader node as the “backup queen.” We’ll use the name “failed primary queen” to refer to the queen that has failed. Replace the Failed Queen These instructions assume you have a backup queen with an IP address whose final octet value is lower than that of the failed primary queen. For example, if your queen has the address 10.61.2.100, then your backup queen should have an address of 10.61.2.99 (or one with a lower final octet value). Workers’ final octets are assumed to be .101, .102, and so on. Outline of the Queen Replacement Procedure: • Prerequisites (page 82) • Build the MAC Address File (page 83) • Removing the Failed Primary Queen (page 84) • Setting Up Passwordless Root SSH Among All Nodes (page 84) • Shutting Down Workers (page 85) • Shutting Down and Configuring the Backup Queen (page 85) • Swapping Old and New Queens’ Network Settings (page 86) • Backup Queen is Now Primary Queen (page 87) • Running the Queen Replacement Script (page 88) 1 Your cluster must have a replication factor of 2. If you try to replace the queen of a cluster that has a replication factor of 1, it will not work and you may lose all your data. 2 If you don’t already have a backup queen, install a backup queen now as shown in “Install the Queen Software on the Loader Node” on page 79. 3 Check your hardware and network: Make sure that all the worker nodes will be able to boot and that the replacement queen will be able to reach them over the network. Prerequisites Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 82 Queen Replacement Replace the Failed Queen Build the MAC Address File 1 If a MAC address list file is available, get it now. This is the file that lists the MAC addresses of all the worker nodes in your cluster. Do not place this file in the /home/beehive/smc directory of your backup queen! Instead, keep it in a backup directory. In this example, we will save it as /home/beehive/backup-smc/.macs on the backup queen. a Get the .macs file in one of these ways: If the filesystem of the failed primary queen is still available, you can copy the file from there directly: scp <OLDQUEEN>:/home/beehive/smc/.macs <STANDBYQUEEN>:/home/beehive/backup-smc/.macs If that file or filesystem is unavailable, then you might have a backup copy you can use. The backup is usually saved as /home/beehive/backup-smc/.macs on your backup queen. If you do not find a usable .macs file, continue to Step 2, below. If you do find a usable .macs file, proceed to Step b, below. b 2 Open the .macs file in a text editor and add the IP address of each worker to its line in the file. Format it as shown below in “Format of the .macs file.” Save the file as /home/ beehive/backup-smc/.macs on your backup queen. If no MAC address list file is available, create one now: a First gather the MAC addresses of all worker nodes in the cluster. Log into each worker and use the ifconfig command to get the MAC (HWaddr) and IP (inet addr) addresses corresponding to the interface used for communication with Aster Database (typically this is the eth0 or bond0 interface). Make a note of the addresses for each worker. b Using a text editor, edit or create the .macs file. Format it as shown below in “Format of the .macs file.” Save the file as /home/beehive/backup-smc/.macs on your backup queen. Format of the .macs file: The .macs file contains a list of MAC address / IP address pairs. Each pair occupies one line followed by a line break and an empty line terminated with another line break. In other words, there are two line breaks after each pair. Format each MAC address as numbers and lowercase letters only, without colons, dots, or punctuation. Place one space character between the MAC address and the IP address. Format the IP address as four decimal octet values separated by dots. A file with two entries might look like what’s shown below. (Note the empty line between the two entries, which is required!) 00215c014b25 10.61.2.106 0024E8A0BD22 10.61.2.107 83 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Queen Replacement Replace the Failed Queen Removing the Failed Primary Queen 1 If your failed primary queen is still running, change its IP address so that the backup queen can assume the failed primary queen’s current IP address. To do this: • Perform a soft shutdown on the failed primary queen. • Set the IP address of the failed primary queen, giving it a lower final octet .98 in / etc/network/interfaces and • Shut down the failed queen machine with “halt”. # ncli system softshutdown Setting Up Passwordless Root SSH Among All Nodes The queen replacement script requires that passwordless SSH be set up among all nodes for user root and user beehive: 1 SSH into the backup queen as root. Unless noted otherwise, you will perform the following steps as root on the backup queen. 2 Copy the ssh configurations for the root and beehive users from one of the worker nodes to the backup queen: # scp -r root@<worker_IP_address>:/root/.ssh/* /root/.ssh/. If prompted, type the root password. 3 Become user beehive: # su - beehive 4 Copy user beehive’s keys to all workers and loaders: $ scp -r beehive@<worker_IP_address>:/home/beehive/.ssh/* /home/ beehive/.ssh/. If prompted, type user beehive’s password. 5 Type exit to become root user again. $ exit 6 Verify passwordless access among nodes and users: a As root, verify passwordless access from this node to the other. You should not be prompted for a password: # ssh root@<node_IP_address> # exit b As beehive, verify passwordless access from this node to the other. You should not be prompted for a password: # su - beehive $ ssh beehive@<node_IP_address> $ exit c Type exit to become root user again. $ exit d As root, verify passwordless access from root at this node to beehive at the other node: # ssh beehive@<node_IP_address> # exit Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 84 Queen Replacement Replace the Failed Queen 7 Repeat the steps above to copy the backup queen’s keys to all nodes. Shutting Down Workers 1 Shut down all workers: a Log into each worker as root. b Run “local stop”: # /etc/init.d/local stop c Shut the worker down with the halt command or an ILO power-off command: # halt d Repeat until all workers have been shut down. Shutting Down and Configuring the Backup Queen 1 SSH into the backup queen as root. Unless noted otherwise, the steps after this one are done on the backup queen. 2 Soft Shutdown: Perform a soft shutdown of the backup queen by issuing the command: # ncli system softshutdown 3 Source the /etc/profile.d/asterenv.sh file by issuing: # source /etc/profile.d/asterenv.sh Note: If the soft shutdown attempt fails, you should open the AMC of the backup queen, click the Admin: Cluster Management screen, click the Activate Cluster button, wait for the backup queen to become active, and then retry the soft shutdown. Warning! If passwordless SSH access is not set up between all nodes (both from the queen to the nodes and from the worker nodes back to the queen), the rest of this procedure will fail. If you haven’t checked your passwordless SSH configuration in all directions, check it now as explained in “Setting Up Passwordless Root SSH Among All Nodes” on page 84. 4 ConfigureOS.py: Run the script, ConfigureOS.py, to reset the backup queen's OS-level network settings to match those of the failed primary queen. To do this: • Go to the configure directory: # cd /home/beehive/bin/lib/configure • Run the script. Assuming a failed primary queen whose IP address (sysman_ip) is 10.61.2.100, you would type: # ./ConfigureOS.py --restore_conf --configure_network --sysman_ip=10.61.2.100 --mirror_host=10.61.2.100 (Note: To see the list of arguments, type: ./ConfigureOS.py --help) 85 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Queen Replacement Replace the Failed Queen 5 ConfigureNCluster.py: Run the script, ConfigureNCluster.py to reset the backup queen’s internal Aster Database network-related settings to match those of the failed primary queen. To do this: • Continue working in the /home/beehive/bin/lib/configure directory. • Run the script. Assuming the network settings outlined above, you would type: # ./ConfigureNCluster.py --configure_network --clear_hosts --sysman_ip=10.61.2.100 Swapping Old and New Queens’ Network Settings 1 Make sure failed queen is unreachable: Before you proceed, make sure the failed queen is either offline or has had its address changed as shown in “Removing the Failed Primary Queen” on page 84. 2 Swap old and new queens’ network settings: Change the network settings on the secondary queen, so that it assumes the IP address of the failed queen. On appliances, manually change the following network configuration on the secondary queen. This example assumes the IP address of the failed queen is 39.64.8.14 and the IP address of the secondary queen is 39.64.8.28. Warning! On appliances, do not use YaST to change the network configuration. Doing so can cause problems with Server Management. a Replace each occurrence of the secondary queen’s IP address with that of the failed queen: # cat /etc/sysconfig/network/ifcfg-byn0 # BYNET (byn0) configuration BOOTPROTO=static IPADDR=39.64.8.14 NETMASK=255.240.0.0 BROADCAST=39.79.255.255 MTU='65536' STARTMODE=onboot b Create the smainfo file by copying the /etc/opt/teradata/sm3g file to /var/opt/ teradata/bynet/smainfo: # cp /etc/opt/teradata/sm3g /var/opt/teradata/bynet/smainfo c Edit the smainfo file to change the BYNETIP address to the secondary queen’s IP address: # vi /var/opt/teradata/bynet/smainfo PMA 111 BYNETIP 39.64.8.28 BYNETMASK 255.240.0.0 Important: You must leave the PMA setting as it is; change only the BYNETIP setting. d Restart BYNET: Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 86 Queen Replacement Replace the Failed Queen # /etc/init.d/bynet restart e Issue an ifconfig byn0 and confirm that the IP address has changed: # ifconfig byn0 f Using a text editor, edit the following files and settings on the secondary queen. In each case, replace each occurrence of the secondary queen’s IP address with that of the failed queen: /home/beehive/cluster-management/hosts /home/beehive/config/beehive.cfg /home/beehive/config/beehiveparams.cfg /home/beehive/config/ganglia/gmond.conf /home/beehive/config/queenDbReplicas /home/beehive/config/snmpd.conf /home/beehive/installer/config/installerConfig Note that if there is no file named /home/beehive/config/queenDbReplicas, this is normal, and it is okay to move to the next step. 3 Reboot: Reboot the secondary queen to force it to assume its new IP address. Restarting network services only is not sufficient; you must do a hard reboot. Backup Queen is Now Primary Queen The backup queen is now the new, primary queen. In the steps that follow, we refer to it as the “new queen.” Perform these steps on the new queen: 1 When the new queen comes back online, SSH into it as root again, and use the status command to check the Aster Database operational state. Do this: # /etc/init.d/local status When you see this response, Aster Database is ready, and you can proceed: * status: 2 started Check the SYSMAN_IP environment variable’s value: # echo $SYSMAN_IP Its value should be the IP address of the queen. (That is, you should see the IP address that was used by the old queen and is now used by the new queen.) If it’s not set to the IP address of the queen, contact Aster Support. 3 Save your .macs file as /home/beehive/smc/.macs # cp /home/beehive/backup-smc/.macs /home/beehive/smc/.macs 87 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Queen Replacement Replace the Failed Queen Running the Queen Replacement Script Next you will run the queen replacement script (Main.py) to apply the failed primary queen’s list of workers to the new queen. Tip! The Main.py script logs its errors to /home/beehive/data/logs/ QueenReplacement.log. 1 CD to the directory of the queen replacement script: # cd /home/beehive/bin/utils/queen-replacement 2 Run the queen replacement script using the python runtime. The name of the script is Main.py. The script takes as an argument the pathname of the .macs file. Typically, you will type it as shown here. The script normally requires less than 10 minutes to run. $TC_PYTHON ./Main.py --macFile=/home/beehive/backup-smc/.macs --ignoreWorkerList=queenDb0 Wait until the script prompts you for continue/enter. Do not type anything yet! 3 Power on all worker node machines. Use the physical power switch or your ILO system to do this. 4 Wait for all workers to start. Check the status of all workers by logging into each and run / etc/init.d/local status, looking for the response “Started.” Keep checking the status every minute or two until you see that the service has started. 5 Once all workers have started, go to the shell where you started Main.py, type “continue”, and hit <Enter>. To verify that the queen replacement process completed successfully, check the /primary/ logs/QueenReplacement.log file. It should end with a message similar to 2010-08-02 20:17:36 759893 INFO Will replicate the consistent queen replica queenDb-25 from 10.61.2.107 to the coordinator with new name queenDb-4881944 <-- pid=4960 main() <RecoverQueen.py:139> (but with a current timestamp, the IP address of one of your worker nodes, and different numbers for the old and new queenDbs and PID). If you don't see this message, please contact Aster Support. 6 After the Main.py script has run, perform a soft restart on the queen: # ncli system softrestart 7 Point your browser to the AMC on the new queen, go to the Admin: Cluster Management screen, and click Activate Cluster. Your queen replacement is complete. The new queen is now your active queen. To ensure recoverability, you should install a new backup queen now. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 88 Queen Replacement What is kept and what is lost during queen replacement? What is kept and what is lost during queen replacement? Reset: The Aster Database replication factor (RF) is reset to 2 during the replacement. Kept: The following items survive the queen replacement: • all databases and their data; • Aster Database statistics; and • the roster of worker nodes that you provide in the .macs list. Lost: The following items are lost during queen replacement: 89 • Aster Database logs; and • scripts that you may have stored locally on the queen. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide CHAPTER 7 Admin: Administrative Operations This section explains how to start, activate, and manage your Aster Database using command-line tools and the Aster Management Console (AMC). AMC is Aster Database’s browser-based cluster management tool. • Admin: Cluster Management (page 90) • Admin: Events (page 99) • Admin: Executables (page 99) • Admin: Backup (page 99) • Admin: Configuration: Cluster Settings (page 101) • Admin: Configuration: Workload (page 105) • Admin: Configuration: Roles and Privileges (page 106) • Admin: Logs (page 109) • Restarting Aster Database (page 109) • Activating Aster Database (page 111) • Balance Data (page 113) • Balance Process (page 114) • Cluster Management from the Command Line (page 115) See also: • Cluster Expansion (page 66) • Queen Replacement (page 78) Admin: Cluster Management The Cluster Management page (Admin > Cluster Management) lets you manage your Aster Database cluster. Admin: Administrative Operations Admin: Cluster Management The Cluster Management page lets you: Table 7 - 3: Cluster Managment operations Task Section Restart all nodes in the cluster. “Restarting Aster Database” on page 109 Register new hardware as a worker or loader node in the cluster. “Add New Nodes to the Cluster” on page 66 Bring the cluster online, incorporate newly added nodes into the cluster, or activate nodes in the cluster. “Activating Aster Database” on page 111 Ensure the data in the cluster is fully replicated. “Balance Data” on page 113 Balance the placement of vworkers to ensure data availability and efficient query execution. “Balance Process” on page 114 Upgrade the cluster to a newer version of the Aster Database software. Teradata Aster Big Analytics Appliance 3H Upgrade Guide Using Multi-NIC Machines in Aster Database Multi-NIC machines enable two capabilities in Aster Database: 1 NIC Bonding - This is already set up for by on the Teradata Aster Big Analytics Appliance 3H. 2 Segmenting Network Traffic by Function - You can set up your multi-NIC nodes so that Aster Database traffic is segmented by function for backup, loads and regular Aster Database traffic (called “default” traffic or “queries”). We refer to this feature as the network assignments feature. For set-up instructions, see the instructions, below. Below, we discuss network assignments set-up in the context of the AMC. However, you can also configure networking for Multi-NIC machines using the commands found in the ncli nsconfig Section (page 153). 91 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: Administrative Operations Admin: Cluster Management Segmenting Network Traffic by Function Aster Database enables you to use different subnets for different functions, in order to keep network traffic separate for backup, loading and regular Aster Database traffic (queries). Some reasons for designating a dedicated subnet for different Aster Database functions might include: • Legal requirements • IT policies and restrictions, security • Resource allocation needs and performance • Access from outside the Aster Database subnet for specific functions (loading, backup) You make the network configuration as follows: 1 Create a configuration for each node, 2 Apply the configuration you created. Note that although this can be done without restarting the cluster, any network operations that are already in progress will be interrupted. 3 Optional: You can assign each function to a NIC or a bond. Tip! The IP addresses assigned to each function must be in the same subnet for all nodes. You may see the error “The Loads and Backups Networks have not been configured on the Queen and/or some nodes.” This error occurs if one of the functions (loads/backups) is set up to use a particular subnet for the workers, but not for the queen (i.e. set to the default IP). But the AMC does not detect the problem if the queen has loads/backups assigned to use a specific subnet, but it is not in same network as the IPs assigned for those functions on the workers. So if you encounter errors when configuring network traffic by function, check to make sure the IPs assigned to each function are in the same subnet for all nodes. Configure Network Settings You assign Aster Database functions to their own subnets by using the AMC Network settings. To view and/or edit the Network settings: 1 Select the Admin tab, and then choose Configuration and Network from the drop-down options. 2 The AMC Network Overview screen will appear, showing each node and its current settings. Warning! The settings displayed in the AMC Network Overview screen reflect the current state of the cluster. The information is obtained by querying the system for bindings, and as such, may not reflect the same settings as those in the configuration files. For example, if settings have been made but not applied, the settings displayed will be those in effect currently, even though a restart of network services will apply the settings as they have been configured. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 92 Admin: Administrative Operations Admin: Cluster Management The following permissions will be applied to this list: Table 7 - 4: AMC Network Overview Tab Permissions Aster Database Installation Type Bonding Configuration IP Settings Subnet Function Assignment Appliance (UMOS) View only View and Edit View only For each node, you can assign an IP address or NIC for each of the following functions. • Queries - for internal database communication between nodes (default). • Loads - applies to loaders only. • Backups - applies to backups. Note that if you do not assign an IP address or NIC for backups or loads, the default (queries) setting will be used. 3 93 Click the Configure button on the far right hand side for the node whose network settings you want to configure. In the network configuration window for the node, you will see two tab: Current State and Network Assignments. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: Administrative Operations Admin: Cluster Management 4 Use the Current State tab to view details on the current network assignments and configuration for this node. 5 Use the Network Assignments tab to assign Aster Database functions to a NIC or a bond. The default or “queries” subnet will use the primary IP address. You can optionally assign a different subnet for Loads and/or Backups using the drop-down selector. Click the Save & Apply button to save your settings or Close to cancel. Apply Network Settings To apply the settings you just made immediately, click Save & Apply on the screen where you made the settings. If you choose to only save your settings and want to apply them later, click Save. Your settings will be saved to the network configuration files, and applied automatically when network services are restarted. Warning! Applying the network settings is accomplished by restarting network services with the new settings. Because of this, any operations that are currently running over the network will be interrupted. Be sure that there are no active queries, loads, or backups before applying network settings. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 94 Admin: Administrative Operations Admin: Cluster Management Example Network Configuration Loading Data from Outside the Aster Database Subnet Loaders can exist on both the Aster Database subnet and a loading subnet (for example, an outward facing subnet). The latter allows loading to be done from a machine not on the Aster Database subnet. In this scenario, the loader can still perform its duties in the Aster Database cluster, because the network configuration allows loading traffic from the “outside” loading subnet. Querying Network Status From the Admin: Configuration: Network panel, click the Query Network Status button to view the status of the network. A screen similar to the following should appear indicating the health of the network. 95 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: Administrative Operations Admin: Cluster Management Remove Nodes from the Cluster You can unregister a node to remove it from the list of nodes that Aster Database considers part of the system. This is useful if a node has been permanently removed from the system, such as in the case of a permanent node failure or the re-provisioning of a node. It is highly recommended that node removal only be performed on nodes that have already been physically removed from Aster Database. That is, it should only be used on nodes that are shown as New or Failed in the AMC. Using the AMC to remove (unregister) an Active node could cause Aster Database to transition to a stopped status. Warning! Removing and re-adding a node is not the recommended way to address problems on a node, because re-adding the node will delete the data stored on the node. Before you remove a node for node-maintenance purposes, please read the following: • If you want to repair a node, please read the instructions in “Addressing Failed and Suspect Nodes” on page 55. • If you want to delete all data from a node and re-provision it as a new Aster Database node, see “Deleting All Data to Re-Provision a Node” on page 76. To remove a node from Aster Database, perform the following steps. 1 In the AMC, click Admin: Cluster Management. 2 In the Nodes panel, check that the targeted node is not currently Active (refer to the Status sub-panel for that node). 3 In the Remove column, click the X button. A confirmation window appears. 4 Ensure the displayed address is the node you want to remove, and click OK to remove it. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 96 Admin: Administrative Operations Admin: Cluster Management 5 Physically shut down or reboot the removed node machine now. Rebooting ensures that data clean-up is done, so that the machine can later be re-added as a node in Aster Database. Warning! Once you have removed a node, you cannot immediately re-add that physical machine as a new node! Attempting to add a previously used node will fail. If you wish to re-use a machine that has previously served as an Aster Database node (or that has any data on it at all), you must clean the machine as explained in “Deleting All Data to Re-Provision a Node” on page 76. Tip! Normally, when you remove a node (by clicking on the blue X for the node), Aster Database will remove the node from its list of nodes in the cluster, and will reboot the node. Rebooting the node will stop any current processes that are executing on that node, and those processes will not be restarted after the node restarts. If you are in the process of adding a new node, and if you tell AMC to remove the node (abort the add) while the node is being cleaned, the cleanup may not finish completely and the node may not be rebooted. If cleanup does not finish completely, you might still have some beehive processes running on the node. This is not normally a problem. You may manually kill those processes, or you may reboot the node yourself. Similarly, you may manually remove any remaining files, if necessary. (For more information about removing Aster Database-related files, see “Add Node Fails With “user data directories are present” Message” on page 75.) If you later add the node back to a cluster by IP address using the Add Node button in the AMC, and if you put a check in the "Clean Node" checkbox for that node (and don't remove the node again before the Add Node operation completes), the beehive processes, as well as data files and Aster Database program files, will be cleaned up completely even if the previous clean-up was only partial. The Hardware Configuration Panel The Nodes: Hardware Config subpanel shows the hardware configuration detail for a selected Aster Database node. The panel displays detailed compute, memory and storage information, including a breakdown by processor, which is relevant in multiprocessor servers. Use this window to: • 97 Find the MAC / IP address of a node. (Even if the node is currently down or unreachable, provided it registered successfully in the past.) Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: Administrative Operations Admin: Cluster Management • Find the processor speed or processor type a node. • Find the available memory of a node. • Find the disk capacity of a node. The Node Inspection Panel The Nodes: <Node Name> subpanel shows detailed information about a selected Aster Database node. The panel displays detailed compute, memory and storage information, including a breakdown by processor, which is relevant in multiprocessor servers. Use this window to: • Find the MAC / IP address of a node. (Even if the node is currently down or unreachable, provided it registered successfully in the past.) • Find the processor speed or processor type a node. • Find the available memory of a node. • Find the disk capacity of a node. • Where is the queen database replica? • What v-workers are on this node? Click the Node Hardware Config tab to view the hardware configuration details, including information on NIC bonding for the node. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 98 Admin: Administrative Operations Admin: Events Admin: Events In the AMC, select Admin > Events to open the Events panel, which is used to inspect the event actions currently configured on your cluster. See “Event Monitoring with the Event Engine” for details. Admin: Executables In the AMC, select Admin > Executables to open the Executables panel, which is used for managing and running scripts on your cluster. See “Admin: Executables” for details. Admin: Backup In the AMC, select Admin > Backup to open the Backup panel, which is used for managing and monitoring backups of tables and database. See the Teradata Aster Big Analytics Appliance 3H Database User Guide for details on setting up Aster Database Backup and running backups. 99 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: Administrative Operations Admin: Backup From the Backup panel, you can see an entry for each logical and physical backup, along with information such as: • backup ID • status of the backup • database and table (for logical backups) • backup type (full or incremental) • Backup Manager IP address • start, end, and elapsed time • controls to pause/resume or cancel a backup Adding a New Backup Manager to the AMC In order to add a new Backup Manager to the AMC, you will need its IP address. Remember the software version of Aster Database Backup Manager and Aster Database must be the same. To add the Backup Manager, perform the following steps. 1 Click the Add Manager button. 2 Enter the IP address of the Backup Manager 3 Click OK. Once the Backup Manager has been added, you will see a confirmation message stating “Backup Node added successfully for IP address <IP Address of Backup Manager>.” Note this message is the only indication the Backup Manager has been added successfully. The Cluster Backups table will not populate until a backup has been started. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 100 Admin: Administrative Operations Admin: Configuration: Cluster Settings Starting a Backup Backups are started using the Aster Database Backup CLI on the Backup Manager. See “Using Aster Database Backup” in the Teradata Aster Big Analytics Appliance 3H Databse User Guide. Monitoring and Managing Backups Once a backup has been started, it can be monitored and managed within the AMC Backup tab. Using the icons on the right hand side of each backup listing, you can Pause or Cancel the backup. After pausing a backup, a message will appear stating the backup was successfully paused. To resume the backup, simply click the icon under Pause/Resume again. Similarly, after resuming a backup, a message displays to show the resume was successful. Note that if you cancel a backup, it cannot be resumed. Admin: Configuration: Cluster Settings In the AMC, select Admin > Configuration > Cluster Settings to open the Cluster Settings panel, which allows you to set the basic operating parameters for the Aster Database. These settings apply to the AMC installation; all users will use the settings that are defined and saved here. 101 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: Administrative Operations Admin: Configuration: Cluster Settings The Cluster Settings panel appears. The next few sections explain the configuration settings available on this panel. • “Cluster Settings” on page 102 • “Sparkline Graph Scale Units” on page 103 • “Graph Scaling” on page 103 • “Internet Access Settings” on page 104 • “Aster Support Settings” on page 104 • “QoS Concurrency Threshold Configuration” on page 105 Cluster Settings The Cluster Settings section of the Cluster Settings panel provides a way to specify general cluster-wide configuration options. To set the cluster setting, perform the following steps. 1 Go to Admin > Configuration > Cluster Settings. 2 In the Cluster Settings section, enter the following: Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 102 Admin: Administrative Operations Admin: Configuration: Cluster Settings 3 • Company Name: Name of the company as you wish it to be displayed in the AMC. This name is required when you send diagnostic log bundles to Aster Data’s support team (see “Aster Database Logging” on page 194), so if you intend to send log bundles, do not leave this field blank. • Cluster Name: Name of the cluster as you wish it to be displayed in the AMC. Useful for sites with more than one cluster. This name is required when you send diagnostic log bundles to Aster Data’s support team (see “Aster Database Logging” on page 194), so if you intend to send log bundles, do not leave this field blank. • Process Log Cleanup Frequency: How long you want to keep process log data after each process completes. Increase the setting to obtain more debugging information, decrease to 5 minutes if you do not want to keep logs for very long, or accept the default. Click Save. A confirmation message appears in the top right corner of the panel. Sparkline Graph Scale Units The Sparkline Graph Scale Units section of the Cluster Settings panel provides a way to specify the sparkline unit for the display graphs for network, disk I/O, and memory activity. To set the sparkline unit, perform the following steps. 1 Go to Admin > Configuration > Cluster Settings. 2 In the Sparkline Graph Scales Units section, configure the settings for: • Network • Disk I/O • Memory Graph Scaling The Graph Scaling section of the Cluster Settings panel provides a way to control how the graphical data displays in various AMC tabs are rendered in terms of their numerical scale. The main Dashboard panel and the Nodes >Hardware Stats tab contain graphs of network, disk I/ O, and memory activity, which are affected by the configuration settings in Graph Scaling. 103 1 Select Admin > Configuration > Cluster Settings. 2 In the Graph Scaling box, enter one or more of the following figures. Use higher numbers to make sure the graphs show all high spikes in activity, lower numbers to magnify smaller fluctuations in activity: • Network: The maximum number on the quantity axis in all AMC network graphs, in Kb/s • Disk IO: The maximum number on the quantity axis in all AMC disk I/O graphs, in Kb/s • Memory: The maximum number on the quantity axis in all AMC memory usage graphs, in MB Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: Administrative Operations Admin: Configuration: Cluster Settings 3 Click Save. A confirmation message appears in the top right corner of the panel. Internet Access Settings The Internet Access Settings section of the AMC Cluster Settings panel is where you configure any proxy settings that are needed to enable the queen to have outbound Internet access. The queen needs to use the Internet when you send diagnostic log bundles to Aster Data’s support team (see “Aster Database Logging” on page 194). Depending on your network security policy, you might have outbound Internet access even if you do not fill out these settings. 1 Select Admin: Configuration: Cluster Settings. 2 In the Internet Access Settings box, enter the following: 3 • Proxy Hostname or IP Address: Name or IP number of the proxy server which serves as an intermediary for Internet requests. • Port: Number of the port on the proxy server that is available to receive Internet requests from the queen. • Username and Password: Credentials that the queen can use to log in to the proxy server. Click Save. A confirmation message appears in the top right corner of the panel (you might have to scroll up to see it). Aster Support Settings The Aster Support Settings section of the AMC Cluster Settings panel sets up the AMC’s access to Aster Data support servers. The support center URL, username, and password are required when you send diagnostic log bundles to Aster Data’s support team (see “Aster Database Logging” on page 194). The resource center URL enables the AMC to display Aster Data support’s page of useful code and information. 1 Select Admin: Configuration: Cluster Settings. 2 In the Aster Support Settings box, enter the following: • Support Center URL: The address of the Aster Data support server. This URL is required when you send diagnostic log bundles to Aster Data’s support team (see “Aster Database Logging” on page 194), so if you intend to send log bundles, do not leave this field blank. This URL is different for each Aster Data customer. If you do not yet have your support URL, contact the Aster Data support team. • Resource Center URL: The address of the Aster Data resource center, a web page where you can find documentation, videos, and downloadable client software. This URL provides the destination for the Resource Center link which appears at the top of every AMC page. Like the Support Center URL, you should also have received this URL from Aster Data. When you click the Resource Center link, the page at the Resource Center URL appears with links to documentation and videos. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 104 Admin: Administrative Operations Admin: Configuration: Workload • 3 Username and Password: Your user credentials for logging in to the support center. This information is required when you send diagnostic log bundles to Aster Data’s support team (see “Aster Database Logging” on page 194), so if you intend to send log bundles, do not leave these fields blank. The user name and password are different for each Aster Data customer. If you do not yet have your credentials, contact the Aster Data support team. Click Test. If the cluster can connect to the given URLs, a confirmation message appears in the top right corner of the panel (you might have to scroll up to see it). 4 Click Save. A confirmation message appears in the top right corner of the panel (you might have to scroll up to see it). QoS Concurrency Threshold Configuration The Qos Concurrency Threshold Configuration section of the AMC Cluster Settings panel lets you specify the QoS Concurrency Threshold. Admin: Configuration: Workload The Admin: Configuration: Workload panel lets you control Aster Database’s workload management rules to ensure proper allocation of the cluster’s computing resources. To open the panel in the AMC, go to Admin > Configuration> Workload. In this panel, you create the rules that allow Aster Database to identify higher- and lower-importance jobs and run them with the right level or urgency. See “Workload Management” in the Teradata Aster Big Analytics Appliance 3H Database User Guide for instructions. 105 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: Administrative Operations Admin: Configuration: Roles and Privileges Admin: Configuration: Roles and Privileges Viewing the list of available AMC user privileges 1 Log into the AMC as an amc_admin user. This is typically the db_superuser account in a new Aster Database installation. 2 Go to Admin > Configuration > Roles & Privileges. 3 In the Roles & Privileges tab, the available AMC Roles (amc_admin, process_admin, process_viewer, process_runner, node_admin, and node_viewer) are listed on the horizontal axis of the table, and the individual privileges are listed on the vertical axis. Each privilege is a combination of a section of the AMC and an action the user can perform there. A user is typically granted only one of the roles listed on this page of the AMC. A user can only connect to Aster Database if he has one of the roles listed on this page of the AMC. Creating an AMC user in Aster Database 1 In the Roles & Privileges tab of the AMC, review the list of available AMC user privileges, as explained in “Viewing the list of available AMC user privileges” on page 106. 2 Find the AMC Role that has the privileges you want to grant to the new user. Note the role’s name. 3 Start an ACT session and log in as an administrator (a user with db_admin privileges). 4 At the SQL prompt, use the CREATE USER command to create the user account, and use the GRANT command to give the user the AMC Role you chose earlier. For example, to create an account for Topper Headon (theadon) and make him a process_viewer user in the AMC, you would type this: Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 106 Admin: Administrative Operations Admin: Configuration: Roles and Privileges CREATE USER theadon IN ROLE process_viewer PASSWORD '5t4g0l33'; Checking users’ current AMC privileges To check the current privileges of users, perform the following steps. 1 Log into ACT as a db_admin user. 2 Run this query: SELECT nc_users.username, nc_roles.rolename FROM nc_users, nc_roles, nc_group_members WHERE nc_users.userid = nc_group_members.memberid AND nc_roles.roleid = nc_group_members.groupid GROUP BY username, rolename; Editing users’ AMC privileges To edit a user's AMC privileges, perform the following steps. 1 Assess the current state of user privileges: • To find the user's current privileges, see “Checking users’ current AMC privileges” on page 107. • To see the list of available AMC user privileges, see “Viewing the list of available AMC user privileges” on page 106. 2 Find the AMC role that has the privilege(s) you want to grant to or revoke from the user. Note the role's name. 3 Start an ACT session and log in as an administrator (a user with db_admin privileges). 4 At the SQL prompt, use the GRANT or REVOKE command to give or remove the privileges. For example, to give Topper Headon the process_viewer privilege, you would type this: GRANT process_viewer TO theadon; The user's new AMC rights apply for all AMC sessions he or she starts in the future. If the user is currently logged in, the current session will not be updated with the new rights until he or she logs out and logs back in. 107 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: Administrative Operations Admin: Configuration: Hosts Admin: Configuration: Hosts Setting up Host entries for all Aster Database nodes You can set up host entries on all the nodes of an Aster Database cluster by editing the /etc/ hosts file on each Aster Database node manually or through the AMC by performing the following steps. 1 Log into the AMC as an administrator user. 2 Go to Admin > Configuration > Hosts. 3 Click the Hosts tab. 4 Create a host entry for each host you want to add by clicking on New Host Entry and filling in the web form with its IP address and alias. Tip! If you are making host entries for Teradata nodes, make sure that when you enter the alias, you include "cop#" at the end (e.g. if you will execute “... load_from_teradata( ... TDPID('dbc')...”, then enter a name like “dbccop1” as the alias.) 5 When you are finished adding entries for each node, click Save and Apply Changes. 6 Your changes will be written to the hosts file on each Aster Database node. Tip! Note that the AMC does not show any entries in /etc/hosts or /etc/resolv.conf that were not added through the AMC or ncli. Therefore, it does not allow you to edit or remove entries that were not added through the AMC or ncli. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 108 Admin: Administrative Operations Admin: Logs Warning! Do not manually edit any of the entries in the /etc/hosts or /etc/resolv.conf files within the sections enclosed in comments indicating that they were added by Aster Database. These are the changes made through the AMC, and should only be edited through the AMC. Here is a sample of how these entries appear: ## Configured by NCluster. DO NOT EDIT!!! ## 10.51.13.100 dyu # localhost ## End NCluster Configuration ## Setting up DNS entries for all Aster Database nodes If the network is set up such that DNS servers are used to resolve the host or database names, you must add the DNS server(s) to the /etc/resolv.conf file on each Aster Database node. You can do this by editing the /etc/resolv.conf file on each node manually Admin: Logs To view and manage the Aster Database logs from the AMC, click Admin: Logs. See “Aster Database Logging” on page 194 for details. Restarting Aster Database Aster Database is designed to be resilient to many forms of failure. Many serious failures that Aster Database may encounter can be resolved by restarting the system. Restarting Aster Database involves a full restart of the system. During this time, queries cannot be performed and most administrative functions will be unavailable. There are two options for restarting Aster Database: Soft Restart and Hard Restart. Procedure To restart Aster Database: 109 1 In the AMC, go to the Admin: Cluster Management tab. 2 Click either Soft Restart or Hard Restart (described below). 3 After the restart has finished, you must click the Admin: Cluster Management tab and click Activate Cluster to make the cluster operational again. See “Activating Aster Database: The Procedure” on page 112. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: Administrative Operations Restarting Aster Database Soft Restart Clicking Soft Restart invokes a software-level restart of Aster Database. This process generally takes one to three minutes and involves restarting the software on each node in the system. During a Soft Restart, the AMC may show nodes as having a status of “Upgrading” even if the Soft Restart was not part of an upgrade operation. This happens because upon a Soft Restart, Aster Database always checks to see whether there are upgrade-related scripts to run, and displays the Upgrading icon while it does the check. The status for the affected nodes will display as Upgrading until the check is performed and any upgrade scripts are run. After this, a status of Prepared will appear. Most issues requiring a restart will be resolved with a soft restart. After you perform a soft restart, you must click Activate to make the cluster operational again. Since the outage period with the Soft Restart option is significantly lower, it is always recommended to perform a Soft Restart first before trying a Hard Restart. If the issue with Aster Database is not resolved with Soft Restart, a Hard Restart can be performed. Note: Setting the QoS concurrency to zero will still allow any new queries that are part of an open transaction. Soft Restart does not wait until open transactions are finished; it rolls back open transactions and then does the restart. See also “Soft Shutdown” on page 116. Backup interaction with soft-restart Before running soft-restart on a cluster, ensure that there are no data backups or restorations in progress. Interrupting backup operations in this manner can lead to errors during startup. To find out how to check whether backup operations are in progress, see “Backup and Restoration” in the Teradata Aster Big Analytics Appliance 3H Database User Guide. Warning! Never restart the cluster while Aster Database Backup is running a data backup or restoration. Hard Restart In very rare cases, there are errors (typically hardware-related) that require a hard restart. Clicking Hard Restart will trigger a hardware-level restart of Aster Database. This process can take 10 minutes or longer, depending on the time needed to reboot the physical servers used in the system. A hard restart should be issued in cases where a Soft Restart fails to resolve the problem. After you perform a hard restart, you must wait for all nodes to become Prepared before you click Activate to make the cluster operational again. Soft Shutdown To shut down Aster Database in preparation for upgrades or hardware moves, see “Soft Shutdown” on page 116. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 110 Admin: Administrative Operations Activating Aster Database Activating Aster Database Activating a new node into Aster Database allows you to scale disk storage and CPU processing resources. Node activation is a two-step process. First, data storage is balanced by shifting v-workers from currently-active worker nodes to newly added node(s). Second, processing resources are balanced by activating the v-workers on the newly added node(s). During the data re-balancing (invoked from the Admin: Cluster Management: Nodes tab of the AMC by clicking the Balance Data button), currently-running queries will not be disrupted but will continue to run only on previously-existing nodes. To utilize the processing power of new nodes, a brief, predictable downtime is necessary (invoked by clicking the Balance Process button in the AMC). Table 7 - 5: Aster Database activation and balancing steps Activation Process What Happens Step 1: Balance Data - balancing data placement Completely online. Rebalancing time dependent on data size. Step 2: Balance Process - optimally locating v-workers Very brief (seconds to low minutes). Requires short outage during this activation period When an outage (e.g. query blocking operation) is needed for activation of compute processing on the new node, it is brief and very predictable – typically a few seconds to a couple minutes. The details of each of the two activation steps are described below, following these typical use cases and best practices: Situations that Require an Activation 111 • NEW CLUSTER: The cluster has just started or rebooted. In the Admin: Cluster Management Panel of the AMC, click the Activate button, and then a new window will pop open. Next, click Activate Aster Database. This will bring the cluster to the target replication factor and make all nodes available to help process queries. • EXISTING NODE REBOOTS: A node has just rebooted and is recognized as “Prepared” in the AMC. Because a node went offline, the replication factor must have fallen below the target. You can quickly restore the replication factor through the Balance Data feature by viewing the appropriate node address and then clicking the Activate button and then click Balance Process. This will make the rebooted node's processors available to the cluster and move it to “active” state. • ADD NEW NODE FOR SCALE-OUT: A new node has been added to the cluster. In this case, it will probably take a longer period to copy data from existing nodes to the new node as part of the data re-balancing process; Balance Data will do this in the background and leave the cluster fully available for loads and queries. First, click the Activate button and then click Balance Data in the resulting pop-up to copy data over. Later, when a few seconds to minutes of outage are acceptable, it is recommended that you balance process by clicking the Activate button and then clicking Balance Process in the resulting popup. This Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: Administrative Operations Activating Aster Database will make the new node's processors available to the cluster. Note that you can add one or more nodes to an existing Aster Database and the data-rebalancing occurs in a parallel manner for maximum performance. Activating Aster Database: The Procedure Follow the steps below to activate Aster Database after a restart or shutdown. 1 Make sure your queen node is running. If it is not, restart the queen machine now. The Aster Database software and the AMC will be started automatically. 2 With your browser, navigate to the AMC. 3 In the AMC, click on the Admin: Cluster Management tab. If the cluster is not already Active, the Aster Database status lamp in the upper left corner will be red with a status of STOPPED. 4 In the Admin: Cluster Management: Nodes panel, under the label, Node Name, you will see a list of the queen and nodes, with a Status for each. The next action you need to take depends on what you see here. Do one of the following: • If the worker nodes have a Status of Preparing, turn to Step 5. • If the worker nodes have a Status of Prepared, turn to Step 6. • If the worker nodes have a Status of New, turn to Step 6. • If no worker nodes are displayed, it could be that you have never added nodes to the cluster. See “Add New Nodes to the Cluster” on page 66. If you know that your cluster has nodes, but they have not appeared in the Nodes tab, then you should wait a few more minutes if you have just restarted Aster Database. The worker nodes take a few minutes to reappear after a hard restart. If you have not already performed a hard restart, you can do so now as explained in “Restarting Aster Database” on page 109. If, after restarting, the workers fail to appear, then contact Aster Support. • If any nodes have a status of Failed, see “Addressing Hardware Problems on Workers” on page 52. 5 While the worker nodes show a Current Status of Preparing, you must wait for them to become Prepared. Once the status of all nodes is Prepared, turn to the next step. 6 When all worker nodes show a Current Status of Prepared or New, go to the Admin: Cluster Management screen and click Activate Cluster. Tip! The Activate Cluster button is also used, under certain circumstances, to incorporate new nodes into the cluster. See “Incorporate the New Nodes” on page 71. 7 If the Activate Nodes dialog appears, click Activate Aster Database again. (Note: If you are activating from a hard restart, you will have already clicked this button a few minutes ago. This is normal: The first activation prepares the nodes, and the second brings them online.) Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 112 Admin: Administrative Operations Balance Data The green message box in the upper right of the AMC shows that the cluster is Activating. When the queen has finished activating all nodes, the Aster Database status lamp lights green and shows a status of Active. Your Aster Database is ready to use. Balance Data Balance Data is an Aster Database administrative action that balances data placement across all the worker nodes and, if needed, adds v-workers to the cluster. Queries and loads are not disrupted, though there may be some performance overhead as the activation process uses system resources. Balance Data can be used to incrementally scale-out a cluster, or to quickly and non-disruptively restore full data replication after a hardware failure. You initiate Balance Data in the AMC by clicking the Balance Data button in the Admin: Cluster Management tab. For instructions, see the next section, “Balance Data: The Procedure” on page 113. After you run a Balance Data operation, some worker nodes (in particular, newly added worker nodes) may be in the Passive (Blue) state in the AMC. At this point, storage of live and standby data is balanced across all nodes. Queries continue to run only on the Active nodes, while the Passive nodes act as up-to-date standbys that can be activated when an Active node fails. In Aster-speak, we say that your Active nodes are hosting all the active v-workers, while your Passive nodes are hosting only passive v-workers. Balance Data runs in the background, and may run for a long time. Since it balances data across all nodes, it may need to copy very large amounts of data. For example, suppose you have a three-node cluster with 400 GB used per worker node and you use Balance Data to add one node the cluster. To achieve data balance, Aster Database will store roughly (400 * 3) / 4 = 300 GB per node once online activation is finished. This implies that 300GB must be copied onto the new worker node. Assuming this incorporation occurs over a 1Gbps network that is otherwise unused, this will take at least (300 * 8 Gigabits / 1 Gbps) = 300*8 seconds = 2400 seconds = 40 minutes. Note that with the Network Aggregation feature, you have the option to “bond” together multiple 1Gbps NICs to offer trunked bandwidth – for example if you bonded 8x 1Gbps links, you would make available the equivalent of 8Gbps of aggregate bandwidth, which could reduce the time required to re-balance data (assuming network is the bottleneck). It is recommended that at some point after Balance Data completes, you find an acceptable time for a few minutes of downtime and run Balance Process on the cluster to balance the computational burden across its nodes. Balance Data may create extra copies of some data in order to achieve balance while not deleting the existing, in-use copies. These extra copies will not be deleted until you perform Balance Process. Balance Data: The Procedure Follow this procedure to balances data placement across all the worker nodes: 1 113 In the Admin: Cluster Management tab, click Balance Data to balance data across all available nodes. This step updates existing and new nodes with the data they need. This Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: Administrative Operations Balance Process process runs for at least a few minutes, and as long as a few hours in large Aster Databases, depending on the amount of data that must be copied to the newly added node(s). As mentioned above, Balance Data allows the storage re-balancing process to occur seamlessly with no downtime for running queries. Next Steps If you are scaling out your cluster, you may want to perform a partition split now to increase the number of v-workers in the cluster. See “Splitting Partitions in Aster Database” on page 73. Balance Process Balance Process is an Aster Database administrative action that load-balances the query processing burden across all worker nodes in Aster Database. It will optimize performance given the current data placement. The Balance Process step does not create new copies of data, so it typically runs quickly. It also does some cleanup, deleting data that can no longer be used. It will briefly disrupt the cluster, aborting any in-progress transactions, a period that can last from a few seconds to a few minutes. It is recommended that Balance Process be run at some point after Balance Data completes, when a few minutes of downtime are acceptable, so that a new node’s processors are available to the cluster. You initiate Balance Process in the AMC by clicking the Balance Process button in the Admin: Cluster Management tab. See the next section for instructions. Balance Process: The Procedure Warning! While the Balance Process step is in progress, your cluster cannot process queries. Before you per- form the next step, make sure all running queries have finished successfully and that no new queries are allowed to enter Aster Database. Already-running queries will be killed when you click Balance Process, and new queries submitted after you click it will wait until the balancing is complete. 1 Click the Balance Process button in the AMC’s Admin: Cluster Management tab to force currently passive nodes to contribute to the handling of database queries. This interrupts the operation of the cluster. This step places active v-workers on all nodes that are able to host them. This action takes at least a few seconds and as many as a few minutes to complete. Once processing is balanced, the cluster resumes handling queries and the new node(s) are part of the cluster, which you can check by looking for a status of Active in the AMC’s Nodes panel. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 114 Admin: Administrative Operations Cluster Management from the Command Line Next Steps If you are scaling out your cluster, you may want to perform a partition split now to increase the number of v-workers in the cluster. See “Splitting Partitions in Aster Database” on page 73. Cluster Management from the Command Line You can manage many aspects of Aster Database from the command line. Most tasks are done via the Aster Database Command Line Interface (ncli), a tool for inspecting and managing all nodes in the cluster. Below, we explain the most common cluster management tasks. For more detailed instructions, see “Admin: ncli (Aster Database Command Line Interface)” on page 136. To get started with the ncli, open a command shell on the queen, log in as root, and type ncli to get basic help, and then type, for example, ncli system to show the help for the system commands. # ncli # ncli system Checking Cluster Status Use the system show and node show commands to check the Aster Database operational state. To check the queen’s status, log into the queen as root user and, at the command line, run the system show command: # ncli system show Next, use the node show command to check the status of all nodes in the cluster: # ncli node show Soft Restart To restart Aster Database, use the softrestart command. Working as root user at the queen command line, type the command: # ncli system softrestart Note: Setting the QoS concurrency to zero will still allow any new queries that are part of an open transaction. Soft Restart does not wait until open transactions are finished; it rolls back open transactions and then does the restart. See also: “Soft Restart” on page 110 and “Hard Restart” on page 110. 115 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: Administrative Operations Cluster Management from the Command Line Soft Shutdown To shut down Aster Database in preparation for upgrades or hardware moves, use the softshutdown command. Working as root user at the queen command line, type the command: # ncli system softshutdown If the shutdown attempt fails with the message, “Unable to grab exclusive lock for restart,” you can use the SoftShutdownBeehive.py script with the --force flag to clean up leftover processes and shut down. Contact Aster support for help using the script. See also: “Soft Shutdown” on page 110. Soft Startup To start Aster Database after a soft shutdown has been performed, use the softstartup command. Working as root user at the queen command line, type the command: # ncli system softstartup If the startup attempt fails with the message, “Unable to grab exclusive lock for restart,” you can use the SoftStartupBeehive.py script with the --force flag to clean up leftover processes and start. Contact Aster support for help using the script. Next, you must activate the cluster (“Activating Aster Database” on page 111). Freeing Space Occupied By Defunct V-Workers When Aster Database deletes v-workers, the space is not freed for approximately 24 hours. (This can occur, for example, if a replica v-worker goes down, the system creates new replica v-worker, and then the original replica v-worker comes back up. In this case you have more replica v-workers than you need, and the unneeded v-worker will be deleted automatically.) The 24-hour waiting period is a safety mechanism, but it can delay your work if you are adding machines and wish to scale out immediately, because you must wait 24 hours for the space occupied by the defunct workers to become available for new data. To immediately reclaim space made free by v-worker deletions, use the command-line utility, TrashmanUtil. You can find this tool on the Aster Database queen in /home/beehive/bin/ utils/support/TrashmanUtil. Please contact Aster customer support before you attempt to use this tool. Setting Up Passwordless Root SSH Between Nodes The cluster requires that passwordless SSH be set up among nodes for various operations. Depending on the operation, passwordless SSH may be required for user root and/or user beehive. Normally, passwordless SSH is set up automatically. If you need to add passwordless SSH between any two nodes, do this: 1 Log in to the node that will provide the original key (usually the queen) as root. 2 Make sure the current keys work: # ssh root@localhost Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 116 Admin: Administrative Operations Cluster Management from the Command Line # exit This login attempt should succeed without prompting for a password. 3 Copy the root keys so that they are also the beehive keys: # cp -r /root/.ssh/* /home/beehive/.ssh/. Note! Make sure the permissions on /home/beehive/.ssh is “beehive:users”. 4 Copy the keys from this node to the other node(s) that will have a passwordless SSH relationship with this node: # scp -r /root/.ssh/* root@<node_IP_address>:/root/.ssh/. 5 Become user beehive: # su - beehive 6 Copy user beehive’s keys to the other node(s) that will have a passwordless SSH relationship with this node: $ scp -r /home/beehive/.ssh/* beehive@<node_IP_address>:/home/beehive/.ssh/. 7 Type exit to become root user again. $ exit 8 As root, verify passwordless access from this node to the other. You should not be prompted for a password: # ssh root@<node_IP_address> # exit 9 As beehive, verify passwordless access from this node to the other. You should not be prompted for a password: # su - beehive $ ssh beehive@<node_IP_address> $ exit 10 Type exit to become root user again. $ exit 11 As root, verify passwordless access from root at this node to beehive at the other node: # ssh beehive@<node_IP_address> # exit 12 Repeat the steps above to copy the keys to all nodes that will allow passwordless SSH. 117 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide CHAPTER 8 Securing Aster Database This chapter explains how to make security settings in Aster Database. Managing Users and Privileges • Database Roles and Privleges: See (CREATE USER, CREATE ROLE, GRANT, ALTER USER, ALTER ROLE) in the Teradata Aster Big Analytics Appliance 3H SQL and Function Reference. AMC • AMC Roles: Creating an AMC user in Aster Database (page 106) • AMC Access: Internet Access Settings (page 104) SQL-MapReduce See the Teradata Aster Big Analytics Appliance 3H Database User Guide. Security between Aster Database and other systems • SSL for ODBC and JDBC: See the Teradata Aster Big Analytics Appliance 3H Database User Guide. • Reporting Tools: See the Teradata Aster Big Analytics Appliance 3H Database User Guide. • Aster Database - Teradata Connector: “Connector Argument Clauses” in the Teradata Aster Big Analytics Appliance 3H Database User Guide Aster Database Firewall The Aster Database firewall runs on all Aster Database nodes, blocking all non-Aster-related external access to the nodes. The documented public interfaces to Aster Database remain open. You can disable the firewall if desired, using the ConfigureNCluster.py command. For installations with special requirements, Teradata Aster services can help you configure custom firewall policies. Policies can be separately configured for each node. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 118 Securing Aster Database Aster Database Firewall Default Firewall Settings The Aster Database firewall is enabled or disabled based on your Aster Database installation type: AMOS The firewall is enabled by default only on Aster-managed OS (AMOS) deployments. UMOS On user-managed OS (UMOS) deployments, the default firewall state is disabled. This is done because in UMOS deployments, the worker/loader nodes are often placed on a different IP subnet than the queen. In such a deployment, enabling the firewall would block communication between the queen and workers since the default firewall policy blocks all access from outside the queen’s subnet. On UMOS deployments, you can enable the firewall if required, using the ConfigureNCluster.py command, provided all nodes are in the same subnet. Open Ports For traffic within the Aster Database subnet, all ports are open. For traffic to and from outside the Aster Database subnet, the Aster Database firewall blocks connections on all ports except those it needs to operate. Because all ports are accessible within the Aster Database subnet, you do not need to open specific ports between workers and loaders. Your Aster Database firewall policy should only unblock ports that must be kept open for subnet-external access. Enabling and Disabling Aster Database Firewall The Aster Database firewall state on a node is controlled by the “firewall” parameter in the / home/beehive/config/beehiveparams.cfg file. This takes the following parameters: • on -- The Aster Database firewall will be started when you start Aster Database. • off -- The Aster Database firewall will not be started when you start the Aster Database. Use this feature only if you want all Aster Database nodes to run without a firewall. • disabled -- When you start Aster Database, no attempt will be made to start, stop, or configure firewalls. This means that the firewall configuration routine will not run during cluster startup. This is useful if you have set up your cluster’s firewalls manually, and you do not want Aster Database to change your configuration. You pass these parameters using the --firewall flag with the Aster Database configuration script, ConfigureNCluster.py. /home/beehive/bin/lib/configure/ConfigureNCluster.py --firewall=[on|off|disabled] Procedure To modify the firewall activation setting on Aster Database use the following steps: 119 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Securing Aster Database Aster Database Firewall 1 Perform a soft shutdown of Aster Database. 2 Run ConfigureNCluster.py with the --firewall flag: /home/beehive/bin/lib/configure/ConfigureNCluster.py --firewall=<on|off|disabled> 3 Perform a soft startup, and activate Aster Database. Worker and loader nodes will automatically get the configuration changes from the queen during start-up. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 120 Securing Aster Database Aster Database Firewall 121 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide CHAPTER 9 Monitoring Aster Database • Event Monitoring with the Event Engine (page 122) • SNMP Monitoring of Aster Database (page 134) Event Monitoring with the Event Engine The Aster Database Event Engine assists in system maintenance and monitoring. The Event Engine uses a subscription model to send notifications of various events within the system. You can configure separate subscriptions to be notified of events based on various filters. Some examples of filters you can create include: • “Give me an email when a hardware alert happens” • “Give me an email only when bad things happen”, or • “Notify me when components change their state”. The Event Engine resides on the queen. It monitors and generates notification on states and activities on each node. You create subscriptions to specific types of events in order to be notified when they occur. These subscriptions are created through ncli, and may be viewed in ncli or the AMC. When certain events occur, Aster Database will perform a remediation, such as a soft shutdown automatically. This section covers the following topics: • “Event Engine Overview” on page 123 • “Managing Event Subscriptions” on page 123 • “Upgrades of Event Engine” on page 126 • “Viewing Event Subscriptions” on page 127 • “Supported Events” on page 128 • “Remediations” on page 131 • “Event Engine Best Practices/FAQs” on page 131 • “Testing the Event Engine” on page 132 • “Troubleshooting Event Engine Issues” on page 134 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 122 Monitoring Aster Database Event Monitoring with the Event Engine Event Engine Overview Log messages and user actions on all the nodes in Aster Database generate events. When a triggering event happens or a triggering log message is generated, the node on which the event occurred notifies the Event Engine. The Event Engine responds by checking for subscriptions that fit the event profile, and sending an notification to any subscribers. To subscribe to a particular event or to all events that fit a certain profile, you’ll use the ncli. For more on using ncli, see “Admin: ncli (Aster Database Command Line Interface)” on page 136. Managing Event Subscriptions Upon creating a new event subscription, it automatically becomes active. You can view, enable/disable, and modify the subscriptions using the ncli events commands on the queen (see the “ncli events Section” on page 161). These changes take place dynamically while the Event Engine is running. To view all existing subscriptions, you can issue: # ncli events listsubscriptions To view only one existing subscription, issue the following, specifying the appropriate subscription id: # ncli events listsubscriptions <subid> The AMC also provides a read-only list of existing event subscriptions (see “Viewing Event Subscriptions” on page 127). Event components Events are made up of the following information: Table 9 - 1: Event components 123 Name Possible values Description event id See the table “Subscribable Events in Aster Database” on page 128 for a list of valid event ids. The unique identifier for the event. severity INFO, WARN, ERROR, or FATAL The severity of the event. All events have the default severity ‘INFO’. message various The log message generated by the event. priority LOW, MEDIUM or HIGH The priority of the event. LOW is the default. component hardware, hardware.disk, software.aster, etc. The component affected by the event. node IP The IP address of a non-queen node in the cluster. The node affected by the event this is only populated when the event affects a non-queen node. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Monitoring Aster Database Event Monitoring with the Event Engine Event filters used in subscription definitions Table 9 - 2: Event filters used in subscription definitions Filter Required Default Description --eventIds [event[,event...]] Optional all Filter specifying one or more event ids that will trigger a notification. Separate event ids by commas, with no spaces. --minPriority [low | medium | high] Optional LOW Filter specifying the minimum event priority that will trigger a notification. --minSeverity [info | warn | error | fatal] Optional INFO Filter specifying the minimum event severity that will trigger a notification. --componentTypes filter[,filter...] Optional all Filter based on component type string. Matches as much of the name as given. Examples are: • • • • "hardware" to get all hardware events "hardware.disk" to get disk events "software" to get all software events "software.aster" to get just Aster Database software events Other subscription definition settings The following table shows other settings to use when adding or editing a subscription: Table 9 - 3: Other subscription definition settings Parameter Required Default Description --id [subid] Required for edit The next unused id number The subscription id. If specified, this should be an integer. If editing a subscription, the subscription id must be supplied. --type [email | snmp] Required --throttleSecs [secs] Optional Must be one of either: • email for email notifications or • snmp for SNMP traps (see “SNMP Monitoring of Aster Database” on page 134). 0 --to Required address[,address.. for email .] Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Throttles the notifications for multiple occurrences of the same event. When set to the default value 0, messages are generated for each occurrence of the event. Use this setting to ensure that for a given event (same node, same event id) additional messages will be generated only after the specified time has elapsed. To set a subscription to only repeat an event every 30 minutes, you would specify --throttleSecs 1800 when creating it. The address of the email recipient(s). If supplying multiple addresses, separate them with a comma without spaces. 124 Monitoring Aster Database Event Monitoring with the Event Engine Table 9 - 3: Other subscription definition settings (continued) Parameter Required --from address Required for email --smtp host[:port] Required for email Default Description The address of the email sender. Port defaults to 25 The hostname or IP address (and optionally the port) of the email server. --username username Optional The username for the SMTP server. --password password Optional The password for the SMTP server. --manager host[:port] Required Port defaults for SNMP to 162 Used for SNMP subscriptions only. Sets the target host and port number, when using SNMP notification. See “SNMP Monitoring of Aster Database” on page 134. Creating an Event Subscription To create a new subscription do the following steps: 1 First, determine what kind of subscription you want to create. You can see a list of options by looking at: • the list of available events - “Subscribable Events in Aster Database” on page 128. • the filters you may use when creating a subscription - “Event filters used in subscription definitions” on page 124. • other parameters that apply to subscriptions - “Other subscription definition settings” on page 124 2 Log in to the queen as the user ‘beehive’. 3 Issue the ncli events addsubscription command with the desired subscription filters and other parameters, like the following example. The command will create the new event subscription and return a table showing all existing subscriptions: # ncli events addsubscription --type email --minPriority low --minSeverity info --componentType "software." --to "[email protected]" --from [email protected] --smtp smtp-server.teradata.com Event Subscriptions +--------+------------+--------------+-------------| Sub ID | Notif Type | Min Priority | Min Severity +--------+------------+--------------+-------------| 1 | email | Low | INFO +--------+------------+--------------+-------------1 rows table continued... +----------------+-----------+---------------+ | Component Type | Event IDs | Throttle Secs | +----------------+-----------+---------------+ | software. | | 0 | +----------------+-----------+---------------+ 125 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Monitoring Aster Database Event Monitoring with the Event Engine 4 Note that if you attempt to create a subscription without specifying all the necessary parameters, or with invalid parameters, an error message will explain what is needed: # ncli events addsubscription --type email --minPriority low -minSeverity info --componentType "software." Invalid Arguments: Email requires at least 'to', 'from' and 'smtp' Editing an Event Subscription To edit an existing event subscription, use the same parameters as for adding a subscription, except that for the --subid, you will supply the subscription id for the subscription you wish to edit. The following command edits the subscription we created above, to change the -minSeverity, --minPriority and --componentType: # ncli events editsubscription --subid 1 --type email --minPriority medium --minSeverity warn --componentType "software.aster" --to [email protected] --from [email protected] --smtp smtp-server.teradata.com --username username --password password Event Subscriptions +--------+------------+--------------+-------------| Sub ID | Notif Type | Min Priority | Min Severity +--------+------------+--------------+-------------| 1 | email | Medium | WARN +--------+------------+--------------+-------------1 rows table continued... +----------------+-----------+---------------+ | Component Type | Event IDs | Throttle Secs | +----------------+-----------+---------------+ | software.aster | | 0 | +----------------+-----------+---------------+ Note that all parameters must be supplied again, with changes made to only those that you want to edit, when editing a subscription. Deleting an Event Subscription To delete an existing event subscription, issue the following command, specifying the appropriate subscription id. In this example, we delete the subscription created above: # ncli events deletesubscription 1 Deleted Subscription 1 Upgrades of Event Engine Beginning in Aster Database version 5.0, the Event Engine works differently than in previous versions. If you are upgrading from a pre-5.0 to a 5.0 or later version, the upgrade will attempt to migrate settings from the legacy Event Engine to the subscription-based Event Engine. The following modifications will be made: Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 126 Monitoring Aster Database Event Monitoring with the Event Engine • The upgrade does a best-effort migration of old settings to the new Event Engine. This includes migrating email and SMTP alerts to the new subscription model. The upgrade procedure will build a new subscription for every single SMTP alert that was configured in the old Event Engine. Then it will attempt to consolidate them into subscriptions wherever the notification parameters are the same. That is, if you have configured ten different things in the old system to use the same email parameters, those will be consolidated into a single subscription since the email parameters are the same. • The upgrade then logs changes that have been made, and changes that cannot be made in the log file: /primary/logs/PlatformManager.log Viewing Event Subscriptions The AMC provides read-only access to event subscriptions. To see a list of subscribed events: 1 Log in to the AMC as a user with the admin role. 2 Select Admin: Events from the menu. 3 In the Event Subscriptions tab, review the table of subscribed events. On a newly installed cluster, there will be no event subscriptions. Before they can be viewed in the AMC, you must first add event subscriptions through ncli, as described below. Figure 29: AMC Admin: Events: Event Subscriptions tab with no subscriptions See “ncli events Section” on page 161 for a discussion of how to add event subscriptions. The AMC does not display an event until at least one subscription to it has been created. After creating some event subscriptions, the Admin: Events tab will look more like the following: 127 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Monitoring Aster Database Event Monitoring with the Event Engine Figure 30: AMC Admin: Events: Event Subscriptions tab showing subscriptions Supported Events To assist administrators in detecting and managing situations where the cluster is running out of disk space, a node is suspect or failed, a user is initiating actions in the AMC, or replication factor issues exist, Aster Database provides the following subscribable events: Table 9 - 4: Subscribable Events in Aster Database Event ID Event Alert Text Description MC0001 AMCAudit: A user attempted to cancel a statement Occurs when a user attempts to cancel a process from the AMC by clicking Cancel from the Processes list. MC0002 AMCAudit: A user attempted a soft restart Occurs when a user attempts a soft restart from the AMC by clicking the Soft Restart button. MC0003 AMCAudit: A user attempted a hard Occurs when a user attempts a hard restart from restart the AMC by pressing the Hard Restart button. MC0004 AMCAudit: A user attempted to add Occurs when a user attempts to add one or more a node nodes in the AMC by pressing the Add Nodes button. MC0005 AMCAudit: A user attempted to remove a node Occurs when a user attempts to remove a node in the AMC by pressing its “X” (remove) icon. MC0006 AMCAudit: A user attempted to activate the cluster Occurs when a user attempts to activate the cluster from the AMC by pressing the Activate Cluster button. MC0007 AMCAudit: A user attempted to balance data Occurs when a user attempts to balance data from the AMC by pressing the Balance Data button. MC0008 AMCAudit: A user attempted to balance processes Occurs when a user attempts a process rebalance from the AMC by pressing the Balance Process button. MC0009 AMCAudit: A user attempted to upload a software upgrade Occurs when a user attempts to upload a software upgrade from the AMC (by pressing Get File and Distribute in Step 1 of the Upgrade Software dialog box). Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 128 Monitoring Aster Database Event Monitoring with the Event Engine Table 9 - 4: Subscribable Events in Aster Database (continued) 129 Event ID Event Alert Text Description MC0010 AMCAudit: A user attempted to upgrade software Occurs when a user attempts to upgrade software from the AMC (by pressing Upgrade Cluster Now in Step 2 of the Upgrade Software dialog box). MC0011 AMCAudit: A user attempted to run Occurs when a user attempts to run an executable an executable from the AMC by pressing Run Now in the Enter Script Variables dialog box. MC0012 AMCAudit: A user attempted to modify an executable Occurs when a user attempts to modify an executable from the AMC by clicking its “pencil” (edit) icon in the Executables Library. MC0013 AMCAudit: A user attempted to cancel a running executable Occurs when a user attempts to cancel a running executable from the AMC by clicking Cancel in the Executable Jobs list. MC0014 AMCAudit: A user attempted to pause a backup Occurs when a user attempts to pause a running backup from the AMC by clicking Pause in the Cluster Backups list. MC0015 AMCAudit: A user attempted to resume a backup Occurs when a user attempts to resume a paused backup from the AMC by clicking Resume in the Cluster Backups list. MC0016 AMCAudit: A user attempted to cancel a backup Occurs when a user attempts to cancel a running backup from the AMC by clicking Cancel in the Cluster Backups list. MC0017 AMCAudit: A user attempted to add Occurs when a user attempts to add a backup a backup manager manager from the AMC by clicking the Add Manager button. MC0018 AMCAudit: A user attempted to remove a backup manager Occurs when a user attempts to remove a backup manager from the AMC by clicking the Remove Manager button. MC0019 AMCAudit: A user attempted to change admin settings Occurs when a user attempts to change an administrative setting in the AMC by clicking the Save button for a setting in Admin>Configuration>Cluster Settings. MC0020 AMCAudit: A user attempted to create a log bundle Occurs when a user attempts to create a log bundle in the AMC by pressing the Manually Initiate Diagnostic Bundle button on the Admin>Logs screen. MC0021 AMCAudit: A user attempted to save Occurs when a user attempts to save a network a network configuration for a node configuration for a node by pressing the Save button in the Network Configuration>Edit Configuration dialog box. MC0022 AMCAudit: A user attempted to apply a network configuration for a node Occurs when a user attempts to apply a network configuration for a node by pressing the Save & Apply button in the Network Configuration>Edit Configuration dialog box. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Monitoring Aster Database Event Monitoring with the Event Engine Table 9 - 4: Subscribable Events in Aster Database (continued) Event ID Event Alert Text MC0023 AMCAudit: A user attempted to save Occurs when a user attempts to save a network a network assignment for a node assignment for a node by pressing the Save & Apply button in the Network Configuration>Network Assignments dialog box. MC0024 AMCAudit: A user attempted to save Occurs when a user attempts to save IP ranges to IP ranges to the IP pool the IP pool by pressing the Save & Apply button in the Admin> Configuration>Network>IP Pools tab. ST0001 Disk Full High > 90% Occurs when any worker node’s used disk space passes 90%. ST0002 Disk Full Medium > 80% Occurs when any worker node’s used disk space passes 80%. ST0003 Disk Full Low > 65% Occurs when any worker node’s used disk space passes 65%. SY0001 Node is Suspect Occurs when a node status changes to “Suspect”. SY0002 Node is Failed Occurs when a node status changes to “Failed”. SY0003 Node has changed state Occurs whenever a node changes state to any state other than "Failed" or "Suspect". SY0005 VWorker is Failed Occurs when a vworker status changes to “Failed”. SY0006 VWorker has changed state Occurs whenever a vworker changes state to any state other than "Failed". SY0007 Beehive has started Occurs when Aster Database starts up on the queen node. SY0008 Replication Factor is 0, system is unavailable System is unavailable. SY0009 Replication Factor is below the target value. Occurs when replication factor falls below target. SY0010 Replication Factor is at or above the Occurs when replication is at the target or above it. target value. SY0011 Disk error detected on worker node. Occurs when a worker node disk error is detected. PM0001 Platform Manager notified of core dump Occurs when there is a core dump on a node. QS0001 Query Canceled by Workload Management Occurs when Workload Management cancels a query because of contention for memory resources. QS0002 Memory caches dropped Occurs when Workload Management drops memory caches associated with a canceled query. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Description 130 Monitoring Aster Database Event Monitoring with the Event Engine Tip: Note that subscriptions to these events generate messages once when triggered by an event in the system. For example, one message will be sent whenever a node fails. If another node also fails, another message will be sent. However, the “Node is failed” subscription does not continue to generate additional messages at intervals while the node(s) are still down, so plan accordingly when responding to messages from event subscriptions. Tip: Note that the event subscription messages that are generated by AMC actions (those whose identifier begins with “MC”), get sent when the action is initiated within the AMC. The firing of the AMC events does not necessarily indicate that the initiated action was successfully completed. Remediations In special cases, Aster Database will take remedial actions to correct a condition reported by the Event Engine. If you wish to implement further automated remediation (e.g. a node removal) you can do so through SNMP management frameworks. For more information on this, see “SNMP Monitoring of Aster Database” on page 134. The following table shows remedial actions that will be taken automatically when the specified event occurs: Table 9 - 5: Automated Remedial Actions EventID Description Action ST0001 Disk full high Soft shutdown SY0008 RF=0 Soft shutdown In the case where a soft shutdown has been issued, you will need to correct the problem that prompted the shutdown, and then restart the cluster. To restart the cluster, log in as the 'root' user to the queen node and run "ncli system softrestart". Event Engine Best Practices/FAQs Tip: If you use the fully qualified domain name (FQDN) of the mail server in the Event Engine configuration, ensure that the Queen can correctly resolve that FQDN. If the Queen cannot resolve it, you will need to edit /etc/ resolv.conf accordingly for the queen to resolve the FQDN. • After an Aster Database shutdown, do event subscription emails still go out? No. If the queen node is not running the Aster Database software stack, emails will not be generated. • Is there a way to be alerted when things are going well in the system (i.e. a worker is fixed)? Yes. You can subscribe to events with a minSeverity setting of “INFO”. • 131 What are some recommendations when using the Event Engine? Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Monitoring Aster Database Event Monitoring with the Event Engine Don’t manually edit the configuration file (this was recommended in release 4.5.1, but beginning in release 4.6, you should only use ncli to modify event subscriptions.) See “ncli events Section” on page 161. Testing the Event Engine Once you have set up the Event Engine to trigger emails for specific events, it is useful to test these to verify everything is working as expected. You don’t want to find out that a subscription is misconfigured after the event occurs! There are two ways to do this: issuing sample events, and testing for disk full events. Issuing sample events You can send a sample alert by issuing the following command: /home/beehive/bin/utils/SendLogEvent <eventID> <message> For example, issuing the following logs the event SY0005 “VWorker is Failed”: /home/beehive/bin/utils/SendLogEvent SY0005 “VWorker is Failed.” This simulates an alert, adding the appropriate entry to generic.log and alerts.log. Log files are found in the directory /home/beehive/data/logs. In fact, even ignored alerts (those without subscriptions) should be logged. To determine which events have subscriptions, issue: # ncli events listsubscriptions Testing for disk full events Obviously, you want to test without actually creating a real disk 90% full event on a node! Aster Database provides a method to validate the Event Engine settings by forcing lower thresholds for the disk full events on a particular node. You can pass a configuration flag at the command line to reset the threshold temporarily. This only lasts until Aster Database restarts. When the cluster restarts, all disk full settings return to their default values. Warning! Make sure you test these automated actions during a time of scheduled maintenance so that users are not affected by the activity. Automated remediations such as Soft Shutdown will disrupt user activity on the cluster. See “Remediations” on page 131 for a list of these. Procedure 1 Select a worker node in the system and determine how much of its disk space is being used. This is most easily done in the AMC, on the Nodes: Node Overview tab. Let’s say you find Node 1 is utilizing 32% of its total disk space. 2 To test the warning level disk threshold first (ST0003), we will change it to 30% from the default of 65%. Open a browser and paste this URL, supplying the IP Address of the node you identified in step 1 (Node 1 in our example). http://<IP Address of Node>:1953/std/ configflags?diskfullThresholdLow=30 You should see a message like the following in the browser: Successfully set --diskfullThresholdLow to 30 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 132 Monitoring Aster Database Event Monitoring with the Event Engine You want to set the threshold to something lower than the current disk utilization. Since the node’s current utilization is 32%, the Warning Disk Full event will happen the next time the threshold is checked. The new threshold is set only for this particular node, not for the entire cluster. Tip: If you are using the Aster Database firewall, you may not be able to connect to port 1953 on the worker node. You’ll need to disable the firewall temporarily to set this configuration parameter. To do this, see “Enabling and Disabling Aster Database Firewall” on page 119. 3 Review the alerts.log file to validate that disk full event alerts are now being sent by the worker node. The last few messages should contain a “Low diskfull alert”. # tail -10 /primary/logs/alerts.log 2011-04-18T21:41:21.344197 WARN 2179 StatsManager.cpp:1533 ST0003] Low diskfull alert 2011-04-18T21:42:21.647056 WARN 2179 StatsManager.cpp:1533 ST0003] Low diskfull alert 4 If you wish to review the log file, you can log in to the queen and examine the file / primary/logs/PlatformManager.log 5 The cluster is now generating email for event subscriptions. If you have configured a subscription that includes the “Warning disk full” event, the email recipient (or SNMP sever) should start to receive those messages. 6 Reset the warning level disk threshold to 65%. Open a browser and paste this URL, supplying the IP Address of the node (Node 1 in our example). http://<IP Address of Node>:1953/std/ configflags?diskfullThresholdLow=65 This sets the Warning disk full threshold level back to 65%. 7 You can test the Error and Critical levels of disk full in this same way using the following URLs: • For Error Level Disk Full Threshold (ST0002): http://<IP Address of Node>:1953/std/ configflags?diskfullThresholdMedium =[Threshold Value] • For Critical Level Disk Full Threshold (ST0001): http://<IP Address of Node>:1953/std/ configflags?diskfullThresholdHigh=[Threshold Value] Returning Operations to Normal After Testing Event Subscriptions You will need to restart the Aster Database cluster after testing any events that issue a soft shutdown. For a list of these see “Remediations” on page 131. Recovering from softShutdown Restart the cluster. To do this, log in as the 'root' user to the queen node. Run "ncli system softrestart". 133 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Monitoring Aster Database SNMP Monitoring of Aster Database Troubleshooting Event Engine Issues Occasionally, the Event Engine may not work as you expect it to. Please check the following items before contacting technical support. Do the log files show the expected behavior? There are two log files for the Event Engine. One that resides on the Queen: • /primary/logs/alerts.log - reports logs from the Event Engine service. And one that resides on the nodes: • /primary/logs/PlatformManager.log - reports the events that have occurred on that node. If you encounter an issue with the Event Engine, the first thing to do is verify that it has started properly. Check the last few messages in the /primary/logs/alerts.log file. They should start with an ‘INFO’ for information and state that the Blackbird service has started successfully. If the file contains messages that start with ‘WARN’ or ‘ERROR’, a problem has occurred. Review the messages to determine the cause. Is Aster Database firing event alerts to trigger the configured actions? Check the last few messages in the /primary/logs/alerts.log file. Check that the expected disk full alerts are shown in the file. If not, the system is not firing event alerts. Perhaps you don’t have any nodes with disk utilization high enough to trigger the alert. See “Testing the Event Engine” on page 132 for information on how to set thresholds lower to test event alerts. SNMP Monitoring of Aster Database All Aster Database nodes can send SNMP traps and respond to SNMP read requests. Aster Database’s SNMP service conforms to net-snmp version 5.4.2.1 and supports the values of the UC Davis MIB. Follow these instructions to set up SNMP monitoring of Aster Database: • Setting Aster Database to send SNMP traps to an NMS (page 134) • Setting an NMS to perform SNMP reads on Aster Database (page 135) Setting Aster Database to send SNMP traps to an NMS This section explains how set up Aster Database to send SNMP traps to an SNMP network management system (NMS) such as net-snmp, HP Open View, or CA Unicenter. Tip! If you don’t already have an NMS installed, the open source net-snmp tool may prove useful to you. For instructions on setting up the tool, see the net-snmp FAQs on sending traps and receiving traps. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 134 Monitoring Aster Database SNMP Monitoring of Aster Database Instructions: 1 2 Make a note of the following: • IP address of your NMS. • SNMP trap listener port number of your NMS. This is the port on which the NMS listens for SNMP traps. By default, this is port 162. Issue the ncli command to create a subscription, using the --type snmp and --manager host[:port] flags as in the example: # ncli events addsubscription --subid 1 --type snmp --minPriority medium --minSeverity warn --componentType "hardware" --manager targetNMS.teradata.com 3 Remember that the port defaults to 162, so you don’t need to specify a port unless your NMS listens for SNMP traps on a different port. If you wanted to specify another port instead, you would use the following flag to refer to the NMS: --manager targetNMS.teradata.com:port Setting an NMS to perform SNMP reads on Aster Database You can point your network management system (NMS) to the queen, workers, and loaders in Aster Database so that the NMS can perform SNMP reads on Aster Database. Each Aster Database node runs an SNMP agent that listens on port number 19678 for SNMP reads from the NMS. Consult the documentation for your NMS for instructions on setting it up to perform reads. 135 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide CHAPTER 10 Admin: ncli (Aster Database Command Line Interface) Aster Database Command Line Interface (ncli) is a command line tool that enables you to gather operational information from all nodes in Aster Database and to take administrative actions in a uniform manner throughout the cluster. ncli is functional even if the cluster is down - at which time, ncli may be used to repair the cluster. ncli allows you to generate output (such as cluster system statistics) in a format that you can later analyze. ncli functionality includes a way to look at node status, vworker configuration, I/ O configuration, replication status, and process management job status. Operations may be performed on one, a group, or all of the nodes. Output may be formatted in tables for screen viewing, piped to another UNIX command, or saved to a file. This section explains how to use ncli. The following topics are covered here: • ncli Installation and Setup (page 136) • Using ncli (page 137) • ncli Command Reference (page 138) ncli Installation and Setup Installing ncli ncli is available beginning in Aster Database version 4.6. When installing or upgrading to version 4.6 or later, ncli is installed under /home/beehive/ncli. To display the version of ncli currently installed, issue: $ ncli --version Setting up Passwordless SSH Passwordless SSH must be set up among all nodes for the OS user who will issue ncli commands. The Aster Database installer does this automatically for the beehive user on all machines that are part of the cluster. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 136 Admin: ncli (Aster Database Command Line Interface) Using ncli To run most ncli commands, you should log in as the UNIX user, beehive. However, to run certain powerful commands (e.g., softrestart and softshutdown in the system section), you must be logged in as root. If you attempt to run one of these commands as beehive, an error message will display indicating that this command may only be run by root. Using ncli Who Should Use ncli? In most installations, your in-house power users and administrators as well as Teradata Aster support representatives and consultants will use ncli. Any administrator who doesn't want to use the AMC for various reasons, can use ncli. Some operations, like configuration of event subscriptions, are only possible throught ncli. Administrators will find ncli very useful because many ncli commands work when the cluster is down (i.e. AMC does not) and it can aid in troubleshooting. This is in contrast to the AMC (Aster Database Management Console), which is focused on setting up, managing, and scaling out Aster Database. The AMC is used by your in-house Aster Database administrators and DBAs. Issuing ncli Commands You invoke ncli from the command line on the Aster Database queen by typing the command “ncli” followed by the section name, command name, and any parameters. The capabilities of ncli are divided into “sections”, which are groups of commands with related functions. Flags may be added to commands to modify their actions, for example, by formatting the results or limiting the action of the command to specific nodes. To run a command, open a shell on any node in the cluster and type ncli followed by the high level flags, the name of the section, the command, and finally any command arguments. In other words, a typical command takes the form: $ ncli [<highlevelflag>] <section> <command> [<commandflag>] For example, to run the show command in the node section, while passing an argument to the --hosttype flag so that you'll see only information about workers (and not the queen or loaders), you would type: $ ncli --hosttype=worker node show A simpler example, which shows CPU configurations for all nodes, is this: $ ncli node showcpuconfig Command Line Conventions You can use the standard UNIX command line conventions, such as piping your results through another command. For example: $ ncli vworker showconfigsignature | grep 8f39c1ddfa4762d81f4a5960a31491ff 137 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: ncli (Aster Database Command Line Interface) ncli Command Reference ncli help To invoke ncli help, type: $ ncli --help See the table below for information on how to access detailed help for sections and commands. Table 10 - 1: ncli help commands Flag Description ncli --help Shows a list of help commands and high level flags. ncli Shows a list of available command sections. ncli <section> Shows available commands within the specified section. ncli --help <section> <command> Shows detailed help text for the specified command. ncli --helpshort Shows help only for ncli module. ncli Command Reference ncli command sections The capabilities of ncli are divided into sections, which are groups of commands with related functions. The following table lists the sections: Table 10 - 2: ncli command sections Section Description node Commands related to nodes tables Commands related to table information procman Commands that retrieve status from the process management master qos Commands related to workload management process Commands related to running processes ippool Commands to configure the Aster Database pool of IP addresses vworker Commands related to vworkers system Commands related to Aster Database system status display and control ice Commands related to ICE (Inter Cluster Express) server Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 138 Admin: ncli (Aster Database Command Line Interface) ncli Command Reference Table 10 - 2: ncli command sections (continued) Section Description disk Commands related to disks replication Commands related to replication session Commands to view the state of running sessions in the system nsconfig Commands to configure name servers and hosts query Commands to view the state of running queries in the system netconfig Commands to configure network interfaces by function statserver Commands related to the StatServer events Commands to configure events util Miscellaneous commands sysman Commands related to sysman, the Aster Database system management layer database Commands for pre-upgrade database tasks ncli node section The most commonly used section is the node section, which provides general tools for reporting and running UNIX commands on one or many nodes in the cluster. The syntax to run a command in the node section looks like this example: $ ncli node showsummaryconfig which displays a result like this: Node Configuration +------------+----------------------------------+-----------------------------------| Node IP | Node ID | Platform +------------+----------------------------------+-----------------------------------| 10.60.11.5 | 840101ef035870509e8d5da5a49fcea9 | VMware, Inc.:VMware Virtual Platform | 10.60.11.6 | 9949ee3ebd5560f738add0635e39b40a | VMware, Inc.:VMware Virtual Platform | 10.60.11.7 | de5dcfbe8146aa3f03f16ec2637a373b | VMware, Inc.:VMware Virtual Platform +------------+----------------------------------+-----------------------------------3 rows table continued... +--------------------+---------------------------| Aster Version | Kernel Version +--------------------+---------------------------| beehivemain-r28783 | 2.6.32-131.21.1.el6.x86_64 | beehivemain-r28783 | 2.6.32-131.21.1.el6.x86_64 | beehivemain-r28783 | 2.6.32-131.21.1.el6.x86_64 +--------------------+---------------------------- table continued... 139 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: ncli (Aster Database Command Line Interface) ncli Command Reference +--------------------------------------------------------+------+--------------| Distribution | CPUs | Free Mem (MB) +--------------------------------------------------------+------+--------------| Red Hat Enterprise Linux Server release 6.0 (Santiago) | 1 | 6179 / 7873 | Red Hat Enterprise Linux Server release 6.0 (Santiago) | 1 | 6791 / 7873 | Red Hat Enterprise Linux Server release 6.0 (Santiago) | 1 | 6793 / 7873 +--------------------------------------------------------+------+--------------- table continued... +----------------+----------------+ | Free Swap (MB) | Free Disk (GB) | +----------------+----------------+ | 10049 / 10049 | 151 / 167 | | 10049 / 10049 | 79 / 89 | | 10049 / 10049 | 79 / 89 | +----------------+----------------+ Here is another example using showcmd, which issues the specified command on every node and displays results in a table. For example: $ ncli node showcmd cat /proc/sys/fs/file-max displays the results: Command Output for cat /proc/sys/fs/file-max +------------+----------------------------------+------+--------+--------+ | Node IP | Node ID | exit | stdout | stderr | +------------+----------------------------------+------+--------+--------+ | 10.60.11.5 | 840101ef035870509e8d5da5a49fcea9 | 0 | 784168 | | | 10.60.11.6 | 9949ee3ebd5560f738add0635e39b40a | 0 | 784195 | | | 10.60.11.7 | de5dcfbe8146aa3f03f16ec2637a373b | 0 | 784196 | | +------------+----------------------------------+------+--------+--------+ 3 rows runonall and runonother The ncli node runonall command may be used to run any executable on multiple nodes. It can also be used to run a command from a file. The executable must exist on all nodes prior to the command being run. For some commands (like df ), the command already exists on all nodes. If a user-written script is being executed, then it must be copied to all nodes using ncli node clonefile or a similar mechanism. This effectively allows you to run commands in parallel over SSH on the cluster. An example is: $ ncli node runonall df Similarly the runonother command is used to run the specified executable in parallel on all nodes except the one from which the command is issued: $ ncli node runonother cat /proc/sys/fs/file-max Table 10 - 3: ncli node section Command Description changeconfig <oldip> <newip> <newmac> Changes configuration of node IP address and/ or MAC address. clonefile <filename> Copies the named file <filename> from this node to all other nodes in the cluster. runonall <cmd> Runs the command <cmd> in parallel on all nodes. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 140 Admin: ncli (Aster Database Command Line Interface) ncli Command Reference Table 10 - 3: ncli node section (continued) Command Description runonother <cmd> Runs the command <cmd> in parallel on all nodes except this one. show [summary] Shows cluster nodes and optional summary table. showcmd <cmd> Shows the output of command <cmd> for each of the nodes, in table format. showcpuconfig [--ids] Shows CPU configuration for all nodes or just the nodes specified using the --ids flag. showdbinfo [epochtime] Shows node information stored in the Aster Database internal database, optionally using epochtime. showhwconfig Shows all node hardware commands at once. showinterfaces Shows details for network interfaces. showpci Shows the items on the PCI bus. showstoragestats [--nodeids], [--aggregate] Shows storage statistics of specified nodes, or an aggregate for all nodes. showsummaryconfig Shows summarized node configurations. showuid Shows node uid (unique identifier) information from config files. showversion Shows detailed software version information. ncli tables Section The tables section provides general tools for returning information about tables. The syntax to run a command in the tables section looks like the examples shown below. First, you must gather information by issuing: $ ncli tables gathertableinfo --forcerun=true which displays results like: Table space information has been recorded in /home/beehive/data/tmp/ table_space_data_beehive Table space information has been recorded in /home/beehive/data/tmp/ table_space_data_retail_sales Then you can display the information gathered by issuing: # ncli tables showtableinfo which returns results like: Table Spaces +--------------+---------+-------------+----------------+------------| dbname | vworker | schema_name | table_name | compression 141 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: ncli (Aster Database Command Line Interface) ncli Command Reference +--------------+---------+-------------+----------------+------------| beehive | w6z | public | employees | none | beehive | w5z | public | employees | none | retail_sales | w5z | public | customer_index | none | retail_sales | w5z | public | region_dim | none ... | retail_sales | w6z | public | geo_dim | none | retail_sales | w6z | public | date_dim | none +--------------+---------+-------------+----------------+------------- table continued... +------------+------------+-----------------+------------------+ | table_type | total_size | total_disk_size | dead_tuple_count | +------------+------------+-----------------+------------------+ | row | 32768 | 32768 | 0 | | row | 0 | 0 | 0 | | row | 11698176 | 11698176 | 0 | | row | 32768 | 32768 | 0 | ... | row | 32768 | 32768 | 0 | | row | 32768 | 32768 | 0 | +------------+------------+-----------------+------------------+ Table 10 - 4: ncli tables section Command Description gathertableinfo [--configfile][--dbnames] [--forcerun] Gathers table information and saves it to temporary storage. The automatic process that gathers table information runs periodically. However, you can force it to run at any time by specifying the -forcerun=true flag. Use the --configfile flag to specify where the results should be written, and use --dbnames to supply the databases for which to gather information. showtableinfo [--tables] [--dbnames][--aggregate] Shows table information collected by ncli tables gathertableinfo. Use --dbnames or -tables to filter results by database and/or table. Use -aggregate to display logical partitioned table statistics in aggregate, rather than per partition. ncli procman Section The procman section provides general tools for obtaining statuses from the process management manager. The syntax to run a command in the procman section looks like this example: $ ncli procman showjobs which displays results like: Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 142 Admin: ncli (Aster Database Command Line Interface) ncli Command Reference ProcMgmt nodes +-------+-----------------------------------------------+--------------+ | JobId | Name | Task Indices | +-------+-----------------------------------------------+--------------+ | 8 | Txman on 10.50.129.100 | 0 1 | | 14 | sysmanExec on 10.50.129.100 | 0 | | 18 | AdmctlMonitorMgr on 10.50.129.100 | 0 | ... | 232 | Net-SNMP on 10.50.129.101 | 0 | | 236 | HardwareStatCollectorExec on 10.50.129.101 | 0 | | 240 | System SharedJVM on 10.50.129.101 | 0 | +-------+-----------------------------------------------+--------------+ 56 rows Table 10 - 5: ncli procman section Command Description showjobs [<jobId> ...] Shows registered jobs. Optionally, shows only those registered jobs listed by <jobId>. Separate multiple jobIds by spaces. shownodes Shows registered nodes. showtasks [<jobId:taskIndex> ...] Shows registered tasks. Optionally, shows only those registered tasks listed by identifier <jobId:taskIndex>. The task index is the number assigned to each individual task that makes up a job. Task index assignment begins with zero (0) and proceeds until each task that makes up a job has a unique index. Separate multiple jobId:taskIndex references by spaces. showusers Shows registered OS users. ncli qos Section The qos section allows you to view details related to workload management. The syntax to run a command in the qos section looks like this example: $ ncli qos showconcurrency which displays results like: Concurrency is 100 Table 10 - 6: ncli qos section 143 Command Description cancel <sessionid> Cancels the specified query. canceladmission <sessionid> Cancels the specified query from being admitted. setconcurrency <concurrency> Sets the maximum query concurrency. showadmissioncontrols Shows the QoS admission controls. showall Shows all of the QoS-related data. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: ncli (Aster Database Command Line Interface) ncli Command Reference Table 10 - 6: ncli qos section (continued) Command Description showconcurrency Shows the maximum query concurrency. showcpucgroups Shows all of the CPU group tasks. showmemcgroups Shows all of the memory cgroup tasks. showprocesses <sessionid> ... Shows processes under QoS, optionally filtered by sessionid. showrules Shows the QoS rules. showserviceclasses Shows the QoS service classes. showsessiondetails <sessionid> ... Shows the QoS session details, optionally filtered by sessionid. showsessions <sessionid> ... Shows the QoS sessions, optionally filtered by sessionid. showslavesessions <sessionid> ... Shows the QoS sessions from the QosSlave processes, optionally filtered by sessionid. ncli process Section The process section provides commands related to running processes - specifically the memcheck [<processnamefilter>] command, which reports on memory utilization. The syntax to run a command in the process section looks like this example: $ ncli process memcheck postgres which displays results like: Process Memory Usage +------------+----------------------------------+--------------------+------| Node IP | Node ID | Process Name | pid +------------+----------------------------------+--------------------+------| 10.60.11.5 | 840101ef035870509e8d5da5a49fcea9 | postgres queenDb-0 | 3274 | 10.60.11.5 | 840101ef035870509e8d5da5a49fcea9 | postgres queenDb-0 | 3270 | ... | 10.60.11.7 | de5dcfbe8146aa3f03f16ec2637a373b | postgres w6z | 28799 +------------+----------------------------------+--------------------+------27 rows table continued... +------------+----------+-------------+ | VSize (MB) | RSS (MB) | Shared (MB) | +------------+----------+-------------+ | 720 | 62 | 56 | | 720 | 59 | 53 | | | 139 | 6 | 0 | +------------+----------+-------------+ Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 144 Admin: ncli (Aster Database Command Line Interface) ncli Command Reference Table 10 - 7: ncli process section Command Description memcheck {<processnamefilter>} Shows memory utilization for all processes. Optionally, provide a filter by process name or partial process name. ncli ippool Section The ippool section provides general tools for managing IP address allocations in the Aster Database cluster. In an Aster-managed installation (AMOS), all worker and loader nodes are allocated IP addresses from a pool managed by the queen (coordinator) node. Pools are not required to be contiguous, but they must be in the same network, and they must include all existing nodes. They can also consist of only one IP address. Tip: Note that in a User-managed installation (UMOS), nodes are assigned an IP address through the OS directly, and then added to the Aster Database cluster by IP address, so the ncli ippool commands are not used. Syntax The syntax to run a command in the ippool section looks like this example: $ ncli ippool showranges which displays results like: IP Address Ranges +--------+--------------+--------------+ | Type | Start IP | End IP | +--------+--------------+--------------+ | shared | 10.60.11.100 | 10.60.11.105 | +--------+--------------+--------------+ 1 rows Table 10 - 8: ncli ippool section Command Description setranges <start ip1>-<end ip1> [shared|queens|workers|loaders]... Sets the new IP pool ranges. showallocations Shows currently allocated IP addresses. showranges Shows currently configured IP ranges. Note that configuring the IP pool is not required. The default configuration will be a shared pool of the entire network and if this is suitable to your needs, you may use it without changing it. However, changing pools later can be difficult as the changes must include all existing nodes in your cluster. For command-line help on these commands, see: 145 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: ncli (Aster Database Command Line Interface) ncli Command Reference # ncli --help ippool You should issue the ippool showranges command before making any changes, in order to see what the current settings are. Default IP Pool Behavior By default, the queen assumes it can allocate IP addresses for the entire subnet on which it resides. For example, if you have assigned an IP address and subnet mask of 192.168.10.10/ 255.255.255.0 during queen installation, then the queen will create a default IP address allocation pool of 192.168.10.1 through 192.168.10.254. In many networks, the first address is reserved for the gateway, and the coordinator will omit any IP addresses that it detects are in use. This default behavior may not be suitable for all installations, especially those where the Aster Database has been given a specific portion of IP addresses in a network or is part of a very large network (like a subnet mask 255.255.0.0, for example). The IP pool management is flexible enough to handle various needs. Some examples will be shown below. Warning! The default behavior for IP address assignment to new nodes is that the system allows allocations of all node types over the entire subnet. If the default configuration is not suited to your network, the IP pool must be reconfigured using ncli ippool setranges before adding any new nodes to the Aster Database cluster. If this is not done before adding new nodes, you will need to remove the nodes through the AMC, allocate the correct IP address pool, and then re-add the nodes through the AMC. If you cannot remove and re-add the nodes (for example, because they contain data), you should call Teradata Support for assistance. Setting Up IP Pools The pools can be reconfigured using the ncli ippool command or the AMC (“Setting up IP Pools in the AMC” on page 500). This should be done prior to adding nodes as the allocations may restrict the type of changes allowed. In many cases, leaving the default range alone will be suitable. It is only recommended to change the allocation scheme if the network is not entirely owned by Aster Database, or if the range is so large that the user does not wish to give Aster Database the entire range (e.g. 2^16 in the case of a class-B subnet). Keep in mind that the queen’s IP address has no effect on the allocation scheme other than the fact that it is already in use and cannot be allocated again. All new nodes will be allocated from the start of the pool for that node type, and because of this, the pool should be modified prior to adding new nodes to the system. Previous versions of Aster Database would allocate new nodes incrementally from the coordinator IP (typically X.X.X.100), but this is no longer the default behavior. Even if there are existing nodes at .100 and higher, any new nodes will be allocated starting at the bottom of the available range(s). Since the default behavior is a single range with a designated type of “shared” spanning the subnet, in many cases the 2nd address (.2 in class-C subnet) will be the next node assigned (since typically .1 is the gateway). Restoring Pre-4.6.3 Behavior If the old default behavior is desired (as in Aster Database version 4.6.2 and earlier), the following command can be issued. In this example, the queen has been installed with IP address 10.1.1.100: Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 146 Admin: ncli (Aster Database Command Line Interface) ncli Command Reference ncli ippool setranges 10.1.1.100-10.1.1.100 queens 10.1.1.101-10.1.1.239 workers 10.1.1.240-10.1.1.254 loaders This creates 3 ranges. The first is a single IP for the queen. The second is a range just for worker allocation. The third is a range for loader allocation. If there is a backup queen, it should be given an IP address out of the allocated ranges, such as 10.1.1.99 in this example. Using the ippool setranges Command It is recommended to make the pools as large as you expect the cluster to be, with additional space for growth. Also, some features in Aster Database use IP addresses to do utility operations (like networking auto-enslavement for bonding) and these operations require temporary use of some of the IP addresses in the pool. Allowing two or three extra IP addresses in the worker and loader pool will suffice. You can use setranges to allocate a group of IP addresses that will be assigned to new workers, queens, or loaders as they are added to the Aster Database cluster, or you can allocate a pool of shared IP addresses that can be assigned to any new node that is added. Note that there can be no overlap between the ranges that are allocated, but you may use two or more IP pools that are noncontiguous: ncli ippool setranges 10.60.11.100-10.60.11.105 shared IP Address Ranges +--------+--------------+--------------+ | Type | Start IP | End IP | +--------+--------------+--------------+ | shared | 10.60.11.100 | 10.60.11.105 | +--------+--------------+--------------+ 1 rows You can use setranges to resize the ranges at any time, but each new range must always include all existing nodes of its type. ncli Ippool Examples The following examples describe a few typical installation scenarios and the commands needed to configure the queen for these installations. Installation 1: Creating a minimal class-C IP Pool Create a shared range for use by all nodes. Assume the coordinator is already installed at 192.168.10.10. # ncli ippool setranges 192.168.10.10-192.168.10.199 shared This means all new nodes will be allocated IP addresses from .11 until .199. Installation 2: Allocation of node specific ranges in a class-C network To allow the queen a portion of the class-C network and to control how nodes are allocated, the following command can be used: # ncli ippool setranges 192.168.10.10-192.168.10.11 queens 192.168.10.100-192.168.10.199 workers 192.168.10.200-192.168.10.219 loaders This command assumes the coordinator is already using 192.168.10.10 and leaves 192.168.10.11 for an additional queen later on. Workers and Loaders will be allocated IP addresses from .100-.199 and .200-.219 respectively. 147 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: ncli (Aster Database Command Line Interface) ncli Command Reference Installation 3: Allocation of a portion of a class-B network Suppose that your IT department has given you an allocation for your Aster Database nodes in a class B network. You can restrict the node allocation to use just a portion of that network. Note that the queen still assumes that it will be the only DHCP server reachable by the nodes for use in PXE provisioning of the OS. # ncli ippool setranges 172.20.0.10-172.20.0.249 shared This would allocate 240 addresses in the 172.20.0.0/16 space. Installation 4: Allocation of non-contiguous IP pools Pools are not required to be contiguous, but they must be in the same network, and they must include all existing nodes. If you run out of IPs for a particular node type in an existing pool and need to add another pool, you can do that instead of growing the existing pool. Pools can also consist of only one IP address. # ncli ippool setranges 192.168.10.10-192.168.10.10 queens 192.168.10.200-192.168.10.209 workers 192.168.10.230-192.168.10.239 workers 192.168.10.220-192.168.10.229 loaders This would allocate a single IP address for the queen, ten addresses for workers and ten addresses for loaders. ncli vworker Section The vworker section provides commands related to Aster Database vworkers. A vworker represents a single instance of the Aster Database data management software on an Aster Database node (machine). Each Aster Database node typically has a number of vworkers running on it. The syntax to run a command in the vworker section looks like this example: $ ncli vworker show which returns a result like the following, where Node is the IP address of the Aster Database node machine, Status is the current operational status of one or more vworkers on the node, and Count is the number of vworkers that currently have that status on that node: vworkers +---------------+-------------+---------------+ | Node | Status | VWorker Count | +---------------+-------------+---------------+ | 10.50.129.100 | Deactivated | 0 | | 10.50.129.100 | Active | 1 | | 10.50.129.101 | Active | 1 | | 10.50.129.101 | Deactivated | 1 | | 10.50.129.102 | Active | 1 | | 10.50.129.102 | Deactivated | 2 | +---------------+-------------+---------------+ 6 rows Tip: With any command in the node section, you can use flags to limit commands to only specific machine(s) or type(s) of nodes. See the section ncli sysman Section (page 163) for information on using flags. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 148 Admin: ncli (Aster Database Command Line Interface) ncli Command Reference Table 10 - 9: ncli vworker section Command Description show Summarizes operational status of the vworkers. showconfigsignature Shows vworkers’ configuration signatures. showdetail [shownodeid] Shows vworkers’ details, with Node ID if shownodeid is specified. ncli system Section The system section provides commands related to Aster Database status display and control. The syntax for commands in the system section looks like this example: $ ncli system softrestart Note: Setting the QoS concurrency to zero will still allow any new queries that are part of an open transaction. Soft Restart does not wait until open transactions are finished; it rolls back open transactions and then does the restart. Most of the commands in the system section duplicate some of the functionality available through the AMC. Exposing them through ncli enables you to run those commands even if the AMC is not running or you do not have access to the AMC for whatever reason. Because they are so powerful, the commands softrestart and softshutdown must be run by the root OS user. The softrestart command may be issued on a cluster after it has been shut down or to restart it when it is running. The softrestart command should be issued for the first time on a new cluster only after the workers have attained a status of “Prepared” (you can check the worker status in the AMC). Tip: Using the commands softrestart and softshutdown through ncli is preferable to issuing them the old way using the Python utilities SoftShutdownBeehive.py and SoftRestartBeehive.py located by default in the directory /home/beehive/bin/utils/primitives/. If AMC is running, it is even better to use the “Soft Restart” or “Hard Restart” buttons in the Admin:Cluster Management tab, because they can be used without logging in to the system at the command line, and you can see the cluster status during the restart. Table 10 - 10: ncli system section 149 Command Description activate Activates Aster Database. See “Activating Aster Database” on page 531. addnode <worker or loader> <ip or mac> [--clean] [group <group name>] [display <display name>] Adds/registers a new node. balancedata Balances data in Aster Database. See “Balance Data” on page 533. “Adding Nodes to Aster Database” on page 487 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: ncli (Aster Database Command Line Interface) ncli Command Reference Table 10 - 10: ncli system section (continued) Command Description balanceprocess Balances Aster Database processes. See “Balance Process” on page 534. changepartitioncount Changes the partition count and the parallelism <newpartitioncount> {<parallelism>} to the values specified. See “Splitting Partitions in Aster Database” on page 475 changepartitioncountstatus Shows progress made in changing the partition count. removenode <ip address> [--force] [--dryRun] Removes/unregisters the node specified by ip address. Use --dryRun to test if the node is eligible to be removed. Use --force to force removal if necessary (if internal checks fail). See “Removing Nodes from Aster Database” on page 502. show Shows Aster Database queen status. Note that a status of Up means only that the queen is running; it does not tell you if the cluster is ready to accept queries. For that, use the ncli node show command as described in “ncli node section” on page 139. showpartitioncount Shows system partition count. softrestart Issues a soft restart to Aster Database. You must be root to issue this command. See also: “Soft Restart” on page 530. softshutdown Issues a soft shutdown to Aster Database. You must be root to issue this command. See also: “Soft Shutdown” on page 531. softstartup Performs a soft startup of Aster Database. You must be root to issue this command. See also: “Soft Startup” on page 536. ncli ice Section The ice section provides commands related to Ice server, which provides services to move data around in Aster Database. The syntax to run a command in the ice section looks like this example: $ ncli ice showactivetransports which returns a result like: Active Transports +------------+---------------------+--------------------+ | Node | SessionId | TransportId | +------------+---------------------+--------------------+ | 10.60.11.5 | 2327674903724048181 | 387487523833891928 | | 10.60.11.6 | 2327674903724048181 | 387487523833891928 | | 10.60.11.7 | 2327674903724048181 | 387487523833891928 | +------------+---------------------+--------------------+ 3 rows Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 150 Admin: ncli (Aster Database Command Line Interface) ncli Command Reference . Table 10 - 11: ncli ice section Command Description showactivetransports Shows all active transports. showicestats <sessionid> Show statistics of all Ice servers. ncli disk Section The disk section provides commands related to disks. The syntax to run a command in the disk section looks like this example: $ ncli disk showallconfig which returns a result like: IO Scheduler Configuration +------------+----------------------------------+------------| Node IP | Node ID | Device Path +------------+----------------------------------+------------| 10.60.11.5 | 840101ef035870509e8d5da5a49fcea9 | /dev/sda1 | 10.60.11.5 | 840101ef035870509e8d5da5a49fcea9 | /dev/sda2 | ... | 10.60.11.7 | de5dcfbe8146aa3f03f16ec2637a373b | /dev/sda7 +------------+----------------------------------+------------18 rows table continuted... +-----------+--------------+-----------------+ | Scheduler | Max Requests | Read Ahead (KB) | +-----------+--------------+-----------------+ | deadline | 4096 | 4096 | | deadline | 4096 | 4096 | | | deadline | 4096 | 4096 | +-----------+--------------+-----------------+ Table 10 - 12: ncli disk section 151 Command Description showallconfig Shows report on disk configuration. showfsconfig [all] Shows report on file system configuration. showhpacucliconfig Shows report on data generated by the hpacucli utility. hpacucli (the Array Configuration Utility CLI) is a command line-based disk configuration program for HP Smart Array Controllers and RAID Array Controllers. showioschedconfig Shows report on IO scheduler configuration. showmdconfig Shows report on md configuration (multiple device i.e. RAID). showmegacliconfig [--pdonly|--ldonly] Shows report on disk configuration for Dell machines. Optionally view only PD (physical disks) or LD (virtual disks). showmppconfig Shows report on MPP configuration. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: ncli (Aster Database Command Line Interface) ncli Command Reference ncli replication Section The replication section provides commands related to Aster Database replication. The syntax to run a command in the replication section looks like this example: $ ncli replication showgoal which returns a result such as: Replication factor goal is 2 Table 10 - 13: ncli replication section Command Description showdetailedrpcstats Shows detailed replication RPC statistics. showgoal Shows the goal replication factor. showsummaryrpcstats Shows summarized replication RPC statistics. ncli session Section The session section provides commands related to Aster Database sessions. The syntax to run a command in the session section looks like this example: $ ncli session show which returns a result such as: Sessions +---------------------+---------+------------------+-----------+-------------| session_id | user_id | user_ip | queen_pid | db_name +---------------------+---------+------------------+-----------+-------------| 5047778509961726353 | beehive | 127.0.0.1 | 17438 | beehive | 1914926430725305282 | beehive | 127.0.0.1 | 31349 | beehive | 283128356804756573 | beehive | 127.0.0.1 | 29621 | beehive | 1627772863387028473 | beehive | 127.0.0.1 | 29668 | retail_sales | 7450466840935701437 | beehive | 127.0.0.1 | 29765 | beehive | 1365127267625199444 | beehive | 127.0.0.1 | 12173 | beehive +---------------------+---------+------------------+-----------+-------------6 rows table continued... +---------------+---------------+---------------+----------------+------------------+ | session_state | start_time | end_time | login_duration |running_process_id| +---------------+---------------+---------------+----------------+------------------+ | closed | 1337538898000 | 1337540698000 | 1800000 | | | closed | 1337754018000 | 1337754022000 | 4000 | | | closed | 1337798303000 | 1337798318000 | 15000 | | | closed | 1337798318000 | 1337798332000 | 14000 | | | closed | 1337798380000 | 1337798473000 | 93000 | | | idle | 1336695793000 | None | | None | +---------------+---------------+---------------+----------------+------------------+ Table 10 - 14: ncli session section Command Description show [--ids] [--status] Shows session information for specified session ids and/ or statuses. showactive Shows all active sessions. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 152 Admin: ncli (Aster Database Command Line Interface) ncli Command Reference ncli nsconfig Section The nsconfig section provides commands for setting up nameservers and hosts on all nodes in the cluster simultaneously. Some of this functionality is also available through the AMC (see “Setting up Host entries for all Aster Database nodes” on page 514 and “Setting up DNS entries for all Aster Database nodes” on page 515). The syntax to run a command in the nsconfig section looks like this example: $ ncli nsconfig show hosts which shows all the entries in the /etc/hosts file, which have been made through ncli and/ or the AMC. On a clean installation, it returns: { "hosts": [] } After one host has been added through either the AMC or ncli, the result will be like: { "hosts": [ { "comment": "Teradata server", "ip": "10.31.120.100", "aliases": [ "tdserver" ] } ] } Note that the example above is also the format to use when creating a file of host entries to be added through ncli. If there are multiple aliases for a particular host, separate them by commas. Adding or Modifying Hosts through ncli You can modify the /etc/hosts for every node in the Aster Database cluster by creating a “hosts file” and applying it through ncli. Note that doing so will not overwrite any existing entries that were added manually to /etc/hosts. However, the file you apply will overwrite any entries that were made through ncli or the AMC. So be sure the hosts file to be applied contains not only any new entries you wish to add, but also any existing entries added through the AMC or ncli that you wish to retain. To apply a hosts file through ncli, issue the following command, substituting your own path to the file with the entries to be added: $ ncli nsconfig apply /home/beehive/myhostsfile hosts The format for the hosts file is: 153 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: ncli (Aster Database Command Line Interface) ncli Command Reference { "hosts": [ { "comment": "<comment>", "ip": "<ipaddress>", "aliases": [ "<alias1>"[ ,”<alias2>”[ ,...”<aliasn>”]] ] } ] } Adding or Modifying Nameservers through ncli To add to or modify /etc/resolv.conf for nameservers, issue the following, substituting your own path to the file with the entries to be added:. $ ncli nsconfig apply /home/beehive/mynameserversfile nameservers For the nameservers file, the format is: { "nameservers": [ { "comment": "<comment>", "ip": "<ipaddress>" } ] } Note that any existing entries in these files not made through ncli or the AMC will not be overwritten, and only those entries created through ncli or the AMC may be replaced by the ncli nsconfig apply <filePath> hosts|nameservers command. Table 10 - 15: ncli nsconfig section Command Description apply <filePath> hosts|nameservers Applies the configuration stored at <filePath> to all nodes in the cluster, for either hosts or nameservers. See the section above this table for the format of the file to be applied. show hosts|nameservers Shows current configuration. synccluster hosts|nameservers Synchronizes the configuration across the cluster. This command is useful when new nodes are added. syncnode hosts|nameservers Synchronizes the configuration on the node where this is run. This command is useful when a new node has been added. validate hosts|nameservers Validates the configuration. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 154 Admin: ncli (Aster Database Command Line Interface) ncli Command Reference ncli query Section The query section provides commands to obtain information about active queries, recent queries, and usage statistics. The syntax to run a command in the query section looks like this example: $ ncli query showrecent 3 which returns a result like: +---------------------+----------------------------------+---------------------+ | sessionid | statement | start_time | +---------------------+----------------------------------+---------------------+ | 2954808787919656620 | SELECT * FROM sales_fact; | 2012-05-09 13:52:39 | | 7977221054196701228 | SELECT COUNT(*) from sales_fact; | 2012-05-09 09:32:02 | | 7977221054196701228 | SELECT * FROM product_dim; | 2012-05-09 09:32:24 | +---------------------+----------------------------------+---------------------+ (table continued...) +---------------------+-------------+---------+ | end_time | duration | running | +---------------------+-------------+---------+ | 2012-05-09 14:22:43 | 0:30:04 | N | | 2012-05-09 09:32:25 | 0:00:10 | N | | 2012-05-09 09:32:02 | 0:00:01 | N | +---------------------+-------------+---------+ Table 10 - 16: ncli query section 155 Command Description process_phase [--process_ids=<process_id>, <process_id>..] [--max=<num_of_processes (default:10)>] Shows the phase for the given process ids. Use --max=<num-of-processes> to specify the number to return. If not specified, the default is 10. process_phase_statements [--process_ids=<process_id>, <process_id>..] [--max=<num_of_processes (default:10)>] Shows the statement phase information for the given process ids. Use --max=<num-of-processes> to specify the number to return. If not specified, the default is 10. process_statements [--process_ids=<process_id>, <process_id>..] [--max_statement_len=<length>] Shows the result set of statements corresponding to the given list of process identifiers. You can limit the length of statements in characters by specifying --max_statement_len=<length>. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: ncli (Aster Database Command Line Interface) ncli Command Reference Table 10 - 16: ncli query section (continued) Command Description processes {<processfilter>} Shows queries that have run recently, filtered by processfilter value. Any plural values you wish to specify for processfilter can be expressed in comma-delimited form. Valid values for processfilter are as follows: [--process_ids=<process or statement ids>] [--users=<users>] [--databases=<databases>] [--execution_time_operator=< ">" | "<" | "<=" >] [--query_text] [--verbose] [--summary] (If this is specified, the output will include a separate table with the counts of each process by status.) [--statuses=<completed, error, running, pending, canceled>] showactive Shows currently active queries. showlongestprocess Shows the longest running query within last 24 hours. showmostactiveuser [--verbose] Shows the most active users within last 24 hours. shownoderesourceusage <sessionid> ... Shows per-node query resource usage for the specified queries. showprocessexecutiontime Shows process execution time within the last 24 hours. showprocessresourceusage <sessionid> ... Shows per-process query resource usage for specified queries. showrecent <count> Shows the most recent queries. showsystemresourceusage <sessionid> ... Shows system-wide query resource usage for the specified queries. workload_policies Shows the workload policies defined in the system. workload_service_classes Shows the workload service classes defined in the system. ncli netconfig Section For clusters composed of multi-NIC machines, the network assignments feature gives you the option of segregating data-backup traffic and data-loading traffic from your query network traffic. Using this feature, you can cable a NIC interface of each node into a separate subnet that you dedicate for backup or loading traffic. You set up network assignments via the ncli netconfig command, as explained below. The netconfig section provides commands for assigning the Aster Database functions on each node Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 156 Admin: ncli (Aster Database Command Line Interface) ncli Command Reference to a network by IP address or network interface. Some of this functionality is also available in the AMC’s Network Assignments panel. Tip: Note that before using these commands to configure the network, all nodes must have the appropriate physical cabling to support the configurations you will make. If you attempt to enslave an uncabled interface, ncli will experience a long time out while it attempts to configure the network. The syntax to run a command in the netconfig section looks like this example: $ ncli netconfig showsystem which returns results like: Current Network State +------------+----------------------------------+------------+-------------+--------| Node IP | Node ID | IP Address | Netmask | Gateway +------------+----------------------------------+------------+-------------+--------| 10.60.11.5 | 840101ef035870509e8d5da5a49fcea9 | 10.60.11.5 | 255.255.0.0 | 10.60.0.1 | 10.60.11.6 | 9949ee3ebd5560f738add0635e39b40a | 10.60.11.6 | 255.255.0.0 | 10.60.0.1 | 10.60.11.7 | de5dcfbe8146aa3f03f16ec2637a373b | 10.60.11.7 | 255.255.0.0 | 10.60.0.1 +------------+----------------------------------+------------+-------------+--------3 rows table continued... +-----------------------+------------+----------------+ | Bonding Enabled (Y/N) | Interfaces | Bonding Master | +-----------------------+------------+----------------+ | N | eth1 | N/A | | N | eth0 | N/A | | N | eth0 | N/A | +-----------------------+------------+----------------+ or, to see IP addresses assigned to the various Aster Database functions, issue: $ ncli netconfig showfunctionips which returns results like: IPs for each function +------------+----------------------------------+----------+------------+ | Node IP | Node ID | Function | IP Address | +------------+----------------------------------+----------+------------+ | 10.60.11.5 | 840101ef035870509e8d5da5a49fcea9 | queries | 10.60.11.5 | | 10.60.11.5 | 840101ef035870509e8d5da5a49fcea9 | loads | 10.60.11.5 | | 10.60.11.5 | 840101ef035870509e8d5da5a49fcea9 | backups | 10.60.11.5 | | 10.60.11.6 | 9949ee3ebd5560f738add0635e39b40a | queries | 10.60.11.6 | | 10.60.11.6 | 9949ee3ebd5560f738add0635e39b40a | loads | 10.60.11.6 | | 10.60.11.6 | 9949ee3ebd5560f738add0635e39b40a | backups | 10.60.11.6 | | 10.60.11.7 | de5dcfbe8146aa3f03f16ec2637a373b | queries | 10.60.11.7 | | 10.60.11.7 | de5dcfbe8146aa3f03f16ec2637a373b | loads | 10.60.11.7 | | 10.60.11.7 | de5dcfbe8146aa3f03f16ec2637a373b | backups | 10.60.11.7 | +------------+----------------------------------+----------+------------+ 9 rows Table 10 - 17: ncli netconfig section 157 Command Description apply [--dryRun] Apply the differences between the network configuration and the system. Use --dryRun to test your configuration before applying it. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: ncli (Aster Database Command Line Interface) ncli Command Reference Table 10 - 17: ncli netconfig section (continued) Command Description fromsystem Sets the network configuration from the system state. inspect Runs tests to see if there is a mismatch in functions assignments. setconfig ip1 <1.2.3.4> netmask1 <255.255.255.0> interfaces1 <eth0,...ethn> bonding1 <y|n> Assigns network parameters for this node for one or more interfaces. All parameters take an integer index <N> to indicate how the pairings align.(i.e. - to designate different interfaces use interfaces1, interfaces2,... interfacesn. For IPaddresses, use ip1,ip2,...ipn.) Parameters to configure are: • • • • ip<N> : The ip address netmask<N>: The network mask gateway<N>: The gateway (optional) usebonding<N>: 'y' or 'n' to explicitly enable bonding. If not specified, it will default to 'y' for more than one interface specified. • interfaces<N>: comma separated list of ethernet interfaces to use. Note that setconfig only applies to AMOS installs. For UMOS installs, these configurations are set through the operating system. setfunctions <eth*> loads <1.2.3.4> backups Assigns functions by interface or IP or specify [-clear] to erase. showconfig Shows currently configured network parameters. showfunctionips Displays the IP addresses used for each function. showfunctions Shows currently configured functions. showsystem Shows current networking state. ncli netconfig Examples Setting up NIC bonding In the following example, setconfig is used to set up bonding and assign NICs to IP addresses. Note that this example applies only to AMOS installs. UMOS multi-NIC installations require NIC bonding to be set up in the OS prior to installing Aster Database. $ ncli netconfig setconfig ip1 192.168.60.100 netmask1 255.255.255.0 interfaces1 eth0,eth1,eth2 bonding1 y ip2 192.168.25.50 netmask2 255.255.255.0 gateway2 192.168.25.30 interfaces2 eth3 Network Configuration +----------------+---------------+---------------+-------------+----------------+ | IP Address | Netmask | Gateway | Bonding Y/N | Interfaces | +----------------+---------------+---------------+-------------+----------------+ | 192.168.60.100 | 255.255.255.0 | -| Y | eth0,eth1,eth2 | | 192.168.25.50 | 255.255.255.0 | 192.168.25.30 | N | eth3 | +----------------+---------------+---------------+-------------+----------------+ 2 rows Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 158 Admin: ncli (Aster Database Command Line Interface) ncli Command Reference The configuration is then verified using --dryRun to find any issues before it is applied: $ ncli netconfig apply --dryRun The output supplies information on how the configuration will be applied: Operations required to apply +--------------------------------------------------------------------+ | Operation | +--------------------------------------------------------------------+ | Clear ip settings on interface eth1 | | Take down interface eth1 | | Create bond bond0 | | Set ip settings on interface bond0 to 192.168.60.100:255.255.255.0 | | Add slaves eth0,eth1,eth2 to bond0 | | Add default gateway 192.168.25.30 for eth3 | +--------------------------------------------------------------------+ 6 rows Warning! Applying the network settings is accomplished by restarting network services with the new settings. Because of this, any operations that are currently running over the network will be interrupted. Be sure that there are no active queries before applying network settings. After any necessary preparations are made, the command to apply the network configuration is issued: $ ncli netconfig apply which returns results like: Current Network State +----------------+---------------+---------------+-------------+----------------+ | IP Address | Netmask | Gateway | Bonding Y/N | Interfaces | +----------------+---------------+---------------+-------------+----------------+ | 192.168.60.100 | 255.255.255.0 | -| Y | eth0,eth1,eth2 | | 192.168.25.50 | 255.255.255.0 | 192.168.25.30 | N | eth3 | +----------------+---------------+---------------+-------------+----------------+ 2 rows Assigning Aster Database functions to subnets This example shows assignment of Aster Database functions to subnets. The example assigns loads and backups to only use the IP address 192.168.25.50. Query traffic remains on the default IP address 192.168.60.100. $ ncli netconfig setfunctions 192.168.25.50 loads 192.168.25.30 backups +----------+---------------+ | Function | IP Address | +----------+---------------+ | queries | 192.168.60.100| | loads | 192.168.25.50 | | backups | 192.168.25.50 | +----------+---------------+ 3 rows 159 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: ncli (Aster Database Command Line Interface) ncli Command Reference ncli statsserver Section Normally, you would use the AMC for this type of information. See “Admin: Administrative Operations” on page 486. The ncli statsserver commands are good diagnostic tools to use if the AMC is not available. The syntax to run a command in the statsserver section looks like this example: $ ncli statsserver showclusterstatus which returns results like: ClusterStatus +--------------------------+--------+ | property | value | +--------------------------+--------+ | clusterStatus | Up | | replicationFactor | 2 | | minimumReplicationFactor | 1 | | goalReplicationFactor | 2 | | clusterType | ec2 | | distroName | redhat | +--------------------------+--------+ 6 rows IncorporationStatus +----------------------+------------+ | property | value | +----------------------+------------+ | activating | False | | replicating | False | | activationImbalanced | False | | dataImbalanced | False | | timestamp | 1338916002 | +----------------------+------------+ 5 rows BackupStatus +----------+-------+ | property | value | +----------+-------+ +----------+-------+ 0 rows Table 10 - 18: ncli statserver section Command Description showactivestatements [resolveRunningOrPending] Shows running statements in the StatsServer. Use resolveRunningOrPending if you wish to distinguish between statements that are running and those that are pending for statements that have not yet ended. showclusterstatus Shows the cluster status from the StatsServer showhwstats [--nodeids] [--metrics] [--aggregate] [--function=[latest|average]] [--sincetime] Shows hardware stats from the StatsServer. Use the flags to filter the results. See “Hardware Stats Tab” on page 468. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 160 Admin: ncli (Aster Database Command Line Interface) ncli Command Reference Table 10 - 18: ncli statserver section (continued) Command Description shownodes [--nodeids] [epochtime | utctime] Shows a list of nodes from the StatsServer. Use the flag --nodeids to filter by node and epochtime or utctime to specify the time format. showphases <statementid> Shows statement phases from the StatsServer, optionally filtered by statementid. showsessions Shows active sessions in the StatsServer. showstatements [resolveRunningOrPending] Shows statements in the StatsServer. Use resolveRunningOrPending if you wish to distinguish between statements that are running and those that are pending for statements that have not yet ended. showvworkers Shows the vworkers from the StatsServer. ncli events Section The events section provides commands to view and configure event subscriptions in the Aster Database Event Engine. See “Event Monitoring with the Event Engine” on page 122 for information about event subscriptions. When you set up event subscriptions, you’re setting up subscription to be notified via SNMP or email whenever events of a particular type occur. The ncli is the only way to add and manage subscriptions. The commands in the events section will run against the queen, even if executed from a worker node. The syntax to run a command in the events section looks like this example: $ ncli events listsubscriptions Event Subscriptions +--------+------------+--------------+--------------+---------------| Sub ID | Notif Type | Min Priority | Min Severity | Component Type +--------+------------+--------------+--------------+---------------| 9 | snmp | High | FATAL | | 8 | snmp | Medium | ERROR | | 7 | snmp | High | FATAL | | 6 | snmp | High | FATAL | +--------+------------+--------------+--------------+---------------4 rows table continued... +-----------+---------------+----------------------+ | Event IDs | Throttle Secs | Notification Details | +-----------+---------------+----------------------+ | ST0001 | 0 | manager=10.60.11.5 | | SY0002 | 0 | manager=10.60.11.5 | | SY0001 | 0 | manager=10.60.11.5 | | ST0002 | 0 | manager=10.60.11.5 | ----------------+-----------+----------------------+ 161 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: ncli (Aster Database Command Line Interface) ncli Command Reference To add a new event subscription, issue a command like: $ ncli events addsubscription --eventIds ST0003 --type snmp --manager 10.60.11.5 --minPriority high --minSeverity fatal Which displays the event subscription added, returning a result like: Event Subscriptions +--------+------------+--------------+--------------+----------------+-----------+ | Sub ID | Notif Type | Min Priority | Min Severity | Component Type | Event IDs | +--------+------------+--------------+--------------+----------------+-----------+ | 5 | snmp | High | FATAL | | ST0003 | +--------+------------+--------------+--------------+----------------+-----------+ table continued... +---------------+----------------------+ | Throttle Secs | Notification Details | +---------------+----------------------+ | 0 | manager=10.60.11.5 | +---------------+----------------------+ 1 rows To see a list of required and optional parameters for an event subscription, issue the following command. $ ncli --help events addsubscription ncli events addsubscription <subscription args> <notification args> Add a new subscription Add or Edit a subscription <subscription args> [--id id]: Subscription ID. Required for edit --type email | snmp --minPriority low | medium | high --minSeverity info | warn | error | fatal [--componentTypes filter[,filter...]] : Filter(s) based on component type string. [--eventIds event[,event...]] : Specific event ids [--throttleSecs secs] : Throttle same events. 0 means don't throttle. <email notification args> --to address[,address..] --from address --smtp host[:port] [--username username --password password] <snmp notification args> --manager host[:port] For a list of valid event IDs, see “Supported Events” on page 128. Table 10 - 19: ncli events section Command Description addsubscription <subscription args> <notification args> Adds a new subscription. deletesubscription <sub id> Deletes an existing subscription. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 162 Admin: ncli (Aster Database Command Line Interface) ncli Command Reference Table 10 - 19: ncli events section (continued) Command Description editsubscription <subscription args> <notification args> Edits an existing subscription. listsubscriptions [sub id] Lists existing subscriptions to events, optionally filtered by subscription identifier. ncli util Section Typically, ncli commands generate output in the form of named tables. The util section allows the output of other ncli commands to be used as table data sources in a SELECT SQL query. This lets users combine the output of multiple ncli commands using JOINs, GROUP BYs, or other constructs. The basic syntax to run a command in the util section looks like this example: $ ncli util sql <SELECT sql query> Note that only SELECT statements are permitted. You might issue the following to view the version of the cluster. $ ncli util sql "SELECT distinct build_version FROM (ncli node showversion)" which displays results like: +---------------+ | build_version | +---------------+ | beehive-r28783| +---------------+ 1 rows Review the help section for details about running more complex queries by issuing the following: $ ncli --help util sql Table 10 - 20: ncli util section Command Description sql <select sql query> Allows you to use an ncli command within an SQL query. ncli sysman Section The sysman section provides commands related to the system manager process. The syntax to run a command in the sysman section looks like this example: $ ncli sysman memcheck which returns a result like: -----------------------------------------------class 1 [ 8 bytes ] : 496 objs; 0.0 MB; 163 0.0 cum MB Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: ncli (Aster Database Command Line Interface) ncli Command Reference class 2 [ 16 bytes ] : 233 objs; 0.0 MB; 0.0 cum class 3 [ 32 bytes ] : 167 objs; 0.0 MB; 0.0 cum ... class 42 [ 4608 bytes ] : 3 objs; 0.0 MB; 0.6 cum class 47 [ 8192 bytes ] : 1 objs; 0.0 MB; 0.7 cum class 53 [ 16384 bytes ] : 2 objs; 0.0 MB; 0.7 cum -----------------------------------------------PageHeap: 1 sizes; 0.8 MB free -----------------------------------------------206 pages * 1 spans ~ 0.8 MB; 0.8 MB cum; unmapped: MB; 0.0 MB cum Normal large spans: Unmapped large spans: >255 large * 0 spans ~ 0.0 MB; 0.8 MB cum; unmapped: MB; 0.0 MB cum -----------------------------------------------DevMemSysAllocator: failed_=0 SbrkSysAllocator: failed_=0 MmapSysAllocator: failed_=0 -----------------------------------------------MALLOC: 3145728 ( 3.0 MB) Heap size MALLOC: 1581104 ( 1.5 MB) Bytes in use by application ... MALLOC: 1 Thread heaps in use MALLOC: 5242880 ( 5.0 MB) Metadata allocated ------------------------------------------------ MB MB MB MB MB 0.0 0.0 Sysman Process Memory Statistics(PID 11535) +---------------------+----+ | Virtual Memory Size | 92 | | Resident Set Size | 9 | +---------------------+----+ Table 10 - 21: ncli sysman section Command Description demerits Reports on vworker/node demerits. logclusterview Logs detailed information in the sysman log. memcheck Shows report on memory utilization. ping Checks to determine if sysman is up. showactivitystatus Shows sysman activity status. showreplicas vworkerid [vworkerid ... Shows vworker replicas, optionally filtered by vworkerid. showrf Shows cluster’s current and target replication factor. showversion Shows sysman version string. showvworkers Shows cluster view of virtual workers. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 164 Admin: ncli (Aster Database Command Line Interface) ncli Command Reference ncli database Section The database section provides commands related to the upgrade process. The syntax to run a command in the sysman section looks like this example: $ ncli database backupmetadata Table 10 - 22: ncli database section Command Description backupmetadata Dumps all the Postgres metadata catalogs of all the databases on all nodes. These dumps can be used to re-create the preupgrade metadata structures, if necessary. steadystatechecks Checks for prepared transactions and zombie databases (databases that were previously dropped, but some remnants remain). ncli Flags Flags in ncli allow you to modify the actions of commands. For example, you can constrain commands to only worker nodes, only loader nodes, only the local machine, or only a specific IP addresses. You may also use flags to control output by formatting reports. There are two types of flags: • High level flags affect any command they are used with. • Command related flags apply only to particular commands. The online help for command related flags appears with the command they are used with. The online help for high level flags appears when invoking the main online help by issuing: $ ncli --help Command related flags are listed with their commands in the “ncli Command Reference” on page 138. High level flags are discussed in this section. The syntax for using the high level flags is as follows: $ ncli <flag>=<parameter1,paramter2,...parametern> <section> <command> Limiting Actions of ncli To limit the actions of ncli to specific machines, use the --hosts, --hostsfile, or -hosttype flag. These flags are used most commonly with the node section, but may be used with any command. To limit the number of nodethreads, use the --nodethreads flag. --hosts flag The --hosts flag tells ncli to return only results related to the specified Aster Database node or nodes. You may use IP address(es), hostname(s), “localhost”, or a mixture of these. Some examples using the --hosts flag are: 165 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: ncli (Aster Database Command Line Interface) ncli Command Reference $ ncli --hosts=localhost node showsummaryconfig $ ncli --hosts=10.75.10.221,10.75.10.222 vworker showdetail --hosttype flag The --hosttype flag tells ncli to return only results related to the specified type of Aster Database node. It works with the node section and accepts the argument worker, queen, or loader. To specify more than one type, a comma separated list may be used. For example: $ ncli --hosttype=worker,loader node showsummaryconfig --hostsfile flag Allows you to supply a file containing one host per line as input. The command issued will act on the hosts listed in the file. An example using the --hostfile flag is: $ cat > /tmp/myhosts 10.50.129.100 10.50.129.101 10.50.129.102 Ctrl-D $ ncli --hostsfile=/tmp/hosts node showcmd cat /proc/loadavg Command Output for cat /proc/loadavg +-------------+------+------------------------------+--------+ | Node | exit | stdout | stderr | +-------------+------+------------------------------+--------+ | 10.75.10.23 | 0 | 6.83 12.24 13.37 4/954 23178 | | | 10.75.10.25 | 0 | 14.95 11.08 9.35 3/959 19246 | | +-------------+------+------------------------------+--------+ 2 rows --nodethreads The --nodethreads flag allows you to specify the number of node threads to run in parallel. For ncli operations that need to gather information from multiple worker nodes, ncli spawns SSH processes to connect to the nodes via threads. By default, the size of the threadpool used is 20. This means if you are operating on a cluster of 40 nodes, the initial communication will be with only 20 nodes. As the threads complete their operation on the individual nodes, they will start running the same operation on the nodes that remain. So the data is still fetched from all 40 nodes, but will parallelize to only 20 nodes at any one time. The --nodethreads flag provides the option to control this degree of parallelism. An example is: $ ncli --nodethreads=10 node runonall uptime --vworkers Specifies the vworkers to act upon. Formatting and Sorting Flags The formatting and sorting flags allow you to change the way results are displayed. They include --tablefilterregex, --tableformat, --tabletype, and -tablesortcolname. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 166 Admin: ncli (Aster Database Command Line Interface) ncli Command Reference --tablefilterregex When this flag is set, it is treated as a regular expression that table titles are matched against. Only tables with titles that match against the regular expression are displayed. This is useful if a particular ncli command outputs multiple tables, but only a single table is desired. --tableformat flag The --tableformat flag tells ncli to return the results in a particular table format. Available format options are json or cli. The cli option is intended to format results to display on screen in a table format, but may also be written to a file. The default setting is cli and it does not need to be specified. For example, issuing: $ ncli node show gives the following result, in cli format: +---------------+--------+--------+ | Node | Type | Status | +---------------+--------+--------+ | 10.50.129.100 | queen | Active | | 10.50.129.101 | worker | Active | | 10.50.129.102 | worker | Active | +---------------+--------+--------+ 3 rows The json option formats results in JSON (JavaScript Object Notation) format. For more information on JSON, see the URL, http://www.json.org. So the json flag should be used if you’re going to parse the output for use with a script, for example. You should use JSON to parse the output into a string and then pass that string to the script. You can also use the json flag to view the results on screen in order to check what is being passed. Here is an example: $ ncli --tableformat=json node show gives the result: { "header": ["Node", "Type", "Status"] , "rows": [ ["10.50.129.100", "queen", "Active"] , ["10.50.129.101", "worker", "Active"] , ["10.50.129.102", "worker", "Active"] ] } --tabletype flag Options for the --tabletype flag are normal and diff. The --tabletype=normal flag displays one row for each row of data returned. The --tabletype=diff flag tells ncli to return only one row for each unique result, effectively grouping by like values. This is useful when comparing the settings of many nodes. 167 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: ncli (Aster Database Command Line Interface) ncli Command Reference Here are two code examples with their resulting reports, first with the default (normal) tabletype: $ ncli node show +--------------+--------+--------+ | Node | Type | Status | +--------------+--------+--------+ | 10.75.10.11 | queen | Active | | 10.75.10.12 | worker | Active | | 10.75.10.13 | worker | Active | | 10.75.10.14 | worker | Active | | 10.75.10.15 | worker | Active | . . . | 10.75.10.23 | worker | Active | | 10.75.10.240 | loader | Active | | 10.75.10.241 | loader | Active | | 10.75.10.243 | loader | Active | | 10.75.10.25 | worker | Active | | 10.75.10.26 | worker | Active | +--------------+--------+--------+ 18 rows and then with --tabletype=diff : $ ncli --tabletype=diff node show +-------+--------------+--------+--------+ | Count | Sample Node | Type | Status | +-------+--------------+--------+--------+ | 1 | 10.75.10.11 | queen | Active | | 3 | 10.75.10.240 | loader | Active | | 14 | 10.75.10.13 | worker | Active | +-------+--------------+--------+--------+ 3 rows Notice how the resulting table shows the results grouped by the column ‘Type’. The --tabletype=diff flag is especially useful for detecting discrepancies in configuration or performance among nodes. The following example shows the status of vworker processes grouped by status and node. It gives you a quick look at how many active and inactive vworkers exist on the nodes, and whether there is data skew, without having to weed through a list of every single vworker’s status: $ ncli --tabletype=diff vworker show vworkers +-------+-------------+-------------+---------------+ | Count | Sample Node | Status | Vworker Count | +-------+-------------+-------------+---------------+ | 1 | 10.75.10.11 | Deactivated | 0 | | 1 | 10.75.10.13 | Deactivated | 4 | | 1 | 10.75.10.11 | Active | 1 | | 2 | 10.75.10.17 | Active | 6 | | 4 | 10.75.10.12 | Deactivated | 6 | | 9 | 10.75.10.25 | Deactivated | 5 | | 12 | 10.75.10.26 | Active | 5 | +-------+-------------+-------------+---------------+ 7 rows Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 168 Admin: ncli (Aster Database Command Line Interface) ncli Command Reference --tablesortcolname flag The --tablesortcolname flag tells ncli to return the results in a particular order by sorting on the specified column. It takes as its argument the name of any column in the command’s results. For example, to view details for vworkers sorted by partition, issue: $ ncli --tablesortcolname='Partition' vworker showdetail If you specify a column that does not exist, ncli returns an empty table with column headings so you can see the available column names. Miscellaneous Flags --help Prints the online help for ncli. --verbose Increases verbosity by providing better and more detailed error messages in case of errors. --version Displays the version number of ncli. --flagfile Inserts flag definitions from the given file into the command line. This is useful when specifying many flags for one command. --minLogSeverity Sets the minimum log severity to the supplied integer for ncli. Valid values are 2(verbose) through 7(fatal). The default value is 4. --undefok Used to specify a comma-separated list of flag names that may be specified on the command line, even if ncli does not define a flag with that name. It is important to note that flags in this list that have arguments MUST use the --flag=value format. The default value for -undefok is ''. 169 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide CHAPTER 11 Admin: Executables The Aster Database Executables framework is a set of script management tools that allow Aster Database administrators to create, manage and run custom scripts on one or many nodes in their cluster. Scripts can be shell scripts, SQL scripts or can invoke SQL-MapReduce functions. Scripts can be run on any node on the cluster, or they can be restricted to run on only specified nodes. AMC Executables provide an easier way to diagnose cluster issues, such as data skew, and perform routine cluster maintenance. Prior to AMC Executables, you could create custom scripts to provide these benefits, but they had to be run by a user logging in through a shell, which could be inconvenient depending on security and IT policies. There was also no provision for creating a library of scripts before AMC Executables. This section explains how to store, run, and manage your Aster Database scripts using the AMC. The following topics are covered: • Executables Tab (page 170) • Running Scripts (page 172) • Creating Scripts (page 174) • Best practices for building scripts (page 185) • Upgrades (page 186) Executables Tab You must be logged in as an admin user to access the Executables tab. Access the Executables tab in the AMC by clicking on the Admin tab and choosing Executables from the submenu. The Executables tab has two views: • “Executables Library” on page 170 • “Executable Jobs” on page 172 Executables Library The Executables Library provides a list of all available scripts, both out-of-the-box and custom. Each script includes information on what variables are needed to run it, who created Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 170 Admin: Executables Executables Tab it, and when it was created. There is a “Run Now” button to invoke the script, a pencil icon to edit the script, and an “X” icon to delete it (if it is a custom script). Figure 31: Executables Library in AMC Out-of-the-Box Scripts Aster provides five out-of-the-box scripts, which install automatically with a clean install or upon upgrading. These scripts perform cluster administration tasks, such as finding data skew and determining table information such as size. These scripts cannot be modified or deleted, but they serve as a useful reference when creating your own custom scripts. Many of the scripts “cascade”, which means that if they are acting on a parent table, they will automatically act on all of its descendants as well. The Aster out-of-the-box scripts are as follows: • All Table Sizes* - gets the table size of all tables in the database (cascade). • Data Skew Detector - identifies any tables in a database with statistically significant skew. • Table Info - gets the table statistics of the specified table in the database (cascade). • Table Size* - gets the table size of the specified table and database (cascade). This script aggregates or sums up all the results across Aster Database. It displays results per vworker and supplies the total across the cluster. • Table Size (Details)* - gets the table size of the specified table and database on each vworker (cascade). * Note that table size information is the Postgres view of the world. If there is compression, the script will not report the compressed size. It reports raw data. But you can determine if the table is no, low, medium, or high compression. The scripts also report whether a table is defined as using row or columnar storage. You may view the code from these scripts by selecting the pencil icon. A window will appear with information about the script and the code itself, all read-only. By clicking the “Clone” button, you will create a copy of the script that may be edited for use in creating your own scripts. 171 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: Executables Running Scripts Executable Jobs To view the status of executable jobs, select the Executable Jobs tab. You will see a table listing all jobs with information, such as their status (Running, Completed, or Error), who submitted the job, start, end, and elapsed time, a link to see the output, and a “Cancel” link for jobs in progress. Note that the list of jobs is automatically refreshed periodically. You can click on the column headers to sort by a specific column. The history of script runs is maintained for a week, with a maximum history of 1000 runs. Figure 32: Executable Jobs in AMC Viewing Output To view the output of a job, click the Output link for that job. The following example shows output from the out-of-the-box script, Data Skew Detector. Figure 33: Viewing the output of a job in AMC Running Scripts To run scripts, you must be logged in to the AMC as an administrator. Each script runs immediately when you click the “Run Now” button. The following list provides detailed steps for running scripts. 1 2 Log into the AMC as an administrator. Using the navigation tabs at the top of the page, go to Admin > Executables. The Executable Library panel should appear 3 Find the script you wish to run in the list of available scripts, and click Run Now. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 172 Admin: Executables Running Scripts 4 A window will prompt you to enter information (script variables) necessary to run the script. Note that not all scripts require the same variables, so this screen may look different depending upon which script you have chosen to run. 5 Enter the variables, and optionally select “Save as Template” to save a template that automatically uses the same variables. If saving as a template, give your template a name. This is the name that will be displayed in the Executables Library. 6 When finished, click Run Now. Figure 34: Running an Executable Script in AMC 7 The script will run immediately. To view progress and output, select the Executable Jobs tab. 8 If you have chosen to save the variables entered when running the script as a template, you can access the template by finding it in the Executables Library. The following example shows a template created when running the out-of-the-box script, Data Skew Detector. Figure 35: Accessing an Executable Template in AMC Best Practices for Running Scripts The following information explains AMC Executables and their use in more detail: Memory limits Aster Database enforces a memory usage limit for scripts you run (200MB). If your script exceeds the limit, the script will be cancelled and an error message will be issued. 173 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: Executables Creating Scripts Disk space Aster Database enforces a maximum disk usage of 5GB disk space for scripts. If your script exceeds the limit, the script will be cancelled and an error message will be issued. Workload management Your Aster Database workload management rules apply to all SQL jobs you run via the Executable Jobs tab, but workload management rules do NOT apply to non-SQL scripts that you run. This means that before you run shell scripts on the cluster, you should consider the performance impact they will have on other Aster Database users. Logging Note that the Executable Jobs tab will show the script as having been run by the database user who invoked it, but in the Aster Database logs, the scripts are logged as having been run directly from the AMC (i.e. by the extensibility OS user). If those scripts, in turn, run SQL scripts via ACT, then the SQL jobs will be logged under the user name that was passed by the script to invoke ACT. Capturing runtime information AMC Executables are run in the context of a bash shell (bash -c <command>). Stdout and stderr are redirected to temp files. AMC picks up stdout, stderr and the status code when a worker thread has finished running the script. These are displayed in the AMC Executable Jobs tab, and can be viewed by selecting “Output” for a specific job. You can use the information displayed by stdout to troubleshoot your custom scripts. Creating Scripts Requirements The following are requirements for creating scripts: • The script can be a shell script (Perl, Python, bash) or a SQL script. It can also call a SQLMR function. • The script must specify its variables and their types. The following variable types are supported: Boolean, integer, ipaddress, multiline text, password, string. The AMC will do basic validation against these types at runtime. These variables are made available to the script as environment variables. During script definition, you will define the list of variables and their types. • When loading a script, you assign it a category from the list provided. This list is supplied by Aster and cannot be edited. Variables These are the rules for using variables in your scripts: Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 174 Admin: Executables Creating Scripts 1 Variable names can only consist of the following characters: a-z, A-Z, 0-9, and _ (underscore). 2 Variables are identified in SQL statements by the '&' prefix. 3 To get a literal '&' in the output stream, use '&&' in the input stream. 4 Variable names terminate when an invalid variable character is encountered. 5 If a variable name is terminated by the '.' character, the '.' character is not output in the output stream. This permits strings to be appended to variable values without creating unwanted whitespace. SQL Scripts SQL scripts with AMC Executables are run using ACT (which is launched by the AMC using the extensibility OS user). You enter the SQL script directly into the New Executable dialog box, and AMC Executables will launch ACT to run the script at runtime. This example shows how to create a SQL script that queries the system table nc_all_child_partitions to retrieve a list of all partitions in the database: 1 Create the SQL script. For this example, we will use the SQL script: SELECT * from nc_all_child_partitions; 175 2 Go to Admin > Executables. 3 From the Executables Library tab, click the New Executable button. 4 Enter the name of the executable and description in the provided fields. 5 Select a category from the drop-down list. Categories are supplied by Aster and cannot be added to or modified. 6 Choose Yes or No for “Is this language SQL?” For this example, we will choose Yes, which will allow AMC to create the variables needed (database, user, and password) to connect to the database through ACT automatically. 7 Either upload your script file or enter the script directly into the source code text area. For this example we will enter it in directly. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: Executables Creating Scripts Figure 36: Creating a new Executable in AMC 8 Supply any additional variables needed to run the script in the Variable Inputs section by clicking the Define Variables button. The Executable Variables screen appears. 9 Click the Add Variable button. 10 Define the variable by specifying a label, name, type, help text, and whether it is required for the script to run for each variable. 11 Repeat steps 9–11 to define an additional variable. 12 Click Save. Figure 37: Saving Variables for an Executable in AMC Cluster Utility SQL-MapReduce Functions In order to invoke any SQL-MapReduce functions via an Executable Job, those functions must first be installed (using ACT). You run the function just as you would any SQL statement, that is, following the instructions for “SQL Scripts” on page 175. The database user invoking ACT must have the correct permissions to run the SQL-MR function. As with SQL scripts, AMC automatically creates the variables for database access (database, user and password), but you may also create additional variables in your script. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 176 Admin: Executables Creating Scripts The out-of-the-box scripts use the following SQL-MR functions. These functions are automatically installed as part of the Aster Database installation. They are all system functions that operate on partitions. These functions are meant to be invoked through AMC Executables, and can only be run by an administrator user, but they can be executed on any schema as long as the designated database user has the necessary permissions on that schema. Note that if you type \dF in ACT, these out-of-the-box functions will not appear, as they are internal-only functions. You can, however, use these in your own custom scripts. Note that for these functions, the "SELECT 1" and “PARTITION BY 1” clauses are merely used to invoke the functions. The functions will still work even if you operate on a table, instead. The available out-of-the-box SQL-MR functions are: • nc_genericlocalquery • nc_tablesize • nc_skew • nc_recursive nc_genericlocalquery The nc_genericlocalquery function connects to the local Postgres instance and issues the SELECT query you give it. The function only allows SELECT queries. Syntax This function operates within each vworker and creates a SELECT clause as follows: "SELECT columns FROM from_clause." SELECT * FROM nc_genericlocalquery( ON (SELECT 1) PARTITION BY 1 [DATABASE('database_name') [COLUMNS('column1' [, ...])] FROMCLAUSE('from_clause') ); Arguments DATABASE: Optional. You can either specify a database, a list of databases, or omit this parameter. If you do not select a database then the query is run on all databases. When querying pg_catalog tables, running the query will work, but if your query contains tables specific to a particular database you will get an error message. COLUMNS: Optional. This parameter specifies the names of the columns to be selected. Alternatively, if this parameter is not used, then the SQL query constructed is "SELECT * FROM from_clause" which will return all rows. FROMCLAUSE: Required. This parameter is either the table or subquery from which the columns are selected. Tables that are frequently used within the FROMCLAUSE include the Postgres catalog tables pg_class, pg_tablespace, pg_namespace, and pg_inherits. 177 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: Executables Creating Scripts Output The output includes the vworker, IP address, and the columns specified in the COLUMNS clause. Example SELECT * FROM nc_genericlocalquery ( ON (select 1) PARTITION BY 1 COLUMNS ( 'relname as table_name', 'nspname as schema_name', 'table_type', 'compression', 'parent_name', 'pg_total_relation_size(oid)::bigint as total_size' ) DATABASE('beehive') FROMCLAUSE ('(SELECT a.oid, a.relname, b.nspname, d.inhparent::regclass as parent_name, CASE WHEN EXISTS (SELECT * FROM pg_rfile f WHERE f.rfrelid = a.oid) THEN ''columnar'' ELSE ''row'' END as table_type, CASE WHEN c.spcname is null then ''none'' WHEN c.spcname = ''_bee_compress_high'' then ''high'' WHEN c.spcname = ''_bee_compress_medium'' then ''medium'' WHEN c.spcname = ''_bee_compress_low'' then ''low'' END as compression FROM pg_class a LEFT OUTER JOIN pg_tablespace c ON (c.oid=a.reltablespace) INNER JOIN pg_namespace b ON (a.relnamespace=b.oid) LEFT OUTER JOIN pg_inherits d ON (a.oid=d.inhrelid) WHERE b.nspname not in (''pg_catalog'',''information_schema'', ''pg_toast'', ''_bee_special'', ''nc_system'' Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 178 Admin: Executables Creating Scripts ) AND b.nspname=''public'' ) x' ) ); Returns results like: ip |vworker| table_name | schema_name | table_type | compression|parent_name|total_size ----+-------+--------------+-------------+------------+------------+----------+-----------localhost| w4z |to_be_packed | public | row | none | |65536 localhost| w4z |to_be_unpacked| public | row | none | |65536 ... localhost| w5z |glm_output1 | public | row | none | |65536 nc_tablesize The nc_tablesize function (table size function) may be used to obtain the on-disk size of one or more tables. Syntax SELECT * FROM nc_tablesize( ON (SELECT 1) PARTITION BY 1 [info_database('database_name')] [password('password')] [info_relation('relation_name')] ); Arguments INFO_DATABASE - Optional. Database or list of database names. For information about all databases, omit the clause or use *. PASSWORD - Required. The password of the database user. INFO_RELATION - Required. Relation (table or index) name or list of relation names. For information about all relations, use *. Output The output includes the following columns: 179 • database - the database of the table. • schema - the schema of the table. • relname - the relation name of the table. • object_type - type of the entity, either "relation" or "index". • compression - compression level of the entity, either "None", "high", "medium" or "low". Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: Executables Creating Scripts • rootname - the name of the root in a logically partitioned table. If a table has no parent, it is considered its own root, so this column will never be empty. If the table is not a logically partitioned table, this value will be the same as relname. • rootschema - the schema of the root in a logically partitioned table. If a table has no parent, it is considered its own root, so this column will never be empty. If the table is not a logically partitioned table, this value will be the same as schema. • storage_type - storage type of the entity, either "column" or "row". • size_on_disk - total number of bytes occupied by the entity on disk. • data_size - number of bytes occupied by the entity as viewed by Postgres. • dead_tuple_count - number of dead tuples in the entity. Example SELECT database, schema, relname, size_on_disk FROM nc_tablesize( ON (SELECT 1) PARTITION BY 1 info_database('beehive') info_relation('*') ); Returns results like: database | schema | relname | size_on_disk ----------+--------+--------------+-------------beehive | public | glm_test1 | 65536 beehive | public | to_be_packed | 131072 ... beehive | public | to_be_unpckd | 131072 nc_skew The nc_skew function (table skew function) does a statistical test for skew across the cluster on a set of data (tables) you supply at runtime. The function takes a distribution of some metric (usually the table size or number of rows) of a table over many vworkers and determines whether or not the data distribution is skewed. It tests the skewness of table distribution by using a chi-square test. If there is no skew in a table, the test result for that table will not be output to the screen. Syntax SELECT * FROM nc_skew ( ON 'input_table' PARTITION BY 'partition_by_columns' PARTITIONS('partition_by_columns') [ METRIC('metric_column_name') ] [ PVALUE('p_value')] [ VWORKERCHECK('true | false') ] ); Arguments PARTITIONS: Required. A list of column names that are used in the 'PARTITION BY' clause. The order can be different than the order in which they occur in the 'PARTITION BY' clause. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 180 Admin: Executables Creating Scripts METRIC: Optional. The name of the column containing the metric. If not specified, the function uses the second column by default (since partition is often first). PVALUE: Optional. The significant value for the chi-square test. Valid values should be in the range of [0,0.25]. Default value is 0.05. VWORKERCHECK: Optional. If 'true', the function will compare the number of data partitions to the number of function invocations to see if any expected data is missing. Default value is 'false'. Output The function outputs a table that includes the tablename, p-value, chi-square result, and the minimum, maximum and average values of the metric for each table where skew was detected. Example The table below shows example Input Data from table nc_data_skew_testdata. Table 11 - 1: Input Data from nc_data_skew_testdata 181 ind tablename cnt 1 table1 106 2 table1 90 3 table1 105 4 table1 108 5 table1 123 6 table1 114 7 table1 78 8 table1 125 9 table1 112 10 table1 84 11 table1 92 12 table1 82 13 table1 105 14 table1 103 15 table1 136 16 table1 78 17 table1 74 18 table1 73 19 table1 127 20 table1 108 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: Executables Creating Scripts Table 11 - 1: Input Data from nc_data_skew_testdata (continued) ind tablename cnt 21 table2 15 22 table2 18 23 table2 16 24 table2 20 25 table2 20 26 table2 18 27 table2 5 28 table2 6 29 table2 15 30 table2 10 31 table2 14 32 table2 1 33 table2 1 34 table2 1 35 table2 4 36 table2 12 37 table2 2 38 table2 15 39 table2 11 40 table2 19 41 table3 100 42 table3 100 43 table3 100 44 table3 99 45 table3 99 46 table3 101 47 table3 98 48 table3 100 49 table3 99 50 table3 100 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 182 Admin: Executables Creating Scripts Table 11 - 1: Input Data from nc_data_skew_testdata (continued) ind tablename cnt 51 table3 99 52 table3 99 53 table3 100 54 table3 101 55 table3 99 56 table3 101 57 table3 101 58 table3 101 59 table3 101 60 table3 101 Example SQL-MR call SELECT * FROM nc_skew ( ON nc_data_skew_testdata PARTITION BY tablename PARTITIONS('tablename') METRIC('cnt') ) ORDER BY tablename; The table below shows sample output from nc_skew. Table 11 - 2: tablename pvalue chisquare min_value max_value avg_value table1 2.33907705871061e-07 67.548690064261 73 136 101 table2 1.47257894766994e-09 80.5874439461883 1 20 11 nc_recursive The nc_recursive function is used to do the cascading in the out-of-the-box executables. The function takes as input a set of rows representing a tree structure and processes the data to sum up all the metrics from the hierarchy at the root node. This is useful for viewing table sizes for parent tables that include the table sizes for all their children. Syntax SELECT * FROM nc_recursive ( ON 'input_table' PARTITION BY partition_by_columns PARTITIONS('partition_by_columns') TABLE_NAME('table_name') PARENT_NAME('parent_name') SUM_OVER('metric1'[, ...]) 183 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: Executables Creating Scripts [ COMPRESSION('compression_type') ] [ TABLE_TYPE('table_type') ] ); Arguments PARTITIONS: Required. A list of column names that are used in the 'PARTITION BY' clause. The order can be different than the order in which they occur in the 'PARTITION BY' clause. TABLE_NAME: Required. The name of the column which contains the names of the tables. PARENT_NAME: Required. The name of the column which contains the parent table name of each table. SUM_OVER: Required. A list of column names which contain the metrics to be added up. COMPRESSION: Optional. A string specifying the compression level. TABLE_TYPE: Optional. A string specifying the type of the table. Output The output is the same as the data input, except that there is one row for each root level parent and the data is reported at the parent level. Example Table 11 - 3: database tablename parent tablesize another_metric compression table_type beehive pt_jan02_2011 pt_jan_2011 21 210 none fact beehive pt_jan_2011 pt_2011 none fact beehive pt_2011 pt 1 fact beehive pt none fact beehive pt_jan01_2011 pt_jan_2011 none fact wikilogs pt_2011 pt none fact wikilogs pt_jan01_2011 pt_jan_2011 10 100 none fact wikilogs pt_jan02_2011 pt_jan_2011 21 210 none dimension wikilogs pt none fact wikilogs pt_jan_2011 none fact 10 100 pt_2011 Example SQL-MR call SELECT * FROM nc_recursive ( ON nc_recursive_sample PARTITION BY database PARTITIONS('database') TABLE_NAME('tablename') PARENT_NAME('parent') SUM_OVER('tablesize', 'another_metric') COMPRESSION('compression') TABLE_TYPE('table_type') ); Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 184 Admin: Executables Best practices for building scripts Table 11 - 4: Example Output from nc_recursive database table_name compression table_type tablesize another_metric beehive pt hybrid fact 31 310 wikilogs pt none hybrid 31 310 Best practices for building scripts The following best practices will help you build successful scripts: • Use the out-of-the-box scripts as a template whenever possible. These scripts have been tested, and can be relied upon to have proper syntax and logic. Note that out-of-the-box scripts may be examined, but not edited. You may view the code from these scripts by selecting the pencil icon. A window will appear with information about the script and the code itself, all read-only. By clicking the “Clone” button, you will create a copy of the script that may be edited for use in creating your own scripts. • Issue several SQL commands in a single transaction. Although this is not always possible, doing so reduces the number of times ACT is launched, and therefore cuts down the potential points of failure for the script. It is also usually faster, not only because you don't need to load and run ACT multiple times, but also because there is substantially less overhead in running a single transaction than in running multiple transactions, especially in a distributed system like Aster Database. • Invoke ACT via bash to perform checks before issuing SQL commands. If your SQL script contains many transactions, consider invoking ACT via bash instead of using the built-in SQL script functionality in the AMC. To invoke ACT via bash, you would first create a SQL script and save it on the queen. When invoking ACT, you must pass the username and password of an Aster Database user who has sufficient rights to run the SQL in the script. Then create a shell script like the following to call ACT and run your SQL script: #!/bin/bash #This script acts as a wrapper around act to launch an #SQL script called logical_partitioning_list. The basic #parts of the act call are the username/password #credentials, options (db, etc) and the reference #to the SQL script file to be run. #You may wish to specify the database in the script #with -d, or ask the user. act -u "$FLAG_username" -w "$FLAG_password" -c "SELECT 1" -f /home/ beehive/scripts/logical_partitioning_list.sql If you are invoking ACT via bash, you can look at the exit code of ACT or the stdout/stderr data that ACT returns to determine whether to run the next command in the sequence. If the you are invoking ACT via SQL scripts, then the behavior is identical to that of invoking ACT with a -f option that points to a file containing a list of SQL statements, and there is no opportunity to perform checks before issuing additional commands. 185 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Admin: Executables Upgrades Upgrades AMC Executables is very flexible and powerful, because you have full access to the cluster at the command-line level. The flipside is that you are not dealing with an API that guarantees a stable set of commands through subsequent upgrades. Scripting interacts with many parts of Aster Database that could potentially change in each new version (for example, CLI changes or SQL changes). However, SQL scripts are more likely to remain functional after upgrading. To assist in troubleshooting any issues, each script is tagged with the Aster Database version number it was created under, but it is possible that upgrading may require rebuilding any scripts created under a prior version. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 186 Admin: Executables Upgrades 187 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Using Teradata Tools to Manage Aster Database CHAPTER 12 Teradata provides a unified management environment to support both Teradata and Aster Database. This chapter describes the tools available from Teradata for managing Aster Database. See these sections for details: • Managing Aster Database with Teradata Viewpoint (page 188) Managing Aster Database with Teradata Viewpoint Beginning with Aster Database version 5.0, you can use Teradata Viewpoint to manage Aster Database on the Teradata Aster MapReduce Appliance. This gives database administrators one unified platform from which to view information about and perform administrative tasks for both Teradata database and Aster Database. Viewpoint is analogous to the Aster Database AMC in its functionality. Overview Viewpoint communicates with Aster Database through a Web service API, using SSL and HTTPS. The API enables administrators to view information about Aster Database securely through Viewpoint. It also enables performance of many administrative tasks though Viewpoint. Note that you must use the “db_superuser” credential when accessing Aster Database through Viewpoint. This ensures that you will have the correct permissions to view information about Aster Database and perform administrative functions. Information available through Viewpoint The information in this section is not meant to be exhaustive, but merely to provide an idea of what kind of information about Aster Database is available through Viewpoint. Based on this Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 188 Using Teradata Tools to Manage Aster Database Managing Aster Database with Teradata Viewpoint and the information about the AMC, administrators can decide which access portal makes more sense for their purposes. Most of the information Viewpoint displays about Aster Database is real time. The one exception is information about table size. Table size information is updated daily. The following information about Aster Database can be viewed in Viewpoint: Processes and Sessions Table 12 - 1: Process and Session information available in Viewpoint Information Type Information Details Processes Returns a list of all processes, optionally filtered by attributes. Process Statements Information about the statements that make up a process. Process Phases Information about process phases and their statuses Process Phase Statements Information about the individual statements that make up a process phase. Workload Management Information about workload policies and service classes configured within Aster Database. Sessions Information about sessions (connected and historical) and their status. Cluster and Nodes Resources Table 12 - 2: Cluster and Node Resource information available in Viewpoint 189 Information Type Information Details Cluster Status Information about the status of the cluster. Replication Factor Information about the replication factor. Nodes Information about the nodes in a cluster. Node Status Status information about the nodes in a cluster. Storage Storage information about the cluster. Virtual Workers Information about virtual workers in a cluster. Component Statistics Statistics about the components in a cluster. Hardware Configuration Information about a node’s hardware configuration, including CPU(s), RAM and CPU cache. Tablespace Information about tablespace compression, storage type, dead tuples, and space used. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Using Teradata Tools to Manage Aster Database Managing Aster Database with Teradata Viewpoint Administrative operations available through Viewpoint The information in this section is not meant to be exhaustive, but merely to provide an idea of what kind of administrative functions are available through Viewpoint. Based on this and the information about the AMC, administrators can decide which access portal makes more sense for their purposes. The following tasks can be performed in Aster Database through Viewpoint: Cluster Administration Table 12 - 3: Cluster Administration functions available in Viewpoint Task Name Task Details Hard Restart Performs a hard restart of the cluster (reboots all nodes). Soft Restart Performs a soft restart of the cluster (restarts Aster Database services). Rebalance Data Rebalances data among vworkers. Rebalance Process Rebalances processes among vworkers. Upload and Distribute Uploads a file to the cluster and distributes is to vworkers. Workload Administration Table 12 - 4: Workload Administration functions available in Viewpoint Task Name Task Details Service Classes Performs a hard restart of the cluster (reboots all nodes). Workload Policies Performs a soft restart of the cluster (restarts Aster Database services). Save Service Classes Rebalances data among vworkers. Save Workload Policies Rebalances processes among vworkers. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 190 Using Teradata Tools to Manage Aster Database Managing Aster Database with Teradata Viewpoint Network Administration Table 12 - 5: Network Administration functions available in Viewpoint Task Name Task Details Node Network Configuration Sets a network configuration for a node. Node Network Current Configuration Shows the current network configuration. Node Network Function Assignments Assigns an Aster Database function to a network on a node. Save Network Configuration Saves the network configuration. Apply Network Configuration Applies the network configuration. Save Network Function Assignments Saves the network function assignments. Executables Framework Table 12 - 6: Executables Framework Administration functions available in Viewpoint Task Name Task Details Executables Job List Lists executable jobs. Executables List Lists available executable jobs. Start Executable Starts the designated executable. Save Executable Saves the executable settings. Log Bundling Table 12 - 7: Log Bundling functions available in Viewpoint Task Name Task Details Log Bundles Shows existing log bundles. Create Log Bundles Creates log bundles for transmission to Teradata Support. For more information, see the Viewpoint documentation, available from Teradata. Configuring Aster Database for use with Viewpoint Before you can access Aster Database through Viewpoint, you must edit a configuration file to enable the integration. To do this: 1 Open the configuration file in your favorite editor: /home/beehive/config/dbinfocollector_default.cfg 191 2 Set the disabled flag to false. 3 Modify the following parameters appropriately: Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Using Teradata Tools to Manage Aster Database Managing Aster Database with Teradata Viewpoint • credential - the username and password of the database user that runs the statistics collection query. This should be ‘db_superuser’ or the equivalent. • schedule_day - The day of the week (0-6) to run the statistics collection, or 7 for every day of the week. • schedule_time - the time of day to run the statistics collection. This should be a time when the cluster is not busy. Troubleshooting the Viewpoint integration If you notice that Viewpoint is not reporting accurate values in a timely manner, see the points below: • If you find that the information displayed in Viewpoint is more than about 20 seconds out of date, this is expected only for statistics related to disk consumption (disk used, free disk still available, etc.) when compression (compressed tables or compressed indexes) is involved. • In clusters that are running near their maximum CPU or disk I/O capacity, or in which large amounts of new or changed data need HIGH compression, delays may be even longer than 30 minutes. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 192 Using Teradata Tools to Manage Aster Database Managing Aster Database with Teradata Viewpoint 193 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide CHAPTER 13 Aster Database Logging Aster Database automatically tracks its activity in a variety of log files. The log files are useful when you need to find the cause of an error or unexpected behavior, or when you just want to confirm that an operation has taken place. You can access Aster Database log files through the AMC. • To view log files for individual worker or loader nodes or view the system logs stored on the queen, use the Node Details tab. To display this tab, click Nodes: Node Overview, then click the IP address or name of the node. Click the Prep, System, or Kernel link to view the desired log. See “Reading Aster Database Logs” on page 480. • To view (or create) bundles containing multiple log files, which you can send to the Teradata Aster support team along with a request for troubleshooting assistance, use the Logs tab. To display this tab, click Admin: Logs, then click the Prepare, Download, or Send link. The rest of this chapter provides more information about diagnostic log bundles; see “Overview of Diagnostic Log Bundles” on page 194 and “Using Diagnostic Log Bundles” on page 195. Overview of Diagnostic Log Bundles When an issue arises on a cluster, one of the first steps in finding the cause is to retrieve the relevant log files. Aster Database is made up of a large array of distinct services, and it produces more than 60 different logs spread across every node in the cluster. The AMC provides an easy way for you to deal with all these different logs by creating diagnostic log bundles. A diagnostic log bundle is a compressed tarball containing data used to determine the system context and diagnose Aster Database issues. This data may come in system logs from the queen and subordinate nodes (worker and loader). By using diagnostic log bundles, you can more easily send information to Teradata Aster tech support for analysis, reducing the time and effort required to diagnose system problems. Only AMC users with administrative privileges can create, download, and send diagnostic log bundles. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 194 Aster Database Logging Using Diagnostic Log Bundles Using Diagnostic Log Bundles This section explains how to do the following tasks: • “Displaying the Diagnostic Bundle Jobs Panel” on page 195 • “Sending a Diagnostic Log Bundle” on page 196 • “Saving a Diagnostic Log Bundle on Your Local Filesystem” on page 197 • “Including All Nodes in a Diagnostic Log Bundle” on page 197 • “Making a Custom Diagnostic Log Bundle” on page 197 • “Running Custom Commands in a Diagnostic Log Bundle Job” on page 198 • “Viewing Diagnostic Log Bundle Contents” on page 199 Displaying the Diagnostic Bundle Jobs Panel To display the diagnostic bundle jobs in AMC, choose Admin > Logs. A list of diagnostic bundle jobs is displayed. Figure 38: AMC list of diagnostic bundle jobs For each job, the following information is shown: Table 13 - 1: Diagnostic Bundle Job Information in AMC 195 Field Description Job ID System-generated unique number to identify the job. Type Queen or Cluster. A queen-type bundle includes only log files and information from the queen. A cluster-type bundle includes log files and information from all nodes, including the queen. Status Tells whether the job is currently running, completed, or failed. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Aster Database Logging Using Diagnostic Log Bundles Table 13 - 1: Diagnostic Bundle Job Information in AMC (continued) Field Description Submitted by Tells what initiated the job. “System” means the job was run automatically by the AMC. If the job was manually initiated, the username of the person who submitted the job is displayed. Start Time Start time of the log content. That is, the time of the first logged event included in the bundle. End Time End time of the log content. Filename Name of the log bundle file. The name indicates the time the bundle creation job was initiated. Filesize Size of the log bundle file in MB. PrepareClusterBundle Click Prepare create a complete bundle that includes logs from the other nodes as well Download Click Download to download a diagnostic log bundle. Send to Aster Support Click Send to use to send the log bundle to the support team at Teradata Aster. Sending a Diagnostic Log Bundle The Diagnostic Bundle Jobs panel provides links that you can use to send log bundles directly to the support team at Teradata Aster. Note! Before you can send logs directly to Teradata Aster, you must configure the Cluster Settings and Aster Support Settings (Admin > Configuration > Cluster Settings). To send a log bundle, follow these steps: 1 Open the Diagnostic Bundle Jobs panel in AMC (Admin > Logs). 2 To send a log bundle to the support team at Teradata Aster, click the log’s corresponding Send link in the Send to Aster Support column. Figure 39: Sending a log file to Teradata Aster 3 In the confirmation dialog, click OK. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 196 Aster Database Logging Using Diagnostic Log Bundles While the bundle is being sent, a blue progress bar appears next to the Send link. If the sending succeeds, the bar becomes green. If the sending fails, the bar becomes red. Move the mouse over the bar to display status information. Saving a Diagnostic Log Bundle on Your Local Filesystem You can download and save a diagnostic log bundle file on your local filesystem. This is useful if you want to view the contents of the file or if you need to send the bundle to Teradata Aster’s support team but you can not connect to the support server URL. To download a diagnostic log bundle: 1 Open the Diagnostic Bundle Jobs panel in AMC (Admin > Logs). 2 Click the log’s corresponding Download link in the Download column. 3 Follow the instructions to save the bundle (a .gz file) on your system. Including All Nodes in a Diagnostic Log Bundle By default, a diagnostic log bundle contains only system logs from the queen. If you want to create a complete bundle that includes logs from the other nodes as well, you can create what is called a “cluster bundle” by clicking the Prepare link. Another way to include all nodes in a bundle is to click the Manually Initiate Diagnostic Bundle button. This displays a dialog that provides many more choices, including the choice to include queen and cluster nodes in the bundle, set a time window, and add custom commands. For more information, see the next section, “Making a Custom Diagnostic Log Bundle”. Making a Custom Diagnostic Log Bundle You can start a log bundle job with custom settings such as a particular date range. 1 Display the Diagnostic Bundle Jobs panel as described in “Displaying the Diagnostic Bundle Jobs Panel” on page 195. 2 Click Manually Initiate Diagnostic Bundle. Figure 40: The Manually Initiate Diagnostic Bundle button 3 197 Modify the settings as desired. For example, set a custom start and end time for the contents of the log bundle. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Aster Database Logging Using Diagnostic Log Bundles Figure 41: Initiating Diagnostic Bundle 4 If desired, click Advanced to add custom commands. These are explained in the next section, “Running Custom Commands in a Diagnostic Log Bundle Job”. 5 Click Create Bundle Now. The job starts immediately. You can not save the settings and schedule the job to run later. Running Custom Commands in a Diagnostic Log Bundle Job When you manually initiate a diagnostic log bundle job, you can optionally run additional commands and record the output of those commands as part of the log bundle. For example, you might want to run Linux commands that will help diagnose a problem, such as ps or vmstat. The commands are run on each machine where a bundle is being created. If the bundle is a queen-only bundle, the commands run on the queen. If the bundle is a queen and cluster bundle, then the commands run on all of the nodes (queen, workers, and loaders). The command must, of course, exist on every node where it will be run. Warning! Custom commands run as user “beehive,” which is a fairly powerful user role. Be careful what commands you perform, as “beehive” has broad permissions that might permit you to unintentionally disrupt the cluster. To add custom commands to a diagnostic log bundle job, perform the following steps. 1 If the command is a custom program or shell script, copy it to every node on which it will be run, and put it in the same directory on each node. 2 Display the Diagnostic Bundle Jobs panel as described in “Displaying the Diagnostic Bundle Jobs Panel” on page 195. 3 Click Manually Initiate Diagnostic Bundle. 4 Click Advanced. 5 Additional fields appear so you can enter your custom commands. The following example shows two types of commands: a custom script and a query file passed as an argument to ACT. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 198 Aster Database Logging Using Diagnostic Log Bundles Figure 42: Initiate Diagnostic Bundle Enter any one-line command that you could normally run at the Linux command line, such as: 6 • A standard Linux operation; for example, ps, vmstat, ls, and so on. • Your own shell script or custom program. Include the full directory path as well as the command name. • An SQL command, specified by running ACT from the command line and passing in the SQL as a parameter. When you pass in the SQL as a file, the SQL file must be present on every node where the command will run. Click Create Bundle Now. If the commands succeed, the output of the commands will be included in the bundle (.tgz) file(s). If you request a queen and cluster bundle, you will get two separate .tgz files, one for the queen and one for the rest of the cluster. Troubleshooting: If the command(s) do not run successfully, the bundling operation will usually complete anyway, but in the Diagnostic Bundle Jobs table on the Support tab you will see that the “Status” column contains a yellow exclamation point rather than a green checkmark. The word “Completed” will be underlined, and if you hover over the word or click it, you will get a short description of the problems that occurred when the AMC tried to run the command. To see the output, use the steps in the next section, “Viewing Diagnostic Log Bundle Contents”. Viewing Diagnostic Log Bundle Contents To look at the contents of a diagnostic log bundle: 199 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Aster Database Logging Using Diagnostic Log Bundles 1 Find the .tgz file. Do one of the following: • Download the file as described in “Saving a Diagnostic Log Bundle on Your Local Filesystem” on page 197; or, • Open an ssh session to the queen machine and look in the directory /primary/ diagbundles. Look for a file with the same name shown in the list of diagnostic log bundle jobs. The name will look similar to YYYYMMDD_HH.MI.SS.tgz, where YYYY represents the year, MM represents the month, and so on. 2 Unzip the file. The result is a .tar file. 3 Untar the .tar file. For a queen bundle, this will yield a set of directories with names like configs, cores, customcmds, logs, meta, and sysprofile. For a cluster bundle, when you untar the bundle, you will get two directories, one named “meta” and one named “gather”. Inside the “gather” directory is a .tgz file for each worker node and loader node. If you unzip and untar one of these .tgz files, you will get the same directories shown above, with the files gathered from that node. 4 Use cd to change into the appropriate directory. For example, to see the output of any custom commands that you passed in to the job, go to the customcmds directory. Use ls to make sure one file is displayed for each of the commands that you specified. The file names are the command numbers. In our example, there would be two files named 1 and 2. 5 Use your favorite text editor to view the contents of any file in the bundle directory. Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 200 Aster Database Logging Using Diagnostic Log Bundles 201 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Aster Glossary This glossary lists terms you will encounter in building and using databases and applications in Aster Database. ACT Aster Database Cluster Terminal (ACT) is the terminal-based SQL query client for Aster Database. AMC Aster Management Console is a web-based administrative console that allows you to monitor and control Aster Database. Aster Database partitioning See distribution (of rows). Aster Database Data Validator Discontinued utility in Aster Database that was used to check data. Aster Database Loader Also written as “ncluster_loader”, this is Aster Database’s command-line bulk loading utility. Customers are encouraged to use this rather than Bulk Feeder. It provides an alternative to the SQL INSERT statement and offers much better performance and error handling. Hint! Do not confuse Aster Database Loader with a loader node. Aster Database replication To provide availability, Aster Database is designed to maintain multiple copies (usually two) of your data. Maintaining these copies is called replication, and, when you create your cluster, you specify a desired replication factor that tells the system how many copies to maintain. Teradata Aster recommends running Aster Database at a replication factor of two, which means the cluster stores two copies of your data at all times. Replication is achieved by maintaining a copy of each Aster Database vworker. Recall that, in the distributed architecture of Aster Database, your data is distributed across many vworkers that do the work of retrieving data and performing calculations on the data. Teradata Aster Big Analytics Appliance 3H Database User Guide 202 For a given partition, we refer to the vworker holding the active copy of your data as the “active vworker” and the one holding the backup copy as the “passive vworker.” If the active vworker fails, the passive vworker takes over immediately. automatic logical partitioning The method of partitioning or splitting a large table into child partitions to optimize query performance and simplify table administration. Automatic logical partitioning uses the PARTITION BY RANGE or PARTITION BY LIST clause to create a partitioned table, and is the preferred method of logical partitioning. _bee_stats database See system tables. backup queen A second queen that you can activate if your queen fails. Usually it's kept in powered-down or STOPPED state with no workers connected, and only powered up when needed. Sometimes called a backup queen. balance process Also known as “balance processing,” this is the act of making sure active vworkers are evenly distributed on your cluster's hardware, so that each worker node has about the same number of active vworkers as all other worker nodes. balance data Also known as “balance storage,” this is the act of making sure your Aster Database contains the required number of copies of your data (as specified by your replication factor; usually two copies), and making sure that each replica copy is located on a separate physical worker node from the primary copy. BIT Outdated name for ACT. Bulk Feeder Unsupported bulk-loading application; replaced with Aster Database Loader. child partition In an automatic logical partitioning schema, a child partition is one partition of the data in the partitioned table. child table In a parent-child table inheritance schema, a child table is one partition of the data. coordinator Old term for the Aster Database queen. CSV Comma-separated value file format where commas are used as field separators. 203 Teradata Aster Big Analytics Appliance 3H Database User Guide CTAS CREATE TABLE AS SELECT This is just a variant of a CREATE TABLE statement with a SELECT subquery that populates the new table with rows from an existing table. These are used very frequently in the Aster Database context to manually repartition a table or to do the ‘transform’ part of ELT. See ELT and ETL. data locality The state of having needed data local to (on the same machine as) an operation or other data. Having data locality is a key factor affecting the efficiency of an operation in an MPP system. data model A database’s structure of tables and columns that determine what form and format the data will be stored in. DDL Data definition language to create and alter database tables. dimension table One of the two main table types in a star schema-style database. A row in a dimension table usually describes an item in detail. A dimension table stores unchanging or slowly changing descriptions of the participants in the actions tracked by your database. (The details of each action are recorded in the fact table.) For example, product names and descriptions usually live in a dimension table. The volume of data in a dimension table typically grows slowly. Often a dimension table enumerates the set of known values for a particular category. distributed dimension table In Aster Database you can optionally distribute a dimension table by declaring a distribution key on it. distributed query planning The queen manages the distribution of data in the cluster, prepares top-level, partition-aware query plans, issues queries to vworkers, and assembles the query results. The vworkers, in turn, prepare local query plans and execute the queen's queries in parallel. The queen structures top-level queries so that little or no data is shipped to the queen until the final phase, when the query results are assembled and sent to the client. distribution (of rows) Distribution of rows (sometimes called “physical partitioning”) means splitting a table’s data across many vworkers in Aster Database to allow scaling. This is a key Aster Database feature. A physical partition is a subset or “slice” of rows stored on a vworker. Don’t confuse Aster Database distribution with the common data modeling practice of logical partitioning (a.k.a. parent-child table inheritance or automatic logical partitioning), which you can also do in Aster Database. The difference is this: distribution happens automatically based on the distribution key you declare using DISTRIBUTE BY when you create the table. The distribution is automatic in the sense that you don’t have to declare the boundaries of each partition. Instead, you just say which column (this is called the distribution key column) Teradata Aster Big Analytics Appliance 3H Database User Guide 204 provides the values that will be used to define the distribution, and Aster Database chooses boundaries to split up the records. Logical partitioning, on the other hand, requires you to explicitly declare the boundaries of each partition. distribution key When you create a fact (and optionally when you create a dimension table), you specify a distribution key that determines how Aster Database will physically distribute that table’s data across the cluster. The distribution key specifies which column's value will be evaluated to determine its location in the cluster. If the table has a primary key defined, then the distribution key must be one of the columns from the primary key. The column you choose as your distribution key must be of a datatype allowed for use as a distribution key. ELT Aster’s better alternative to the longstanding datawarehousing practice of ETL (extract, transform, and load). In Aster Database it’s usually much better to extract, load, and only then transform, because you can use the computing power of the cluster to carry out the data transformations. The main tools for performing such transformations are the CTAS command and SQL-MapReduce transformation functions that you write. ETL The longstanding datawarehousing practice for loading data into the warehouse. ETL stands for extract, transform, and load. By ‘transform’, we mean the reformatting of the data that you must do to ensure consistent and correct data representation in the warehouse. In Aster Database, we prefer to follow the ELT approach to loading, rather than ETL. fact table One of the two main table types in a star schema-style database. In a star schema, the fact table is usually the largest table in the database and records the minute-to-minute actions that your database was built to track. (The job of storing the more detailed information about the actions’ participants is delegated to a set of dimension tables.) In the fact table, each row usually represents an action or movement, such as a sales transaction or a web pageview. Because of this, the volume of data in a fact table tends to grow fast. A fact table contains two types of columns: columns that contain facts (say, timestamp and price of a sale) and columns that are foreign keys to the dimension tables (links to the rows, for example, that describe the product sold and the customer who bought it). Note that Aster Database does not enforce referential constraints. Foreign keys are used mainly for joining tables. The effective primary key of a fact table is usually a composite of more than one column. You create a fact table in Aster Database with the CREATE FACT TABLE command, and you must distribute each fact table by declaring one of its columns to be its distribution key, using the keyword DISTRIBUTE BY. foreign key The column that is used to join a fact table with a dimension table. Aster Database does not enforce referential constraints. Foreign keys are used mainly for joining tables. 205 Teradata Aster Big Analytics Appliance 3H Database User Guide Hadoop Apache Hadoop is an open source platform for storing and managing big data. Teradata Aster provides SQL-H to enable business users to access the Hadoop data from Aster Database directly. Aster Database manages communication with Hadoop nodes through SQL-H to read data for SQL queries and SQL-MR functions. hash distribution See physical partitioning. HCatalog HCatalog is the table and storage management service for data stored in Apache Hadoop. Hive An open-source SQL layer for Hadoop. It is not compliant with SQL-92 and supports none of the SQL guarantees. ICE The InterConnect Executable process in Aster Database. This is the Aster Database service responsible for finding and shuffling partitions of data between vworkers. For example, if the users table is distributed across many partitions and you run the query, SELECT * FROM myusers, then ICE collects the rows from all the partitions. imbalanced Undesirable cluster state that you should fix by running either a balance data or balance process. This typically means you have one worker node with more than the desired number of active vworkers running on it, or you have a single worker node hosting both an active vworker and its corresponding replica vworker. in-database applications Aster Database’s in-cluster, in-database applications let you inject user functions (applications) into the data flow at the lowest levels. For many applications, Aster Database is superior to other distributed computing frameworks because Aster Database provides better tools for data manipulation (partitioning, sorting, and the like), as well as process management and workload management. incorporate Outdated term for balance data. JDBC Standard Java API that allows clients to access a database. Aster Database offers a JDBC driver. list partitioning See logical partitioning. loader node An optional node in Aster Database that is specialized in loading data. Normally, you can route all loading directly though the queen, but for high-volume loading requirements, you Teradata Aster Big Analytics Appliance 3H Database User Guide 206 can deploy loader nodes to increase loading capacity. When using loader nodes, you initiate the loading using the ncluster_loader utility, which communicates with the queen. The queen then delegates loading to the loader nodes, which load data into the appropriate vworkers in parallel. You can also force the use of a particular loader node. logical partition A child table or child partition and its data. logical partitioning Splitting one large table into smaller logical pieces for faster performance and easier management. This is done via automatic logical partitioning (preferred) or parent-child table inheritance (supported for backward compatibility) and is a common database practice as well as a popular feature of Aster Database. Each partition is created as a child partition of the single partitioned table. The top level table is normally empty, there to represent the data set. Some logical partitioning designs contain multiple generations of partitions. For example, you might have a schema in which table sales_2008 has yearly child partitions sales_2008_01 through sales_2008_12, and each yearly child partition has daily child partitions like sales_2008_01_01 through sales_2008_01_31. Don't confuse logical partitioning with the more automatic physical partitioning feature of Aster Database. For clarity when discussing logical partitioning in this document, we avoid the term “logical partition,” and instead use the more explicit terms child table or child partition. machine See node. materialized projection A relatively narrow table that contains a copy of a group of columns that are commonly accessed together. A materialized projection usually contains a subset of the columns of a wider table and is created to allow queries to run faster. nc_ tables see system tables. NIC bonding Network link aggregation that allows you to combine multiple network interface cards to support a common connection for better performance. node In the cluster, a node is a server machine that hosts vworkers, a loader, or a queen. Typically a node is a physical machine, but if you’ve installed your cluster on VMware or in the cloud, then it’s a virtual machine. In Aster Database, each node has a designated role as a queen node, worker node, loader node. 207 Teradata Aster Big Analytics Appliance 3H Database User Guide node splitting See partition splitting. ODBC A standard API that allows clients to access a database. Aster Database offers an ODBC driver. Optimized Transport A massively parallel communication transport mechanism that enables dynamic repartitioning of data. parent-child table inheritance An older method of splitting a large table into child tables to optimize query performance using the INHERITS keyword. This approach has been replaced by the preferred automatic logical partitioning. partition See physical partition or logical partition. For clarity, we avoid using the unqualified term “partition” in this document and instead say “child table” or “child partition” for a logical partition, or “physical partition” for a partition that Aster Database maintains automatically based on a distribution key. partition count Each worker node in Aster Database contains a number of vworkers. The total number of vworkers in the cluster is the “partition count” of the cluster. partition splitting The act of increasing the number of vworkers in your Aster Database. Having the appropriate ratio of CPU cores to vworkers ensures efficient use of your workers’ computing power. As your cluster grows and you add more worker machines, it eventually makes sense to increase the total number of vworkers in order to maintain a good ratio. Contact Teradata support to find out the proper CPU core/vworker ratio for your hardware. Don't confuse repartitioning with partition splitting. Repartitioning happens to rows inside a query, and does not involve changing the physical location of rows on disk. In repartitioning, only the location of an in-memory copy of the row is changed. Partition splitting, on the other hand, 'permanently' moves some rows to a different vworker for storage. partitioning See logical partitioning or distribution (of rows). The term “partitioning” is ambiguous. passive coordinator Outdated term for backup queen. physical partition See vworker. physical partitioning Outdated term. See distribution (of rows). Teradata Aster Big Analytics Appliance 3H Database User Guide 208 primary interface The Ethernet NIC that the Aster Database administrator designated as the main networking interface for cluster communications. This is specified by interface name and is often eth0. primary queen When discussing the queen and the backup queen, we refer to the currently operating queen as the “primary queen”. queen The queen node is the Aster Database coordinator, distributed query planner, distributed query coordinator, and keeper of the data dictionary and system tables. The queen is responsible for cluster, transaction, and storage management. The queen handles software delivery to all nodes. See also distributed query planning. range partitioning See logical partitioning. repartitioning The act of reshuffling the rows of a distributed table to nodes on the cluster where they are needed for a join or aggregation. Repartitioning is frequently a prerequisite step for query execution in which the data required for a join is laid out as though it were distributed by the attribute/expression in the join or the aggregation. For example, when you run SELECT column-a FROM foo GROUP BY column-a, if column-a is not the distribution key of foo, then Aster Database must repartition foo so that, for the duration of this operation, it’s distributed on column-a. Don't confuse repartitioning with partition splitting. Repartitioning happens to rows inside a query, and does not involve changing the physical location of rows on disk. In repartitioning, only the location of an in-memory copy of the row is changed. Partition splitting, on the other hand, 'permanently' moves some rows to a different vworker for storage. replicate The act of updating the replica of a given piece of data when that piece of data changes. With each change in a vworker's data, Aster Database ensures that the vworker's replica gets a record of the change. See Aster Database replication. Tip! The term “replica” also arises in the case of a replicated dimension table. Don’t confuse the two. replicated dimension table A dimension table whose entire contents are copied to all vworkers for faster lookup. This is the default behavior of a dimension table in Aster Database, or you can include the clause DISTRIBUTE BY REPLICATION in your CREATE TABLE statement to create a replicated dimension table. Good to know: Don't confuse replicated dimension tables with Aster Database replication! They are not closely related. What's being replicated in Aster Database replication are vworkers 209 Teradata Aster Big Analytics Appliance 3H Database User Guide (sometimes called “partitions”) whereas what's being replicated in a replicated dimension table is the whole contents of the table. replication See Aster Database replication. replication factor (goal) Also written as “RF(g)”, this is the desired number of copies of data to be kept in Aster Database. This is almost always 2. This is specified at installation time and can be changed. This setting is stored on the queen as /home/beehive/config/ goalReplicationFactor. replication factor (current) Also written as “RF(c)”, this is your cluster's current replication factor. RF(c) is the replication degree of the partition with the lowest replication degree in the cluster. In other words, if one partition in the cluster has lost its replica, meaning its current replication degree has fallen to 1, then the current replication factor of your cluster is 1. When RF(c) falls below RF(g), the AMC alerts you that you need to take action to restore your cluster's replication factor. RF See replication factor (current) and replication factor (goal). schema Logical subdivision of a database, typically schemas are used to cordon off sections of the database so that different groups of users have authority over the use of those sections. Tip! In this document, we do not use the term “schema” to mean data model. We use “data model” instead. shared-nothing A distributed computing architecture in which nodes are independent and do not share disk or memory. SMC Outdated term for the AMC. SQL-MapReduce Aster’s programming framework and API for writing data analysis and manipulation functions that you can run in a distributed manner. SQL-MapReduce function A function, usually invoked in a SELECT statement, that operates in Aster Database’s SQLMapReduce framework. You can write SQL-MapReduce functions yourself, or use Teradata Aster’s functions. standby queen See backup queen. Teradata Aster Big Analytics Appliance 3H Database User Guide 210 star schema The database design schema that DBAs most commonly use in Aster Database is the star schema, consisting of (usually) one fact table surrounded by a set of dimension tables. Fact tables store the running log of events or transactions. Dimension tables describe items in detail. When you diagram the schema, it looks like a star, with the central fact table surrounded by dimension tables. stats db See system tables. system tables Tables that hold Aster Database system information. These tables’ names start with “nc_”. These tables are often referred to as the “stats db” or as the “_bee_stats db”. tuple A “tuple” is an ordered set of values that we think of as a single record. Rows are the elements that comprise a database table. At any given time, each row will be represented by a specific tuple of values. A row can be updated over time to contain a different tuple. In the Aster Database documentation, we use “row,” except on those rare occasions when we’re trying to show the distinction between a tuple and a row. UDF user-defined function vworker A virtual worker responsible for storing and operating on data in Aster Database. Conceptually, a vworker is roughly equivalent to a physical data partition in Aster Database, and as a result you will often hear people refer to a vworker as a “partition” or “physical partition.” The queen delegates work to vworkers, and query results are aggregated and returned via the queen. In a typical installation, you'll have as many active vworkers per worker node as you have CPU cores per worker node. See also partition count. view A stored query accessible as a virtual table. A view is composed of the result set of a query. A view is not part of the physical schema, but is instead a dynamic, virtual table computed or collated from data in the database. virtual worker See vworker. WAL file A Postgres write-ahead log file. 211 Teradata Aster Big Analytics Appliance 3H Database User Guide worker In this document, we avoid this term. Instead, we say vworker to mean the basic Aster Database unit that does work, or we say worker node to mean the physical or virtual server that acts a worker machine in the cluster. worker node An Aster Database node (machine) that contains vworkers. Teradata Aster Big Analytics Appliance 3H Database User Guide 212 213 Teradata Aster Big Analytics Appliance 3H Database User Guide Index Symbols /primary 76 A about this book 13 Activate and Balance Processing 56 Activate and Balance Storage 113 to restore RF 55 activate Aster Database 112 activate nodes 112 Activating status 35 activation 111 about 111 instructions 112 Active (node state) 113 active node 113 compared with passive node 113 defined 48 Active status 35 add loader nodes 66 add node failure, fixing 75 Add Node button 91 add worker nodes 66 Admin 90, 170 Configuration tab 101 Roles and Privileges tab 106 admin console (ncli) 136 admin console URL 27 Admin tab in AMC 90, 170 administrative actions 90 allowed actions based on cluster status 37 allowed AMC actions based on user privileges 106 configure hosts 108 administrative console URL 27 administrator command line controls 136 creating an AMC administrator 106 administrator role allowed AMC actions based on user privileges 106 alerts 122 AMC 26 address to type in browser 27 admin actions 90, 170 admin actions allowed based on role 106 admin actions allowed based on status 37 Admin Tab 90, 170 Admin: Configuration tab 101 Admin: Roles and Privileges tab 106 certificate, managing 33 Config Panel 97 Dashboard 26 Data Panel 46 documentation link 29 introduction 26 launching 27 Nodes Panel 46 opening the AMC in your browser 27 overview 26 Processes tab 38 status lamp 35 troubleshooting 34 URL 27 user (creating an AMC user) 106 user roles, editing 107 user roles, list all 106 user roles, list current 107 window layout 28 architecture 16 architecture diagram tiers 18 Aster Data, about 13 Aster Database activate 112 activation, about 111 activation, instructions 112 checking status from the command line 115 command-line cluster controls 115 overview 16 restarting 109 starting from the command line 116 status of 115 stopping from the command line 116 Aster Database status 35 administrative action rights and 37 Aster Management Console 26 Aster Relational Compute Engine (ARC) 17 Aster support portal 12 availability 21 available space 48 cluster-wide 32 per-node 49 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 214 B Backing Up status 36 backup use separate network for 91 backup node defined 19 balance data 113 defined 203 to restore RF 55 Balance Data button 113 balance process 72, 114 Balance Process button 56, 72, 114 BIT, the application 203 blackbird 122 blue light 36 build number 30 C capacity 48 capacity gauge cluster-wide 32 per-node 49 certificate managing the AMC certificate 33 check status 115 child partition 203 Clean Node check box background 76 UMOS 68 cleaning a node for re-use 76 cli for Aster 136 cluster monitoring 122 monitoring events 122 securing 118 SNMP monitoring 134 cluster architecture 16 cluster status 115 administrative action rights and 37 command line interface, ncli 136 command-line cluster management tools 115 compression viewing in the AMC 32 concurrency, setting 74, 75 Config Panel of AMC 97 Configuration tab 101 configure DNSs 108 console URL 27 console (ncli) 136 conventions 11 coordinator 203 copyright 13 215 CREATE USER creating AMC user 106 CSV 203 CTAS 204 customer support 12 D Dashboard 26 Nodes section of the Dashboard 31 Processes section of the Dashboard 30 window layout 28 data balance data 113 compressed and raw size of 32 skew 62 space used and remaining, cluster-wide 32 space used and remaining, per-node 49 data locality 204 Data Panel in AMC 46 data storage 48 balancing in cluster 113 cluster-wide 32 per-node 49 restoring replication 55 utilization 48 date of publication 13 DDL 204 delete all data from a node machine 76 delete node 96 disk free space cluster-wide 32 per-node 49 reclaiming vworker space 116 disk failure 56 disk replacement 51 disk space cluster-wide 32 per-node 49 distribution key defined 205 dmesg 60 documentation 29 opening from AMC 29 documentation conventions 11 documentation version and updates 13 documentation, about 11 E edition 13 ELT 205 ETL 205 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide event monitoring 122 executable jobs 170 executables 170 exporter, defined 18 F failed node 55 Failed status of new node 75 on normal clusters 75 failover replication factor 54 firewall 118 disable 119 open ports on Aster Database firewall 119 foreign key: declaration not supported in Aster Database 205 free space cluster-wide 32 freeing space occupied by dead v-worker 116 overview in AMC 32 per-node 49 viewing the amount of 48 G get latest documentation 13 glossary 202 green light 35 Green Summary Box 30 H H2 Head2 Checking the Current Replication Factor 55 Dashboard Nodes section Center Panel 32 Open Ports 119 Restoring Replication Factor 55 H3 Head3 ncli netconfig Examples 158 HA 21 Hadoop 206 hard restart 110 hardware monitoring 122 hardware failure 56 hash partitioning 206 HCatalog 206 help 12 Help link 29 high availability 21 history of jobs or queries run 38 HTTPS allowing HTTP connections to the AMC 34 managing the AMC certificate 33 I imbalanced 206 data imbalanced 36 processing imbalanced 36 incorporate 206 in-database applications 206 init.d/local status command 115 Is the cluster up? 115 IWT 206 J job history 38 timeline 42 K kernel log 60 L lamp 35 launching the AMC 27 light greyed out 37 list nodes 47 list of statements run 38 list partitioning 206 loader defined 18 loader nodes add new 66 loading use separate network for 91 local restart 56 local status 115 log 60 format of Aster Database logs 61 retrieve in AMC 60 logging 60 alerts 122 format of Aster Database logs 61 logical partitioning defined 207 logs filtering display of 40 M machine 207 management console URL 27 massively parallel processing 16 monitor Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 216 SQL-MapReduce execution 45 monitoring 122 cluster 122 SNMP monitoring 134 mpp 16 overview 20 multi-NIC machines 91 N nc_ tables defined 207 ncli 136 preupgrade commands 165 nCluster backup use separate network for 91 network open ports on Aster Database firewall 119 Network Assignments feature (since 4.6.3) 91 network configuration on cluster 108 network topology 16 network traffic, segregating by function 91 networking multi-NIC machines 91 new in 4.6.3 Network Assignment (multi-NIC) 91 New node 48 NIC multi-NIC machines in cluster 91 NIC bonding defined 207 node 46 activate 112 activation, about 111 activation, instructions 112 active node 113 adding attempt fails 75 cleaning a node for re-use 76 data skew 62 defined 17, 207 failure 50 failure, fixing 55 list of nodes 47 logs for a node 60 node state 47 node status, list of 47 passive node 113 removing from Aster Database 96 reprovision old node 76 restarting 56 suspect node 55 types in AMC 47 node list 47 node splitting 208 217 node state 47 nodes add new 66 Nodes Panel in AMC 46 O ODBC, defined 208 on-disk data size 32 online help, launching 29 open ports on Aster Database firewall 119 opening the AMC 27 optimization split partitions 73 overview panel in AMC 26 P parent-child table inheritance 208 partition 208 defined 208 partition count 73 current 73 defined 208 partition splitting 73 defined 208 partitioning defined 208 overview 19 split partitions 73 Passive (node state) 113 passive coordinator 208 passive node 113 compared with active node 113 defined 48 passwordless root SSH, setting up 116 for queen replacement 84 payload gauge cluster-wide 32 per-node 49 permissions AMC user permissions 106 physical partition 208 physical partitioning defined 208 physical worker 18 planner, distributed aspects of 18 portal 12 ports open ports on Aster Database firewall 119 Postgres 180 PostgreSQL version 17 preparation log (Prep Log) 60 Prepared node 48 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide Preparing node 48 primary interface defined 209 primary queen 209 privileges AMC user permissions 106 Process Filter 40 Processes tab in AMC 38 processing power, balancing in cluster 72, 114 processing skew 62 processing: balance 114 prompt, ncli 136 provision node reprovision old node 76 Q queen defined 18, 209 query cancelling 45 list of statements run (UI) 38 time elapsed (UI) 42 query history AMC query history page 38 timeline 42 query timeline 42 quiesce 116 R range partitioning 209 rebalance data 113 rebalance processing 114 rebalance processing power 72, 114 rebalance storage 113 reboot soft shutdown 116 red light 36 release number 30 remove node 96 repartitioning 209 replace disk 51, 56 replicate defined 209 replicated dimension table 209 Replicating status 36 replication defined 202 replication factor 54 changing 57 checking 55 restoring 55 viewing summary of 32 replication factor, current 210 replication factor, goal 210 reprovision old node 76 restart 109 hard restart 110 soft 110 soft shutdown 116 restarting Aster Database 109 Restarting status 36 restoring replication factor 55 Restoring status 36 revision number 30 RF 54 changing 57 viewing summary of 32 role AMC user role 106 edit roles of AMC user 107 list available AMC user roles 106 list roles of AMC user 107 Roles and Privileges tab 106 S scale out add v-workers 73 split partitions 73 schema, defined 210 script management 170 scripts 170 security 118 firewall 118 separate networks by function 91 SetConcurrency.py script 74, 75 shutdown 116 command-line shutdown 116 soft shutdown 116 size of table, checking 64 skew 62 skew, finding 64 SMC 210 SNMP 134 SNMP read configuration 135 SNMP monitoring 134 soft restart 110 soft shutdown 116 soft startup 116 SoftShutdownBeehive.py 116 SoftStartupBeehive.py 116 space 48 reclaiming vworker space 116 space available cluster-wide 32 per-node 49 split partitions 73 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 218 SQL-MapReduce history of jobs run 38 monitoring 45 SSH setting up passwordless SSH 116 setting up passwordless SSH for new queen 84 standby node 113 start command-line startup 116 start Aster Database 112 state Aster Database state 35 node state 47 statisics disk usage, cluster-wide 32 disk usage, per-node 49 stats db 211 status 35 administrative action rights and 37 Aster Database status 35 command-line status check 115 node status 47 status lamp 35 Stopped status 36 storage 46 viewing stored data size in the AMC 32 storage utilization 48 cluster-wide 32 per-node 49 storage, balancing 113 storage, restoring replication in 55 support 12 Suspect node explained 50 icon for 48 suspect node 55 fixing 55 system log 60 system overview 16 system statistics disk usage cluster-wide 32 disk usage per-node 49 system status 35, 115 system tables, defined 211 T table size, checking 64 table size, checking 64 technical support 12 timeline of jobs run 42 troubleshooting add-node failures 75 219 AMC Add Node dialog box displays unexpectedly 34 AMC certificates 34 AMC login window refuses to load 34 skew, detecting 64 tuple 211 typeface conventions 11 U Unavailable status 37 updated documentation 13 URL 12 AMC 27 Aster Data Support URL 12 Ganglia 28 old AMC 28 user AMC user role 106 AMC user roles, editing 107 AMC user roles, list all 106 AMC user roles, list current 107 creating AMC user 106 permissions in AMC 106 permissions in AMC, editing 107 permissions in AMC, list all 106 permissions in AMC, list current 107 utilities changePartitionCountExec 74 command-line cluster controls 115 firewall 118 init.d/local status command 115 initialPartitionCount 73 local restart 56 monitoring tools 122 SetConcurrency.py 74, 75 SNMP monitoring 134 SoftShutdownBeehive.py 116 SoftStartupBeehive.py 116 totalPartitionCount 73 V var/log/messages 60 version 30 AMC version 30 Aster Database software version 30 checking from command line 30 documentation version 13 view defined 211 virtual worker 18 adding more virtual workers 73 defined 211 freeing space occupied by dead v-worker 116 primary vs. replica 54 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide v-worker: See virtual worker. W WAL file 211 worker defined 212 worker node 18 defined 18, 212 failed or suspect node 55 restarting 56 worker nodes add new 66 worker: See "worker node" or "virtual worker." Y yellow light 35, 36 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide 220 221 Teradata Aster Big Analytics Appliance 3H Database Administrator Guide