Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Teradata Database Explained, Illustrated and Demystified Accenture and Teradata Teradata Architecture Key Database Features Accenture Confidential Teradata Architecture: Table of Contents Introduction Platform Architecture: MPP and SMP Teradata Architecture: MPP and SMP Key Differentiators Accenture Confidential 2 Introduction Platform Architecture: MPP and SMP Teradata Architecture: MPP and SMP Key Differentiators Accenture Confidential 3 Introduction: Purpose and Intended Audience • The purpose of this deck is to familiarize Accenture practitioners with the Teradata Relational Database System (RDBMS) • We tend to focus on Teradata’s unique architecture or features, occasionally contrasting them with Oracle, a much more familiar reference for most readers • The reader need not be totally technical to benefit by reading this: we have attempted to provide high-level overview material vs. deep (and sometimes boring) details • Finally, illustrations are provided to add clarity to certain concepts where it makes sense Accenture Confidential 4 Introduction: Unique Teradata Attributes • Teradata is unique among commercial RDBMSs in a number of ways. (If that weren’t true, there probably would be no need for this deck) • Key Teradata differentiators: – It is implemented on Massively Parallel Processing (MPP) hardware architecture – and always has been – It was implemented on proprietary hardware with portions of the database imbedded in the hardware/firmware (although this is no longer the case) – The database software is unconditionally parallel – It is linearly scalable, with hundreds of reference sites exceeding a terabyte (1,000 gigabytes) in size – It virtually “owns” the Very Large Database (VLDB) market space – and has for twelve years • Some of the above points are discussed in the “Key Differentiators” section Accenture Confidential 5 Introduction Platform Architecture: MPP and SMP Teradata Architecture: MPP and SMP Key Differentiators Accenture Confidential 6 Teradata Platform Architecture: Uni-processors, SMPs and MPPs • Computers –can be broadly categorized into one of three hardware architectures: – Uni-processor • The desktop PC is the example • Generally applied to client, not server, applications • Not further discussed further in this paper – Symmetric Multi Processing (SMP) • A single computing system with multiple processing units, often microprocessors – Massively Parallel Processing (MPP) • A collection of computing systems – usually SMPs – that are interconnected and that collaborate to solve a common task(s) • While there are significant differences between these architectures, the application programming model is essentially unchanged among them: the platform software deals with the hardware differences Accenture Confidential 7 Teradata Platform Architecture: A Closer Look at SMP Hardware • Typical SMP hardware architectures have: – Two to eight, up to as many as 64 processors • Smaller SMPs often have Compaq or Intel motherboards and run MS Windows • Larger SMPs are typically RISC machines running UNIX – Sun, H-P and IBM dominate this space – All of the processors run from a common, shared memory and they all access that memory via a common, shared memory bus – All of the processors share the I/O slots, channels and associated peripherals devices, notable disk storage subsystems • SMP examples: – Low-end: • Compaq ProLiant DL series (2-4 CPUs, desk side) • NCR’s Model 4455 or similar (1-4 CPUs, desk side) – Midrange: HP’s NetServer 6000 series (4-6 CPUs, rack mount) – High-end: Sun’s Enterprise 10000 (16-64 CPUs, free standing) – Nearly all IBM and compatible mainframes Accenture Confidential 8 Teradata Platform Architecture: SMP Hardware SMP hardware: 4 CPUs (in blue) with shared memory and I/O subsystems Memory Memory Bus I/O Bus Peripheral Devices Accenture Confidential 9 Teradata Platform Architecture: SMP Hardware Scalability • Scalability options for SMPs include: – – – – – Larger memories Faster CPUs More CPUs More I/O (slots and busses) More peripherals (usually disk arrays) • Scalability limitations for SMPs: – Every shared hardware subsystem is a potential bottleneck for an SMP – The most common limiter to SMP scalability is the memory subsystem • Each CPU must access the single memory via a common bus • As the number of CPUs increases, there is added contention for memory accesses – CPUs begin waiting on the memory subsystem • Eventually, a point of diminishing returns is reached, where the added expense of additional CPUs fails to provide a commensurate increase in performance Accenture Confidential 10 Teradata Platform Architecture: An Introduction to MPP Hardware • Massively Parallel Processing (MPP) hardware systems consist of from two to perhaps hundreds of SMP systems called “nodes” – Just like a stand alone SMP, each node has its own memory and I/O subsystems as well as its own copy of the operating system and application(s) – The nodes are interconnected via a dedicated, very high-speed, often proprietary interconnect network – Most MPP systems run under UNIX, though a few MPP Teradata installations run under Windows 2000 • MPP examples: – IBM’s pSeries (formerly RS6000) and IBM’s “Deep Blue,” their chessplaying machine that defeated Grand Master Gary Kasparov in May, 1997 – The NCR 5250 or 5255 (among other NCR servers), which has never played chess and probably never will Accenture Confidential 11 Platform Architecture: A 2-node MPP System MPP hardware showing 2 nodes, their disk arrays and the interconnect Interconnect Network MPP Node 0 MPP Node 1 Disk Array 0 Disk Array 1 Accenture Confidential 12 Teradata Platform Architecture: More on MPP Hardware • MPP hardware architectures are often called “shared nothing” or “loosely-coupled” systems, since the nodes – the basic MPP building blocks – share no computing hardware • The network that interconnects the nodes enables them to communicate and cooperate to solve a problem – Exactly how the interconnect is used depends entirely on the application(s) running in the system • So, why bother with the complexity of MPP hardware? – One word: Scalability: The ability to add processing nodes without “hitting the wall” before reaching a desired level of performance Accenture Confidential 13 Platform Architecture: Teradata’s MPP Hardware • For Teradata’s MPP hardware, each node is: – – – – – Made by Solectron to NCR’s design and specifications Powered by a 4-CPU Intel Xeon board Connected to all the other nodes via NCR’s BYNET interconnect Connected via SCSI to its own disk array(s) Optionally connected to the disk array(s) of another node in the complex for fault tolerance purposes (more on this later) Accenture Confidential 14 Platform Architecture: Teradata’s MPP Hardware NCR MPP hardware showing 4 nodes, their disk arrays and the BYNET interconnect BYNET BYNET Interconnect Node 0 Disk Array 0 Accenture Confidential Node 1 Disk Array 1 Node 2 Disk Array 2 Node 3 Disk Array 3 15 Platform Architecture: Teradata’s BYNETtm Interconnect • NCR’s node interconnect subsystem is called the BYNET • The BYNET is fully scalable: – When you add a node, you add bandwidth with it, so that the total bandwidth available scales as the MPP complex grows – Early Teradata machines did not have a scalable interconnect (YNET) • The network architecture is “Folded Banyan” – All nodes are directly connected to all other nodes • There are always two BYNETs for redundancy purposes • The BYNET hardware is an ordinary PCI card designed by NCR • The BYNET is fast: – 120 megabytes per second per node per BYNET in each direction • It’s patented by and proprietary to NCR Accenture Confidential 16 Platform Architecture: BYNET Node-to-Node Connections • Ever node has a dedicated bi-directional channel to every other node • This architecture is duplicated – there are really 2 channels (one shown) Point-to-Point Messaging Accenture Confidential Broadcast Messaging 17 Platform Architecture: Teradata “Cliques” • A Teradata clique provides high availability, and is a configuration option • A clique is a group of nodes – 4 are shown below – that can access a common chunk of disk array storage • Cliques eliminate any single point of failure BYNET Interconnect Four nodes Shared SCSI Sharable disk Accenture Confidential 18 Platform Architecture: Why have Cliques? • Cliques add high availability via automatic failure detection and software re-configuration in the event of a hardware failure(s) BYNET Interconnect Interconnect Accenture Confidential 19 Platform Architecture: MPP Hardware Illustration • Below is a medium size Teradata MPP system: – – – – – 16 nodes, each with their own busses, memory and back plane 8 cliques of 2 nodes 8 disk arrays, one for each clique 2 BYNETs, because there are always two BYNETs Total BYNET bandwidth is (2 x 2 x 16 x 120) = 7.68 GB/sec! ... Accenture Confidential 20 Platform Architecture: MPP Operating System Software • Operating System software – For MPP Teradata, the choices are the same as for SMP: • NCR’s version of UNIX: MP-RAS, or • Windows 2000 – For both OS options: • The BYNET device driver is an ordinary (UNIX or Windows) one • Teradata doesn’t use the native file system for performance reasons; all the Teradata database structures are managed by Teradata within raw disk Accenture Confidential 21 Introduction Platform Architecture: MPP and SMP Teradata Architecture: MPP and SMP Key Differentiators Accenture Confidential 22 Teradata Architecture: Software “Units of Parallelism” • Teradata software components are known as “Virtual Processors” or VPROCs – VPROCs are software threads or processes • There are two kinds of VPROCs: – Access Module Processors (AMPs) • An AMP reads, writes and manipulates all database rows in the partition that the AMP “owns” – Parsing Engines (PEs) • PE parse SQL statements, reducing them to their component executable steps • The number of VPROCs is configurable • VPROCs are in every Teradata node • VPROCs can migrate around the complex, as in the case of a failed node • VPROCS provide parallelism within a node Accenture Confidential 23 Teradata Architecture: MPP Platform with AMPs and PEs • Four-node MPP system showing Virtual Processors – AMPs and PEs – in each node BYNET BYNET Interconnect VPROCS AMP & PE VPROCS AMP & PE VPROCS AMP & PE Node 0 “w” AMPs “w” partitions Node 1 “x” AMPs “x” partitions Node 2 “y” AMPs “y” partitions Accenture Confidential VPROCS AMP & PE Node 3 “z” AMPs “z” partitions 24 Teradata Architecture: Data Partitioning Explained • Data is automatically distributed to all AMPs – and thus to all disks – via a proprietary hashing algorithm – No partitioning or re-partitioning ever required • File system architecture is fundamentally different – Rows stored in blocks – Space allocation is entirely dynamic • Absolutely minimal DBA effort required – No reorgs, repartitioning, space management, index rebuilds – Minimal monitoring required Accenture Confidential 25 Teradata Architecture: Data Partitioning Illustrated • The rows of each table are automatically and unconditionally distributed to all AMPs (and all available disk storage) – This enables Teradata’s automatic and unconditional parallelism AMP1 Disk AMP2 Disk AMP3 Disk AMP4 Disk SYSTEM TABLES CUSTOMER ORDERS LINEITEM PART SUPPLIERS Accenture Confidential 26 Teradata Architecture: Data Partitioning Explained • Let’s take a simple case: – A four-node, eight AMP Teradata MPP system – A single database table of 100,000 rows • The system will configure itself with two AMPs in each node • Then, via hashing the Unique Primary Index, it will distribute all rows to all AMPs – giving each AMP about 12,500 rows – and each node 25,000 rows • This is the ideal “flat” distribution across all the system, and will occur if the primary key is essentially random – like SSN • In all processing, each node has to deal with only 1/4 of the total database – The name of the game is simple: “Divide and conquer” Accenture Confidential 27 Teradata Architecture: SMP Hardware • In an SMP architecture, Teradata looks much the same as an ordinary database such as Oracle: – A single SMP processor does it all – A single software image can access the entire database Accenture Confidential 28 RDBMS Architecture: Teradata on SMP Hardware • On SMP hardware architecture, Teradata runs on: – – – – Windows 2000 Intel microprocessors The above combination is often called “Wintel” Almost all Wintel boxes use Compaq or Intel processor boards, typically populated with Pentium III or Pentium 4 CPUs – NCR’s SMP machines on either Windows or UNIX (MP-RAS) • The latter configuration – SMP/UNIX is often used as a low-cost test platform for a production MPP system under MP-RAS • Examples of Teradata SMP platforms: – – – – – IBM HP Compaq Dell NCR (but only rarely, probably due to cost or client standards) Accenture Confidential 29 Teradata Architecture: SMP Hardware and Disk Array SMP hardware showing 4 CPUs and disk array SMP Box SCSI Interconnect (dual paths shown) Disk Array Accenture Confidential 30 Introduction Historical Perspective Platform Architecture: MPP and SMP Teradata Architecture: MPP and SMP Key Differentiators Accenture Confidential 31 Key Differentiators • • • • Ubiquitous, persistent parallelism Unrelenting partitioning A really, really mature query optimizer The above yield the ability to handle very complex queries, large complex databases and lots of concurrent users doing lots of different stuff • Truly linear scalability • Mainframe connection via direct FIPS-60 channel connect – ESCON or “Bus and Tag” media Accenture Confidential 32 Scalable, Parallel, High Availability MPP Hardware • • • A group of 1-4 nodes with connections to each other’s storage --- keeps applications running when node(s) fail All critical components have redundant backups Nodes have (optional) LAN/WAN/Mainframe connectivity Server Management BYNET MPP Interconnect SMP Processing Nodes Point-to-Point SCSI or FibreChannel Interconnect CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU CPU Data Cache Memory Memory Data Cache Memory Memory Data Cache Memory Memory DA Controllers (w/Cache) DA Controllers (w/Cache) DA Controllers (w/Cache) Data Cache Memory Memory DA Controllers (w/Cache) LSI Logic or EMC2 Disk Arrays Accenture Confidential 33 Shared Nothing Software Architecture • Basis of Teradata parallelism and scalability – Divide the work evenly among many processing units – No single point of vulnerability or chokepoint for any operation Accenture Confidential 34 Teradata Data Distribution • Automatic, Always On • Rows are distributed evenly by hash partitioning Table A Table B Table C – Define the row, we’ll do the rest – Regardless of queries or demographics • Shared nothing software Primary Index Teradata Parallel Hash Function VAMP1 VAMP2 VAMP3 P P P M Accenture Confidential D M D M VAMP4 ………………………………………………………VAMPn P D M P D M P D M P D M P D M P D M D 35 Key Data Warehousing Capabilities • Technology – – – – – Fully automatic space management Automatic data distribution Always-On, Automatic, Integral, Multi-Level Parallelism Continually Improved Cost Based Optimizer Full ANSI SQL functionality, complex query optimization Accenture Confidential 36 Hash Distribution • Data automatically distributed to AMPs via hashing • Even distribution results in scalable performance • Hash map defined and maintained by the system – 2**32 hash codes, 64K buckets distributed to AMPs • Prime Index (PI) column(s) are hashed • Hash is always the same - for the same values • No partitioning or repartitioning required VPROCs AMP & PE Accenture Confidential VPROCs AMP & PE 14 3 16 2 54 1 41 7 21 33 73 18 87 94 53 61 75 23 37 Shared Nothing Software • Delivers linear scalability – – – – Maximizes utilization of SMP resources To any size configuration Allows flexible configurations Incremental upgrades VPROCs Amps Accenture Confidential VPROCs VPROCs Amps Amps VPROCs Amps VPROCs Amps VPROCs VPROCs Amps Amps VPROCs Amps VPROCs Amps VPROCs VPROCs Amps Amps VPROCs Amps VPROCs Amps VPROCs VPROCs Amps Amps VPROCs Amps 38 A Shared Nothing Database Architecture Enables Expansion with Balance • Amount of parallelism grows at the same rate as the system expands • Each parallel unit does an equal amount of work Work Accomplished Unit of Hardware Power Hardware Scalability Unit of Hardware Power Unit of Hardware Power Unit of Hardware Power Software Scalability = Unit of Parallelism Accenture Confidential 39 Optimizer - Parallelization • Cost based optimizer – Parallel aware • • • • Rewrites built-in and cost based Parallelism is automatic Parallelism is unconditional Each query step fully parallelized Accenture Confidential 40 Shared Everything vs. Nothing Shared Everything Shared Nothing Database Architecture Database Architecture • A single database buffer used by all UoPs • A single logical data store accessed by all • Each UoP is assigned a data portion • Query Controller ships functions to UoPs UoPs • Scalability limited due to control bottlenecks and scalability of single SMP platform that own the data • Locks, buffers, etc., not shared • Highly scalable data volumes Buffers, Locks, Control Blocks Data Data Partition - Unit of Parallelism Accenture Confidential Data Partition Data Partition Data Partition Q/A Thank You Accenture Confidential 42