Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Comparison between Oracle and Teradata Center of Excellence Data Warehousing Wipro Technologies 1 Basic Database Architecture Differences: Oracle Oracle SMP DBMS Kernel Locking Logging Multi-Task Control Global Systems/Storage Area Global Buffer Pools I/O I/O I/O I/O Data Partition I/O I/O I/O I/O Data Partition I/O I/O Data Partition I/O I/O I/O I/O I/O Data Partition 2 Basic Database Architecture Differences: Teradata Teradata MPP SMP Node PE V-Proc PE V-Proc BYNET Connect B Y N E T SMP Node PE V-Proc PE V-Proc BYNET Connect AMP AMP AMP AMP AMP AMP AMP AMP M-T M-T M-T M-T M-T M-T M-T M-T Lock Lock Lock Lock Lock Lock Lock Lock Log Log Log Log Log Log Log Log BPool BPool BPool BPool BPool BPool BPool BPool I/O I/O I/O I/O I/O I/O I/O I/O Data Partition Data Partition Data Partition Data Partition Data Partition Data Partition Data Partition Data Partition @ Oracle SMP 3 Oracle Physical Storage Tablespaces are individual units of storage Tablespaces have associated data files Data files can be added to, extended or removed from a tablespace dynamically Only partitioned objects can span tablespaces Logical objects are stored in physical segments. Segments consist of extents Object space is allocated one extent at a time. May cause fragmentation. 4 Disk Space and the Teradata Database SMP Node PE V-Proc SMP Node PE V-Proc BYNET Connect AM P M-T AM P M-T AM P M-T AM P M-T Loc k Log Loc k Log Loc k Log BPo ol I/O BPo ol I/O BPo ol I/O PE V-Proc B Y N E T PE V-Proc BYNET Connect AM P M-T AM P M-T AM P M-T AM P M-T Loc k Log Loc k Log Loc k Log Loc k Log Loc k Log BPo ol I/O BPo ol I/O BPo ol I/O BPo ol I/O BPo ol I/O Data Data Data Data Partition Partition Partition Partition Data Data Data Data Partition Partition Partition Partition DBC Stores TABLE INDEX Products INDEX TABLE 5 Sample Large Disk Space Allocation DDL Oracle9i drop tablespace ts_o1 including contents; create tablespace ts_o1 datafile ’/export/home/oracle/oracle817/dbs/datafiles/o_1’ size 8190m reuse extent management local autoallocate ; Repeat this for every tablespace . . . Teradata Execute Once CREATE USER tpcd3000g AS PERM= 5400E9, PASSWORD= tpcd3000g; 6 Comparing Database Creation Database Creation Task Oracle Teradata Script(s) to Create Catalog/Data Dictionary Tables and Views Catalog.sql, Catclust.sql, Catproc.sql DIP Manage “In-Flight Transactional Data Rollback/Undo Segments Transient Journal Database Logs Redo Logs Permanent Journal Temporary Space/Work Space Temp Space Spool Space Default Permanent Space Table Block Size Default Tablespace Set at Table Level for Each but Requires DBA Setup of Bufferpools for Each Block Size Default Database Set at Table Level for Each Teradata Memory Management for Different Blocksizes is Automatic XCTL, Vconfig.out Database Startup Information INIT.ORA and Control File 7 Create Table Statements Oracle Teradata CREATE TABLE Table1 CREATE TABLE Table1 ,FALLBACK , ( Col1 NUMBER, NO BEFORE JOURNAL, NO AFTER JOURNAL (Col1 INTEGER, Col2 INTEGER, Col3 INTEGER) UNIQUE PRIMARY INDEX ( Col1 ); Col2 NUMBER, Col3 NUMBER ) TABLESPACE Tablespace1 STORAGE (INITIAL 6144 NEXT 6144 MINEXTENTS 1 MAXEXTENTS 5 ); or... CREATE TABLE <expl|impl column specification> AS SELECT <any query>; 8 Creating Tables: Similarities All tables have names Tables have columns with data types You can define: Constraints Referential integrity Global temporary tables Triggers 9 Creating Tables: Differences Teradata tables are created in the database or user DBC Product TABLE s Teradata provides several convenient options: Fallback Permanent journaling Table block size Primary index Built into the Table MULTISET 10 Partitioning Partitioning tables and indexes allow Oracle and Teradata to store lots of data With Oracle, the process of choosing partitioning methods and partitioning keys is the balancing of query access path, performance, and data load requirements You specifically manage the partitioning constraints and their relationship to disk storage With Teradata, the hash partitioning algorithm is very good at evenly distributing (loading) data partitions and is the basis for high performance data access and ease of user access Provide reasonable partitioning columns when defining the table and Teradata does the rest Partitioning Columns are Chosen for Even Data Distribution in Both Teradata and Oracle 11 Partitioning Comparisons - Oracle Oracle Partitioning Decisions - Decide on Methods as well as partitioning columns, set up the disk environment Hash1 99Q4 California 99Q3 Florida 99Q2 New York 99Q1 Texas Range Partitioning List Partitioning 99Q4 99Q3 99Q2 99Q1 Hash2 Hash3 Hash3 Hash2 Hash4 Hash3 Hash2 Hash1 Hash1 Composite Range Partitioning Hash Partitioning 12 Partitioning Comparisons-Teradata Teradata Partitioning is a Fact of the System with Hash Data Distribution based on Primary Index (Partitioning) Columns and system managed disk AMP1 AMP2 AMP3 AMP4 Table 1 Table 2 Table 1 Table 2 Table 1 Table 2 Table 1 Table 2 Table 3 Table 3 Table 3 Table 3 13 Examples of DDL to Create Partitioned Tables Oracle9i create table orders( o_orderdate date , o_orderkey number NOT NULL, o_custkey number NOT NULL, o_orderpriority char(15) , o_shippriority number , o_clerk char(15) , o_orderstatus char(1) , o_totalprice number , o_comment varchar(79) ) pctfree 1 pctused 99 initrans 10 storage (freelists 99 freelist groups 2) parallel nologging partition by range (o_orderdate) subpartition by hash(o_custkey) subpartitions 16 ( partition ord1 values less than (to_date(’1992-0101’,’YYYY-MM-DD’)) tablespace ts_o1 . . . partition ord84 values less than (MAXVALUE) tablespace ts_o84 ) ; Teradata CREATE MULTISET TABLE ORDERTBL, DATABLOCKSIZE= 29.5 KILOBYTES ( O_ ORDERKEY DECIMAL (15,0) not null ,O_ CUSTKEY INTEGER not null ,O_ ORDERSTATUS CHAR( 1) CASESPECIFIC not null ,O_ TOTALPRICE DECIMAL( 15,2) not null ,O_ ORDERDATE DATE FORMAT ’yyyy- mm- dd’ not null ,O_ ORDERPRIORITY CHAR( 15) CASESPECIFIC not null ,O_ CLERK CHAR( 15) CASESPECIFIC not null ,O_ SHIPPRIORITY INTEGER not null ,O_ COMMENT VARCHAR( 79) CASESPECIFIC not null ) UNIQUE PRIMARY INDEX( O_ ORDERKEY ); Repeat 82 Times with DBA Defined Ranges 14 Oracle Parallelism Oracle parallelism is not directly related to table partitioning because of its shared disk architecture Dynamically splits data over parallel processing units for Selects/Inserts means 1 or more parallel processing units per partition Considers partitioning when distributing data to parallel processing units One parallel processing unit applied to each partition for Updates/Deletes Each user/query can get varying amounts of parallelism or run serially depending upon the resources available at query run time May need to manually control parallelism to improve system throughput and to ensure fair distribution of parallel resources Parallel processes may funnel down to serial processing for final 15 sort/merge, aggregate activity Teradata Parallelism Teradata parallelism is directly related to its shared nothing architecture Automatically applied by the database Architecture ensures that each major unit of parallelism (the VAMP) has similar amounts of data and memory Pipelining and query step parallelism is performed within the VAMP Utility Parallelism and Query/Data Manipulation Language parallelism (Select, Insert, Update, Delete) are all the same All system parallelism is available to ALL operations Teradata parallelism is automatic, pervasive, and database managed All users/queries take advantage of all the system parallel resources You do not manage and control parallelism 16 Data Types Teradata Oracle CHAR VARCHAR2 NCHAR NCHAR2 NUMBER LONG LONGRAW RAW DATE BLOB CLOB NCLOB BFILE ROWID UROWID CHAR VARCHAR CHAR VARYING LONG VARCHAR NUMERIC DECIMAL DOUBLE PRECISION FLOAT INTEGER SMALLINT BYTEINT BYTE VARBYTE GRAPHIC VARGRAPHIC LONG VARGRAPHIC DATE REAL 17 Data Type Translations Oracle Teradata Char Char Varchar2 Varchar Number(m,n) DECIMAL(m,n) or NUMERIC(m,n) Date (includes time) Date (time portion) Date (does not include time) Date Timestamp Time 18 Datatypes In Oracle the maximum precision "m" for number is 38. In Teradata it is 19 NUMBER (without precision) has no direct counterpart in Teradata. Determine migration by contents ROWID has no counterpart in Teradata 19 Constraints Like Teradata’s Constraints Defined on columns and tables Oracle is different from Teradata on the follow: Default mandatory is NULL Case sensitivity is fixed at ON Constraints can be disabled and enabled 20 Oracle Specific Columns and Table ROWNUM will number retrieved rows ROWID indicates physical address of row SELECT must have FROM list DUAL is a pseudo-table with 1 row 21 Creating Indexes Similarities between Teradata and Oracle: Indexes take up space on disk Indexes can be unique and non-unique Indexes and secondary indexes provide alternate ways to access data Differences: Teradata indexes are not in B-tree structure Hash Subtables Teradata automatically partitions indexes across the AMPs Teradata uses a Primary Index for each table 22 Index Usage Comparisons Oracle solutions traditionally have relied heavily on indexes OLTP workloads required fast access paths to few rows Decision support solutions continue Oracle’s use of indexes where tactical queries with OLTP-like response time requirements are given more emphasis than throughput performance Teradata solutions have traditionally not used lots of indexes Teradata’s efficient parallel architecture emphasizes throughput performance requirements - a result of it’s DSS background 23 Index Usage Comparisons Contd. Teradata typically uses fewer indexes than the equivalent Oracle system Saves on disk storage Reduces table maintenance windows where affected by existing indexes Fewer database objects to manage and monitor Most of the indexes found in Oracle may not be used on Teradata Indexes can provide clues to ad hoc query support requirements No Bit-Map indexes for Teradata Add indexes to Teradata only as workloads (or anticipated workloads) require them Monitor their usage or lack of usage Statistics are key - you WANT to collect statistics in Teradata! 24 Principles Behind the Use of Indexes Maximize the use of Single or Few AMP Operations: Drive towards using single or few AMP operations for queries Minimizing the number of AMPs in an operation, all other AMPs are freed to perform other tasks. Scalability is increased because the freed AMPs can execute more single or few AMP operations creating greater throughput by increasing the number of tactical queries executed in parallel Create Efficient All AMP Operations: Reduce the resource consumption on each AMP for all AMP operations. Since all AMP operations are virtually impossible to remove from a data warehousing environment, reducing the impact on each AMP is important Gain scalability for two reasons: Individual queries execute faster freeing the AMPs to execute other queries 25 Decrease in resource consumption allows more queries to use the shared resources, such as spool space, or CPU Miscellaneous DDL Triggers Triggers function just the same (pre- and post-, insert / update / delete) as in Teradata Oracle Options not in Teradata Database Links Reference (use) objects in another instance Synonyms Named references to objects Sequences Number generators E Often used to generate surrogate keys Hierarchies K L 26 ANALYZE/Collect Statistics ANALYZE/DBMS_STATS package are intended for collecting database object statistics for Oracle’s Cost Based Optimizer (CBO) Goal is to collect statistics give queries good access paths Once Plans are good and stable, stop analyzing tables to preserve plans Teradata’s Collect Statistics command collects database object statistics Optimizer reacts to changing demographics (growing tables, changing column value cardinality, etc.) Keeping statistics up to date ensures good plans Don’t freeze statistics - old statistics encourage old access plans that may not be effective as the database changes 27 Data Control Language Data Control Language (DCL) is used in both Oracle and Teradata to assign user access Oracle users may be granted: Object Privileges Easily relate to Teradata System Privileges Can be simulated within Teradata 28 User Access Rights Teradata enables you to assign access rights: To individual users To logical groups of users On individual objects On groups of objects Tables Views Macros 29 Object Privileges Granting Oracle Privileges Granting Teradata Privileges Grant Alter Grant Drop Grant Delete Grant Delete Grant Index Grant Index or Grant Drop Table Grant Select Grant Select Grant Insert Grant Insert Grant Update Grant Update Grant Execute Grant Execute Procedure/ Grant Execute 30 System Privileges Granting Oracle Privileges Granting Teradata Privileges Grant Create Table Grant Create Table or Grant Table Grant Alter Any Table Grant Create Table or Grant Table Grant Delete On Any Table Grant Delete on Database Grant Drop Any Table Grant Drop Table or Grant Table Grant Insert Any Table Grant Insert on Database Grant Update Any Table Grant Update on Database/Grant Update (column list, all but column list) Grant Select Any Table Grant Select on Database Grant Create View Grant Create View or Grant View on Database Grant Create View or Grant View on Database Grant Create Any View Grant Drop Any View Grant Drop View or Grant View on Database 31 Questions ? 32