Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
A TPC Benchmark for EII Servers
Dina Bitton
CALLIXA
An EII (Enterprise Information Integration)
Platform
To a Business end-user, an application developer, or
an application program, the enterprise data sources
appear to be a single, integrated, relational database.
VIRTUAL
DATABASE
SCHEMA
VIRTUAL
CUSTOMER
TABLE
VIRTUAL
TRADES
TABLE
VIRTUAL
ACCOUNTS
TABLE
VIRTUAL
CREDIT
TABLE
EII SERVER
ORACLE
ORACLE
XML
LET THE RACE BEGIN!
IBM Information Integrator
Oracle Data Hub
Callixa, Composite Software, Metamatrix
Benchmark wars anyone?
Benchmarking an EII Server
Strong
model, unify and deliver information from
Federated Functionality
across the enterprise, irrespective of data
type, ownership or location.
ETL
Server must provide:
•
DB Gateways
Weak
Performance / Scalability
•
•
•
Dynamic integration (not copy the
data)
Shared-nothing Parallelism
Load Balancing & High Availability
Non-intrusive operation
Strong
Modeling EII Capabilities
• Ability to model a virtual database defined
by “federating” heterogeneous databases
• Federation primitives to be modeled:
– Distributed, heterogeneous environment
– Metadata:
• Data transforms
• Replication
• Fragmentation
Integration Server Architecture
Client 1
Client 2
Client 3 Client 4
HUB 1
Data Agent 1
Client 5
HUB 2
Data Agent 2
Data Agent 3
HUB 3
Data Agent 4
Data Agent 5
Shared-nothing Distributed Architecture for Performance,
Scalability, and High Availability
 Process intercommunication - process-to-process socket communication
 Multithreading
 Pipelining
A TPC Benchmark for EII
• Test database: federation of 3 TPC-D test databases
distributed across Oracle, UDB and MS SQL Server
• Schema translation/Data transformation
• Replication
• Fragmentation
• 17 TPC-H queries including
– Single table with fragment elimination
– Single table with multiple fragments needed + Aggregate functions
– Simple cross-join with two single site tables More complicated cross-join
with all 3 fragments join without filtering to a single site table (large volume
data shipping)
– A 4-way join involving (Master/Detail) fragmented tables, a single-site table,
and a replicate (or redundant) table
3-Site Integration Server Deployment
4 CPU, 2 GB RAM
Solaris 2.7
Callixa
Callixa
Client
Client
Query
Server
4 CPU, 3 GB RAM
AIX 4.3.3
Data
Data
Agent
Agent
UDB 7.2
X GB TPCD
Site 1
4 CPU, 3 GB RAM
Solaris 2.8
Data
Agent
Sybase 12.0
X GB TPCD
Site 2
4 CPU, 3 GB RAM
Solaris 2.8
Data
Agent
Oracle 8.1.7
X GB TPCD
Site 3
Source Data Distribution
Federated Tables
Site 1 (UDB V7.1)
Site 2 (Sybase)
Site 3 (Oracle)
orders
250,000
orders
750,000
orders
500,000
lineitem3
lineitem
1,000,000
lineitem
3,000,000
lineitem
2,000,000
customer2
customer
25,000
customer
125,000
supplier
supplier
10,000
order3
partsupp
800,000
partsupp
part
200,000
part_r
part
200,000
Federated Database Setup:
1.
2.
3.
4.
5.
6.
Order3 is a fragmented orders table from 3 sites
Lineitem3 is a fragmented lineitem table from 3 sites
Customer2 is a fragemented customer table from 2 sites
Supplier is a supplier table at site 1
Partsupp is a part supplier table at site 3
Part_r is a redundant part table from site 1 and site 2
Related documents