Download Performance Measurements of a User

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Performance Measurements
of a User-Space DAFS Server
with a Database Workload
Samuel A. Fineberg
Don Wilson
NonStop Labs
© 2003 Hewlett-Packard Development Company, L.P.
The information contained herein is subject to change without
Outline
•
•
•
•
•
•
Background on DAFS and ODM
Prototype client and server
I/O tests performed
Raw benchmark results
Oracle TPC-H results
Summary and conclusions
August 27, 2003
Fineberg and Wilson NICELI Presentation
page 2
What is the Direct Access File System (DAFS)?
•
Created by the DAFS Collaborative
–
–
•
DAFS is a distributed file access protocol
–
–
•
Data requested from files, not blocks
Based loosely on NFSv4
Optimized for local file sharing environments
–
–
•
Group consisting of over 80 members from industry, government, and
academic institutions
DAFS 1.0 spec was approved in September 2001
Systems are in relatively close proximity
Connected by a high-speed low-latency network
Built on top of direct-access transport networks
–
–
Initially targeted at Virtual Interface Architecture (VIA) networks
Direct Access Transport (DAT) API was later generalized and ported to
other networks (e.g., Infiniband, iWarp)
August 27, 2003
Fineberg and Wilson NICELI Presentation
page 3
Characteristics of a Direct Access
Transport
•
•
Connected model, i.e., VIs must be connected before
communication can occur
Two forms of data transport
–
–
•
Both transports support direct data placement
–
•
Send/receive – two-sided
RDMA read and write – one sided
Receives must be pre-posted
Memory regions must be “registered” before they can
be transferred through a DAT
–
–
Pins data in physical memory
Establishes VM nslation tables for the NIC
August 27, 2003
Fineberg and Wilson NICELI Presentation
page 4
DAFS Details
•
Session based
–
–
•
RPC-like Command format
–
–
–
•
Implemented with send/receive
Server “receives” requests “sent” from clients
Server “sends” responses to be “received” by client
Open/Close
–
•
DAFS clients initiate sessions with a server
DAT/VIA connection is associated with a session
Unlike NFSv2, files must be open and closed (not
stateless)
Read/Write I/O “modes”
–
–
Inline: limited amount of data included in request/response
Direct: Server initiates RDMA read or write to move data
August 27, 2003
Fineberg and Wilson NICELI Presentation
page 5
Inline vs. Direct I/O
Inline
Direct
Client
Server
Time
Client
Server
Request
Request
Direct
Inline
disk read
Read
write
or write
RDMA read
disk write
Response
or write
Request
Response
Direct
read
disk read
RDMA write
Response
August 27, 2003
Fineberg and Wilson NICELI Presentation
page 6
Oracle Disk Manager (ODM)
•
File access interface spec for the Oracle Database
–
–
–
•
Supported as a standard feature in Oracle 9i
Implemented as a vendor supplied DLL
Files that can not be opened using ODM use standard
APIs
Basic commands
–
Files are created and pre-allocated then committed
– Files are then “identified” (open) and “unidentified”
(closed)
– All r/w I/O uses an asynchronous “odm_io” command
• I/Os specified as descriptors, multiple per odm_io call
–
–
Multiple waiting mechanisms: wait for specific, wait for any
Other commands are synchronous, e.g., resizing
August 27, 2003
Fineberg and Wilson NICELI Presentation
page 7
Prototype Client/Server
•
DAFS Server
–
Implemented for Windows 2000 and Linux (all testing was
on Windows)
– Built on VIPL 1.0 using DAFS 1.0 SDK protocol stubs
– All data buffers are pre-allocated and pre-registered
– Data-driven multithreaded design
•
ODM Client
–
–
–
–
Implemented as a Windows 2000 dll for Oracle 9i
Multithreaded to enable decoupling of asynchronous I/O
from Oracle threads
Inline buffers are copied, direct buffers are
registered/deregistered as part of the I/O
Inline/direct threshold (set when library is initialized)
August 27, 2003
Fineberg and Wilson NICELI Presentation
page 8
Test System Configuration
•
•
Goal was to compare local I/O with DAFS
Local I/O configuration
–
•
DAFS/ODM I/O configuration
–
–
•
Single system running Oracle on locally attached disks
One system running DAFS server software with locally
attached disks
Second system running Oracle and ODM client, files on
DAFS server accessed using ODM over a network
4-processor Windows 2000 server based systems
–
–
–
500MHz Xeon, 3GB RAM, dual-bus PCI 64/33
ServerNet II (VIA 1.0 based) System Area Network
Disks were 15K RPM attached by two PCI RAID
controllers, configured for RAID 1/0 (mirrored-striped)
August 27, 2003
Fineberg and Wilson NICELI Presentation
page 9
Experiments
•
Raw I/O Tests
–
–
–
–
•
Odmblast – streaming I/O test
Odmlat – I/O latency test
DAFS tests used ODM dll to access files on DAFS server
Local tests used special local ODM library built on
Windows unbuffered I/O
Oracle database test
–
–
–
–
Standard TPC-H benchmark
SQL based decision support code
DAFS tests used ODM dll to access files on DAFS server
Local tests used ran without ODM (Oracle uses windows
unbuffered I/O directly)
August 27, 2003
Fineberg and Wilson NICELI Presentation
page 10
Odmblast
•
ODM based I/O stress test
–
–
•
In our experiments, Odmblast streams 32 I/Os to server
–
–
•
•
Intended to present a continuous load to the I/O system
Issues many simultaneous I/Os (to allow for pipelining)
16 I/Os per odm_io call
wait for I/Os from the previous odm_io call
I/Os can be reads, writes, or a random mix
I/Os can be at sequential or random offsets
August 27, 2003
Fineberg and Wilson NICELI Presentation
page 11
Odmblast read comparison
Bandwidth (MB/sec)
250.0
200.0
Local Seq Rd
Local Rand Rd
DAFS Seq Rd
DAFS Rand Rd
150.0
100.0
50.0
0.0
0
200000
400000
600000
800000
1000000
I/O Size (bytes)
August 27, 2003
Fineberg and Wilson NICELI Presentation
page 12
Bandwidth (MB/sec)
Odmblast write comparison
100.0
90.0
80.0
70.0
60.0
50.0
40.0
30.0
20.0
10.0
0.0
Local Seq Wr
Local Rand Wr
DAFS Seq Wr
DAFS Rand Wr
0
200000
400000
600000
800000
1000000
I/O Size (bytes)
August 27, 2003
Fineberg and Wilson NICELI Presentation
page 13
Odmlat
•
I/O Latency test
–
How long does a single I/O take
• (not necessarily related to aggregate I/O rate)
–
–
–
For these experiments, <16K = inline, ≥ 16K = direct
Derived the components that make up I/O time using
linear regression
More details in paper
August 27, 2003
Fineberg and Wilson NICELI Presentation
page 14
Odmlat performance
Time per Operation
(microseconds)
3000.0
Read Time
Write Time
2000.0
1000.0
0.0
0
16384
32768
49152
65536
Bytes per I/O Operation
August 27, 2003
Fineberg and Wilson NICELI Presentation
page 15
Oracle-based results
•
Standard Database Benchmark - TPC-H
–
–
–
–
•
Oracle configuration
–
–
–
•
Written in SQL
Decision support benchmark
Multiple ad-hoc query streams with an “update thread”
30GB database size
All I/Os 512-byte aligned (required for unbuffered I/O)
16K database block size
Database files distributed across two NTFS file systems
Measurements
–
–
–
Compared average runtime for local vs. DAFS based I/O
Skipped official “TPC-H power” metric
Varied inline/direct threshold for DAFS based I/O
August 27, 2003
Fineberg and Wilson NICELI Presentation
page 16
Time (Hrs:Min)
Oracle TPC-H Performance
17:13
13:23
local
August 27, 2003
DAFS 16kdirect
Fineberg and Wilson NICELI Presentation
14:39
DAFS 16kinline
page 17
Oracle TPC-H Operation Distribution
16 KByte Write
0.3%
>16 KByte Read
19.1%
16 KByte Read
79.4%
>16 KByte Write
1.1%
August 27, 2003
Fineberg and Wilson NICELI Presentation
page 18
Oracle TPC-H CPU Utilization
Dafs Client (inline I/O)
Local I/O
100
DAFS Server (inline I/O)
DAFS Client (direct I/O)
DAFS Server (direct I/O)
90
80
% CPU Used
70
60
50
40
30
20
10
0
0
200
400
600
800
1000
1200
Elapsed Time (mins)
August 27, 2003
Fineberg and Wilson NICELI Presentation
page 19
TPC-H Summary
•
Local I/O still faster
–
–
–
•
DAFS still has more capabilities than local I/O
–
•
Limited ServerNet II bandwidth
Memory registration or copying overhead
Windows unbuffered I/O is already very efficient
Capable of cluster I/O (RAC)
Memory registration is still a problem with DATs
–
Registration caching can be problematic
• Can not guarantee address mappings will not change
• ODM has no means for notifying NIC of mapping changes
–
Need either better integration of I/O library with Oracle or
better integration of OS with DAT
• Transparency is expensive!
August 27, 2003
Fineberg and Wilson NICELI Presentation
page 20
Conclusions
•
DAFS Server/ODM Client achieved performance close to the
limits of our network
–
•
Running a database benchmark, DAFS TPC-H performance was
within 10% of local I/O
–
•
Also provides advantages of a network file system (i.e., clustering
support)
Limitations of our tests
–
–
•
Local SCSI I/O was still faster
ServerNet II bandwidth was inadequate – no support for multiple NICs
Needed to do client-side registration for all direct I/Os
TPC-H benchmark was not optimally tuned
–
Needed to bring client CPU closer to 100%
• More disks, less CPUs, other tuning
–
CPU offload is not a benefit if I/O is the bottleneck
August 27, 2003
Fineberg and Wilson NICELI Presentation
page 21
HP logo
Related documents