Download Performance Analysis of Applying Replica Selection Technology for

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Bus (computing) wikipedia , lookup

Internet protocol suite wikipedia , lookup

Low-voltage differential signaling wikipedia , lookup

Airborne Networking wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

IEEE 1355 wikipedia , lookup

UniPro protocol stack wikipedia , lookup

Transcript
Performance Analysis of Applying Replica Selection
Technology for Data Grid Environments*
Chao-Tung Yang1,†, Chun-Hsiang Chen1, Kuan-Ching Li2, and Ching-Hsien Hsu3
1
High-Performance Computing Laboratory,
Department of Computer Science and Information Engineering,
Tunghai University, Taichung 40704, Taiwan
[email protected]
2
Parallel and Distributed Processing Center,
Department of Computer Science and Information Management,
Providence University, Taichung 43301, Taiwan
[email protected]
3
Department of Computer Science and Information Engineering,
Chung Hua University, Hsinchu 300, Taiwan
[email protected]
Abstract. The Data Grid enables the sharing, selection, and connection of a
wide variety of geographically distributed computational and storage resources
for solving large-scale data intensive scientific applications. Such technology
efficiently manage and transfer terabytes or even petabytes of data for dataintensive, high-performance computing applications in wide-area, distributed
computing environments. Replica selection process allows an application to
choose a replica from replica catalog, based on its performance and data access
features. In this paper, we build a Grid environment based on three existing PC
Cluster environments and perform performance analysis of data transfers using
GridFTP protocol over these systems. In addition, based on experimental
results, it is proposed a cost model to pick the best replica, in real and dynamic
network situations.
Keywords: Grid computing, Data Grid, Replica selection, Globus, GridFTP.
1 Introduction
Grid computing is utilization of many computers’ resources in a network to a single
problem at the same time - usually to a scientific or technical problem that requires a
great number of computer processing cycles or access to large amounts of data. A
Grid computing environment provides a platform for scientific applications and
physical experiments. A Grid is a large-scale virtual organization which resources are
shared in order to solve problems [4, 7, 9, 10, 11 12]. Grid computing is distributed
computing taken to the next evolutionary level. The goal is to create the vision of
*
This paper is supported in part by NSC Taiwan (National Science Council), under grants no.
NSC92-2213-E-029-025, NSC92-2119-M-002-024, NSC 93-2119-M-002-004 and NSC932213-E-029-026.
†
The corresponding author.
V. Malyshkin (Ed.): PaCT 2005, LNCS 3606, pp. 278 – 287, 2005.
© Springer-Verlag Berlin Heidelberg 2005
Performance Analysis of Applying Replica Selection Technology
279
large and powerful self-managing virtual computer, which is a huge collection of
connected heterogeneous systems. The emerging mechanism is resources sharing
through the availability of high bandwidth network. The “computational Grid” is a
term used to provider the users a better performance, especially in terms of speed and
throughput. The term “Data Grid” aggregate distributed resources to produce results
for large size problems. Most of these Data Grid applications are executed
simultaneously and access a large number of shared data files in Grid.
In certain data intensive scientific applications, such as high-energy physics,
bioinformatics applications and astrophysical virtual observatory, we confront with
huge amount of data. A Data Grid provides two essential basic services, which are a
secure, reliable, efficient data transport protocol and replica management [2]. The
high-speed transport protocol, GridFTP, extends the popular FTP protocol with some
new features required for Data Grid applications, such as partial file transfer and
third-party transfer [5]. The replica management service take advantage of replica
catalog with GridFTP transfer to provide for the creation, registration, location and
management of data replicas [1].
In this paper, we build a Grid environment based on three existing PC Cluster
environments and perform performance analysis of data transfers using GridFTP
protocol over these systems. In addition, based on experimental results, it is proposed
a cost model to pick the best replica, in real and dynamic network situations. In this
paper, we propose a cost model according to the three significant parameters: network
bandwidth, CPU load and I/O state. Although the network situation is constantly
changing and the storage equipments are busy or idle, we can use our cost model to
determine the best replica immediately. The replica selection can be conducted
accurately because our cost model is based on the system monitoring information that
update continuously.
2 Background Review
2.1 Globus Toolkit
The Globus Project [10, 11, 12] provides software tools that make it easier to build
computational Grids and Grid-based applications. These tools are collectively called
The Globus Toolkit. The Globus Toolkit is used by many organizations to build
computational Grids that can support their applications. The composition of the
Globus Toolkit can be pictured as three pillars: Resource Management, Information
Services, and Data Management. Each pillar represents a primary component of the
Globus Toolkit and makes use of a common foundation of security. GRAM
implements a resource management protocol, MDS implements an information
services protocol, and GridFTP implements a data transfer protocol. They all use the
GSI security protocol at the connection layer [8, 11, 12, 13].
2.2 NWS
The Network Weather Service (NWS) [16] is a generalized and distributed
monitoring system for producing short-term performance forecasts based on historical
performance measurements. The goal of the system is to dynamically characterize and
280
C.-T. Yang et al.
forecast the performance deliverable at the application level from a set of network and
computational resources. It is composed of three component processes:
− nws_nameserver: implements a naming and discovery service used to manage a
system of nws_sensor and nws_memory,
− nws_memory: provides persistent storage for the measurement data collected by the
NWS deployment,
− nws_sensor: gathers performance measurements from a specified resource and
communicates it to a set of nws_memory specified on the command line.
A typical installation would involve one nws_nameserver, one or more
nws_memory (which may reside on different machines), and a nws_sensor running on
each machine for which resources are to be monitored. The system includes sensors
for end-to-end TCP/IP performance (bandwidth and latency), available CPU
percentage, and available non-paged memory.
2.3 Sysstat Utilities
The Sysstat [15] utilities are a collection of performance monitoring tools for Linux
OS, which sysstat package contains the sar, mpstat, and iostat commands. The sar
command collects and reports system activity information. This information can also
be saved in a system activity file for future inspection. The iostat command reports
CPU statistics and I/O statistics for tty devices and disks. The statistics reported by
sar concern I/O transfer rates, paging activity, process-related activities, interrupts,
network activity, memory and swap space utilization, CPU utilization, kernel
activities, and tty statistics, among others. Both uniprocessor (UP) and Symmetric
multiprocessor (SMP) machines are fully supported.
3 Replica Selection
3.1 Replica Selection Scenario
The system established in this research used the following architecture. Figure 1
shows our proposed replica selection model, to show how a client identifies the best
location for a desired replica transfer.
At first, the client login at the site local site and execute parallel applications in the
Data Grid platform. This application checks the files are located in local site or not. If
they are present at the local site, the application accesses them immediately.
Otherwise, the application passes the logical file names to replica catalog server,
which returns a list of physical locations for all registered copies. The application
passes this list of replica locations to a replica selection server, which identifies the
destination locations of storage system for all candidate data transfer operations.
The replica selection server sends the possible destination locations to information
server, which provides the performance of measurements and predictions of three
system factors, as described in next section. According to these estimates, the replica
selection server chooses the best replica location and returns location information to
the parallel application, which receives the replica through GridFTP. Once finished
the application’s computation, the application returns the results to user.
Performance Analysis of Applying Replica Selection Technology
281
Fig. 1. Replica selection scenario
3.2 System Factors
We propose a replication selection model for Data Grid environments. In this
environment, we can treat a biological database as a replica of Data Grid. When we
execute large-scale data intensive applications in these environments, a site has both
data stores and computational capabilities. To determine the best database from many
of same replications is a significant problem. In our model, we consider three system
factors that affect the replica selection:
− Network bandwidth: Network bandwidth is one of the most significant factors in
Data Grid, since the size of a data file in Data Grid environment is usually very
large. In other words, the data file transfer time is tightly dependent on network
bandwidth situations. Because network bandwidth is unstable and dynamic factor,
we should often measure and predict it as most accurate as possible. NWS
(Network Weather Service) is a powerful toolkit for such purpose,
− CPU load: a Grid platform consists of a number of heterogeneous systems, built
with different system architectures, e.g., cluster platforms, supercomputers, PCs.
CPU load is a dynamic system factor, and if the CPU load of a system is heavy, it
will certainly affect the data file download process from this site. The measurement
of CPU status is done through the Globus Toolkit / MDS,
− I/O state: Data Grid nodes consist of different heterogeneous storage systems. The
size of data in Data Grid is huge. If I/O state of the site that we would like to
download file from is very busy, it will directly affect the data transfer
performance. We measure the I/O state using sysstat utilities.
3.3 Replica Selection Cost Model
The target function of a cost model for distributed and replicated data storage is the
score of information from information service. We listed different influencing factors
282
C.-T. Yang et al.
for our cost model in the previous section. However, we have to express these factors
within a mathematical notation for further analysis. We assume node I is the local site
which the user or application is logged in, while node j possesses the replica which
the user or application wanted. The seven system parameters in our replica selection
cost model are:
− Scorei − j : The score high or low represents the user or application acquiring the
replica effectively or not is from node I to node j,
− Pi −BWj : The percentage of bandwidth from node I to node j. In other words, the
current bandwidth divided the highest theoretical bandwidth,
− W BW : The weight of the network bandwidth defined by the administrator of the
Data Grid,
− PjCPU : The percentage of CPU idles of node j,
− W CPU : The weight of the CPU load defined by the administrator of the Data Grid,
− PjI / O : The percentage of I/O idles of node j,
− W I / O : The weight of the I/O state defined by the administrator of the Data Grid,
According to the given three system factors, we define the following general
formula as:
(1)
Scorei− j = Pi −BWj ⋅ W BW + PjCPU ⋅ W CPU + PjI / O ⋅ W I / O
In this formula, three influencing factors: W BW , W CPU , and W I / O , described as the
weights of network bandwidth, CPU, and I/O. These weights can be determined by
the administrator of the Data Grid organization. According to different attributes of
storage systems in Data Grid nodes, administrator can decide for different weights,
because some storage equipment does not affect CPU load. After several
experimental measurements, we consider that network bandwidth is the most
significant factor, influencing directly the data transfer time. When we perform data
transfer using GridFTP protocol, we discover that the CPU and I/O statuses slightly
affect the performance of data transfer. In our Data Grid environment, we define the
values as 80%, 10%, and 10%, respectively.
4 Experimental Environments and Results
In this section, there are experimental results using GridFTP protocol. First, we
measure and compare the FTP with GridFTP, as their file transfer time. Secondly, we
focused in the parallel data transfer in this paper, measuring and comparing the
GridFTP with 1, 2, 4, 8 and 16 TCP streams of file transfer time.
The Data Grid testbed consisting of three Linux PC clusters is built as:
− THU site: four PCs with dual AMD AthlonMP 2.0GHz processors, 1GB DDR
memory, 60GB HD, 1Gbps network bandwidth,
− Li-Zen site: four PCs with Intel Celeron 900MHz processor, 256MB DDR
memory, 10GB HD, 30 Mbps network bandwidth,
− HIT site: four PCs with Intel P4 2.8GHz processors, 512MB DDR memory, 80GB
HD, 1Gbps network bandwidth.
Performance Analysis of Applying Replica Selection Technology
283
Figure 2 shows the hardware and network configuration of our Data Grid testbed.
The THU site is located in Tunghai University, Taichung City; Li-Zen site is located
at Li-Zen High School, Taichung County, while HIT site is located in Hsiuping
Institute of Technology, Taichung County, all in Taiwan.
Fig. 2. Our Data Grid testbed
4.1 FTP Versus GridFTP
The Globus Project surveyed available protocols and technologies, implemented some
prototypes, and settled on using FTP and its existing extensions as a base, and then
extending it again to add missing required functionality. The Globus alliance propose
a common data transfer and access protocol named GridFTP that provides secure,
efficient data movement in Grid environments. This protocol, which extends the
standard FTP protocol, provides a superset of the features offered by the various Grid
storage systems currently in use.
In Grid environments, access to distributed data is typically as important as access
to distributed computational resources. Distributed scientific and engineering
applications require transfers of large amounts of data between storage systems, and
access to large amounts of data by many geographically distributed applications and
users for analyzing and visualization. We note that GridFTP protocol is extended
from FTP protocol, and suitable for Grid environments. Figure 3 shows the
performance of FTP and GridFTP by transferring four different file sizes. We
transferred these files (256, 512, 1024 and 2048 megabytes) from THU site alpha01 to
HIT site gridhit3 in our first experiment.
4.2 GridFTP with Parallel Data Transfer
Using multiple TCP streams can improve aggregate bandwidth over using a single
TCP stream in WAN environments. We apply this feature of GridFTP protocol to
transfer different sizes files in Data Grid environments. GridFTP (as well as normal
FTP) defines multiple wire protocols, or MODES, for the data channel. Most normal
284
C.-T. Yang et al.
FTP servers only implement stream mode, i.e., the bytes flow in order over a single
TCP connection. GridFTP defaults to this mode so that it is compatible with normal
FTP servers.
254.61
100
50
FTP
GridFTP
109.57
63.69
32.07
150
58.67
200
125.77
250
220.84
300
30.31
File Transfer Time (sec)
FTP versus GridFTP
0
256
512
1024
2048
File Sizes (MB)
Fig. 3. FTP versus GridFTP
However, GridFTP has another mode, called Extended Block Mode, or MODE E.
This mode sends the data over the data channel in blocks. Each block consists of 8
bits of flags, a 64 bit integer indicating the offset from the start of the transfer, and a
64 bit integer indicating the length of the block in bytes, followed by a payload of
length bytes. Because the offset and length are provided, out of order arrival is
acceptable, i.e., the 10th block could arrive before the 9th because you know explicitly
where it belongs. This allows us to use multiple TCP channels. If you use the
parallelism option, globus-url-copy automatically puts the servers into MODE E.
Note that parallel data transfer with one TCP stream is not the same as no parallel
data transfer at all. Both will use a single stream, but the default will use stream mode
and the parallel data transfer with one TCP stream will use mode E [12].
268.46
253.53
238.59
232.24
228.00
220.80
GridFTP with Parallel Data Transfer
200
100
50
66.89
62.28
60.34
58.77
56.54
56.32
150
GridFTP with no Parallel Data Transfer
GridFTP with 1 TCP Stream
GridFTP with 2 TCP Streams
GridFTP with 4 TCP Streams
140.50
134.51
130.09
126.24
116.58
111.17
250
29.08
28.52
28.08
28.89
28.83
28.99
File Transfer Time (sec)
300
GridFTP with 8 TCP Streams
GridFTP with 16 TCP Streams
0
256
512
1024
2048
File Sizes (MB)
Fig. 4. GridFTP with parallel data transfer
The parallelism option is used by the source data note to control how many parallel
data connections may be established to each destination data node. Figure 4 shows the
Performance Analysis of Applying Replica Selection Technology
285
performance of GridFTP transferring 256, 512, 1024 and 2048 megabytes files with 1,
2, 4, 8 and 16 TCP streams from THU site alpha02 to Li-Zen site lz04. According to
the experiment result, we observed that parallel data transfer technique showed better
performance for larger file sizes. Parallel data transfer really improves aggregate
bandwidth, with the establishment of multiple data channels.
4.3 Replica Selection Cost Model
According to the replica selection scenario in 3.1, a user logins the local site THU site
alpha1, and specifies the characteristics of the desired data and passes this attribute
description to replica catalog server. The replica catalog server queries its database
and produces a list of logical files that contain data with the specified characteristics.
The replica catalog server returns the information of physical locations for all
registered replicas of the desired logical files. In this experiment, there is only one
logical file, file-a, conform to user’s request, and the size of file-a is 1024 megabytes.
Table 1. The value of replica selection cost model and file transfer time
alpha1
Alpha4
hit0
lz02
Pi −BWj
88.25
29.09
20.91
PjCPU
98.67
99.56
98.33
2.88
100.00
3.78
80.76
101.9
43.228
128.09
26.939
164.99
I /O
j
P
Replica Selection Cost model
Practical Data transfer time
(a)
(b)
Fig. 5. GUI of replica selection cost model program
286
C.-T. Yang et al.
Next, the user passes this list of replica locations to the replica selection server,
which identifies the destination storage system locations for all candidate data transfer
operations. There are three replicas mapping to the logical file file-a. These three
replicas are individually located at different sites, alpha4, hit0, and lz02. The replica
selection server sends the candidate destination locations to the information server
[17], which provide the three system factors mentioned in 3.2. Based on the replica
cost model referred in 3.3, the replica selection server chooses the best replica and
transfers it to the local site alpha1 by GridFTP. Table 1 shows the values of system
factors and the scores of the replica selection cost model, and the physical file transfer
time. According to discussions given in 3.3, we implemented a replica selection cost
model computer program. We also executed the program in our Data Grid testbed.
Because the program is developed using Java programming language, we can execute
it in any computing platform with JVM. Fig. 5(a) shows costs that are calculated
based on the three system factors (the percentage of CPU idle, I/O idle and bandwidth
from other sites) to alpha1. Figure 5(b) displays the average value based on the
selected time scale, which is adjustable on the top scroll bar. We also can get the sort
list of the costs by clicking the “Cost” button.
5 Conclusions and Future Work
In this paper, we have presented the design and implementation of two fundamental
services. The GridFTP protocol was extended from FTP protocol, and it provides
beneficial features. In this research paper, we focused in parallel data transfer issues.
After measuring the performance of GridFTP with parallel data transfer feature, we
confirm that such technology improves data transfer. After measuring the
performance of FTP and GridFTP with four different file sizes, we could observe that
even file size is 2 gigabytes; the data transfer time is similar. However, we measured
the performance of GridFTP with 1, 2, 4, 8 and 16 TCP streams. We are sure that the
parallel data transfer technology efficiently saves data transfer time. After calculating
the score of replica selection cost model, we can sort a list of replicas from the most
efficient replica to worst one. Therefore, our cost model can provide users or
applications the best choice mechanism for replica selection.
As future work, there are three investigations will be carried out from this research.
First, although we have employed the parallel data transfer feature to improve the
performance of data transfer, there is another striped data transfer feature that can
improve aggregate bandwidth. Second, we will consider how to determine the system
factors weight and refer to more system factors in the replica selection cost model.
Third and last one, we will extend our Data Grid testbed for analyzing the
performance of replica selection in a dynamic and larger number of sites environment.
References
1. B. Allcock, J. Bester, J. Bresnahan, A. Chervenak, I. Foster, C. Kesselman, S. Meder, V.
Nefedova, D. Quesnal, S. Tuecke, “Data Management and Transfer in High Performance
Computational Grid Environments,” Parallel Computing, Vol. 28 (5), pp. 749-771, May
2002.
Performance Analysis of Applying Replica Selection Technology
287
2. B. Allcock, J. Bester, J. Bresnahan, A. Chervenak, I. Foster, C. Kesselman, S. Meder, V.
Nefedova, D. Quesnel, S. Tuecke, “Secure, Efficient Data Transport and Replica
Management for High-Performance Data-Intensive Computing,” IEEE Mass Storage
Conference, 2001.
3. B. Allcock, S. Tuecke, I. Foster, A. Chervenak, and C. Kesselman, “Protocols and Services
for Distributed Data-Intensive Science,” ACAT2000 Proceedings, pp. 161-163, 2000.
4. K. Czajkowski, S. Fitzgerald, I. Foster and C. Kesselman, “Grid Information Services for
Distributed Resource Sharing,” Proceedings of the Tenth IEEE International Symposium
on High-Performance Distributed Computing (HPDC-10), IEEE CS Press, August 2001.
5. K. Czajkowski, I. Foster, N. Karonis, C. Kesselman, S. Martin, W. Smith and S. Tuecke,
“A Resource Management Architecture for Metacomputing Systems,” Proc. IPPS/SPDP
‘98 Workshop on Job Scheduling Strategies for Parallel Processing, pp. 62-82, 1998.
6. R. L. De, C. Costa and S. Lifschitz, “Database Allocation Strategies for Parallel BLAST
Evaluation on Clusters”, Proceedings of the Distributed and Parallel Databases, Vol. 13,
Issue1, pp. 99-127, Hingham, MA, USA, January 2003.
7. I. Foster, “The Grid: A New Infrastructure for 21st Century Science,” Physics Today,
55(2):42-47, 2002.
8. I. Foster, C. Kesselman, “Globus: A Metacomputing Infrastructure Toolkit,” Intl J.
Supercomputer Applications, 11(2):115-128, 1997.
9. I. Foster and C. Kesselman, The Grid: Blueprint for a New Computing Infrastructure,
Morgan-Kaufmann, 1999.
10. I. Foster, C. Kesselman and S. Tuecke, “The Anatomy of the Grid: Enabling Scalable
Virtual Organizations,” Intl J. Supercomputer Applications, 15(3), 2001.
11. Global Grid Forum, http://www.ggf.org/
12. The Globus Project, http://www.globus.org/
13. Introduction to Grid Computing with Globus, http://www.ibm.com/redbooks/
14. SETI@home: Search for Extraterrestrial Intelligence at home, http://setiathome.ssl.
berkeley. edu/
15. SYSSTAT utilities home page, http://perso.wanadoo.fr/sebastien.godard/
16. R. Wolski, N. Spring and J. Hayes, “The Network Weather Service: A Distributed
Resource Performance Forecasting Service for Metacomputing,” Journal of Future
Generation Computing Systems, Vol. 15, No. 5-6, pp. 757-768, October 1999.
17. X. Zhang, J. Freschl, and J. Schopf, “A Performance Study of Monitoring and Information
Services for Distributed Systems,” Proceedings of HPDC, August 2003.