Download Investigating Network Performance – A Case Study

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Airborne Networking wikipedia , lookup

Net bias wikipedia , lookup

Network tap wikipedia , lookup

Internet protocol suite wikipedia , lookup

Point-to-Point Protocol over Ethernet wikipedia , lookup

TCP congestion control wikipedia , lookup

RapidIO wikipedia , lookup

Deep packet inspection wikipedia , lookup

Serial digital interface wikipedia , lookup

Wake-on-LAN wikipedia , lookup

Asynchronous Transfer Mode wikipedia , lookup

IEEE 1355 wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

I²C wikipedia , lookup

IEEE 802.11 wikipedia , lookup

Throughput wikipedia , lookup

UniPro protocol stack wikipedia , lookup

Real-Time Messaging Protocol wikipedia , lookup

Transcript
Investigating Network
Performance – A Case Study
Ralph Spencer, Richard Hughes-Jones, Matt
Strong and Simon Casey
The University of Manchester
G2 Technical Workshop, Cambridge, Jan
2006
Very Long Baseline Interferometry
eVLBI – using the Internet for data
transfer
GRS 1915+105: 15 solar mass BH in an X-ray binary: MERLIN observations
receding
600 mas = 6000 A.U. at 10 kpc
Sensitivity in Radio
Astronomy
• Noise level
 1 / Bt
• B=bandwidth, t=integration time.
• High sensitivity requires large
bandwidths as well as large
collecting area e.g Lovell, GBT,
Effelsberg, Camb. 32-m
• Aperture synthesis needs signals
from individual antennas to be
correlated together at a central
site
• Need for interconnection data
rates of many Gbit/sec
New Instruments are making the
best use of bandwidth:
• eMERLIN 30 Gbps
• Atacama Large mm Array
(ALMA) 120 Gbps
• EVLA 120 Gbps
• Upgrade to European VLBI:
eVLBI 1 Gbps
• Square Km Array (SKA)
many Tbps
The European VLBI Network
EVN
• Detailed radio imaging
uses antenna networks
over 100s-1000s km
• Currently use disk
recording at 512Mb/s
(Mk5)
• real-time connection
allows greater
– response
– reliability
– sensitivity
– Need Internet
eVLBI
EVN-NREN
Gbit link
Chalmers
University
of
Technolog
y,
Gothenbur
g
Onsala
Sweden
Gbit link
Torun
Poland
Jodrell Bank
UK
Dedicate
d
Gbit link
MERLIN
Cambridge
UK
Medicina
Italy
Dwingeloo
DWDM
link
Westerbork
Netherlands
Testing the Network for eVLBI
Aim is to obtain maximum BW compatible
with VLBI observing systems in Europe and
USA.
First sustained data flow tests in Europe:
iGRID 2002
24-26 September 2002
Amsterdam Science and Technology Centre (WTCW)
The Netherlands
“ We hereby challenge the international research
community to demonstrate applications that benefit from
huge amounts of bandwidth! ”
iGRID2002 Radio Astronomy
VLBI Demo.
• Web based demonstration sending VLBI data
– A controlled stream of UDP packets
– 256-500 Mbit/s
• production network Man –Superjanet  Geant --Amsterdam
• Dedicated lambda Amsterdam  Dwingeloo
The Works:
TCP Control
Raid0
Disc
RingBuffer
RingBuffer
UDP Data
Web Interface
n bytes

Wait time
time
Raid0
Disc
UDP Throughput on the
Production WAN
50 bytes
100 bytes
200 bytes
400 bytes
UDP Man-UvA Gig 19 May 02
 Manc-UvA SARA 750 Mbit/s
 SJANET4 + Geant +
SURFnet
 75% Manchester Access link
Recv Wire rate Mbits/s
1000
900
600 bytes
800 bytes
1000 bytes
1200 bytes
1472 bytes
800
700
600
500
400
300
200
100
0
0
5
10
15
20
25
30
35
40
Transmit Time per frame us
50 bytes
100 bytes
200 bytes
400 bytes
600 bytes
800 bytes
1000 bytes
1200 bytes
1400 bytes
1472 bytes
UDP Man-UvA Gig 28 Apr 02
1000
 Manc-UvA SARA 825 Mbit/s
Recv Wire rate Mbits/s
900
800
700
600
500
400
300
200
100
0
0
5
10
15
20
Transmit Time per frame us
25
30
35
40
How do we test the network?
• Simple connectivity test from Telescope site to correlator
(at JIVE, Dwingeloo, The Netherlands, or MIT Haystack
Observatory,
Massachusetts)
: traceroute,
Mk
5’s are 1.2 GHz
P3’s with Streamstore
cardsbwctl
8-pack exchangeable
disks,
Tbytes
storage. iPERF
• and
Performance
of link and
end1.3
hosts:
UDPmon,
of 1 data
Gbpstests
continuous
recording
playback.
• Capable
Sustained
vlbiUDP
(underand
development)
Made by Conduant, Haystack design.
• True eVLBI data from Mk5 recorder: pre-recorded
(Disk2Net) or Real Time (Out2Net)
Telescope connections
Westerbork
Netherlands
Onsala
Sweden
1Gb/s
1Gb/s
??
?
??
Effelsberg
Germany
end 06
2* 1G
155Mb/s
1Gb/s
JIVE
Jodrell Bank
UK
eMERLIN
Cambridge
UK
Torun
Poland
1Gb/s light now
Medicina
Italy
eVLBI Milestones
•
•
•
•
January 2004: Disk buffered eVLBI session:
• Three telescopes at 128Mb/s for first eVLBI
image
• On – Wb fringes at 256Mb/s
April 2004: Three-telescope, real-time eVLBI session.
• Fringes at 64Mb/s
• First real-time EVN image - 32Mb/s.
September 2004: Four telescope real-time eVLBI
• Fringes to Torun and Arecibo
• First EVN, eVLBI Science session
January 2005: First “dedicated light-path” eVLBI
• ??Gbyte of data from Huygens descent
transferred from Australia to JIVE
• Data rate ~450Mb/s
• 20 December 20 2004
• connection of JBO to Manchester by
2 x 1 GE
• eVLBI tests between Poland Sweden
UK and Netherlands
at 256 Mb/s
•
February 2005
• TCP and UDP memory – memory tests at rates up to 450 Mb/s
(TCP) and 650 Mb/s (UDP)
• Tests showed inconsistencies betweeb Red Hat kernals,
rates of 128 Mb/s only obtained on 10 Feb
• Haystack (US) – Onsala (Sweden) runs at 256 Mb/s
• 11 March 2005 Science demo
• JBO telescope winded off, short run on calibrator source done
Summary of EVN eVLBI tests
• Regular tests with eVLBI Mk5 data every ~6
weeks
– 128 Mpbs OK, 256 Mpbs often,
– 512 Mbps Onsala – Jive occasionally
– but not JBO at 512 Mbps – WHY NOT?
(NB using Jumbo packets 4470 or 9000 bytes)
• Note correlator can cope with large error rates
– up to ~ 1 %
– but need high throughput for sensitivity
– implications for protocols, since throughput on TCP
is very sensitive to packet loss.
UDP Throughput Oct-Nov 2003
Manchester-Dwingeloo Production
Throughput vs packet spacing
Manchester: 2.0G Hz Xeon
Dwingeloo: 1.2 GHz PIII
Near wire rate, 950 Mbps
UDPmon
Gnt5-DwMk5 11Nov03/DwMk5-Gnt5 13Nov03-1472bytes
1200
1000
Recv Wire rate Mbits/s





Gnt5-DwMk5
DwMk5-Gnt5
800
600
400
200
0
0
5
10
15
20
25
Spacing
between
frames
us
Gnt5-DwMk5 11Nov03-1472 bytes
12
35
40
10
% Packet loss
 Packet loss
30
8
Gnt5-DwMk5
DwMk5-Gnt5
6
4
2
0
0
 CPU Kernel Load receiver
 4th Year project
 Adam Mathews
 Steve O’Toole
25
30
35
40
Gnt5-DwMk5 11Nov03 1472 bytes
% Kernel
Sender
10
15
20
Spacing between frames us
0
5
10
15
20
Spacing between frames us
0
5
10
15
20
Spacing between frames us
100
80
60
40
20
0
25
30
35
40
Gnt5-DwMk5 11Nov03 1472 bytes
% Kernel
Receiver
 CPU Kernel Load sender
5
100
80
60
40
20
0
25
30
35
40
ESLEA
• Packet loss will cause low
throughput in TCP/IP
• Congestion will result in routers
drooping packets: use Switched
Light Paths!
• Tests with MB-NG network Jan-Jun
05
• JBO connected to JIVE via UKLight
in June (thanks to John Graham,
UKERNA)
• Comparison tests between UKLight
connections JBO-JIVE and
production (SJ4-Geant)
Project Partners
Project Collaborators
The Council for the Central Laboratory
of the Research Councils
Funded by
£1.1 M, 11.5 FTE
EPSRC GR/T04465/01
www.eslea.uklight.ac.uk
UKLight Switched light path
•
Throughput as a function of interpacket spacing (2.4 GHz dual
Xeon machines)
Recv Wire rate Mbit/s
Tests on the UKLight switched lightpath Manchester : Dwingeloo
gig03-jiveg1_UKL_25Jun05
1000
900
800
700
600
500
400
300
200
100
0
100 bytes
200 bytes
400 bytes
600 bytes
800 bytes
1000 bytes
1200 bytes
1400 bytes
1472 bytes
0
Packet loss for small packet size
• Maximum size packets can reach
full line rates with no loss, and
there was no re-ordering (plot not
shown).
% Packet loss
•
50 bytes
10
20
Spacing between frames us
30
40
50 bytes
gig03-jiveg1_UKL_25Jun05
100
10
1
100 bytes
200 bytes
400 bytes
0.1
0.01
0.001
0.0001
600 bytes
800 bytes
1000 bytes
1200 bytes
0
10
20
30
Spacing between frames us
40
1400 bytes
1472 bytes
Tests on the production network
Manchester : Dwingeloo.
•
Small (0.2%) packet loss
was seen
% Packet loss
• Throughput
100 bytes
200 bytes
400 bytes
0.1
0.01
0.001
0.0001
600 bytes
800 bytes
1000 bytes
1200 bytes
0
•
Re-ordering of packets
was significant
50 bytes
gig6-jivegig1_31May05
100
10
1
10
20
Spacing between frames us
30
40
1400 bytes
1472 bytes
UKLight using Mk5 recording terminals
e-VLBI at the GÉANT2 Launch Jun 2005
Jodrell Bank
UK
Medicina
Italy
Dwingeloo
DWDM link
Torun
Poland
UDP Performance: 3 Flows on GÉANT
•
Throughput: 5 Hour run
1500 byte MTU
Jodrell: JIVE
2.0 GHz dual Xeon – 2.4 GHz dual Xeon
670-840 Mbit/s
800
600
400
200
0
0
Medicina (Bologna): JIVE
800 MHz PIII – Mk5 (623) 1.2 GHz PIII
330 Mbit/s limited by sending PC
•
Torun: JIVE
2.4 GHz dual Xeon – Mk5 (575) 1.2 GHz
PIII
245-325 Mbit/s limited by security policing
(>600Mbit/s  20 Mbit/s) ?
•
•
Throughput: 50 min period
Period is ~17 min
500
1000
Time 10s steps
1500
2000
Jodrell
Medicina
Torun
BW 14Jun05
1000
Recv wire rate Mbit/s
•
Jodrell
Medicina
Torun
BW 14Jun05
1000
Recv wire rate Mbit/s
•
800
600
400
200
0
200
250
300
350
Time 10s steps
400
450
500
18 Hour Flows on UKLight
Jodrell – JIVE, 26 June 2005
•
Throughput:
Jodrell: JIVE
2.4 GHz dual Xeon –
2.4 GHz dual Xeon
960-980 Mbit/s
Traffic through SURFnet
man03-jivegig1_26Jun05
Recv wire rate Mbit/s
•
•
w10
1000
800
600
400
200
0
0
1000
2000
3000
4000
5000
6000
7000
Recv wire rate Mbit/s
Time 10s steps
man03-jivegig1_26Jun05
1000
990
980
970
960
950
940
930
920
910
900
5000
•
Packet Loss
– Only 3 groups with 10-150 lost
packets each
– No packets lost the rest of the
time
5050
5100
5150
5200
Time 10s
man03-jivegig1_26Jun05
w10
1000
Packet Loss
•
w10
100
10
1
Packet re-ordering
– None
0
1000
2000
3000
Time 10s steps
4000
5000
6000
7000
Recent Results 1:
• iGRID 2005 and SC 2005
– Global eVLBI demonstration
– Achieved 1.5 Gbps across Atlantic using UKLight
– 3 VC-3-13c ~700 Mbps SDH links carrying data
across the Atlantic from Onsala, JBO and
Westerbork telescopes
– 512 Mps K4 – Mk5data from Japan to USA
– 512 Mbs Mk5 real time interferometry between
Onsala, Westford, Maryland Point antennas
correlated at Haystack observatory
– Used VLSR technology from DRAGON project in
US to set up light paths.
<JBO Mk2 Westerbork array>
Onsala 20-m
Kashima 34-m >
Recent results 2:
• Why can Onsala achieve 512 Mbps from Mk5 to Mk5 even transatlantic?
– Identical Mk5 to JBO
– Longer link
• iperf TCP JBO Mk5 to Man. rtt ~1ms 4420 byte packets get 960 Mpbs
shows 94.7% kernel usage and 1.5% idle
• iperf TCP JBO Mk5 to JIVE rtt ~15ms 4420 byte packets get 777 Mpbs
shows 96.3% kernel usage and 0.06% idle – no cpu left!
• Likelihood is that Onsala Mk 5 marginally faster cpu – at critical point for
512 Mbps transmission
• Solution – better motherboards for Mk5’s – about 40 machines to upgrade!
0
1
2
trial
3
4
5
kernel
% CPU mode
% CPU kernel
5
4.5
4
3.5
3
2.5
2
1.5
1
0.5
0
user
nice
idle
no CPU load
1000
900
800
700
600
500
400
300
200
100
0
0
mk5-606-g7_10Dec05
Throughput Mbit/s
mk5-606-jive_9Dec05
100
90
80
70
60
50
40
30
20
10
0
2
4
6
8
10
12
nice large value - low priority
Not much wrong with the networks!
14
16
18
20
The Future:
• Regular eVLBI tests in EVN continue
• Testing Mk5 SuperStor interface <-> network
interaction
• Test upgraded Mk5 recording devices
• Investigate alternatives to TCP/UDP – DCCP,
vlbiUDP, tsunami, etc.
• ESLEA comparing UKLight with production
• EU’s EXPReS eVLBI project starts March 2006
– Connection of 100-m Effelsberg telescope in 2006
– Protocols for distributed processing
– Onsala-JBO correlator test link at 4 Gbps in 2007
• eVLBI will become routine in 2006!
VLBI Correlation: GRID Computation task
Controller/Data
Concentrator
Processing Nodes
Questions ?