Download Fastpass

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Lag wikipedia , lookup

Piggybacking (Internet access) wikipedia , lookup

IEEE 1355 wikipedia , lookup

Drift plus penalty wikipedia , lookup

Zero-configuration networking wikipedia , lookup

Asynchronous Transfer Mode wikipedia , lookup

Multiprotocol Label Switching wikipedia , lookup

IEEE 802.1aq wikipedia , lookup

List of wireless community networks by region wikipedia , lookup

RapidIO wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Computer network wikipedia , lookup

Network tap wikipedia , lookup

Net bias wikipedia , lookup

TCP congestion control wikipedia , lookup

Airborne Networking wikipedia , lookup

Distributed firewall wikipedia , lookup

Wake-on-LAN wikipedia , lookup

Quality of service wikipedia , lookup

Deep packet inspection wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

Throughput wikipedia , lookup

Transcript
A Centralized ”Zero-Queue” Datacenter Network
Jonathan Perry, Amy Ousterhout, Hari Balakrishnan, Devavrat Shah, Hans Fugal
M.I.T.Computer Science & Artificial Intelligence Lab
Facebook
1
Current Situation
• Current datacenter architectures
1. distribute packet transmission decisions among the endpoints
(“congestion control”)
2. path selection among the network’s switches (“routing”)
• Advantage : fault-tolerance and scalability
• Disadvantage : a loss of control over packet delays and paths taken.
2
Fastpass
• Fastpass : a datacenter network architecture that each packet’s timing be
controlled by a logically centralized arbiter, which also determines the
packet’s path.
• Objectives
1. no queue and low latency
2. high throughput
3. fairness and quick convergence
3
Fastpass Components
1. Timeslot allocation algorithm: at the arbiter to determine when each
endpoint’s packets should be sent.
2. Path assignment algorithm: at the arbiter to assign a path to each packet.
3. Fastpass Control Protocol (FCP) : A replication strategy for the central
arbiter to handle network and arbiter failures, as well as a reliable control
protocol between endpoints and the arbiter.
4
Fastpass Architecture
• In Fastpass, a logically centralized arbiter controls all network Transfers.
• The arbiter processes each request, performing two functions(fig.2):
1. Timeslot allocation
2. Path selection
5
Timeslot Allocation
• Goal : choose a matching of endpoints in each timeslot
• Organized tiers:
1. top-of-rack switches: connect servers in a rack
2. aggregation switches: connect racks into clusters
3. core routers: connect clusters
• RNB(rearrangeably non blocking) property allows the arbiter to perform
timeslot allocation separately from path selection
6
Timeslot Allocation
Allocation algorithm
• How fast?
1. In heavily-loaded networks
2. In lightly-loaded networks
• Algorithm: arbiter greedily allocates a pair if allocating the pair does not
violate bandwidth constraints. When it finishes processing all demands, it
has a maximal matching, a matching in which none of the unallocated
demands can be allocated while maintaining the bandwidth constraints.
7
Timeslot Allocation
A pipelined allocator
• A pipelined allocator(fig.3)
• Max-min fairness(fig.4)
• Reordering problem
• Reduce the overhead of processing demands
8
Path Selelction
• Goal
1. No link is assigned multiple packets in a single timeslot
2. No queueing within the network
• Algorithm
1. Edge-coloring(fig.5)
2. Fast edge-coloring
9
Handing Faults
1. failures of the Fastpass arbiter
2. failures of in-network components (nodes, switches and links)
3. packet losses on the communication path between endpoints and the
arbiter
10
Implementation
• Arbiter : use Intel DPDK(Data Plane Development Kit), a framework that
allows direct access to NIC queues from user space
• Endpoints: need a Linux kernel module that queues packets while
requests are sent to the arbiter
• FCP : a Linux transport protocol
• A Fastpass qdisc (queue discipline)
• Obviate the need for TCP congestion control, not modify TCP
• Greedily sends packets in a timeslot
11
Implementation
• Multicore Arbiter
1. comm-core: communicates with endpoints, handles a subset of endpoints
2. alloc-cores: perform timeslot allocation, work concurrently using pipeline
parallelism
3. pathsel-cores: assign paths
• Progress:Comm-cores receive endpoint demands and pass them to alloccores. Once a timeslot is completely allocated, it is promptly passed to a
pathsel-core. The assigned paths are handed to comm-cores, which notify
each endpoint of its allocations.
12
Evaluation
• Goal:
1. eliminate in-network queueing
2. achieve high throughput
3. support various inter-flow or inter-application resource allocation
objectives in a real-world datacenter network.
13
Evaluation
Experimental environment
Cluster
10Gb/s
Cluster sw
TOR sw
10Gb/s
10Gb/s
1、a single rack of 32 servers
2、each server connects to a ToR switch with a main 10 Gbits/s Ethernet network
interface card (NIC) and a 1 Gbit/s NIC meant for out-of-band communication
3、each server has 2 Intel CPUs with 8 cores and 148 GB RAM
4、 One server is set aside for running the arbiter.
14
Evaluation
Experiment A: throughput and queueing
Under a bulk transfer workload involving
multiple machines, Fastpass reduces
median switch queue length to 18KB from
4351KB,with a 1.6% throughput penalty.
15
Evaluation
Experiment B: latency
Interactivity: under the same workload,
Fastpass’s median ping time is 0.23 ms
vs the baseline’s 3.56 ms,15.5× lower
with Fastpass.
16
Evaluation
Experiment C: fairness and convergence
Fairness: Fastpass reduces standard
deviations of per-sender throughput over 1s
intervals by over 5200× for 5 connections.
17
Evaluation
Experiment D: request queueing
Experiment E: communication overhead
D: Each comm-core supports 130Gbits/s
of network traffic with 1µs of NIC queueing.
(fig.10)
E: Arbiter traffic imposes a 0.3%
throughput overhead.(fig.11)
18
Evaluation
Experiment F: timeslot allocation cores
Experiment G: path selection cores
F: 8 alloc-cores support 2.2
Terabits/s of network traffic.
G: 10 pathsel-cores support > 5
Terabits/s of network traffic.
19
Evaluation
Production experiments at Facebook
--Experiment H: production results
20
Discussion
• A core arbiter would have to handle a large volume of traffic, so allocating
at MTU-size granularity would not be computationally feasible.
 Solution:the use of specialized hardware
• Packets must follow the paths allocated by the arbiter. Routers typically
support IP source routing only in software, rendering it too slow for
practical use.
 Solution:ECMP spoofing
21
Limitation
• The granularity of timeslots:if an endpoint has less than a full timeslot
worth of data to send, network bandwidth is left unused.
• A possible mitigation: the arbiter divides some timeslots into smaller
fragments and allocates these fragments to the smaller packets.
22
Conclusion
• Fastpass
1.
High throughput with nearly-zero queues
2.
Consistent (fair) throughput allocation and fast convergence
3.
Scalability
4.
Fine-grained timing
5.
Reduced retransmissions
• Practicablity
1.
reducing the tail of the packet delay distribution
2.
avoiding false congestion
3.
eliminating incast
4.
better sharing for heterogeneous workloads
5.
with different performance objectives
23
24