Download Dynamic_networks

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Wake-on-LAN wikipedia , lookup

Recursive InterNetwork Architecture (RINA) wikipedia , lookup

Distributed firewall wikipedia , lookup

Peering wikipedia , lookup

Zero-configuration networking wikipedia , lookup

Computer network wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

Piggybacking (Internet access) wikipedia , lookup

Network tap wikipedia , lookup

Airborne Networking wikipedia , lookup

Transcript
Dynamic Interconnection Networks
1
Overview
•
•
•
•
Network properties
Switches
Single and multistage Interconnection networks
Crossbar
2
Network properties
• Node degree d - the number of edges incident on a node.
– In degree
– Out degree
• Diameter D of a network is the maximum shortest path
between any two nodes.
• The network is symmetric if it looks the same from any node.
• The network is scalable if it expandable with scalable
performance when the machine resources are increased.
3
Bisection width
• Bisection width is the minimum number of wires that must be cut to
divide the network into two equal halves.
Small bisection width -> low bandwidth
A large bisection width -> a lot of extra wires
• A cut of a network C(N1,N2) is a set of channels that partition the set
of all nodes into two disjoint sets N1 and N2. Each element of
C(N1,N2) is a channel with a source in N1 and destination in N2 or
vice versa.
• A bisection of a network is a cut that partitions the entire network
nearly in half, such that |N2|≤|N1|≤|N2+1|. Here |N2| means the
number of nodes that belong to the partition N2.
• The channel bisection of a network is the minimum channel count
over all bisections of the network:
Bc  min | C ( N1, N 2) |
bi sec tions
4
Factors Affecting Performance
• Functionality – how the network supports data routing,
interrupt handling, synchronization, request/message
combining, and coherence
• Network latency – worst-case time for a unit message to
be transferred
• Bandwidth – maximum data rate
• Hardware complexity – implementation costs for wire,
logic, switches, connectors, etc.
5
2 × 2 Switches
*From Advanced Computer Architectures, K. Hwang, 1993.
6
Switches
Module size
Legitimate states
Permutation connection
2×2
4
2
4×4
256
24
8×8
16,777,216
40,320
N×N
NN
N!
• Permutation function: each input can only be connected
a single output.
• Legitimate state: Each input can be connected to
multiple outputs, but each output can only be connected
to a single input
7
Single-stage networks
•
•
•
•
Single stage Shuffle-Exchange IN
(left)
Perfect shuffle mapping function
(right)
Perfect shuffle operation: cyclic shift
1 place left, eg 101 --> 011
Exchange operation: invert least
significant bit, e.g. 101 --> 100
*From Ben Macey at http://www.ee.uwa.edu.au/~maceyb/aca319-2003
8
Multistage Interconnection Networks
•
•
•
•
The capability of single stage networks are limited but if we cascade enough
of them together, they form a completely connected MIN (Multistage
Interconnection Network).
Switches can perform their own routing or can be controlled by a central
router
This type of networks can be classified into the following four categories:
Nonblocking
– A network is called strictly nonblocking if it can connect any idle input to any idle
output regardless of what other connections are currently in process
•
Rearrangeable nonblocking
– In this case a network should be able to establish all possible connections
between inputs and outputs by rearranging its existing connections.
•
Blocking interconnection
– A network is said to be blocking if it can perform many, but not all, possible
connections between terminals.
– Example: the Omega network
9
Omega networks
• A multi-stage IN using 2 × 2 switch boxes and a perfect shuffle
interconnect pattern between the stages
• In the Omega MIN there is one unique path from each input to each
output.
• No redundant paths → no fault tolerance and the possibility of
blocking
Example:
• Connect input 101 to output
001
• Use the bits of the
destination address, 001, for
dynamically selecting a path
• Routing:
- 0 means use upper output
- 1 means use lower output
*From Ben Macey at http://www.ee.uwa.edu.au/~maceyb/aca319-2003
10
Omega networks
•
•
•
•
log2N stages of 2 × 2 switches
N/2 switches per stage
S=(N/2) log2(N) switches
Number of permutations in a omega network 2S
11
Baseline networks
• The network can be generated recursively
• The first stage N × N, the second (N/2) × (N/2)
• Networks are topologically equivalent if one network can be easily
reproduced from the other networks by simply rearranging nodes at
each stage.
*From Advanced Computer Architectures, K. Hwang, 1993.
12
Crossbar Network
• Each junction is a switching component – connecting the
row to the column.
• Can only have one connection in each column
*From Advanced Computer Architectures, K. Hwang, 1993.
13
Crossbar Network
• The major advantage of the cross-bar switch is its
potential for speed.
• In one clock, a connection can be made between source
and destination.
• The diameter of the cross-bar is one.
• Blocking if the destination is in use
• Because of its complexity, the cost of the cross-bar
switch can become the dominant factor for a large
multiprocessor system.
• Crossbars can be used to implement the a×b switches
used in MIN’s. In this case each crossbar is small so
costs are kept down.
14
Problem
A) Use two-input AND and OR gates to construct NxN
crossbar switch network between N processors and N
memory modules. Use cij signal as the enable signal for
the switch in ith row and jth column. Let the width of each
crosspoint be w bits.
B) Estimate the total number of AND and OR gates
needed as a function of N and w.
15
Problem (cont.)
...
M2
M1
Mn
Crosspoint
C11
P1
P2
C21
C12
C22
C1n
C2n
...
Cn1
Cn2
Cnn
Pn
16
Problem (cont.)
...
M2
M1
Mn
Crosspoint
C11
P1
P2
C21
M1
C12
C22
C1n
Crosspoint
C2n
...
Cn1
C11
Cn2
Cnn
Pn
P1
17
Problem (cont.)
P1
P2
Address
Address
Decoder
1
Decoder
2
1
2
C11
C12
C21
C22
18
Performance Comparison
Network
Latency
Switching Wiring
Blocking
complexity complexity
Bus
Constant
O(N)
O(1)
MIN
O(log2N)
O(Nlog2N) O(Nw log2
N)
yes
Crossbar
O(1)
O(N2)
no
O(w)
O(N2w)
yes
19
Some Commercial Solutions [3]
• System-on-chip crossbar networks:
– Nexus from Fulcrum Microsystems
• The core is used in PMC-Sierra dual MIPS processor RM9000
20
References
1. Advanced Computer Architecture and Parallel
Processing, by Hesham El-Rewini and Mostafa Abd-ElBarr, John Wiley and Sons, 2005.
2. Advanced Computer Architecture Parallelism,
Scalability, Programmability, by K. Hwang, McGraw-Hill
1993.
3. A. Lines, “Nexus: an asynchronous crossbar
interconnect for synchronous system-on-chip designs”,
Proc. of High Performance Interconnects, pp 2-7, 2003.
21