Download Framing - NDSU Computer Science

Document related concepts

Point-to-Point Protocol over Ethernet wikipedia , lookup

Network tap wikipedia , lookup

Piggybacking (Internet access) wikipedia , lookup

Cracking of wireless networks wikipedia , lookup

IEEE 802.1aq wikipedia , lookup

VMEbus wikipedia , lookup

Serial digital interface wikipedia , lookup

Bus (computing) wikipedia , lookup

Low Pin Count wikipedia , lookup

Asynchronous Transfer Mode wikipedia , lookup

I²C wikipedia , lookup

IEEE 802.11 wikipedia , lookup

IEEE 1355 wikipedia , lookup

CAN bus wikipedia , lookup

UniPro protocol stack wikipedia , lookup

Transcript
Computer Networks (CS 778)
Chapter 2, Direct Link Networks
Chapter examines issues in the OSI DataLink (and to a limited extent,
the physical layer) or TCP/IP Host-to-Network layer.


Five low-level issues are considered: (All five functions are implemented on






Considered with respect to four network (not internet) technologies





Network Adaptor or Network Interface Card (NIC)
Encoding (getting bits on and off the wire/fiber/air)
Framing (delineating frames, send/receive frames)
Error_detection (detecting corrupted frames)
Link_reliability (correcting detected frame-errors)
Access_mediation (if the link is shared, who has access? when? how long?…)
point-to-point links
CSMA networks (AKA: Ethernet) (IEEE 802.3)
Token Ring networks (e.g., FDDI) (IEEE 802.5)
Wireless networks
(IEEE 802.11).
First we examine the building blocks, nodes and links
Nodes
(assume general purpose computers (workstations)




Altho internal nodes (switches) are usually special purpose.
Finite memory (implies limited buffer space)
Connects to network via a network adaptor
Fast processor, slow memory
Three key features of workstation (for networking):
1. Memory Scarce resource in switches/routers (the other is bandwidth)
2. Network adaptor (on I/O bus; delivers data to the network link)

device driver: software on workstation which issues commands to adaptor
3. CPU (capacity increasing rapidly - not true of memory)
CACHE: (level-1: on chip (holds instructions, parameters... ~64KB); level-2: (SRAM; ~512KB)
MAIN MEMORY: (DRAM, MMs range 64MB - 128MB - 512MB - 1GB - 10GB …)


Random-access=(any byte has same access time)
Working memory of most computers. Designs unchanged but in 10 yrs, chip-capacity has increased 256Kb - 256 Mb...
Speed of DRAM has not increased
Processor speeds are doubling every 18 months

Memory speeds are increasing at 7% per year.


Thus a node runs at memory speeds, not processor speeds.
Thus, net software must care about memory access
How many times memory is accessed per message is important.
Links
If you install your own. If nodes are in same room, bldg or
site(campus), buy cable and physically string it between
nodes. What type of cable?
Category 5 twisted pair
50-ohm coax (ThinNet)
75-ohm coax (ThickNet)
Multimode fiber
Single-mode fiber
10-100Mbps, 100m
10-100Mbps, 200m
10-100Mbps, 500m
100Mbps, 2km
100-2400Mbps, 40km
Sometimes links are leased from the phone company (STS is
also denoted OC)
Service to ask for
ISDN
T1
T3
STS-1
STS-3
STS-12
STS-24
STS-48
Bandwidth you get
64 Kbps
1.544 Mbps
44.736 Mbps
51.840 Mbps
155.250 Mbps
622.080 Mbps
1.244160 Gbps
2.488320 Gbps
CABLE:
Twisted Pair
Coaxial Cable
Optical Fiber
Twisted Pair - Transmission Characteristics
Limited distance / bandwidth / data rate
Susceptible to interference and noise






Analog (Amplifiers every 5km to 6km)
Digital (Use either analog or digital signals, repeater every 2km or 3km)
Unshielded Twisted Pair (UTP)

Ordinary telephone wire

Cheapest

Easiest to install

Suffers from external EM interference

Category 3 (up to 16MHz; Voice grade found in most offices; Twist length 7.5 cm to 10 cm)

Category 4 (up to 20 MHz)

Category 5 (up to 100MHz ; Commonly pre-installed in new office bldg; Twist length 0.6-0.85 cm
Shielded Twisted Pair (STP)

Metal braid or sheathing that reduces interference

More expensive

Harder to handle (thick, heavy)
Coaxial Cable Applications and characteristics


Most versatile medium
Television distribution
Antenna to TV

Cable TV
Long distance telephone transmission

Can carry 10,000 voice calls simultaneously

Being replaced by fiber optic
Short distance computer systems links
Local area networks
Analog

Amplifiers every few km

Closer if higher frequency

Up to 500MHz
Digital

Repeater every 1km

Closer for higher data rates






Optical Fiber




Benefits and Applications
Greater capacity (Data rates of hundreds of Gbps)
Smaller size & weight; Lower attenuation; Electromagnetic isolation
Greater repeater spacing (10s of kms at least)
Applications ( Long-haul / Metro / Rural-exchange Trunks; Subscriber loops; LANs)
(Varied index of refraction of the core so laser
beams don’t interfere with each other as much )
ElectroMagnetic Waves (EM)
Signals use electromagnetic (EM) waves traveling at the speed of light
(medium-dependent: copper and fiber about 2/3 of that in a vacuum)
freq
10^X Hz 0
2
4
6
8
10
12
14
16
18
20
22
24
.--+----+----+----+----+----+----+----+----+----+----+----+----+----.
|
|Radio |Microwav|Infrared |UV |
Xray
|Gamma ray|
`-------------------------------------------------------------------’
wavelen (nm)
/
^
\
___________/
|
\__________
/
visible.
\
/
`---.
\
/
Radio
|
Microwave
|
InfraRed
|| UV
\
<-+----+----+----+----+----+----+----+----+----+----+----+----+ >
10^
4
5
6
7
8
9
10
11
12
13
14
15
16
<satellite >
<-fiber>
<----Coax------ >
<AM>
<FM>
<terrestial>
microwave
<-TV >



Binary data is encoded on EM signal thru modulation
Signals propagate over a physical medium - modulate electromagnetic waves - e.g., vary voltage
Modulation = varying signal frequency/ampl/phase to effect the transmission of info.

e.g., vary power (amplitude) of signal (turn hi/low)
Microwave

Terrestrial (Parabolic dish,
Focused beam, Line of sight,
Long haul telecommunications, Higher frequencies give higher data rates)

Satellite



Satellite is relay station - receives on one frequency, amplifies or repeats signal
and transmits on another frequency
Requires geo-stationary equitorial orbit (Height of 35,784km = 22,365 mi.)
USES: Television, Long distance telephone, Private business networks
BROADCAST RADIO



Omnidirectional
FM radio
UHF and VHF television
Infrared



Line of sight (or reflection)
Blocked by walls
e.g. TV remote control, IRD port
Services

For pt-pt links two bit streams may be able to concurrently transmit in opposite
directions (full duplex) or one direction at a time (half duplex).

Assume links are full-duplex unless stated otherwise.
Common Services to the home (last mile)
Bandwidth
28.8 - 56 Kbps
(POTS uses a modem for data (modulator/demodulator)
64 - 128 Kbps
16Kbps - 55.2Mbps
20 - 40 Mbps
Service
POTS
ISDN
xDSL
CATV
Shannon's theorem limits modem rate over analog phones.








C = B*log2(1+S/N); C=achievable channel cap in Hz
B=bandwidth (3300Hz-300Hz = 3000 Hz)
S = Average signal power; N = Average noise power
Current POTS, S/N=1000
Thus, C = 3000 * log2(1001) =~ 30Kbps
Why are 56Kbps modems available then?
1. line qualities improving ( N is lower)
2. 3300Hz-limited Analog lines are being upgraded
ISDN (Integrated Service Digital Network)
- 2 64-Kbps channels (1 = digitized voice, 1 = data)
- CODEC (coder/decoder) en/de-codes voice <--> digital
xDSL (Digital Subscriber Line)
Collection of technologies able to transmit data at high speeds over twisted pair copper found in homes.
Services (cont.)






ADSL (Asymmetrical Digital Subscriber Line)
(Asymmetric) different upstream (phone-to-CO) and downstream (CO-to-phone) rates.
Rates depend on length of link phone -CO (local loop)

downstream: 1.544 Mbps (3.4 mi.) to 8.448 Mbps (1.7 mi.);
upstream: 16 Kbps to 640 Kbps
VDSL (Very-high-data-rate) will be symmetric

12.96Mbps - 55.2Mbps (1000 - 4500 ft.)
(Won't reach from home to CO!)

Phone CO must put STS-n fiber from nbhd to CO ("fiber to the home" or "fiber to the curb" - several homes).
CATV reach ~95% of US homes (~65% subscribe)

some subset of CATV channels (each at 6 MHz) is for digital data.

CATV cable modems are used asymmetrically (40-100Mbps downstream, 20-50Mbps upstream on 1 channel)

(bandwidth will be shared by all users in nbhd requiring some MAC like CSMA/CD or?).
Wireless Links (All 3 use towers)



AMPS (Adv Mobile Phone Sysyem) standard for US cell phones (analog).
PCS (Personal Communication Service), digital cellular, gaining in US.
GMS (Global Mobile System) digital cellular in rest of world.
LEO/MEO constellations (Low/Med Earth Orbit)
Project
| Orbit
|Sats |Uplink Freq| Downlink
Most are voice
===========|==km======|======|==MHz======|==========
Potential for 2 Mbps link
ICO
|10,355
| 10 | 2170-2200 |1980-2010
Globestar | 1,410
| 48 | L-band
|S-band
Iridium
|
780
| 66 | L-band
|L-band
Teledesic | 1,350
| 288 | Ka-band
|Ka-band
Each sattelite will support 1440 16-Kbps satellite-to-earth channels, which can aggregate in a group of 128 to
provide 2.048-Mbps inter-satellite channels.
Services (wireless)
Wireless Links

Radio (RF) and IR can be used for short links
(e.g., office bldgs, malls, campuses)
IR (850-950 nm) provides 1 Mbps over 10 meters. (does not require line-of-sight)

RF bands being made available for data comm (5.2 & 17 GHz for HIPERLAN (Hi Perf European Radio)
2.4 GHz for IEEE 802.11 wireless LANs.

Bluetooth RF (Ericsson, Nokia, IBM, Tohsiba, Itel)

at 2.45GHz (dis = ~10 m) 1 Mbps for eg, printers,

workstn, laptop, projector, PDA, mobile phone;

eliminating wires and cabling in the office.

Networks of such devices are called Piconets.
Iridium satellites form 6 neckaces around earth
1628 moving cells cover the earth.

Encoding
Encode binary data onto signals

Non-Return to Zero (NRZ)
(0 as low signal and 1 as high)
Bits
0 0 1 0 1 1 1 1 0 1 0 0 0 0 1 0
NRZ

2 Problems with long strings of consecutive 1s or 0s

Long string of High signals (1) leads to baseline wander (receiver keeps an average to
distinguish hi/lo – consecutive strings shift that average)


Unable to recover clock (Clock is not transmitted over a separate wire, but is integrated
into the data signal – cycle boundaries are used to re-synchronize clocks).
A link attribute = Number of bit streams that can be concurrently encoded on it.


If just 1, then nodes must share access to the link
(eg CSMA/CD, Token-Ring Multiple Access Protocol)
An Aside on SHARED RESOURCE MANAGEMENT

WAITING POLICY: If needed resource is unavailable, requester waits til it becomes available.


This is how print jobs are managed by an OS
RESTART POLICY: If needed resource is unavailable, the requester terminates and retries later.




This is one way network channels are managed: Ethernet (unswitched) CSMA/CD.
Encoding (NRZ, NRZI, Manchester, 4B/5B)
Assume 2 discrete signal: high and low (ignoring modulation concepts and issues)
Most functions are performed by Network adapter which encodes/decodes bits in signals.
Alternative Encodings


Non-return to Zero Inverted (NRZI)

Make transition from current signal to encode 1. Stay at current signal to encode 0.

Solves the problem of consecutive ones.
Manchester (transmit XOR of NRZ and clock (50% efficient)



Doubles rate of transitions. Receiver has half as much time to detect.
In Manchester, bit-rate = 1/2 baud rate
(50% efficient; rate of signal change = baud rate) (rate of signal change is baud rate)
Bits
NRZ
Clock
Manchester
NRZI
0 0 1 0 1 1 1 1 0 1 0 0 0 0 1 0
Encodings (cont)










4B/5B

every 4 bits of data encoded in a 5-bit code

5-bit codes selected to have no more than one leading 0 and no more than
two trailing 0s

thus, never get more than three consecutive 0s

resulting 5-bit codes are transmitted using NRZI

achieves 80% efficiency
4-bit Data | 5-bit Code
4-bit Data | 5-bit Code
0000
11110
1000
10010
0001
01001
1001
10011
0010
10100
1010
10110
0011
10101
1011
10111
0100
01010
1100
11010
0101
01011
1101
11011
0110
01110
1110
11100
0111
01111
1111
11101


There are 16 codes left over; 11111 = idle line; 00000 = dead line; 00100 = halt;
of the remaining 13, 7 violate the rules and 6 are ctrl symbols (eg, FDDI)
Framing (Break sequence of bits into a frame. Typically implemented
by NIC (Network Interface Card – AKA Network Adapter)
Node A
Adaptor
Bits
Adaptor
Node B
Frames

Now we know how to transmit bit sequences over pt-pt links, (NICNIC), we consider transmission at the "frame“ level.
(“Frame” terminology is usually in reference to a logical group of bits sent over a
“link” (connecting two nodes) whereas a “Packet” usually refers to a logical unit over an
internet or The Internet. However, they are often used interchangeably.)

NodeA wants to tramsmit a frame to nodeB,


tells NIC-A to get frame from memory.
NIC-B collects bits & deposits frame in memory.

(must determine where frame starts and ends).
Framing



Sentinel-based (as opposed to byte-count-based)

Some delineate frame with special pattern, E.g.,Bisynch, PPP, HDLC, SDLC
Byte-oriented (as opposed to bit-oriented)

Frames are collections of characters (bytes) not bits, E.g. Bisynch, PPP, DDCMP
Clock-based (SONET)
8
BISYNC

8
8
8
SYN SYN SOH Header STX Body
8
16
ETX CRC
BISYNC (binary synchronous comm – IBM 1960) (DataLink Protocol)

Sentinel Characters used in Bisynch (sentinel-based, byte-oriented)





SYN = synchronization character (start of frame)
SOH = "Start of Header" character
STX/ETX = Start/End-of-Text characters. (What if ETX occurs in Body? char
stuff =prefix DataLink Esc)
CRC (Cyclic Redun Chk) field to detect trans errors.
Header: for link-level reliable delivery algorithm.
Framing (continued)

PPP (typically run over dialup networks) (DataLink Protocol)
8
8
8
Flag Adr Ctrl






16
Prot
16
Payload
8
Chksm Flag
Flag = 01111110 (Sentinel character)
Adr/Ctrl usually default values (unused)
Protocol identifies hi-level protocol (IP, IPX...)
Payload (default=1500B or negotiated by LCP)
Checksum field is 2 or 4 bytes (2 default)
LCP (Link Ctrl Prot): Sends ctrl mess encapsulated
PPP uses character stuffing when sentinel occurs in Payload also.
Approach
8
8
8
14
42
SYN
SYN
Class

(DDCMP (DEC) byte-counting byte-oriented framing protocol)
Counter-based
 include payload length in header
 e.g., DDCMP
 problem: count field corrupted
 solution: catch when CRC fails
Count
Header
16
Body
CRC

Number of bytes in frame is in FrameCount sub-field of the in header.

If Count field gets corrupted, receiver accumulates as many bytes as Count
indicates then uses error detection field (e.g., CRC) to determine if it is
correct (framing error).
Bit-Oriented Protocols

HDLC: High-Level Data Link Control.




(HDLC/SDLC)
Delineate frame with a special bit Beginning/Ending-sequence: 01111110
SDLC (Synch Data Link Ctrl) (IBM) Standardized by OSI as HDLC.
We discuss HDLC only.
Problem: special pattern may appear in payload. Solution:

bit stuffing. Sender insert 0 after 5 1’s. Receiver delete 0 after 5 1’s.
Approaches

(clock-based)
e.g., SONET: Synchronous Optical Network (1st proposed by Bellcore, then ANSI



fixed size frames each is 125us long. Dominant standard for long-distance optical.
ATM Physical layer protocol (ISO:? Datalink + ~phyiscal layers)
STS-1 (STS=Synch Transport Signal) 51.84 Mbps, 810 byte frames (9 rows, 90 cols)
Below are 2 back-to-back SONET frames (SPE = Synchronous Payload Envelope).
1st 3 bytes of each row are overhead. 1st 2 ovrhd bytes = special Frame Start Pattern (pt to start of frame).
FSP every 810 bytes for synchrony. (other occurrences? OK since FSP is positional - no bit stuffing).
STS-48 at 2488.32 Mbps – all multiples of STS-1. STS-3, SONET frame is 2430 bytes. 3 STS-1 frames fit
exactly in one STS-3 frame
(STS-n frame thought of as n STS-1 frames byte-interleaved. Each STS-1 frame has evenly paced – which
show up at receiver every 1/Nth of the 125 us, not bunched up in 1 1/N seg) STS-Nc: c is for concat.
(User can view it as 1 N*51.48 Mbps pipe. Separate 51.48 Mbps pipes that happen to share the fiber)
Frame Error Detection: 2-D Parity (even):

Errors are rare in optical fiber. Correcting/detected bit
errors can be done by detection/retransmission or
error-correcting codes. Since error correcting
codes are not advance, detect/retrans always used.

CRC (Cyclic Redundancy Check) used in ~all link
protocols (HDLC, DDCMP, CSMA, Token Ring...)

2-D parity (used, e.g., in BISYNC-ASCII)

1-D parity adds 1 bit to 7-bit code to balance # of 1s



Odd parity adds a bit so the # of 1-bits is odd.
Even parity adds a bit so the # of 1-bits is even.
2-D parity does 1-D parity and then the same across
each bit of all bytes.


2-D (even) parity for a 6 byte frame (above)
catches all 1,2,3 bit and most 4-bit errors.
Internet Checksum Algorithm
Idea: view message as a sequence of 16-bit integers.
Add these integers together using 16-bit ones complement
arithmetic, and then take the ones complement of the result.
That 16-bit number is checksum.
Receiver recalculates checksum and compares.
Misses pairs of errors.
(Why 1’s complement? Easy to implement in hardware).
Cyclic Redundancy Check

Add k bits of redundant data to n-bit message





want k << n
e.g., k = 32 n = 12,000 (1500B)
Represent n-bit message as n-1 degree polynomial
e.g., MSG=1001 1010 as M(x) = x7 + x4 + x3 + x1
k is the degree of some specified divisor polynomial,
e.g., C(x) = x3 + x2 + 1
Based on mod2 polynomial arith, so coding/checking alg can be impl’d in hdwre (finite fields)

Let P,C be mod2 polynomials (identified with their coefficient bit-sequence)
(Note: If DegP >= DegC, then C divides P–rem{P/C} evenly. )

Sender/Receiver agree on a divisor, C, of degree k

To send a message, M, append k zeros (on right) to form T, and transmits
P = T – rem{T/C} ( which is divisible by C)

Receiver checks to make sure P divides evenly by C to detect errors

Mod2 is used
CRC (cont)

Transmit polynomial P(x) evenly divisible by C(x) - shift left k bits, i.e., M(x)xk


subtract remainder of M(x)xk / C(x) from M(x)xk
Receiver polynomial P(x) + E(x).
E(x) = 0 implies no errors
Divide (P(x) + E(x)) by C(x); remainder zero if:



Eg:














T/C
E(x) was zero (meaning there is no error), or
E(x) is nonzero & exactly divisible by C(x) (undetected error)
C=1101
M=1001 1010
_1111 1001____
= 1101 | 1001 1010 000
1101
100 1
110 1
P =
10 00
11 01
1 011
1 101
1100
1101
1 000
1 101
101
T= 1001 1010 000
1001 1010 101
is transmitted
Selecting C(x); The method detects:

All single-bit errors, as long as xk and x0 terms have nonzero coefs.





Since C divides T evenly, if it divides T+E it must divide E evenly also.
All double-bit errors, as long as C(x) has factor with at least 3 terms
Any odd number of errors, as long as C(x) contains the factor (x + 1)
Any ‘burst’ error (i.e., seq of consec error bits) with length < k bits.
Most burst errors of larger than k bits can also be detected.
Common C(x)
X8 + x2 + x1 + 1
X10 + X9 + X5 + x4 + x1 + 1
X12 + X11 + x3 + x2 + 1
X16 + x12 + x5 + 1
X32 + X26 + x23 + x22 + X16 + x12 + x11 + X10 + X8 + x7 + x5 + X4 + x2 + x1 + 1



CRC
CRC-8
CRC-10
CRC-12
CRC-CCITT
CRC-32
Single bit error: E(x) = xi C contains xk + 1, so C doesn’t divides xi evenly.
Double bit errors corresp to E = xi + xj. If C contains 3 terms it cannot divide E evenly.
Odd # of errors corresponds to E with an odd number of terms. If C divides E evenly and contains
x+1 (ie, C = D * (x+1)) then E must contains x+1, which doesn’t evenly divide a poly with an odd number of
terms in Mod2 system. E=Q(x+1), E(1)=Q(1)(1+1)=Q(1)*0=0 but any odd number of 1’s adds to 1, etc.
Reliable Transmission




Some error codes are strong enough to detect and also correct errors.
However error correcting code is not used (theory not yet advanced enough?)
Therefore errors detected trigger retransmission of the frame.
This is usually accomplished using acknowledgements and timeouts
Acks & Timeouts (Stop & Wait)
ACK
Timeout
If sender gets no Ack before timeout, retransmits
There are three standard ARQ protocols

STOP AND WAIT


Sender waits for ack after each frame.
If timeout occurs, retransmits frame.
Fram
e
b)
c)
d)
ACK
Receiver
Fram
e
ACK
Fram
e
ACK
ACK
(b)
a)
Fram
e
Sender
Timeout
Receiver
Fram
e
ACK
(c)
Timeout

Sender
Timeout

Stop-and-Wait,
Sliding Window,
Concurrent Logical Channels.
(a)
Timeout

Receiver
Fram
e
Automatic Repeat reQuest (ARQ)


Sender
Timeout
Timeout
(header with no data - or piggybacked with data frame)
Receiver
Fram
e
Ack is ctrl frame 1 peer sends to another peer.
Time

Sender
Describes Ack received before timeout expires.
Describes original frame lost. Retransmission occurs.
Describes Ack lost. Time out occurs, with retransmission.
Describes when timeout fires too soon. Retransmission occurs unnecessarily.
(d)
In c), d) receiver needs to know second frame has been retransmited – to distinguish the retransmitted from
from the next frame. Header has a 1-bit seq# (if seq# does not change, it’s a retransmit.)
Stop-and-Wait problem?

Problem: keeping the pipe full (link utilization; 1 RTT per frame)




Sender
Example
 1.5Mbps=1500Kbps link x 45ms RTT = 67.5Kb (~8KB)
 1KB frames:
th capacity
 Bits/frame / time/frame = 1024*8 / .045 = 182 Kbps = 1/8
Recall delay*bandwidth = amount of data that could be in transit.
Would like to be able to send this much data without waiting for the 1st ack.
(keeping the pipe full principle – the following two algorithms do better at that)
Receiver
Sliding Window
In the Stop & Wait example, we would like the sender to be ready to send
the 9th frame at about the time the 1st ack returns
Sender
Receiver
…

…

Allow multiple outstanding (un-ACKed) frames
Upper bd on # un-ACKed frames, called window
Time

SW: Sender
Sender assigns seq# to each frame,
SeqNum (assume unlimited)
 SWS
…
…
LAR
LFS
Maintain three state variables:



send window size (SWS)
last acknowledgment received (LAR)
last frame sent (LFS)
Maintain: LFS - LAR <= SWS;
Advance LAR when ACK arrives;
Buffer up to SWS frames.
When ack arrives, sender moves LAR right (allowing 1 more frame to be sent).
Sender has timer for each frame, retransmitting when it timesout.
Sender needs to be willing and able to buffer up to SWS frames.
Receiver needs to decide whether to send ack or not?





Let SeqNumToAck be largest SeqNum not yet ack'ed s.t. all lesser
SeqNums have been received.
Receiver acks receipt of SeqNumToAck even if higher SeqNums have been received.
This ack is said to be cummulative.
Receiver then sets LFR=SeqNumToAck and adjusts LAF=LFR+RWS.
Sequence Number Space



SeqNum field is finite; sequence numbers wrap around
SeqNum space must be larger than # of outstanding frames
SWS <= MaxSeqNum-1 is not sufficient








suppose 3-bit SeqNum field (0..7)
SWS=RWS=7
sender transmits frames 0..6
arrive successfully, but ACKs lost
sender retransmits 0..6
receiver expecting 7, 0..5, but receives second incarnation of 0..5
SWS < (MaxSeqNum+1)/2 is correct rule
Intuitively, SeqNum “slides” between two halves of sequence number space
Concurrent Logical Channels (used by ARPANET)



Multiplex 8 logical channels over a single link
Run stop-and-wait on each logical channel (but it keeps the pipe full)
Maintain three state bits per channel




channel busy/not_busy
next sequence number in
current sequence number out
Header: 3-bit channel number, 1-bit sequence number
(4-bits total - same as sliding window protocol

)
Separates reliability from order
(does not keep frames in order and no flow control)
Are there active networking variations of this algorithm that might improve it?
Shared Access Networks
Outline
Bus (Ethernet
802.3 )
Token ring (FDDI 802.5 )
Wireless (
802.11)
Ethernet Overview

History
 developed by Xerox PARC in mid-1970s
 roots in Aloha packet-radio network
standardized by Xerox, DEC, and Intel in 1978
 IEEE 802.3 similar (wider set of media – up to 10 Gbps)
CSMA/CD (carrier sense, multiple access, collision detect)
Frame Format (min frame is 512 bits (64B = 14B header + 46B data + 4B CRC))




64b Preamble: allows receiver to synch with signal (alt 0101010..1)


Packet Type field is demux'ing key (id's high-level protocol to which frame should be delivered)
CRC is 32 bits;
bit-oriented framing protocol (like HDLC)

Both Dest and Source Addresses are 48-bit addresses.
64
48
48
16
Preamble
Dest
addr
Src
addr
Type
32
Body
CRC
CSMA/CD Basics
Carrier Sense (CS part): Check line.


If line is idle…

send immediately

upper bound message size of 1500 bytes

must wait 9.6us between back-to-back frames
If line is busy…

wait until idle and transmit immediately (called persistent, non-persistent alternative exits)

Non-persistent: station doesn’t continue to monitor busy line for the 1 st moment it goes idle.

Instead, waits a random period, then repeats.
Collision Detection (CD part): Continue to listen after sending for 2 wire-traverse-times for collision
(Terminators absorb at each end). If collision…

Delay random period and try again (back-off). E.g., choose delay period as follow:

1st time: 0 or 51.2us (randomly chosen)

2nd time: 0, 51.2, or 102.4us

3rd time51.2, 102.4, or 153.6us

nth time: k x 51.2us, for randomly selected k=0..2n - 1

Give up after several tries (usually 16).

Each NIC has unique 48-bit Ethernet address burned into ROM by vendor, eg, 8:0:2b:e4:b1:2
Each vendor issued a range by prefix (e.g., AMD has 8:0:20); Broadcast=1’s; Multicast=1st bit 1
Bandwidth: 10Mbps, 100Mbps, 1Gbps 10Gbps? MaxLen: 2500m (5 500m segments with 4 repeaters)

Problem: Distributed algorithm that provides fair access
We concentrate on 10-Mbps since 100, 1000 and 10,000 use full-duplex, pt-pt configs.
(switched networks with one (or a very few) station on each link)


Collision detect may take up to two
line-traversal-times, τ:
Ethernet
Coax 
Center
conductor
Dielectric
material
(a) Thickwire ethernet or 10B5



Braided
outer
Outer
cover
conductor
Orginal implementation on coaxial 50 ohm cable (CATV= 75ohm) of up to 500 m.
Transceiver separate from NIC with "vampire" tap into link (hosts >= 2.5 m apart)
10 Mbps; Baseband; 500m max seg len.
(b) is thinwire or 10B2 (200m max seg length, cat-5 phone cable, RJ-45 jacks).
( c) is Twisted pair or 10BT (T for twisted) usually used with a hub and/or switch.
Ethernet

Multiple Ethernet segments can be joined by a repeater - forwards signal (signal forwarding)



Any signal placed on Ethernet by host is broadcast over entire network






no more than 4 repreaters can be used altogether (total=2500 m for the 10B series)
maximum of 1024 hosts.
both directions and repeaters forward signal on all outgoing segments.
Terminators attached to the ends of segments absorb the signal (eliminate bounce back)
Uses the Manchester encoding scheme
100BaseT aka: Fast Ethernet; very similar to 10BaseT.
1000BaseT aka: Gigabit Ethernet; Also similar but cannot connect multiple segs by a hub.
Important to understand that all these Ethernet configurations:





Span a single segment
Allow a linear sequence of segments connected by repeaters
Allow multple segments connected in a star configuration by a hub,
Data that is transmitted by any one host reaches all other hosts.
They all compete for access to medium - in same collision domain.
Switched Ethernet (multiple separate Ethernet segments
messages interchanged between segments by a switch)
Experience with Ethernet





Works best under light loads (e.g., <= 30% utilization).
Most have fewer than 200 hosts (not 1024, which is the maximum)
Most are far shorter than 2500 meters (the maximum)
Most have RTT of ~ 5 microseconds (not 51.2, which is the maximum)
Why have Ethernets been so successful?
easy to administer and maintain (in straight Ethernet, no switches/routers/configuration_tables,..)

easy to add a new host

inexpensive (cable/adaptors cheap)
Why does Ethernet not work in some settings (e.g., real-time process control)?

Probabilistic MAC protocol means with bad luck host may arbitrarily long.

Token bus solves that 


Token Bus
(IEEE 802.4)






For real-time systems (e.g., factory automation), some (e.g., General Motors) prefer Token Bus.

Bounded worst case wait for hosts

Priorities can be implemented
Hosts take turns (if there are n hosts and it takes T sec to send a frame, no host waits more than nT s.)
Use a bus (linear or tree-shaped) rather than a ring, since it fits the layout of an assembly line better.
Hosts organized into a unidirectional logical ring and hosts are numbered.
Highest numbered host sends 1st, then it passes token (special permission-to-send frame) to next host
Very complex protocol.
Token Ring Overview (IEEE 802.5)

Examples




16Mbps IEEE 802.5 (based on IBM ring)
100Mbps Fiber Distributed Data Interface (FDDI)
Many different token ring technologies exist
IBM Token Ring (like Xerox Ethernet) is most prevalent

nearly identical with the IEEE 802.5 standard (we will focus on it)

Next to Ethernet, Token ring is other significant class of shared-media networks

FDDI (Fiber Distributed Data Interface) is newer, faster and deserves some attention

A token ring consist of a set of nodes connected in a ring.

Ring = 1 shared medium

Like Ethernet,
 requires a MAC algorithm (who gets to transmit, when)
 each node sees all frames (when destination-address matches, it copies frame as it flows by).
Token Ring (cont)



Frames flow in one direction
Token is a 3 byte pattern (SD,AC,FC below); each node receives it and forwards it.
Node wishing to send, captures token (flips a single bit in AC to change SD,AC,FC from that
of a token to that of a data-frame-3-byte-header), then inserts a frame (attaches rest of data
frame after the header), then releases the token (by flipping the AC bit again and sending on):


immediate release (token immediately reinserted)
delayed release (token not released until frame is stripped)
Sender removes data frame as it comes back around (receiver(s) do not remove it).

Stations get round-robin service

Manchester coding used
Illegal Manchester codes are used in the Start Delimiter (SD) and End Delimiter (ED)
AccessCtrl (AC): frame priority, reservation priority, token/data frame
FrameCtrl (FC): for demuxing
Addresses are 48 bits (same scheme and interpretation as Ethernet)
Body: no size limit imposed by IEEE 802.5
CRC-32 used for error detection
Status Includes bits for reliable delivery. ( Details several slides ahead).








8
SD
8
8
AC FC
48
48
Dest
addr
Src
addr
32
Body
CRC
8
ED
24
Status
Token Ring
ring interface in


NIC contains transmitter/receiver and at least 1 bit storage between them
While no station wants to transmit, token circulates



Station with data to send



Ring must have sufficient storage capacity in total to hold token (3 bytes)
problem is often avoided by using a "monitor" station with more storage
Siezes token: modifies 1 bit in 2nd byte)
begins sending with dest/multicast/broadcast addr.
Each node checks if destination-address matches

copies packet into a buffer (does not remove packet).

Since packet can be longer than the ring, sender drains packet while sending rest of it.
ring interface in
Token
Ring

How much data is sender allowed to transmit (Token Holding Time or THT)



TRT (Token Rotation Time) = time for token to traverse ring.




TRT  ActiveNodes * THT + RingLatency
ActiveNodes = number of active nodes
RingLatency = time for token to circulate the ring un-siezed.
802.5 protocol: reliable delivery accomplished with 2 Status bits, A,C (both initially = 0.)




802.5 THT = 10 milliseconds
station must monitor remaining THT before sending packet.
Whena receiver sees packet destined for it, sets A to 1.
When receiver copies packet to buffer, sets C to 1.
This tells sender exactly what happened.
802.5 supports priorities:




Token contains 3-bit priority field.
Each device that wants to send assigns a priority to that packet
Device can only seize token if its packet priority is at least as high as the priority of the token.
Token priority is changed by 3 reservation bits in the frame header.


Station waiting to send priority-n packet, sets priority bits as packet passes (unless priority is already  its packet priority)
Token Ring Maintenance




Token Rings have designated "monitor" stations (elected initially or on failure of current monitor).
Monitor ensures health of ring.
Healthy monitor announces health periodically with special ctrl message.
If station doesn’t see monitor healthy in time, assume failure; try to be monitor by transmitting claim token.




if claim token gets back, it wins monitor job.
if sees another "claim token" 1st, tie broken by, eg, Highest node addr.
Same procedure for initial election of monitor.
Monitor may need to:






Insert additional delay into the ring (making it long enough for a token).
Token missing (uses_timer=MaxTRT=#stn*THT + RingLatency) creates new token.
Checks for corrupt frames (checksum/format errors) which could cause circulation forever
Checks for orphan frames (correctly transmitted then parent died).
These are detected using header "monitor" bit (init=0, monitor flips to 1)
(if monitor sees bit=1, knows packet is going by 2nd time - drains it.
Token Ring Multi-Station
Access Unit (MSAU)
or Wiring Center
A
MSAU
Wiring center
E
B
C

D
Dead station maintenance (MSAU relays can be set to bypass powered down
stations but may not detect more subtle problems)


Station suspecting failure can send beacon frame to suspected destination.
Based on how far it gets, status can be known and MSAU relays can be closed
FDDI (Fiber Distributed Data Interface)
Dual Ring Configuration
(transmitting in opposite directions,
2nd used if 1st fails – loop back )



Runs on fiber, not copper.
Single Attachment Stations (SAS) used to reduce expense.
Dual Attachment Stations (DAS) are usual.


Each NIC buffers: 9  bits  80.








Concentrator to attach > 1 SAS (optical bypass if SAS fails)
Station can transmit bits out of buffer before it’s full.
100 Mbps (each bit is 10 nsec wide
If there is a 10-bit buffer and the station waits
until the buffer is half full to transmit, it
introduces a 50 nsec delay in ring rotation time.
Number of stations is limited to 500.
Maximum distance of 2 km between any adjacent pair of stations.
Overall 200 km of fiber limit (100 km limit to ring).
Actually it can run on coax, twisted pair as well.
Uses 4B/5B encoding.
Timed Token Algorithm

Token Holding Time (THT)

Token Rotation Time (TRT)




TRT  ActiveNodes x THT + RingLatency
agreed-upon upper bound on TRT
Each node measures TRT between successive tokens

if measured-TRT > TTRT: token is late so don’t send

if measured-TRT < TTRT: token is early so OK to send
Node concerned with sending frame with bdd delay uses FDDI traffic classes:

Synchronous: node with token can send synchronously whether early or late.


for delay sensitive traffic - e.g., voice or video
Asynchronous: node can send Asynchronously only when the token is early.


how long it takes the token to traverse the ring.
Target Token Rotation Time (TTRT)


upper limit on how long a station can hold the token
throughput sensitive traffic - e.g., file transfer
Synch traffic transmits on early or late. If each node had sizable amt of synch data to send, TTRT would be meaningless.



To account for this, the total amt of synch data that can be sent in one token rotation is also bdd by TTRT
(worst case, asynch traffic 1st uses 1 TTRT, then nodes with synch traffic uses another TTRT - means TRT at any node  2*TTRT)
Note, if synch traffic consumed 1 TTRT, asynch traffic won’t send (token is late)


thus if 1 TRT takes 2*TTRT, next 1 can't (no back-to-back 2*TTRTs for TRT).
Asynch traffic can send if measured TRT < TTRT. If nearly equal, asynch still sends so actual bound for TRT is TTRT + time to send
a full FDDI frame.
Token Maintenance

FDDI ensures a valid token is always in circulation by:






A node can send claim frame without the token and does so when failure is suspected.



If it's claim makes it, sender knows its TTRT bid was lowest. (it now holds token)
When node receives claimframe, checkes if TTRT bid in frame < own;
if less, resets its local definition of TTRT and forwards,





All nodes monitor ring to be sure token is not lost
Sets timer for seeing transmission every 2.5ms.
Upon timeout, sends claim..., when valid transmission is seen, resets timer
Claim frames of FDDI differs from 802.5 because it contains node's TTRT "bid"
bid = the token rotation time the node needs so the applics running there can meet their timing constraints.
if more, claimframe removed and node enters bidding (puts claimframe out)
if equal, node compares claimframe sender addr with own and higher wins.
Frame Format:
- Uses 4B/5B encoding (and ctrl symbols, not the illegal Manchester of 802.5))
- other difference from 802.5 is a bit in header (StartOfFrame) for synch/asynch.
Wireless LANs

Bandwidth: 1-100 Mbps
Physical Media (spread spectrum radio (2.4GHz) or diffused infrared (10m) )


Possibilities are endless (IR_within_bldgs to LEO_constellations)




(IEEE 802.11 see www.ansi.org)
802.11 designed for limited geog (homes, office bldgs, campuses)
primary challenge: mediate access to shared comm medium (signals propagating in space)
supports additional features (time-bdd services, power mgmt, security mechanisms..)
Physical Properties

802.11 designed to run over 3 different media
 2 based on Spread Spectrum Radio technology



Frequency hopping
Direct sequence
1 based on diffused IR,
Spread Spectrum


Idea
 spread signal over wider frequency band than is required
 originally designed to thwart jamming (in military uses)
Frequency Hopping
 transmit over random sequence of frequencies
 sender and receiver share…
 pseudorandom number generator
 seed
 802.11 uses 79 1MHz-wide frequency bands
Spread Spectrum (cont)


Direct sequence achieves same effect by representing each frame bit by multiple bits in transmitted signal
Sender sends XOR of any bit with n random bits using pseudorandom # generator known to sender/receiver
Transmitted values, known as "n-bit chipping code",
Spreads signal across a freq band that is n times wider than the frame would otherwise require.



802.11 defines 1 physical layer using frequency hopping (over 79 1-MHz-wide freq bandwidths)
and a 2nd using direct sequence (11-bit chipping sequence).

Both run in the 2.4-GHz frequency band of the EM spectrum.


in both, spread spectrum also makes signal look like noise to receiver that doesn't know pseudorandom seq.
1
0
Data stream: 1010
1
0
Random sequence: 0100101101011001
1
0
XOR of the two: 1011101110101001
Collision Avoidance
Same as ethernet (CSMA/CD)? Sort of!


Consider:






A
B
C
|______|
B
D
radius of transmission
Suppose A & C want to communicate with B


More complicated, since not all node pairs are A
within reach of each other.
A & C are unaware of each other (Hidden nodes wrt to each other)
2 frames collide at B
And there is another exposed node problem:
B sending to A,
C hears B's transmission,
C should be able to still transmit to D.
C
D
MACA


802.11 addresses both Hidden Node and Exposed Node problems with MACA
(Multiple Access with Collision Avoidance)










all nodes hearing the CTS must wait for Ack to transmit.
Should 2 nodes send RTS concurrently,


receiver replies with CTS (CearToSend) frame (echos length)
Any nodes seeing CTS knows it cannot send during that period.
Any node seeing RTS, but not CTS, knows it’s not close enough to receiver to interfere, therefore can transmit.
Receiver sends Ack (not part of MACA but later MACAW - W for Wireless LANs)


sender/receiver exchange ctrl frames, before sender transmits data, informing nearby nodes.
sender sends RTS (RequestToSend) frame (incl how long sender wants to hold medium field, eg frame len
when senders realize no CTS,
after random wait, try again (usually same type of back-off as Ethernet).
Sender transmits RequestToSend (RTS) frame
Receiver replies with ClearToSend (CTS) frame
Neighbors…
 see CTS: keep quiet
 see RTS but not CTS: ok to transmit
Receive sends ACK when has frame
 neighbors silent until see ACK
Supporting Mobility


access points (AP) (not all nodes are equal – some ar allowed to roam, others are connected
to the ground “distributed system”.

AP’s are tethered nodes (connected to ground systemz)

each mobile node associates with an AP
Distributed Systems
APs: like base stations in CellPhone sys and roamers are like cell phones.
Each node associates itself with one AP


Distribution system
AP-1
AP-3
F
AP-2
A
B
G
H
C
E
D

Scanning (selecting an AP)





Mobility (cont)
node sends Probe frame
all AP’s within reach reply with a ProbeResponse frame
node selects 1 AP; sends AssociateRequest frame
AP replies with AssociationResponse frame
new AP informs the old AP via tethered network
When does this exchange take place?

Active scanning: is done by a node when joining the system or moving
(as described above) (it actively seeks an AP).

Passive scanning: AP can periodically sends a Beacon frame to advertise its
capabilities. (rates, etc.). A node can respond to a Beacon to improve its situation)=.
Frame Format

. 16 16
48 48
48
16
48 0-18,496
32 .

| Ctrl |Duration|Adr-1|Adr-2|Adr-3|SeqCtrl|Adr-4|Payload- --| CRC |

Payload is up to 2312 bytes

Ctrl contains 3 subfields


6-bit Type subfield (indicates: frame carries data, is RTS/CTS, used for scanning)







pair of 1-bit fields - called ToDS and FromDS
both=0 if 1 node is sending to another
both=1 if message went through the DS
Adr-1 ultimate destination
Adr-2 identifies immediate sender (forwarded frame from DS to ultimate destination)
Adr-3 ids intermediate dest (accepts frame from wireless node and forwards across DS)
Adr-4 ids original source
Update on Wireless
(from the literature)
The letter after IEEE 802.11 tells the time order in which the standard was 1st proposed
- Main problems are compatibility and security!
802.11a was actually proposed first, tho 11b (WiFi or Wireless Ethernet) clearly got the jump
802.11b (WiFi or Wireless Ethernet) (~50 million installations today?)
802.11d aims to produce versions of 802.11b that work at other frequencies, making it
suitable for other parts of the world.
802.11e will attempt to add QoS to 802.11
802.11f will attempt to improve on the “handover” mechanism of 802.11
802.11h attempts to add better control over transmission power
802.11i aims to add security to 802.11 using Adv Encryp Standard (AES) – the US
governments official encryption algorithm.
802.11j proposed to cover how 802.11a and HiperLAN2 coexist in same airwaves.
HiperLAN1 European/Japanese standard (ETSI)
HiperLAN2 proposed European/Japanese standard to extend HiperLAN1 to
multimedia.
Wireless update continued
Current WLAN standards;
WLAN System
Capacity
802.11b
11Mbps 6(actual)
802.11a
54Mbps 31 (act.)
802.11g
54Mbps 12 (act.)
HomeRF2
10Mbps 6(actual)
HiperLAN2
54Mbps 31 (act.)
5-UP
108Mbps 72 (act.)
Max. Range
100 meters
80 meters
150 meters
50 meters
80 meters
80 meters
Frequency
2.4GHz
5 GHz
2.4GHz
2.4GHz
5 GHz
5 GHz
QoS
No
No
No
Yes
Yes
Yes
Ship
Now
Now
Soon
Now
2003
2003
- HomeRF was supposed to be cheaper and more secure than 802.11b but it is more expensive
and only slightly more secure.
- 5-UP (5 GHz Unified Protocol) is a joint venture of IEEE and ETSI.
Network Adaptor Overview
Typically where data link functionality is implemented (Framing, Error Detection, MAC)


Nearly all functionality so far described is implemented in NIC (framing, error detect, MAC),
except, pt-pt automatic repeat request - ARQ (Stop & Wait, SlW, Concurrent Logic Channels)
which are typically implemented in the lowest-level protocol running on the host.
Generic NIC and device driver software: (though there is much variation in small details)
Components



link interface (speaks correct protocol to the network)
bus interface (understands how to communicate with host) (NICs always designed for specific bus.)
each bus defines protocol used by:




Bus supports data transfer rate (e.g., 32-bit data path bus running at 25 MHz)


For the host CPU to program NIC,
For the NIC to interrupt the host's CPU,
For the NIC to read and write memory on the host.
(cycle time 40 ns) has peak rate of 800 Mbps (enough for unidirectional STS-12 link at 622 Mbps).
Link half of NIC implements link-level protocol




old protocols, on a chipset.
new protocols, in software on microprocessor
or programmable hardware (FPGAs)
because host bus and network link run
at different speeds, buffering is required
(usually simple FIFO queues).
Host Perspective
NIC is programmed by software running on the host



from the CPU's perspective the NIC exports a control status register (CSR)
(typically some address in memory) that is read/write-able from the CPU.
CPU writes to the CSR to instruct NIC to transmit/receive a frame or to learn current state
of NIC.

Interrupts

Host could sit in tight loop reading CSR until something happens then take action
( polling) but that busy waiting wastes CPU resources.

Instead most CPUs pay attention to the NIC only when interrupted.

OS "interrrupt handler" procedure is invoked, inspects CSR, takes action.

OS disables addressed interrupts while taking action (servicing interrupt) (kept
very short.)
Moving Frames Host <-> Adaptor
Two Move Modes: DMA and PIO


Direct Memory Access (DMA) (for frame transfer NIC-Host)
DMA: NIC reads/writes host's memory w/o CPU involvement;
Host simply gives NIC memory address.
PIO (programmable I/O): CPU is directly responsible
for moving data between NIC and host_memory.

To send a frame, CPU sits in a tight loop that first reads
a word from host memory then writes it to NIC

To receive a frame, CPU reads words from the NIC and writes to memory.

Device Drivers

OS routines that anchor protocol graph to network hardware.

routines to:
 initialize the device,
 transmit frames on the link,
 field interrupts. (some pseudocode in text)
WUGS (Washington University Gigabit Switch)
Design Goals
 Design/impl low-cost single chip high speed ATM host-network interface


Capable of building a DAN (Desk Area Network):





2 ATM ports: bi-directional, 1.2 Gbps each (OC-24)
1 PCI bus: 32/64 bits bus, 33MHz (1.05/2.11 Gbps)
Support LAN applications with low latency requirements.
Support applications with different QoS requirements.
Support 256 (or 1024 for future MM servers) ATM VCs each direction


1.2 Gbps
(transmit/receive)
AAL-0 and AAL-5 frames
APIC - A high performance host-network interface chip. High performance
Wide bandwidth and Low latency. In Gigabit environment, performance bottleneck
goes to end systems.
A Traditional Host Architecture: Hardware
C-M bus
CPU
CPU
CPU
MMU &
Cache
MMU &
Cache
MMU &
Cache
I/O bus
Network
interface
Disk
Keyboard
Main
Memory
Bus Adapter
Monitor
Camera
A Traditional Host Architecture: Software
Applications
Process
Process
Process
User buffers
Sockets
TCP
IP
NIC drivers
NIC
Session
Sockets
Kernel
buffers
UDP
TCP
Network
IP v4/v6
Ethernet ATM0
Ethernet
ATM0
Transport
ATM1 Drivers
ATM1
Problems to solve
Data move between I/O devices needs to pass through C-M (CPU-Memory) bus
C-M bus: read bandwidth 1.5 Gbps
Write bandwidth 0.6 Gbps
System call latency
Interrupt livelock
If processor had to field one interrupt for every packet that is sent or received, there would be no useful
work done -- all processor cycles would be dedicated to servicing interrupts. This is the interrupt
livelock problem, which plagues many high performance network adapters, because servicing even
null interrupt requires a longer time than the packet inter-arrival time in high speed networks.

APIC SOLUTION:




2 ATM ports for the ATM network interface (in red below)
one PCI port for the host/device interface (in purple below)
Protected I/O and Protected DMA (Control path and data path)
Orchestrated Interrupt and Interrupt demultiplexing
Data move between I/O devices with APICs
C-M bus
CPU
CPU
CPU
MMU &
Cache
MMU &
Cache
MMU &
Cache
Bus Adapter
I/O bus
APIC
APIC
APIC
M
Disks
RAID
Main
Memory
APIC
M
Video
Jukeboxes
APIC
M
Monitor
/HDTV
M
Video
Camera
APIC Solution (Cont)
Process
Process
Process
Process
Process
OS Kernel
Network Interface
Process
OS Kernel
APIC
Cell/Frame format
ChanID
C L
pOut
ATM cell
GFC
VPI
pIn
VPI
GFC
VPI
VCI
PTI
HEC
Payload
C
VPI
VCI
PTI
Payload
C
+-------------------------------+
|
.
|
|
.
|
| CPCS-PDU Payload |
| up to 2^16 - 1 octets) |
|
.
|
|
.
|
+-------------------------------+
| PAD ( 0 - 47 octets) |
+-------------------------------+ ------|
CPCS-UU (1 octet ) |
+-------------------------------+ CPCS|
CPI (1 octet )
| PDU
+-------------------------------+ Trailer
|
Length (2 octets)
|
+-------------------------------|
|
CRC (4 octets)
|
+-------------------------------+ -------
Controlling APIC

Control cell
Response cell
Interrupt cell



Pin-configured 16-bit address for APIC ID
Dedicated control VCs for sending control cells.
Each control cell causes a response cell.
Pre-specify a VC within a control cell for receiving
its corresponding response cell.
Interrupt cell is sent in configured interrupt channel
with a very low pace rate. It is used to report
asynchronous events
User-Space Control




User-space driver and kernel driver
Protected Memory-mapped I/O accesses to
on-chip registers of APIC (Protected I/O)
Protected accesses to shared-data structures
in main memory (Protected DMA)
The degree of protection depends on the
policies defined by the kernel driver.
Protected I/O
AMR: Access mask register
Global
registers
4 3 2 1 0
Global registers
Kernel access
Per-channel
registers
Per-channel
registers
Kernel User
Access Access
User access
Per-channel
registers
AMR
1 0 1 1 0 R/W
R/O
R #1
R/W
R/W
R #2
R/W
R/W
R #3
R/W
R/O
R #4
R/W
R/W
Virtual Memory Management
Main Memory
Mapper
CPU
1
2
1
2
Access
Code
Logical page
Physical Frames
2
Shared data access - DMA
Simple DMA
APIC
APIC compares a
pair of Kernel
descriptor and
User descriptor for
DMA protection
Pool DMA
APIC
Protected
DMA
APIC
Packet Splitting
Header-Data boundary: Data-Trailer boundary:
length field in AAL-5 trailer
Head length field
Application’s VA space
Header pool
Data pool
APIC
Trailer pool
APIC provides for 4 global pool chains
Page table
mappings
Data 1
Data 2
Data 3
Application
buffer
Interrupt De-multiplexing and Orchestrated Interrupts
APIC implemented a one-bit flag in the state for every channel. When
an interrupt is issued, in response to an event on that channel, the APIC
automatically disables more interrupts from occurring for that channel
by clearing that bit, which remains cleared until the driver sets it again at
some future time.
Orchestrated interrupts are interrupts that are issued in response to an
event that is expected and to which the processor assigns special
significance. In the APIC context, this manifests itself as interrupts that
would be issued when the APIC reads in a specially marked descriptor.
A notification list is used to allow the driver to quickly identify what interrupt
events have occurred and the channels that caused those events to occur. Every
entry in the notification list contains the channel ID of an active channel, and a
bit vector of the different kinds of events that have occurred on that channel.
Each time this notification register is read, a new entry from the notification list
is returned, and that entry is deleted from the list.
Other Features
TCP checksum assist: APIC computes TCP checksum over entire AAL-5
(implements TCP checksum algorithm in hardware). The value is made
available to the software by writing it into the last descriptor for the frame. The
software computes the checksum over portions of the frame that are not part of
the TCP packet, and “subtracts” the result from the value over the entire frame
to attain the TCP checksum.
Transport CRC assist: CRC-32 algorithm(the same used for AAL-5) for
application-specific customized user-space transportation protocols. The trick is
the same as in TCP checksum assist.
Flow control: A hardware-level flow control as defined by the UTOPIA specification, and
a generic flow control that works at the ATM layer. The GFC has to be enabled by a
configuration pin on the chip. When GFC is enabled, the APIC includes a bit in the GFC
field of every cell sent to the upstream APIC which signals to that APIC whether the flow
control grant is asserted or not. If there’re no cells to be sent to the upstream APIC, the
APIC sends cells anyway on a special flow control VC (VPI/VCI=0xff/0xff21); the
upstream APCI knows to extract the flow control bit from cells received on this VC and
the discard these cells.
Functional Block Diagram of APIC Internal Modules
UTOPIA
Sync
Input
B
Input
Sync
Sync
F
VCXT
Output
D
Cell
Store
Sync
Output
Rx
Sync
Tx
Sync
Pacer
E
Request
ReqMgr
DataPath
IntrNfyMgr
BusInterface
PCI-32/64 Bus
Intr
UTOPIA
C
A
Input
Sync
Input
Sync
F
VCXT
Cell
Store
Sync
Output
Sync
Output
Rx
Sync
Tx
Sync
Pacer
Request
ReqMgr
DataPath
IntrNfyMgr
BusInterface
PCI-32/64 Bus
Intr
UTOPIA
UTOPIA
The Control and Response Cell Path
Input
Sync
Input
Sync
F
VCXT
Cell
Store
Sync
Output
Sync
Output
Rx
Sync
Tx
Sync
Pacer
Request
ReqMgr
DataPath
IntrNfyMgr
BusInterface
PCI-32/64 Bus
Intr
UTOPIA
UTOPIA
The Cell Transit Path
Input
Sync
Input
Sync
F
VCXT
Cell
Store
Sync
Output
Sync
Output
Rx
Sync
Tx
Sync
Pacer
Request
ReqMgr
DataPath
IntrNfyMgr
BusInterface
PCI-32/64 Bus
Intr
UTOPIA
UTOPIA
The Cell Receive Path
Input
Sync
Input
Sync
F
VCXT
Cell
Store
Sync
Output
Sync
Output
Rx
Sync
Tx
Sync
Pacer
Request
ReqMgr
DataPath
IntrNfyMgr
BusInterface
PCI-32/64 Bus
Intr
UTOPIA
UTOPIA
The Cell Transmit Path
Input
Sync
Input
Sync
F
VCXT
Cell
Store
Sync
Output
Sync
Output
Rx
Sync
Tx
Sync
Pacer
Request
ReqMgr
DataPath
IntrNfyMgr
BusInterface
PCI-32/64 Bus
Intr
UTOPIA
UTOPIA
The Loopback Path
Input
Sync
Input
Sync
F
VCXT
Cell
Store
Sync
Output
Sync
Output
Rx
Sync
Tx
Sync
Pacer
Request
ReqMgr
DataPath
IntrNfyMgr
BusInterface
PCI-32/64 Bus
Intr
UTOPIA
UTOPIA
The Multi-point Receive Path
Input
Sync
Input
Sync
F
VCXT
Cell
Store
Sync
Output
Sync
Output
Rx
Sync
Tx
Sync
Pacer
Request
ReqMgr
DataPath
IntrNfyMgr
BusInterface
PCI-32/64 Bus
Intr
UTOPIA
UTOPIA
The Multipoint Transmit Path
Input
Sync
Input
Sync
F
VCXT
Cell
Store
Sync
Output
Sync
Output
Rx
Sync
Tx
Sync
Pacer
Request
ReqMgr
DataPath
IntrNfyMgr
BusInterface
PCI-32/64 Bus
Intr
UTOPIA
UTOPIA
The Multipoint Transmit Path
The Functions of Internal Modules









Input/Output ports: strip/add HEC bite, react to flow control signals (both UTOPIA and GFC)
BusInterface: implements the PCI bus protocol. Forward register access request to
RegisterManager. Act as bus master for transaction requests from the DataPath module.
RegisterManager: handle accesses to all on-chip control/status registers except for PCI
configuration registers.
VCXT: VC translation table module. Add internal header to the incoming cell.
Cell store: 256 cells. All transit cells are automatically categorized by the VCXT as low delay (set a
bit in the internal header). A low delay and a normal delay queue for each ATM port. A low delay
queue and up to 256 normal delay queues (a normal queue for each connection) for the PCI port. A
busy VCindex list is used to keep track of normal delay busy queue for PCI port. Service
discipline for normal delay queue: drain out all the cells in a queue before moving to the next
connection in the VCindex list. (Jitter is limited by the 256 cell capacity. Better performance need
to use the low delay queue)
RxSync: synchronization. A store of 8 cells for batch processing.
Requestor:contains most of the per-channel state of the transmit and receive channels.
(when, where, how. Arbitrate, DMA, size, interrupts, etc.)
DataPath: move data, CRC and checksum
IntrNfyMgr: Decide whether or not to raise an actual interrupt line.
VC Translation
VPI
<8>
VCtag
<16>
VCI (From incoming ATM cell)
<16>
VC Table
VCopen
<1>
VCtag
<16>
VCindex
<8>
VCXdata
<17>
0
2
255
=?
If an incoming cell is not successfully
translated, then the VCXT treats the
cell as a transit cell. Such cell should
be forwarded to the “other” ATM port.
AND
Translation success
indication
Translated VCXdata
Addresses and Formats of Registers
14
2
9
2
00 00000000000000 RegID 00
2
8
1
8
6
2
10 t 00000000 CID RegID 00
2
1
8
8
6
Global registers
Kernel-access per-channel registers
2
11 t CID 00000000 RegID 00
User-access per-channel registers
0x400nnf8 AMR: kernel-access for Rx
0x500nnf8 AMR: kernel-access for Tx
AMR: 31
0x6nn00f8 AMR: User-access for Rx
GFC
VPI
VPI
0x7nn00f8 AMR: User-access for Tx
0x500nn10 Channel ATM Header Register: kernel-access
AMR: 2
VCI
C
0x7nn0010 Channel ATM Header Register : User-access
Addresses and Formats of Registers (cont’)
V
L R O
AuxChanID
0x400nnD0 Connection Setup Register (Rx): kernel-access
AMR: 26
0x6nn00D0 Connection Setup Register (Rx) : User-access
VCtag
MV1
MV0
AuxChanID
0x400nnD4
Connection Multicast Vector Register (Rx): kernel-access
AMR: 26
VCtag
0x6nn00D4
Connection Multicast Vector Register (Rx) : User-access
Functional Block Diagram of Smart Port Card
Cache
CPU
Intel Embedded Module
Main
Memory
PCI Bus
Experimental
FPGA
System
FPGA
G-link port 1
APIC
G-link port 2