Week 8:
Multicasting; Socket Options;
Camelia Zlatea, PhD
Email: [email protected]
 W. Richard Stevens, Network Programming : Networking
API: Sockets and XTI, Volume 1, 2nd edition, 1998 (ISBN 013-490012-X)
– Chap. 7, 11, 19, 21, 22
Addressing in the Internet
 Addressing tied to reachability
– Every host interface has its own IP address
– Router interfaces usually have their own IP addresses
 IP is version 4 (IPv4 addresses)
– 4 bytes long
– two part hierarchy
» network number and host number
– different types of boundary indicator
» class, subnet mask, prefix
– Goal of boundaries is address aggregation
Address classes
 Historical first choice
– fixed network-host partition, with 8 bits of network number
 Generalization
– Class A addresses have 8 bits of network number
– Class B addresses have 16 bits of network number
– Class C addresses have 24 bits of network number
 Distinguished by leading bits of address
leading 0 => class A (first byte < 128)
leading 10 => class B (first byte in the range 128-191)
leading 110 => class C (first byte in the range 192-223)
leading 1110 => class D (multicast)
leading 1111 => Class E (reserved)
Address evolution
 Class based scheme was too inflexible
 Two problems
– Too many routes
– Too few addresses
 Four extensions
– Subnetting (flexible boundaries within network)
– CIDR (flexible grouping of networks- Classless Interdomain Routing)
– Dynamic host configuration (reuse of addresses)
– A bigger address (IPv6)
 One issue
– Network address translation
What is Multicast?
 Multicast is a communication paradigm
– 1 source, multiple destination
 Applications:
– bulk-data distribution to subscribers
» (e.g., newspaper, software, and video tapes distribution),
– connection-time-based charging data distribution
» (e.g., financial data, stock market information, and news tickets
– streaming (e.g., video/audio real-time distribution),
– push applications, web-casting,
– distance learning, conferencing, collaborative work, distributed
simulation, and interactive games.
The Internet group model
– multicast/group communications means...
» 1  n as well as
– a group is identified by a class D IP address
( to
» abstract notion does not identify any host
site 2
from logical view...
multicast router
multicast group
multicast router physical view
IP Multicast: Basic Idea
 Multicast groups: abstract “rendez-vous” points.
 Set up optimal spanning tree spanning participants for
each group.
 Make it cheap by not providing strong guarantees: send
out packets and hope for the best.
The Internet group model (cont’)
 the group model is an open model
– anybody can belong to a multicast group
» no authorization is required
– a host can belong to many different groups
» no restriction
– a source can send to a group, no matter whether it
belongs to the group or not
» membership not required
– the group is dynamic, a host can subscribe to or leave
at any time
– a host (source/receiver) does not know the
number/identity of members of the group
Mapping IP Multicast onto Ethernet Multicast
 IP Multicast (class D IP address):
Class D: 224.x.x.x-239.x.x.x (in HEX: Ex.xx.xx.xx): 28 bits
No further structure (like Class A, B, or C)
Not addresses but identifiers of groups
Some of them are assigned by the IANA to permanent host groups
 Mapping a class D IP adr. into an Ethernet multicast adr.
– The least 23 bits of the Class D address are inserted into the 23 bits of
Ethernet multicast address
– Many to one mapping: 5 bits are not used
– More filtering has to be done at IP level
Ethernet Multicast
 Ethernet is a broadcast medium
– Every frame can potentially be seen by every host
 Ethernet cards have a unique Ethernet address
 Broadcast address:
– ff:ff:ff:ff:ff:ff
 Ethernet Multicast address range for IP:
– 01:00:5e:00:00:00 -to- 01:00:5e:7f:ff:ff
 Mapping IP Multicast onto Ethernet
 Multicast
The Internet group model (cont’)
 local-area multicast
» use the potential diffusion capabilities of the physical
layer (e.g. Ethernet)
» efficient and straightforward
 wide-area multicast
» requires to go through multicast routers, use
IGMP/multicast routing/...
» routing in the same administrative domain is simple and
» inter-domain routing is complex, not fully operational
Multicast and the TCP/IP layered model
other building higher-level
user space
Socket layer
kernel space
IP / IP multicast
device drivers
What is Multicast?
 Several applications need efficient means to transmit data
to multiple destinations with:
less bandwidth
higher throughput
lower delay
higher reliability
 Classification
– Data dissemination
– Transactions
– Large Scale Virtual Environments
 Build on top of the existing Internet and take into account
group communication constraints
– Manage groups
– Create and maintain multicast routes
– Efficient end-to-end delay (reliability, flow control, time constraints)
Ideal Multicast
 Senders (S) and Receivers (R) not aware of each other’s
position in the network.
 Scalable.
 Low latency (join, data propagation).
 Low bandwidth and processing overhead.
 “Reliable”, if this is cheap (“end-to-end”?)
 Easy to join/leave.
Why IP multicast?
 scalability...
– scales to an unlimited number of users
 reduced costs...
– cheaper equipment and access line
 increased speed...
– increases the delivery speed
use unicast?
access line
ISP and Internet
...or multicast? server
access line
ISP and Internet
Multicast Features: Multicast Scope Control
 Who gets which packets?
– Send everything to everybody ..
 TTL scope
– To keep multicast traffic within an administrative domain by setting
ttl thresholds on interfaces on the border router
 Administratively scoped addresses
– A multicast boundary can be setup
on the borders for addresses in range
– Better than ttl scope
Multicasting: Receiving multicast message
 For a process to receive multicast messages it needs to
perform the following steps:
1. Create a UDP socket msd
msd = socket(AF_INET,SOCK_DGRAM, 0);
2. Bind it to a UDPport, e.g., 1234.
All processes must bind to the same port in order
to receive the multicast messages.
struct sockaddr_in
groupHost.sin_family = AF_INET;
groupHost.sin_port = htons(UDPport);
groupHost.sin_addr.s_addr = htonl(INADDR_ANY);
bind(msd, (struct sockaddr *) &groupHost, sizeof(groupHost))
Multicasting: Receiving multicast message
 (cont’)
3. Join a multicast group address GroupIPaddress ,
joinGroup (msd, GroupIPaddress);
4. Use recv or recvfrom to read the messages, e.g.,
nbytes = recv(msd, recvBuf, BufLen,0);
Multicast Groups and Addresses
 Every IP multicast group has a group address.
 IP multicast provides only open groups
– it is not necessary to be a member of a group in order to send
datagrams to the group.
 Multicast address are like IP addresses used for single
hosts, and is written in the same way: A.B.C.D.
– Multicast addresses will never clash with host addresses because
a portion of the IP address space is specifically reserved for
multicast. to
– Multicast addresses from to are reserved for
multicast routing information;
– Application programs should use multicast addresses outside this
Multicasting: Receiving multicast message
/* This function sets the socket option to make the local host join the multicast
group */
void joinGroup(int s, char *group)
struct sockaddr_in groupStruct;
struct ip_mreq mreq; /* multicast group info structure */
if((groupStruct.sin_addr.s_addr = inet_addr(group))== -1)
printf("error in inet_addr\n");
/* check if group address is indeed a Class D address */
mreq.imr_multiaddr = groupStruct.sin_addr;
mreq.imr_interface.s_addr = INADDR_ANY;
if ( setsockopt(s,IPPROTO_IP,IP_ADD_MEMBERSHIP,(char *) &mreq,
sizeof(mreq)) == -1 )
printf("error in joining group \n"); exit(-1);
Receiving Multicast Datagrams
 Join a particular multicast group. This is done using
another call to setsockopt:
struct ip_mreq mreq;
 The definition of struct ip_mreq is as follows:
struct ip_mreq {
struct in_addr imr_multiaddr; /* multicast group to join */
struct in_addr imr_interface; /* interface to join on */
Multicasting: Receiving multicast message
/* This function removes the process from the group */
void leaveGroup(int recvSock,char *group)
struct sockaddr_in groupStruct;
struct ip_mreq dreq; /* multicast group info structure */
if((groupStruct.sin_addr.s_addr = inet_addr(group))== -1)
printf("error in inet_addr\n");
dreq.imr_multiaddr = groupStruct.sin_addr;
dreq.imr_interface.s_addr = INADDR_ANY;
if( setsockopt(recvSock,IPPROTO_IP,IP_DROP_MEMBERSHIP,
(char *) &dreq,sizeof(dreq)) == -1 )
printf("error in leaving group \n");
printf("process quitting multicast group %s \n",group);
Multicasting: Sending multicast message
For a process to send multicast messages it needs to
perform the following:
1. use the UDP socket msd for sending multicast messages
struct sockaddr_in
dest.sin_family = AF_INET;
dest.sin_port = UDPport;
dest.sin_addr.s_addr = inet_addr(GroupIPaddress);
sendto (msd, sendBuf, BufLen,0, (struct sockaddr *) &dest,
sizeof(dest)) ;
Multicasting: Sending multicast message
 (cont’)
2. Join a multicast group address GroupIPaddress ,
joinGroup (msd, GroupIPaddress);
3. Use recv or recvfrom to read the messages, e.g.,
nbytes = recv(msd, recvBuf, BufLen,0);
Multicasting: Sending multicast message
/* This function sets the socket option to make the local host join the multicast
group */
void joinGroup(int s, char *group)
struct sockaddr_in groupStruct;
struct ip_mreq mreq; /* multicast group info structure */
if((groupStruct.sin_addr.s_addr = inet_addr(group))== -1)
printf("error in inet_addr\n");
/* check if group address is indeed a Class D address */
mreq.imr_multiaddr = groupStruct.sin_addr;
mreq.imr_interface.s_addr = INADDR_ANY;
if ( setsockopt(s,IPPROTO_IP,IP_ADD_MEMBERSHIP,(char *) &mreq,
sizeof(mreq)) == -1 )
printf("error in joining group \n"); exit(-1);
 Time-to-live
– control how far the messages can go, e.g., 2 means at most 2
routers away. (default is 1- which will result in multicast packets
going only to other hosts on the local network. )
u_char TimeToLive;
TimeToLive = 2;
setTTLvalue (s, &TimeToLive);
/* This function sets the Time-To-Live value */
void setTTLvalue(int s,u_char *ttl_value)
if( setsockopt(s, IPPROTO_IP, IP_MULTICAST_TTL, (char *) ttl_value,
sizeof(u_char)) == -1 )
printf("error in setting loopback value\n");
 Time-to-live
– To provide meaningful scope control, multicast routers enforce the
following "thresholds" on forwarding based on the TTL field:
0 restricted to the same host
1 restricted to the same subnet
32 restricted to the same site
64 restricted to the same region
128 restricted to the same continent
255 unrestricted
 Loop-back
– allow the process to get a copy of its own transmission we use:
u_char loop;
By default, messages sent to the multicast
loop = 1;
group are looped back to the local host. this
function disables that.
setLoopback (s, &loop);
loop = 1 /* means enable loopback (default)
loop = 0 /* means disable loopback
void setLoopback(int s,u_char loop)
if( setsockopt(s,IPPROTO_IP,IP_MULTICAST_LOOP,(char *) &loop,
sizeof(u_char)) == -1 )
printf("error in disabling loopback\n");
 Reuse-port
– allow multiple multicast processes to to run on the same host:
reusePort (s);
This function sets a socket option that allows multiple processes
to bind to the same port
void reusePort(int s)
int one=1;
if ( setsockopt(s,SOL_SOCKET,SO_REUSEADDR,(char *) &one,sizeof(one)) == -1
printf("error in setsockopt,SO_REUSEPORT \n");
Multicasting - Example
 multicast.h
 multicastUtilities.c
 multicastChat.c
Reliable One-One Communication
 Use reliable transport protocols (TCP) or handle at the application layer
 Client/Server semantics in the presence of failures
 Possibilities
Client unable to locate server
Lost request messages
Server crashes after receiving request
Lost reply messages
Client crashes after sending request
Reliable One-Many Communication
 Reliable multicast
– Lost messages => need
to retransmit
 Possibilities
– ACK-based schemes
» Sender can become
– NACK-based schemes
Atomic Multicast
Reliable Group Communication
– Processes can fail
– Atomicity of Multicast is required
» Atomicity?
Group Membership
– Multicast and a corresponding group of recipients
– Failures of processes can be viewed as changes to group membership.
System Model
– Separating receiving a message and delivering it to a application
– Group View: a list of processes associated with a message
View Change
– A special multicast message
– Race between m and vc
– Either m is delivered to all processes before a process is delivered a new vc
– Or, m is not delivered at all.
Atomic Multicast
Atomic multicast: a guarantee that all
process received the message or none
at all
– Replicated database example
Problem: how to handle process
Solution: group view
– Each message is uniquely
associated with a group of
» View of the process group when
message was sent
» All processes in the group should
have the same view (and agree on
Reliable Mcast Transport Protocol
Smart “session manager”
elects DR’s and sets
parameters. How? Just
like that...
• S, R use windows
• Designated Receivers
eliminate ACK implosion
• ACK’s sent to DR’s
• DR’s and S cache data and
retransmit it when needed.
 After set up S starts sending data. Receivers send periodic
ACK’s after first packet received.
 If no ACK’s for a long time, connection terminates.
 DR’s or S retransmit info using unicast or multicast, depending
on number of errors.
 Immediate TX request sent to DR’s, for receivers that join the
 Sender window advance determined by slowest receiver.
 ACK’s must not be repeated too often. Measure RTT to AP.
 S adjusts (decreases) send window to 1 if many errors; then
increases linearly.
 DR’s are fixed, but each R chooses its DR. (DR sends
SND_ACK_TOME with TTL fixed to a known value).
Socket Options
 Various attributes that are used to determine the
behavior of sockets.
 Setting options tells the OS/Protocol Stack the
behavior we want.
 Support for generic options (apply to all sockets)
and protocol specific options.
Option types
 Many socket options are Boolean flags indicating
whether some feature is enabled (1) or disabled (0).
 Other options are associated with more complex
types including int, timeval, in_addr,
sockaddr, etc.
 Read-Only Socket Options
– Some options are readable only (we can’t set the value).
Setting and Getting option values
getsockopt() gets the current value of a socket option.
setsockopt() is used to set the value of a socket option.
#include <sys/socket.h>
int getsockopt( int sockfd,
int level,
int optname,
void *opval,
socklen_t *optlen);
level specifies whether the option is a general option or a
protocol specific option (what level of code should
interpret the option).
int setsockopt( int sockfd,
int level,
int optname,
const void *opval,
socklen_t optlen);
General Options
 Protocol independent options.
 Handled by the generic socket system code.
 Some general options are supported only by specific
types of sockets (SOCK_DGRAM, SOCK_STREAM).
Some Generic Options
 Boolean option: enables/disables sending of
broadcast messages.
 Underlying DL layer must support broadcasting!
 Applies only to SOCK_DGRAM sockets.
 Prevents applications from inadvertently sending
broadcasts (OS looks for this flag when
broadcast address is specified).
 Boolean option: enables bypassing of normal routing.
 Used by routing daemons.
 Integer value option.
 The value is an error indicator value (similar to
 Readable only
 Reading (by calling getsockopt()) clears any
pending error.
 Boolean option: enabled means that STREAM
sockets should send a probe to peer if no data flow
for a “long time”.
 Used by TCP - allows a process to determine
whether peer process/host has crashed.
 Consider what would happen to an open telnet
connection without keepalive.
Value is of type:
struct linger {
int l_onoff;
int l_linger;
/* 0 = off */
/* time in seconds */
 Used to control whether and how long a call to close
will wait for pending ACKS.
 connection-oriented sockets only.
 By default, calling close() on a TCP socket will
return immediately.
 The closing process has no way of knowing whether
or not the peer received all data.
 Setting SO_LINGER means the closing process can
determine that the peer machine has received the
data (but not that the data has been read() !).
shutdown() vs SO_LINGER
 How you can use shutdown() to find out when the peer
process has read all the sent data [R.Stevens, 7.5]
TCP Connection Termination
close returns
Data queued
App. Reads
queued data
and FIN
TCP Connection Termination close w/ SO_LINGER
Data queued
App. Reads
queued data
and FIN
close returns
TCP Connection Termination w/ shutdown
shutdown WR
read blocks
Data queued
App. Reads
queued data
and FIN
read returns 0
 Integer values options - change the receive and send
buffer sizes.
 Can be used with STREAM and DGRAM sockets.
 With TCP, this option effects the window size used for flow
control - must be established before connection is made.
 Boolean option: enables binding to an address (port)
that is already in use.
 Used by servers that are transient - allows binding a
passive socket to a port currently in use (with active
sockets) by other processes.
 Can be used to establish separate servers for the
same service on different interfaces (or different IP
addresses on the same interface).
 Virtual Web Servers can work this way.
IP Options (IPv4)
 IP_HDRINCL: used on raw IP sockets when we want to
build the IP header ourselves.
 IP_TOS: allows us to set the “Type-of-service” field in an
IP header.
 IP_TTL: allows us to set the “Time-to-live” field in an IP
TCP socket options
 TCP_KEEPALIVE: set the idle time used when
SO_KEEPALIVE is enabled.
 TCP_MAXSEG: set the maximum segment size sent by a
TCP socket.
 TCP_NODELAY: can disable TCP’s Nagle algorithm that
delays sending small packets if there is unACK’d data
 TCP_NODELAY also disables delayed ACKS (TCP ACKs
are cumulative).
