Download Video Over Wireless

Document related concepts

UniPro protocol stack wikipedia , lookup

Standing wave ratio wikipedia , lookup

Telecommunication wikipedia , lookup

Telecommunications engineering wikipedia , lookup

Broadcast television systems wikipedia , lookup

Television standards conversion wikipedia , lookup

Videocassette recorder wikipedia , lookup

Camcorder wikipedia , lookup

Home cinema wikipedia , lookup

Streaming television wikipedia , lookup

Index of electronics articles wikipedia , lookup

Interlaced video wikipedia , lookup

Serial digital interface wikipedia , lookup

H.264/MPEG-4 AVC wikipedia , lookup

HD-MAC wikipedia , lookup

Video Over Wireless
Shilpa Pamidimukkala
• Introduction to Video over Wireless
• Definition of Issues
• Solution to the Issues
• Conclusion
• Reference
Components of a Wireless Video System
Transport + Network Layer
Tradeoff: Throughput, Reliability, Delay
• At the sender side, video packets are first generated by a video
encoder, which performs compression.
After passing through the network protocol stack (e.g., RTP/UDP/IP),
transport packets are generated and then transmitted over a wireless
channel that is lossy in nature.
• Therefore, the video sequence must be encoded in an error-resilient
way that minimizes the effects of losses on the decoded video quality.
In addition, at the physical layer, modulation modes and transmitter
power may be able to be adjusted according to the changing channel
• At the receiver, the demodulated bitstream is processed by the
channel decoder, which performs error detection and/or correction.
• Corrupt packets are usually discarded by the receiver, and are
therefore considered lost.
• In addition, packets that arrive at the receiver beyond their display
deadlines are also treated as lost.
• This strict delay constraint is another important difference between
video communications and many other data transmission applications.
The video decoder then decompresses video packets and displays the
resulting video frames in real-time.
Compression is absolutely necessary to fit digital video within affordable
storage capacities and network communications bandwidths.
Video Streaming
• Video streaming is a server/client technology that allows
multimedia data to be transmitted and consumed.
• Streaming applications include e-learning, video
conferencing, video on demand etc.
• The main goal of streaming is that the stream should
arrive and play out continuously without interruption.
• Real-time streaming can be delivered by either peer-to
peer (unicast) or broadcast (multicast).
Characteristics of a Wireless Video System
• The capacity of wireless channel is limited by the available
bandwidth of the radio spectrum and various types of
noise and interference
• The wireless channel is the weakest link of multimedia
networks – mobility causes fading and error bursts
• Resulting transmission errors require error control
techniques (such as FEC - forward error control and ARQ
– automatic repeat request)
Video Processing
• While for a very long time video processing dealt
exclusively with fixed-rate sequences of rectangular
shaped images (first generation video coding), interest is
recently moving toward a more flexible concept (second
generation video coding).
• In this case, the subject of the processing and encoding
operations is a set of visual regions/objects organized in
both time and space in a flexible and arbitrary complex
• The ISO MPEG-4 [6] was the first international standard
supporting this new advanced concept of visual
information for a wide range of rates and applications.
• MPEG-4 defines a framework for joint description,
compression, storage, and transmission of arbitrary shape
Video Objects (VO’s).
• A frame of a VO is called Video Object Plane (VOP). All
VO’s information (i.e. motion, texture, and shape) are
transmitted within one bit-stream.
• The bitstreams of several VO’s can be multiplexed such
that the decoder receives all the information to decode
the VO’s and arrange them into one video scene.
Wireless networks are characterized by large number of
packet losses because of fading communication channels.
Thus, loss recovery mechanisms must be added to
prevent video degradation.
Solution: Video object based unequal error protection
mechanism, which allocate an optimal FEC redundancy
ratio to each object.
•This object-based manipulation result in a quite remarkable
improvement in term of functionalities such as the
possibility, for the source, of choosing the best coding
strategy independently for each of the objects.
• For instance, if we consider a video-conference system
with “speaker” and “background” as different objects.
•The only interest for end-users is to obtain the best
possible quality for the most relevant object (i.e. “the
speaker”). We can then, allocate more bandwidth to encode
the video object representing the “speaker”. Thus, the
received perceptual video quality can be significantly enhanced.
Error-free frame
Example 1: The extra insertion bit causing the loss of
the first GOB
Example 2: Corrupted group number
causing a GOB misplacement
Example 3: Corruption of the group quantizer parameter that
resulted in employing the wrong quantizer in decoder
Issue-1 Error Recovery
• Two different approaches can be used to deal
with networks transmission errors. The first one is Automatic Repeat Request (ARQ) , and the second one is
Forward Error Correction .
• This FEC allocation is done according to video objects
relevance, and the wireless networks packet loss rate.
• To provide this object-based unequal error protection we
assign a specific redundancy FEC ratio for each object.
FEC Model
• Forward Error Correction (FEC) technique is the
most commonly adopted error-control scheme for interactive
video applications as video conferencing system.
• In FEC scheme each block of k packet are protected by
(n−k) FEC packets.
• If at least k out of n packets are correctly received,
then the entire data information can be correctly recovered
at the receiver. Otherwise, none of the lost packets can be
recovered by the receiver.
FEC Model
FEC Model
• redundancy ratio δ the ratio of redundancy and the data
block plus redundancy (δ = (n−k)
n ).
• By using this FEC schemes, the loss rate perceived at the
receiver side will be lower than the loss rate observed on
the global packet stream within the network.
FEC Model
•where ǫ denote packet loss
probability induced by
network and ǫr denote packet
loss probability perceived at
the receiver when applying a
redundancy ratio δ.
•The loss-cost performances
for various values of δ and ǫ
are plotted
•e.g.,for δ = {0.01, ..., 0.2}
and ǫ = {0.01, ..., 0.1})
At the source
• Cost function called Weighted Distortion is used. The scope of this
cost function is minimization of the weighted average of the different
video object distortion.
• The distortion is defined as the mean squared value of the difference
the pixels of original and decoded pictures, usually termed
mean squared error (MSE).
• In the following, we assume that our video source is able to generate
a multi-object based video coding. This video is composed of a set of
O = {Oi}i2{1,N}. In this case, the cost function can be
denoted as following
At the receiver
The cost function represented by the equation denote the user quality
satisfaction at the source. But, at the receiver, this quality may change
because of network packet losses. So, to determine the cost function
at the receiver we must take into consideration the network loss rate.
the probability of correctly receving an VOP is equal to the probability to
correctly receive at least ki packets from the ni packets transmitted by
the source channel. Such probability may be represented by
We can then denote the distortion, Di(δi), perceived for Oi at the receiver
when we use a redundancy ratio equal to δi as follow:
In this case, the global-video distortion is equal to the sum of all the
Di(δi) multiplied by the associated priority, αi for i = 1, ...,N, as
indicated by the following equations:
Energy Efficient
• Transmitting video over wireless channels from mobile devices has
gained increased popularity in a wide range of applications, a major
obstacle here is the limited energy supply in mobile device batteries.
For this reason, efficiently utilizing energy is a critical issue in designing
wireless video communication systems.
Solution: A general framework is presented that takes into account
multiple factors, including source coding, channel resource allocation,
and error concealment, for the design of energy-efficient wireless video
communication systems. This framework can take various forms
and be applied to achieve the optimal trade-off between energy
consumption and video delivery quality during wireless video
Generally speaking, energy in mobile devices is mainly used for computation,
transmission, display.
Among those, computation and transmission are the two largest energy
During computation, energy is used to run the operating system software,
and encode and decode the audio and video signals.
During transmission, energy is used to transmit and receive the radio
frequency (RF) audio and video signals.
advances in very large-scale integration (VLSI) design and integrated circuit
(IC) manufacturing technologies have led to ICs with higher and higher
integration densities using less and less power. According to Moore’s Law, the
number of transistors on an IC doubles every 1.5 yr. As a consequence, the
energy consumed in computation is expected to become a less significant
fraction of the total energy consumption.
• The goal is to minimize the amount of distortion at the receiver given
a limited amount of transmission energy, or vice versa, to minimize
the energy consumption while achieving a targeted video delivery
• This requires a “cross-layer” perspective where the source and
network layers are jointly considered.
• Specifically, the lower layers in a protocol stack, which directly control
transmitter power, need to obtain knowledge of the importance level
of each video packet from the video encoder, which is located at the
application layer.
• On the other hand, it can also be beneficial if the source encoder is
aware of the estimated channel state information (CSI) passed from
the lower layers and which channel parameters at the lower layers
can be controlled, so it can make smart decisions when selecting
the source coding parameters to achieve the best video delivery
• For this reason, joint consideration of video encoding and power
control is a natural way to achieve the highest efficiency in
transmission energy consumption.
• Factors affecting transmission energy consumption include the power
used for transmitting each bit, the modulation mode, and channel
coding rate at the link layer or physical layer.
• The controller block, indicates the component of the video
transmission system responsible for adapting the source coding
parameters, S, and the channel parameters, C, based on knowledge
of the concealment strategy, the source content and any available
Problem Formulation:
We consider techniques that efficiently adapt the source parameters, S,
and channel parameters, C, in order to minimize the end-to-end
distortion while meeting the energy and delay constraints.
This problem can be formally stated as
The selection of S and C affects the end-to end distortion Dtot, the endto-end delay Ttot, and the total energy Etot for delivering the video
sequence to the receiver.
• The energy consumption Etot is caused by a variety of channel
• The end-to-end delay Ttot is the time between when a video
frame is captured at the transmitter and when it is displayed at the
• Ttot depends in part on the number of bits used to encode the
sequence, the transmission rate, and any scheduling decisions made by
the transmitter.
Source Coding Adaptation
• For video delivery over a lossy channel, the distortion at the receiver
is a random variable from the sender’s point of view.
• Thus, the expected end-to-end distortion (averaged over the
probability of loss) is usually used to characterize the received video
quality, and guide the source coding and transmission strategies at
the sender.
• The expected distortion for the kth packet can be written as
where rk is the probability of loss for the kth packet,
E[DR,k] is the expected distortion if the packet is received correctly, and
E[DL,k] is the expected distortion if the packet is lost.
E[DR,k] accounts for the distortion due to source coding as well as
error propagation caused by interframe coding.
• E[DL,k] accounts for the distortion due to concealment. The
probability of packet loss depends on the CSI, transmitter power, and
channel coding used.
novel approach called variance aware per-pixel optimal resource
(VAPOR) ,aims to improve the reliability of video transmission systems by
making it more likely that what the receiver sees closely resembles
the mean end-to-end distortion calculated at the transmitter.
Channel Adaptation
Transmission Energy
The energy needed to send a packet of L bits with transmission power P
is given by E = PL/R,
where R is the transmission rate in source bits
per second.
These three quantities can be adapted in a variety of ways in an actual
system. For example, power adaptation can be implemented
by power control at the physical layer. The change of the transmission
rate R can be implemented by selecting different modulation modes
or channel rates, or allowing a waiting time for each packet before
In an energy-efficient wireless video transmission system, transmission
power needs to be balanced against delay to achieve the best video
For example, for a fixed transmission power, increasing the transmission
rate will increase the BER but decrease the transmission delay needed
for a given amount of data (or allow more data to be sent within a
given timeperiod).
Furthermore, the amount of transmission energy required to achieve a
certain level of distortion typically decreases with increased delay.
Therefore, in order to efficiently utilize resources such as energy and
bandwidth, those two adaptation components should be jointly designed.
• Joint source coding and power allocation techniques deal with the
varying error sensitivity of video packets by adapting the transmission
power per packet based on the source content and the CSI.
• Here it compare a joint source coding and transmission power
allocation (JSCPA) approach with an independent source coding and
power allocation (ISCPA) approach in which S and C are
independently adapted.
• The JSCPA approach needs nearly 60 percent less energy to transmit
this frame than the ISCPA approach. Figures 4c and 4d show the
probability of loss for each packet in frame for the JSCPA and ISCPA
approach, respectively.
• Darker MBs correspond to a smaller probability of packet loss, MBs
that are not transmitted are marked by white. As seen in Fig. 4c,
more protection is given to the region of the frame that corresponds
to the foreman’s head. Therefore, more power is used to transmit this
region as opposed to the background.
• As shown in Fig. 4d, however, the ISCPA approach has fixed
probability of loss, which means that the power used to transmit the
region corresponding to the foreman’s head is the same as the power
used to transmit the background.
• Therefore, the ISCPA approach wastes energy by transmitting MBs in
the background with the same power as MBs in the high activity
region. As for the source coding, in the ISCPA approach the video
encoder may allocate more bits to packets in high activity regions, as
shown in Fig. 4f.
• Because the transmission power is fixed in this approach, more
energy is used to transmit packets with more bits, as shown in Fig.
4h. Therefore, in the ISCPA approach, more energy may be allocated
to high activity regions, but the likelihood of these regions being
correctly received is the same as the background.
• In the JSCPA approach, the bit and power allocations are done
jointly. Thus, the JSCPA approach is able to adapt the power per
packet, making the probability of loss dependent on the relative
importance of each packet, as shown in Figs. 4e
and 4g.
• Dynamic nature of wireless networks in terms of fluctuating
bandwidth and time-varying delays makes it difficult to provide good
quality streaming under such constraints.
There is a trade-off between the capacity of the wireless network and
the quality of the multimedia streaming application.
Here we investigate the effect the background traffic load has on
unicast streaming video sessions, above a certain load value, the
video streaming session is slowly starved of bandwidth. The
performance of the system is measured using a WLAN probe.
• There are two open-source streaming servers available, Helix from
Real and Darwin Streaming Server (DSS) from Apple .
• Here, we have chosen DSS to be the streaming server for our
experiments since it is a typical streaming system that does not
employ sophisticated adaptation techniques.
• DSS is an open-source, standards-based streaming server that is
compliant to MPEG-4 standard profiles, ISMA streaming standards and
all IETF protocols.
• The DSS streaming server system is a client-server architecture where
both client and server consist of the RTP/UDP/IP stack with
TCP/UDP/IP to relay feedback messages between the client and
At the wireless side, a WLAN resource monitoring application was used to
measure and record the resource utilisation of the video streams.
This application non intrusively monitors and records the busy and idle
intervals on the wireless medium and by analysing the temporal
characteristics of these intervals infers the resource usage on a per station
• The WLAN resource utilisation is characterised in terms of MAC
bandwidth components that are related to the line rate .
• Specifically, three MAC bandwidth components are defined: A
load bandwidth (BWLOAD) associated with the transport of the
traffic stream and is related to the throughput, an access
bandwidth requirement (BWACCESS) that represents the “cost” of
accessing the wireless medium, and a free bandwidth (BWFREE).
• An access efficiency may be defined as the ratio of the BWLOAD to
the BWACCESS and gives an indication of how efficiently a station
accesses the medium.
• The intervals during which the medium is busy correspond to the
intervals during which frames are being transmitted on the medium
(i.e. data and management frames) and is associated with the
transport of the traffic load.
• The busy bandwidth (BWBUSY) is the portion of the transmission rate
used for the transport of the total traffic load and is the sum of the
BWLOAD overall stations.
• Similarly, when the medium is not busy, it is said to be idle. The idle
bandwidth (BWIDLE) represents the portion of the transmission rate
that is idle and may be used by any station to win access
opportunities for its load. The sum of BWBUSY and BWIDLE must
equal the line rate i.e. 11Mbps in IEEE 802.11b.
• Table1. The first column indicates the number of contributing sources
to the background traffic load. The second and third columns
indicate the characteristics of the background traffic and shows
the packet size used to achieve the target background load which
in turn affects the number of packets per second.
Figure 3(a) shows how the offered load per station is increased over
time whilst Figure 3(b) shows how the access requirements vary over
time to send the same background traffic load.
• Figure 4(a) show the variations over time for the BWLOAD measured
at the AP whilst streaming the clip ‘EL’. The tests indicated by the thin
black line <C0 EL MTU 1024B> and <C0 EL MTU 512B> show the
variations in the BWLOAD when there is no background traffic
• The repeating pattern every 300seconds represents each loop of the
video stream. In addition, it can be seen that there is a difference
between the measured BWLOAD using the different hint track
• found that by using a hint track MTU setting of 512B increases the
BWLOAD by approximately 20% due to the additional packet header
overhead that needs to be sent and the increased number of ACKs
that need to be sent to acknowledge each packet.
• Through experimentation, we have found that the packet size
and packet rate of the traffic in the network have a large impact on
the video streaming session.
• In the experiments a video streaming session was established
between the video client and server and the traffic load was increased
steadily over time.
• The background traffic load was varied in terms of the packet size and
the number of contributing sources to the load. As the load is
increased, the throughput reaches a maximum and the AP becomes
• At this point, the video client is slowly starved of bandwidth until the
streaming session can no longer be supported and the streaming
session is finally terminated.
• Katsaggelos, A.K.; Fan Zhai; Eisenberg, Y.; Berry, R.;Wireless
Communications, IEEE [see also IEEE Personal Communications]
Volume 12, Issue 4, Aug. 2005 Page(s):24 - 30
Digital Object Identifier 10.1109/MWC.2005.1497855
• Wang, J.; Majumdar, A.; Ramchandran, K.;
Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP
'05). IEEE International Conference on
Volume 5, 18-23 March 2005 Page(s):v/1101 - v/1104 Vol. 5
Digital Object Identifier 10.1109/ICASSP.2005.1416500
Jon Gretarsson, Feng Li, Mingzhe Li, Ashish Samant, Huahui Wu,
Mark Claypool, Robert Kinicki October 2005 Proceedings of the 1st
ACM workshop on Wireless multimedia networking and
performance modeling WMuNeP '05
• Robson Eisinger, Rudinei Goularte December 2005 Proceedings of
the 11th Brazilian Symposium on Multimedia and the web
WebMedia '05