Download Teleconferencing Applications: A Survey

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Computer security compromised by hardware failure wikipedia , lookup

Transcript
Teleconferencing Applications: A Survey
Dimitris Thanos
Constantin Arapis
Abstract
Recent technological improvements in the domain of networks, multimedia hardware
and compression algorithms have brought teleconferencing and other distributed
multimedia applications to the desktop. An example of a multimedia application
whose popularity is continuously increasing is teleconferencing. This paper is a survey of teleconferencing packages/tools, presenting an overview of their functionality
and their features.
1 Introduction
Computer hardware and networks have been dramatically evolving in the last years. Multimedia enabled hardware, such as CPUs with multimedia instructions, sound-cards, video
digitizing hardware and cameras, are becoming more powerful and more affordable reaching
a large portion of desktop computers. In addition, the continuous evolution in computer networks, from high end ATM and Gigabit Ethernet to ISDN and lower end modems, has increased bandwidth availability and reduced cost, making bandwidth more and more affordable.
These two technologies combined together gave rise to applications that can share and
exchange multimedia data such as text, animation, graphics, images, sound and video. An
example of a multimedia application whose popularity is continuously increasing is teleconferencing. Teleconferencing software packages/tools allow people who are geographically
dispersed to hold conferences by means of sending and receiving multimedia data over networks. Examples of teleconferencing applications include videoconference [1] [12][18], teleteaching [2] [15], telepresentation [3], tele-musical rehearsal [4], video phone [19] and audio
chat [16][21].
There exist several teleconferencing software tools/packages nowadays. This paper
gives an overview of these software packages, their features and their functionality. Our survey is intentionally limited to software packages satisfying one of the following two conditions: be a commercial product or freeware software package. Therefore teleconferencing
software packages that are described in papers but do not fulfill one of the above conditions
are not included in this survey. There exist other surveys in this domain [5][6], but are treating either only platform specific applications or do not include latest development.
This paper is structured in the following way. In the second section we describe a number of features and services relevant to teleconferencing software packages. The existence or
absence of certain features and the implementation of services if any, will help us build a
comparative table of the tested applications. In the third section we evaluate in detail a number of teleconferencing environments. Finally we state our concluding remarks.
229
230
Teleconferencing applications: A survey
2 Teleconferencing features/services
2.1
Broadcast / interactive teleconferencing systems
We classify each teleconferencing environment into one of the two categories: broadcast
teleconferencing system or interactive teleconferencing system. In the broadcast systems
category we classify systems which allow a user to broadcast multimedia data (audio, video,
images, transparencies) to a set of users. Interactive systems allow each user to both send and
receive multimedia data to and from other users. Systems belonging to the interactive category are more suitable for real-time interactive teleconferences. We chose to include broadcast teleconferencing systems in the survey because they can be used in specific situations,
such as event broadcast and one-way tele-presentations [7]. Furthermore, such tools incorporate technology and features that could be interesting for a video-conferencing scenario, such
as off-line redistribution of the logged conference.
2.2
H.323 compatibility
The H.323 protocol [8] was developed to standardize multimedia teleconferencing. It applies
to multi-point and point-to-point teleconference sessions over packet switched networks. It
specifies a number of audio and video codecs as well as several intercommunication protocols. H.323 compliant applications [12][18] developed by different vendors and for different
platforms, can intercommunicate and exchange audio and video data with each other. In addition H.323 includes protocols defining the connection of compliant applications to POTS
(Plain Old Telephone System) through specialized gateways. Table 1 shows a list of the
families of protocols contained in H.323.
Protocol
H .225
H .245
H.261
H .263
G.711
G.722
G.728
G.723
G.729
D escription
Signaling, registration, packetization and synchronization control.
C ontrol for opening/closing channels, other.
Video codec for audiovisual services at P x 64 K bps.
Video codec for video over PO TS.
Audio codec, 3.1 K Hz at 48, 56, and 64 Kbps.
Audio codec, 7 K H z at 48, 56, and 64 K bps.
Audio codec, 3.1 K Hz at 16 K bps.
Audio codec, 5.3 and 6.3 K bps.
Audio codec, 8 K bps.
Table 1 H.323 families of protocols
2.3
Application sharing
Application sharing is the possibility to share a program, running on one of the computers,
with other participants in the conference. This allows participants to view the same data or
information, as well as to control the application as if it was running on their own computer.
D. Thanos and C. Arapis
231
Typical shared applications in teleconferencing environments include shared whiteboard1 and
collaborative browsing2. Application sharing is useful for scenarios such as videoconference
for technical support and tele-learning [9]. For example, participants in a teleconference
could visit certain Internet pages using a shared WWW browser while communicating with
each-other for comments or explanations on the visited pages.
2.4
Text chat
Text chat is an application where users can communicate with each-other by writing short
lines of text. The text is transmitted either to all the conference participants or to a specific
subset. This is particularly useful in teleconferencing applications for configuring purposes,
when audio and video are unavailable.
2.5
Caller ID
In teleconferencing environments where the end user application waits for a connection of
other users, caller ID is the ability to identify the calling party before answering the call. The
user can therefore choose to answer or not the call. Some applications allow for automated
actions that depend on the caller ID.
2.6
Parental controls
Parental controls allow one to restrict certain incoming and outgoing calls, and they prevent
children from viewing unsuited material. Parental controls are usually configurable with a
password.
2.7
Multicast support
Multicast [10] is an IP protocol specifying a lightweight routing method for delivering timecritical application data (such as audio and video) to a subset of destinations on the network,
without congesting the network nodes. It is based on a hierarchical distribution of data packets with no unnecessary replication. In order to set up a multicast based system not only the
applications should support it, but the network routers itself. If the routers of a network do not
support multicast, a multicast network can be simulated. For this some computers in the network act as multicast routers forwarding the IP packets to other stations. The current IP protocol (IPv4) supports multicast as an option which not all routers implement. The next generation IP protocol IPv6 [11] will support multicast as a basic feature (all the IPv6 routers
will implement it).
1
Shared whiteboard is a tool allowing the conference participants to draw on the same area, for demonstrating
and developing ideas.
2
Collaborative browsing is a tool allowing the conference participants to browse WWW pages concurrently.
232
2.8
Teleconferencing applications: A survey
Security
By security we refer to the protection of the data transmitted during a teleconference session
from everyone other than the intended receiver. This data could be audio (voice), video, text
chat, application sharing or any other data transmitted through the network during the teleconference session. Examples of teleconference applications requiring privacy include medical teleconferences and industrial tele-meetings.
2.9
Payment
In certain scenarios, namely teaching and technical support, the information shared during the
session can be of some value that can only be revealed to those people (customers) who are
willing to pay for it. An example is a foreign language course organized by a private educational organization. People who want to follow the course subscribe and pay for it. In a similar teleconference scenario, the system could be conceived in such a way that users who connect to view the content can use an electronic payment method to pay for the content. Payment is tightly bound to security as the content has to be protected in order to preserve its
value. Preserving the value of the content is necessary in a commerce scenario in order to
guarantee to the content provider that the service is only accessible to people who pay.
2.10 Logging and off-line information retrieval
In a videoconference session, there are various events that can be logged in order to consult
and/or summarize the session. These include the following types of information:
•
Control events, the information that shows the way the videoconference evolved, such as
the time the conference started and ended, when each participant joined and left the conference and other similar events.
•
Organizational data, such as what subjects were discussed in the conference including
which participant spoke and for which subject.
•
Various data of the conference: such as the complete or partial video data of the conference, audio of each speaker and data presented during the session.
By logging such events, a videoconferencing session can be either replayed in its integrity, or
specific parts of the session can be consulted off-line when the session is over.
2.11 Other features
Other features characterizing each teleconference application are the following:
•
Number of teleconference participants. Some applications are limited to one-to-one
communication whereas others allow multiple users to take active part in the teleconference. The number of attendants supported by each tool is an important factor for characterization.
•
Audio mixing. In environments allowing more than two teleconference participants,
situations can occur where more than one participants speak at the same time. Different
D. Thanos and C. Arapis
233
tools implement different policies for handling such situations. Some tools implement
audio mixing [16] whereas others implement a ‘speak on demand’ mechanism [18]. The
policy used to handle audio of multiple users can be considered as a characterization factor.
•
Number of simultaneous video streams. For multi-party teleconferences, each tool has a
maximum number of simultaneous video streams that it can send and receive. This number can be used to characterize each tool.
3 Teleconferencing environments
In this section we examine existing teleconferencing applications and classify them based on
their features and their target domain. Due to the large number of existing videoconferencing
applications at the moment not all of them can be described here. The choice of applications
to evaluate is made based on their features, availability and their targeting domain.
At the end of this section the reader may find a comparative table (Table 7) of all the
teleconferencing applications/tools tested, based on their features.
3.1
CU-SeeMe and CU-SeeMe VR
CU-SeeMe [12] is an interactive videoconferencing application providing audio and video
interconnection of two or more participants of a teleconference. It was originally developed at
Cornell University for educational purposes. Currently, an enhanced commercial version is
provided by White Pine Software [13].
CU-SeeMe provides two conference modes. The first mode is intended for a one-to-one
conference, allowing two parties to directly communicate with each other by means of audio
and video data exchange over the network. The second mode is intended for a centralized
multi-party conference in which more than two parties can join. For the second mode, a centralized server (‘reflector’ in the CU-SeeMe terminology) acts as a meeting point for all participants and distributes audio and video between the participants.
In the one-to-one conference mode, the CU-SeeMe client waits on a known port for a
connection from another user. Users and their addresses can be obtained by connecting to and
querying a specialized directory server running on a known machine. Once the connection
between the two parties is established, each client captures audio and video using the multimedia hardware, compresses it, with one of the algorithms shown in Table 2, and sends it to
the other party through the network. The inverse operation is performed on the other user’s
station, to display the received video in a window of the screen and send the received audio to
the speakers or headphones. Figure 1 shows the connections and data flow schema of the oneto-one conference mode.
234
Teleconferencing applications: A survey
Directory
Server
Address data
Address data
Client 2
Client 1
Audio / Video data
Figure 1 CU-SeeMe one-to-one mode
In the multi-party conference mode, users connect to a mirror server where the reflector
software is running. Each instance of the reflector program represents an open conference
where users can join-in (connect) at any time in order to participate. Once connected to the
server, each client sends the compressed audio and video data directly to the reflector server.
The reflector gathers the video data received from each party, multiplexes it in one stream
and broadcasts it to all the participants of the conference. Each participant de-multiplexes the
received data and displays the video of each participant in a separate window on the screen.
Up to 12 video streams can be viewed simultaneously. For the audio a ‘first-speak’ policy is
implemented, allowing broadcast of only one audio stream at a time. Figure 2 shows a
schema of the CU-SeeMe video multiplexing mechanism.
Mirror server
Client data
Multiplexed data
Client 1
Client 2
Client n
Figure 2 CU-SeeMe video multiplexing schema
Figure 3 shows the main CU-SeeMe user interface for the multi-party conference mode.
On the left hand side the participant list of the current conference is displayed along with
each participant’s attributes (camera availability, microphone availability, if he/she is currently looking at your video and whether they are listening or not). On the right hand side the
videos of some (up to 12) participants are displayed. One or more videos can be moved into
separate windows of the screen, where they can be enlarged or reduced. Below the videos
pane, the text chat pane is displayed. Users can write small lines of text to all or some of the
other participants, who appear in this pane. On the lower left side of the window CU-SeeMe
displays the user’s speaker and microphone controls as well as controls to visualize the recording and playback gain.
D. Thanos and C. Arapis
235
Participants
videos
Participants
list
Text chat
pane
Speaker / Mic
controls
Figure 3 CU-SeeMe conference
Cornell University has developed an extension to the CU-SeeMe application called CUSeeMe VR [14]. CU-SeeMe VR merges spatial information to the CU-SeeMe environment to
provide a virtual 3D chat/conference environment. Each conference is considered as holding
in a virtual room where participants can move around. To achieve this, VR displays video of
each participant projected on a virtual wall which can move in a 3D plane of the room. Spatial information is also used for multiplexing the audio of the participants. For example, when
a user ‘moves’ closer to another user he hears the voice of the other user louder than when he
moves away from him.
The commercial version of CU-SeeMe client is available for Windows® 95 & NT 4.0
and Mac® OS platforms and the reflector program is available for Windows NT® 4.0 and
UNIX platforms. It conforms to the H.323 videoconferencing protocol and can therefore
communicate with other conforming applications including Microsoft® NetMeeting™ and
Intel® ProShare®. CU-SeeMe implements whiteboard and text-chat services for multi-user
collaboration during conferences, parental controls to restrict certain incoming and outgoing
calls as well as caller ID for screening incoming calls.
Audio Compression
Intel DVI
Delt-Mod
DigiTalk
G.723.1
G.723.1
Voxware
Bandwidth (Kbps)
32
16
8.5
6.4
5.3
2.4
Video Compression
White Pine M-JPEG
White Pine H.263
Bandwidth (Kbps)
Table 2 CU-SeeMe compression algorithms
236
3.2
Teleconferencing applications: A survey
Remote Language Teaching (ReLaTe)
Remote Language Teaching (ReLaTe) [15] is an interactive videoconference environment for
small multi-party conferences using the Internet multicast technology. It was developed at the
Department of Computer Science of the University College London as a part of the umbrella
Esprit project MICE (Multimedia Integrated Conferencing for Europe).
ReLaTe comprises two basic sub-components, the Robust-Audio Tool (RAT) [16] and
the Video Conferencing Tool (VIC) [17]. Figure 4 shows ReLaTe’s integrated user interface.
Figure 4 ReLaTe integrated user interface
ReLaTe’s multi-party and two-party conference modes are very similar in implementation due to the use of multicast technology. In the multi-party conference, each party initiates
the connection to a multicast address. All the data is sent to this address, and is distributed to
all the parties connected to the same multicast address. Audio, video and other data of each
party reaches therefore all the other connected parties. Similarly for the two-way conference,
each party sends its data to a specified address and port (unicast) of the other party’s system.
The two sub-components of ReLaTe, the VIC and the RAT, work in the following way.
VIC captures the video data of each person’s camera equipment, compresses it and sends it to
the other party(ies). On the receiver’s side, the data are decompressed and are displayed into
a window of the screen. RAT captures the audio data from the sender’s microphone, compresses it, with one of the algorithms shown in Table 3, and sends it to the other party(ies). At
the receiver’s site the data is decompressed and fed to the audio device.
When the network is characterized by a random loss of packets3, audio quality suffers
from the gaps in the audio stream. RAT uses the following piggy-back technique for recovering the lost packets: each packet, say Pn, caries the audio data of two consecutive time periods Tn-1 and Tn. The audio data of Tn is compressed with a greater compression ratio than that
of Tn-1 and therefore is smaller and caries sound of inferior quality. The receiver buffers in a
3
Internet is an example of such network having a packet loss varying from 0-5% inside local networks, to an
average of 20% or more for international connections.
D. Thanos and C. Arapis
237
FIFO queue the audio buffers Tn-1 and Tn. In case packet Pn is lost, lower quality audio is still
played using the Tn audio data of the previous packet. Of course, even with this method, if
two consecutive data packets are lost, there is still a gap in the audio stream. A disadvantage
of this method is that more audio data is transmitted and therefore more bandwidth is required. As an alternative to this method, recent versions of RAT provide the option to send
interleaved audio. The idea of interleaved audio is based on the following observation: audio
intelligibility suffers more when the audio stream is interrupted for a long period, rather than
when it is interrupted for a smaller period many times. RAT uses the schema in Figure 5 to
multiplex the audio data into the packets. In case of a packet loss, the reconstructed stream
has smaller gaps than the audio included in the lost packet. Although interleaving does not
reduce the amount of loss observed, it does significantly improve the perceived quality of an
audio stream. The obvious disadvantage of interleaving is that it increases latency. This limits
the use of this technique for interactive applications, although it could be used for noninteractive scenarios. The major advantage of interleaving is that it does not increase the
bandwidth requirements of a stream.
Original audio stream
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16
lost packet
Interleaved stream in packets
1
5
9 13
2
6 10 14
3
7 11 15
4
8 12 16
Reconstructed audio stream
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16
lost audio
Figure 5 RAT’s interleaved audio schema
If many users speak at the same time, RAT mixes the audio signals into one, and feeds
the mixed stream to the audio card. The mixing is done on the receiver’s side and therefore
the audio streams of all the concurrent speakers arrive at each conference node. The disadvantage of this method is that the bandwidth requirements for each connection increase linearly with respect to the number of speakers.
Audio Compression
16 bit linear
PCM µ-law
DVI
GSM
LPC
Bandwidth (Kbps)
128
64
32
13.2
5.8
Table 3 ReLaTe’s compression algorithms.
238
Teleconferencing applications: A survey
ReLaTe incorporates the following additional features: secures the data sent over the
network, using DES encryption mechanisms. Offers a shared whiteboard application integrated in its interface.
The audio component, RAT, uses a silence-detection method to reduce the traffic on the
network: When the energy of the audio signal is below a certain threshold value, indicating
that the participant is not speaking, RAT does not send any audio data, reducing therefore the
network traffic.
ReLaTe’s sub-components RAT and VIC exist as individual freeware tools. Their
source code is also available under copyright restrictions from UCL University. ReLaTe as
well as its sub-components are available for Sun UNIX and SGI workstations and are being
ported for Windows 95 and NT 4.0 platforms.
3.3
MS NetMeeting
NetMeeting [18] is a freeware interactive videoconferencing application by Microsoft Corporation, that also targets multi-party conferences using audio and video. It is very widely used
due to its integration in Microsoft Internet Explorer and its freeware status.
NetMeeting is very similar in operation to the CU-SeeMe application for one-to-one
conferences. It uses the multimedia hardware of each party to capture audio and video data,
compresses it and sends it over network to the other side. NetMeeting users can log on to one
of many directory servers owned by Microsoft (‘Internet Locator Servers’ (ILS) in the NetMeeting terminology) to find other connected NetMeeting users (Figure 6).
Figure 6 NetMeeting directory server
D. Thanos and C. Arapis
239
Multi-party conferences for NetMeeting differ from those of the CU-SeeMe application.
Once a conference is initiated by two users, other users can ask permission to join the meeting. If accepted, they can send audio and video data to the other parties, but users have the
ability to listen and view a video of only one party at a time. The audio and video data is sent
directly to the other party/ies and do not pass through a server (like the multi-user conference
mode of CU-SeeMe).
NetMeeting offers additional features such as application sharing, shared whiteboard,
shared clipboard, file sharing capabilities and text based chat. NetMeeting runs only on the
Microsoft Windows platforms, but offers interoperability with other H.323 compatible products.
3.4
Internet Phone
Internet Phone [19] from VocalTec [20] is a commercial, interactive videoconferencing software providing audio and video interconnection of two or more parties. Figure 7 illustrates
the user interface of Internet Phone. Like in NetMeeting, directory servers owned by VocalTec, specific for Internet Phone allow users to browse for existing meetings and join one of
them if they have permission. Internet Phone has two operation modes: one-to-one and multiparty mode.
Figure 7 Internet Phone and its statistics dialog
In the one-to-one mode the client waits for a connection from another party. Once the
connection is established, it acts similar to the NetMeeting client, showing the video of the
other party in a window of the screen and playing the received audio to the speakers.
In the multi-party mode, the parties join a specific conference at a specified server. Users can only speak to other conference attendants, but cannot use video. Internet Phone uses a
simple token mechanism for users to speak: only one user can speak at a time, by pressing a
button on the interface. The button is available only when no other person is speaking. When
240
Teleconferencing applications: A survey
a person is speaking, the voice packets are compressed and sent to the server who broadcasts
them to all the other clients.
Internet Phone offers the possibility to call and speak to users on the regular phone network. Internet telephone service providers act as gateways from the Internet to the telephone
network. When a user dials a regular phone number, a connection to such a gateway is established. The gateway calls the specified number and passes the sound to-and-from the telephone network. Internet telephone is a commercial service and therefore users who want to
use it have to pay for it.
Internet Phone can work with full or half duplex audio hardware and supports caller ID.
It can be configured to comply with the H.323 protocol and therefore communicate with
other H.323 compliant applications. It offers a shared whiteboard application, text chat and
voice mail capabilities.
3.5
Speak Freely
SpeakFreely [21] is an interactive conferencing tool limited to the exchange of audio data
(voice) between users. It is widely used due to its availability on various platforms, namely
Windows 95, NT, and UNIX for SUN and Silicon Graphics computers.
In its basic mode of operation, one user initiates a network connection to a known port
on the other party’s computer. Other parties ready to communicate can be retrieved from one
of the directory servers. Once the connection is established, SpeakFreely captures audio from
one computer’s audio card, compresses it and sends it over the network to the other party. On
the remote side, SpeakFreely decompresses the received data and plays it to the sound card.
Both half and full duplex transmission is supported depending on the audio hardware and
drivers.
SpeakFreely supports several audio compression algorithms that can be manually selected depending on the available bandwidth. Table 4 shows the various compression algorithms used by SpeakFreely along with their bandwidth requirement.
Audio Compression
Simple
ADPCM
Simple + ADPCM
GSM
Simple + GSM
LPC
LPC-10
Bandwidth (Kbps)
40
40
20
16.5
8.25
6.5
3.46
Table 4 Speak Freely compression algorithms
SpeakFreely implements multicast allowing multi-party discussion groups on networks
supporting it. For networks not supporting multicast, audio packets are sent to each user to
allow for transmission of an audio feed to multiple hosts.
D. Thanos and C. Arapis
241
A feature distinguishing SpeakFreely from other applications is its optional feature to
encrypt the audio data before it sends it to the other side. This allows for secure/private conversations in two-party or multi-party conferences. In its secure mode of operation, SpeakFreely encrypts the compressed data, in a way that only the other party can decrypt (for e.g.
using the PGP encryption algorithm [22]). This prevents others from intercepting the network
and listening to the audio data transmitted. The reverse operation is performed on the other
side in order to listen to the data. The received stream is decrypted and then decompressed.
Table 5 shows the encryption algorithms used by SpeakFreely for secure communications.
Encryption algorithm
PGP - Prety Good Privacy
IDEA - International Data Encryption Algorithm
DES - Data Encryption Standard
Table 5 SpeakFreely encryption algorithms
3.6
Real Video and Real Audio
Real Video is a commercial package by Real Networks [23] targeting the broadcast of audio
and video over the Internet and corporate Intranets. Real Audio is a similar application, but
limited to the distribution of audio only. Since Real Audio is a subset of Real Video we will
restrict our presentation to Real Video only.
Real Video is intended to provide streaming video and/or audio for both on-demand
content and real-time live broadcasts. It is a one-way tool i.e. multimedia data are broadcast
in one direction only, from the content provider to the clients (receivers). Due to its intended
use, i.e. streaming audio/video and one-way tele-presentations, Real Video inserts a significant delay (~8 sec.) to compensate for data losses during the transmission. This delay makes
Real Video inappropriate for real-time interactive conferences. However it can be used for
real-time non-interactive teleconferences where the delay does not constitute a drawback. An
example is a large conference over the Internet where participants can only listen or view the
talks. Introducing an eight second delay in such a scenario does not affect the service.
Real Video consists of three main sub-components. The RealPlayer (or RealPlayer Plus)
which is the end-user application, used as a client for viewing streaming video and audio. The
other two components, the Real Video Encoder and the Real Video Server, are used on the
provider side to produce and distribute Real Video content respectively. There are two scenarios that Real Video serves. The first is streaming prerecorded audio and video data. The
second is broadcasting real-time live events.
In the first scenario the provider prepares a Real Video media file. This file can be created either by using the Real Video Encoder, to capture and store the media file, or by converting a prerecorded video file (QuickTime, AVI and MPEG) to the Real Video format.
Files in Real Video format can be streamed to the clients from the Real Video Server. The
242
Teleconferencing applications: A survey
provider creates WWW pages containing hyperlinks to the required media file4. When a user
clicks on such a hyperlink the MIME protocol is used to start the client application: Real
Player or Real Player Plus. Real Player connects to the machine running the Real Video
Server and requests the specified file. The Real Video Server streams the media file to the
Real Player, which plays it back at the client’s computer.
In the second scenario the Real Video media stream is not prerecorded, but is created
real-time at the provider’s side. The Real Video Encoder is used to capture video (and audio)
from the provider’s camera and microphone and convert it on the fly to the real Video media
format. The encoder sends the stream to the Real Video Server5, which can serve it to the clients requesting it, in the same way it serves the prerecorded media file. All the clients requesting this live media, receive the same live stream from the time they connect to the
server. Figure 8 shows Real Video’s schema for the broadcast of live media scenario.
Real Video stream
Real Video
Server
Real Video
Encoder
Streaming data
Player
Player
Client
ClientClient
Client
Real Player
Client
Figure 8 Real Video live broadcast schema
For both the live broadcast and the on-demand service, the provider can serve the same
stream in different formats, (each format requires different bandwidth) and users can choose
the one which is best suited to the content and their bandwidth availability. Real Video also
supports a bandwidth negotiation mechanism to allow automatic selection of the format to
use. The available Real Video media stream formats as well as the intended content and
bandwidth requirements are shown in Table 6.
4
More precisely the media file contains information on the streaming protocol and the IP address of the server.
5
Notice that the Real Video Encoder and the Real Video Server do not necessarily run on the same machine.
D. Thanos and C. Arapis
243
Audio compression
Real Video
Bandwidth (Kbps)
5 - 80 (*)
Video compression
Real Video (standard)
Real Video (fractal)
Bandwidth (Kbps)
1 - 420 (*)
(*) user specified, depending on the quality
Table 6 Real video compression algorithms
Real Video also offers a feature called ‘synchronized multimedia’ where the Real Player
application is connected to the WWW browser and directs it to display specific URLs while
viewing the media stream. The provider attaches the URLs to specific intervals of the media
stream and the Real Video Server sends both to the clients. When the Real Player receives a
URL, it indicates to the connected WWW browser to display the page of that URL.
Real Video uses proprietary network protocols to deliver the media stream to the clients.
The network technology used is a combination of the HTTP, TCP and UDP protocols as well
as multicast technology. It is not compliant to the H.323 protocol. It provides an autoconfiguration tool to choose the best protocol for each user, as well as connection statistics
information for the UDP mode (Figure 9).
Figure 9 Real Player’s statistics window
The end-user client, RealPlayer, is available for free for Windows® 95 & NT 4.0, Mac®
OS and most UNIX platforms. Real Networks provides an enhanced commercial version of
the client, RealPlayer Plus, which offers some additional functionality. Namely it allows to
store on the local disk a broadcast stream and play it offline, it allows better sound quality for
28.8 modems and has buttons for preset stations. The RealVideo Server as well as the RealVideo Encoder, are available for Windows® 95 & NT 4.0 and UNIX platforms. They exist in
several commercial packages depending on the intended utilization.
244
3.7
Teleconferencing applications: A survey
Other videoconferencing tools
WebPhone [24] from Netspeak Corporation [25] is a commercial videoconferencing application similar to Internet Phone and NetMeeting. It is an interactive application compliant to the
H.323 protocol, offers encryption, full duplex audio, caller ID, centralized directory server
support, voice mail and text chat.
VDOPhone [26] from VDOnet Corporation [27] is a commercial interactive videoconferencing tool offering H.323 compatibility, text based chat, parental control options and application sharing compatible with MS NetMeeting.
OnLive! Traveler [28] by OnLive! Technologies is a 3D Virtual World software that
provides real-time communication by allowing groups of people to meet in a virtual environment and talk with their own voices through animated sprites (avatars). It uses the VRML
1.0 protocol for displaying the avatars in a virtual 3D space. It mixes audio of all users in order to provide real-time group communications.
PowWow [29] is a freeware interactive voice-based communications tool featuring
multi-user conferences (up to 8 parties can talk at the same time), as well as text-to-speech
synthesizer technology. PowWow users can connect to a central server to join a conference,
use a shared whiteboard as well as use text chat features.
FreeTel [30] is a freeware for non-commercial use, interactive application for voicebased and text-based communication.
Net2Phone [31] by IDT Corporation provides both the software and the gateway servers
to communicate through Internet to the conventional telephone network. Like Internet Phone,
the user can compose any domestic or international call and Net2Phone will forward the
voice data to a gateway that will pass the call to the normal telephone network. Net2Phone
allows users to charge an account, and use that money to pay the gateways for the calls.
Vosaic [32] offers a broadcast teleconference application suite similar to Real Video.
The suite comprises three components: The Vosaic Studio, the application to prepare the media stream either live, or by converting an existing video/audio file to the Vosaic format. The
Vosaic MediaServer, the application which allows the transmission of prepared multimedia
streams to multiple clients. Finaly, the MediaClient is the end-user application which displays
the audiovisual audio streams at the client computers. The MediaClient application exists
both as a Web browser plug-in as well as a java applet which can be viewed inside any Web
browser without the need of a specific plug-in. The MPEG1, MPEG2 and H.263 video
codecs are currently supported for the transmission of video streams, as well as the MPEG
audio (32 Kbps), GSM (13 Kbps), half-rate GSM (8.3 Kbps) and half-rate 723 (3.3 Kbps)
audio codecs, for the transmission of audio streams.
D. Thanos and C. Arapis
VDO Phone
yes
yes
no
yes
yes
2
1
yes
yes
no
y es
y es
y es
y es
y es
y es
y es
no
no
Vosaic
Web Phone
yes
no
no
no
yes
no
yes
n /a
yes
yes
no
Real Video
Speak Freely
yes
yes
no
yes
yes
yes
yes
2
1
yes
no
no
Net2Phone
Internet Phone
yes
yes
yes
yes
yes
yes
no
1
no
yes
no
no
FreeTel
MS NetMeeting
yes
no
no
no
no
no
yes
yes
no
yes
no
PowWow
ReLaTe
yes
yes
no
yes
yes
yes
yes
20
12
yes
no
no
OnLive! Traveler
CU-SeeMe
T w o w ay
H .3 2 3 c o m p a t.
A p p l. sh a r in g
T ext chat
C a lle r ID
P a r e n ta l c o n tr o ls
M u ltic a st
N B u sers
N B v id e o str e a m s
S ile n c e d e te c tio n
D ir e c to r y se rv e r
S e c u r ity
Paym ent
245
yes yes y es y es no no
no no
n /a n /a
yes y es
n /a n /a
n /a n /a
no no
y es y es
8
n /a n /a n /a n /a
n /a n /a
yes yes
n /a n /a
no no no
no
n o no no y es y es
-
Table 7 Comparative table of teleconferencing tools
4 Conclusion
In this paper we have presented a survey of teleconferencing packages/tools which are either
freeware or commercial products. We have classified them into two main categories: broadcast teleconference systems and interactive teleconference systems. Broadcast teleconference
systems are essentially one-way transmission systems: during the whole teleconference session a user broadcasts data and media streams to the teleconference participants. Interactive
teleconference systems are two-way transmission systems: a user may transmit data and media streams to other participants and receive data and media streams from any other participant.
Currently, limited support of multicast technology and lack of network bandwidth, especially for users connected over Internet, limit both the number of teleconference participants
and the various types of media streams to be exchanged, for example animation and video at
25 frames/sec. However in the near future things will change. The next generation IPv6 protocol fully supports multicast. The advent of new types of networks such as ATM, ISDN and
Gigabit Ethernet will substantially increase bandwidth availability. The new types of networks will also provide new types of services more appropriate for teleconference applications, for example the ability to tune quality of service parameters. The impact of the above
improvements in the teleconference area will be important. Teleconference technology will
be used in new application domains requiring extra services in addition to audio/video communication and application sharing. Examples of services include the ability to securely exchange data, the ability to set access rights depending on the user profile and methods to incorporate payment services. Furthermore the number of participants in a teleconferencing
246
Teleconferencing applications: A survey
session will substantially increase requiring adaptation of user interfaces and provide satisfactory solutions for new challenges such as initial set-up procedures.
Finally, current teleconferencing software will have to be integrated in teleconference
environments assisting users during the whole lifetime of teleconferences. Namely assistance
is needed before and after the teleconference session per se. Before the teleconference session
the teleconference administrator would define the teleconference agenda. Constraints specified in the agenda will be verified during the teleconference session. Whenever the constraint
is violated, the teleconference administrator will be prompted to take the appropriate actions.
The teleconference session will be stored in a teleconference database. After the teleconference, the teleconference database would allow users to issue queries not only on a single teleconferencing recording, but also on collections of teleconferences. The selected teleconferences or part of them could be then ‘replayed’, for example, listen and view audio and video
streams, and view actions of users on the shared application in use.
References
[1]
C. Breiteneder, S. Gibbs and C. Arapis. “TELEPORT-An Augemented Reality Teleconferencing Environment”, 3rd Eurographics Workshop on Virtual Environments, Monte Carlo, Monaco, Februay 1996.
[2]
C. Arapis, D. Konstantas and T. Pilioura. “Design Issues and Alternatives for Setting up Real-time Interactive Telelectures”. Proceedings of the 1998 ACM Symposium on Applied Computing (SAC 98), Atlanta, Georgia, February-March 1998, pp. 104-111.
[3]
D.J. Gemmell and C.G. Bell. “Noncollaborative Telepresentations Come of Age”, Communications of
the ACM, Vol. 40, No. 4, April 1997, pp. 79-89.
[4]
D. Konstantas, Y. Orlarey, S. Gibbs and O. Carbonel. “Distributed Musical Rehearsal”, Proceedings of
the International Computer Music Conference 97, Thessaloniki, Greece, September 1997.
[5]
Stroud’s reviews of communications clients, http://cws.internet.com/32phone-reviews.html
[6]
North Carolina State University’s Desktop Videoconferencing Product Survey,
http://www3.ncsu.edu/dox/video/products.html
[7]
D.J. Gemmell and C.G. Bell. “Noncollaborative Telepresentations Come of Age”, Communications of
the ACM, Vol. 40, No. 4, April 1997, pp. 79-89.
[8]
H.323 ITU-T Recommendation H.323 - Packet - based multimedia communications systems,
http://www.itu.int/itudoc/itu-t/approved/h.html
[9]
C. Arapis, D. Konstantas and T. Pilioura. “Design Issues and Alternatives for Setting up Real-time Interactive Telelectures”. Proceedings of the 1998 ACM Symposium on Applied Computing (SAC 98), Atlanta, Georgia, February-March 1998, pp. 104-111.
[10]
Mbone (IP Multicast) information, http://www.mbone.com/
[11]
IPv6: Next generation IP protocol, The Internet Engineering Task Force, RFC 1883.
[12]
Dorcey, T. CU-SeeMe Desktop VideoConferencing Software, in Connexions 9, 3 (March 1995)
[13]
White Pine Software, http://www.wpine.com
[14]
CU-SeeMe VR application. Immersive Desktop Teleconferencing, in the Proceedings of the 4th ACM
International Multimedia Conference (Nov. 1996)
[15]
Remote Language Teaching (ReLaTe), http://www.ex.ac.uk/pallas/relate/
[16]
Robust-Audio Tool (RAT), http://www-mice.cs.ucl.ac.uk/mice/rat/
[17]
VIdeo Conferencing tool (VIC), http://www-nrg.ee.lbl.gov/vic/
D. Thanos and C. Arapis
[18]
MS NetMeeting application, http://www.microsoft.com/netmeeting/
[19]
Internet Phone v.5.0, http://www.vocaltec.com/products/iphone5/index.html
[20]
VocalTec Communications, http://www.vocaltec.com
[21]
SpeakFreely, http://www.fourmilab.ch/speakfree/windows/
[22]
PGP: Pretty Good Privacy by Simson Garfinkel, O’Reilly & Associates, 1994
[23]
Real Networks, http://www.realnetworks.com
[24]
WebPhone, http://connect.netspeak.com/product/webphone/index.html
[25]
NetSpeak Corporation, http://connect.netspeak.com/
[26]
VDOPhone, http://www.vdo.net/vdostore/vdophone.html
[27]
VDOnet Corporation, http://www.vdo.net/corporate/
[28]
OnLive! Traveler, http://www.onlive.com/prod/trav/about.html
[29]
PowWow application, http://www.powwow.com/
[30]
FreeTel, http://www.freetel.com/
[31]
Net2Phone, http://www.net2phone.com/
[32]
Vosaic streaming multimedia products, http://www.vosaic.com/products/index.html
247