Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Teleconferencing Applications: A Survey Dimitris Thanos Constantin Arapis Abstract Recent technological improvements in the domain of networks, multimedia hardware and compression algorithms have brought teleconferencing and other distributed multimedia applications to the desktop. An example of a multimedia application whose popularity is continuously increasing is teleconferencing. This paper is a survey of teleconferencing packages/tools, presenting an overview of their functionality and their features. 1 Introduction Computer hardware and networks have been dramatically evolving in the last years. Multimedia enabled hardware, such as CPUs with multimedia instructions, sound-cards, video digitizing hardware and cameras, are becoming more powerful and more affordable reaching a large portion of desktop computers. In addition, the continuous evolution in computer networks, from high end ATM and Gigabit Ethernet to ISDN and lower end modems, has increased bandwidth availability and reduced cost, making bandwidth more and more affordable. These two technologies combined together gave rise to applications that can share and exchange multimedia data such as text, animation, graphics, images, sound and video. An example of a multimedia application whose popularity is continuously increasing is teleconferencing. Teleconferencing software packages/tools allow people who are geographically dispersed to hold conferences by means of sending and receiving multimedia data over networks. Examples of teleconferencing applications include videoconference [1] [12][18], teleteaching [2] [15], telepresentation [3], tele-musical rehearsal [4], video phone [19] and audio chat [16][21]. There exist several teleconferencing software tools/packages nowadays. This paper gives an overview of these software packages, their features and their functionality. Our survey is intentionally limited to software packages satisfying one of the following two conditions: be a commercial product or freeware software package. Therefore teleconferencing software packages that are described in papers but do not fulfill one of the above conditions are not included in this survey. There exist other surveys in this domain [5][6], but are treating either only platform specific applications or do not include latest development. This paper is structured in the following way. In the second section we describe a number of features and services relevant to teleconferencing software packages. The existence or absence of certain features and the implementation of services if any, will help us build a comparative table of the tested applications. In the third section we evaluate in detail a number of teleconferencing environments. Finally we state our concluding remarks. 229 230 Teleconferencing applications: A survey 2 Teleconferencing features/services 2.1 Broadcast / interactive teleconferencing systems We classify each teleconferencing environment into one of the two categories: broadcast teleconferencing system or interactive teleconferencing system. In the broadcast systems category we classify systems which allow a user to broadcast multimedia data (audio, video, images, transparencies) to a set of users. Interactive systems allow each user to both send and receive multimedia data to and from other users. Systems belonging to the interactive category are more suitable for real-time interactive teleconferences. We chose to include broadcast teleconferencing systems in the survey because they can be used in specific situations, such as event broadcast and one-way tele-presentations [7]. Furthermore, such tools incorporate technology and features that could be interesting for a video-conferencing scenario, such as off-line redistribution of the logged conference. 2.2 H.323 compatibility The H.323 protocol [8] was developed to standardize multimedia teleconferencing. It applies to multi-point and point-to-point teleconference sessions over packet switched networks. It specifies a number of audio and video codecs as well as several intercommunication protocols. H.323 compliant applications [12][18] developed by different vendors and for different platforms, can intercommunicate and exchange audio and video data with each other. In addition H.323 includes protocols defining the connection of compliant applications to POTS (Plain Old Telephone System) through specialized gateways. Table 1 shows a list of the families of protocols contained in H.323. Protocol H .225 H .245 H.261 H .263 G.711 G.722 G.728 G.723 G.729 D escription Signaling, registration, packetization and synchronization control. C ontrol for opening/closing channels, other. Video codec for audiovisual services at P x 64 K bps. Video codec for video over PO TS. Audio codec, 3.1 K Hz at 48, 56, and 64 Kbps. Audio codec, 7 K H z at 48, 56, and 64 K bps. Audio codec, 3.1 K Hz at 16 K bps. Audio codec, 5.3 and 6.3 K bps. Audio codec, 8 K bps. Table 1 H.323 families of protocols 2.3 Application sharing Application sharing is the possibility to share a program, running on one of the computers, with other participants in the conference. This allows participants to view the same data or information, as well as to control the application as if it was running on their own computer. D. Thanos and C. Arapis 231 Typical shared applications in teleconferencing environments include shared whiteboard1 and collaborative browsing2. Application sharing is useful for scenarios such as videoconference for technical support and tele-learning [9]. For example, participants in a teleconference could visit certain Internet pages using a shared WWW browser while communicating with each-other for comments or explanations on the visited pages. 2.4 Text chat Text chat is an application where users can communicate with each-other by writing short lines of text. The text is transmitted either to all the conference participants or to a specific subset. This is particularly useful in teleconferencing applications for configuring purposes, when audio and video are unavailable. 2.5 Caller ID In teleconferencing environments where the end user application waits for a connection of other users, caller ID is the ability to identify the calling party before answering the call. The user can therefore choose to answer or not the call. Some applications allow for automated actions that depend on the caller ID. 2.6 Parental controls Parental controls allow one to restrict certain incoming and outgoing calls, and they prevent children from viewing unsuited material. Parental controls are usually configurable with a password. 2.7 Multicast support Multicast [10] is an IP protocol specifying a lightweight routing method for delivering timecritical application data (such as audio and video) to a subset of destinations on the network, without congesting the network nodes. It is based on a hierarchical distribution of data packets with no unnecessary replication. In order to set up a multicast based system not only the applications should support it, but the network routers itself. If the routers of a network do not support multicast, a multicast network can be simulated. For this some computers in the network act as multicast routers forwarding the IP packets to other stations. The current IP protocol (IPv4) supports multicast as an option which not all routers implement. The next generation IP protocol IPv6 [11] will support multicast as a basic feature (all the IPv6 routers will implement it). 1 Shared whiteboard is a tool allowing the conference participants to draw on the same area, for demonstrating and developing ideas. 2 Collaborative browsing is a tool allowing the conference participants to browse WWW pages concurrently. 232 2.8 Teleconferencing applications: A survey Security By security we refer to the protection of the data transmitted during a teleconference session from everyone other than the intended receiver. This data could be audio (voice), video, text chat, application sharing or any other data transmitted through the network during the teleconference session. Examples of teleconference applications requiring privacy include medical teleconferences and industrial tele-meetings. 2.9 Payment In certain scenarios, namely teaching and technical support, the information shared during the session can be of some value that can only be revealed to those people (customers) who are willing to pay for it. An example is a foreign language course organized by a private educational organization. People who want to follow the course subscribe and pay for it. In a similar teleconference scenario, the system could be conceived in such a way that users who connect to view the content can use an electronic payment method to pay for the content. Payment is tightly bound to security as the content has to be protected in order to preserve its value. Preserving the value of the content is necessary in a commerce scenario in order to guarantee to the content provider that the service is only accessible to people who pay. 2.10 Logging and off-line information retrieval In a videoconference session, there are various events that can be logged in order to consult and/or summarize the session. These include the following types of information: • Control events, the information that shows the way the videoconference evolved, such as the time the conference started and ended, when each participant joined and left the conference and other similar events. • Organizational data, such as what subjects were discussed in the conference including which participant spoke and for which subject. • Various data of the conference: such as the complete or partial video data of the conference, audio of each speaker and data presented during the session. By logging such events, a videoconferencing session can be either replayed in its integrity, or specific parts of the session can be consulted off-line when the session is over. 2.11 Other features Other features characterizing each teleconference application are the following: • Number of teleconference participants. Some applications are limited to one-to-one communication whereas others allow multiple users to take active part in the teleconference. The number of attendants supported by each tool is an important factor for characterization. • Audio mixing. In environments allowing more than two teleconference participants, situations can occur where more than one participants speak at the same time. Different D. Thanos and C. Arapis 233 tools implement different policies for handling such situations. Some tools implement audio mixing [16] whereas others implement a ‘speak on demand’ mechanism [18]. The policy used to handle audio of multiple users can be considered as a characterization factor. • Number of simultaneous video streams. For multi-party teleconferences, each tool has a maximum number of simultaneous video streams that it can send and receive. This number can be used to characterize each tool. 3 Teleconferencing environments In this section we examine existing teleconferencing applications and classify them based on their features and their target domain. Due to the large number of existing videoconferencing applications at the moment not all of them can be described here. The choice of applications to evaluate is made based on their features, availability and their targeting domain. At the end of this section the reader may find a comparative table (Table 7) of all the teleconferencing applications/tools tested, based on their features. 3.1 CU-SeeMe and CU-SeeMe VR CU-SeeMe [12] is an interactive videoconferencing application providing audio and video interconnection of two or more participants of a teleconference. It was originally developed at Cornell University for educational purposes. Currently, an enhanced commercial version is provided by White Pine Software [13]. CU-SeeMe provides two conference modes. The first mode is intended for a one-to-one conference, allowing two parties to directly communicate with each other by means of audio and video data exchange over the network. The second mode is intended for a centralized multi-party conference in which more than two parties can join. For the second mode, a centralized server (‘reflector’ in the CU-SeeMe terminology) acts as a meeting point for all participants and distributes audio and video between the participants. In the one-to-one conference mode, the CU-SeeMe client waits on a known port for a connection from another user. Users and their addresses can be obtained by connecting to and querying a specialized directory server running on a known machine. Once the connection between the two parties is established, each client captures audio and video using the multimedia hardware, compresses it, with one of the algorithms shown in Table 2, and sends it to the other party through the network. The inverse operation is performed on the other user’s station, to display the received video in a window of the screen and send the received audio to the speakers or headphones. Figure 1 shows the connections and data flow schema of the oneto-one conference mode. 234 Teleconferencing applications: A survey Directory Server Address data Address data Client 2 Client 1 Audio / Video data Figure 1 CU-SeeMe one-to-one mode In the multi-party conference mode, users connect to a mirror server where the reflector software is running. Each instance of the reflector program represents an open conference where users can join-in (connect) at any time in order to participate. Once connected to the server, each client sends the compressed audio and video data directly to the reflector server. The reflector gathers the video data received from each party, multiplexes it in one stream and broadcasts it to all the participants of the conference. Each participant de-multiplexes the received data and displays the video of each participant in a separate window on the screen. Up to 12 video streams can be viewed simultaneously. For the audio a ‘first-speak’ policy is implemented, allowing broadcast of only one audio stream at a time. Figure 2 shows a schema of the CU-SeeMe video multiplexing mechanism. Mirror server Client data Multiplexed data Client 1 Client 2 Client n Figure 2 CU-SeeMe video multiplexing schema Figure 3 shows the main CU-SeeMe user interface for the multi-party conference mode. On the left hand side the participant list of the current conference is displayed along with each participant’s attributes (camera availability, microphone availability, if he/she is currently looking at your video and whether they are listening or not). On the right hand side the videos of some (up to 12) participants are displayed. One or more videos can be moved into separate windows of the screen, where they can be enlarged or reduced. Below the videos pane, the text chat pane is displayed. Users can write small lines of text to all or some of the other participants, who appear in this pane. On the lower left side of the window CU-SeeMe displays the user’s speaker and microphone controls as well as controls to visualize the recording and playback gain. D. Thanos and C. Arapis 235 Participants videos Participants list Text chat pane Speaker / Mic controls Figure 3 CU-SeeMe conference Cornell University has developed an extension to the CU-SeeMe application called CUSeeMe VR [14]. CU-SeeMe VR merges spatial information to the CU-SeeMe environment to provide a virtual 3D chat/conference environment. Each conference is considered as holding in a virtual room where participants can move around. To achieve this, VR displays video of each participant projected on a virtual wall which can move in a 3D plane of the room. Spatial information is also used for multiplexing the audio of the participants. For example, when a user ‘moves’ closer to another user he hears the voice of the other user louder than when he moves away from him. The commercial version of CU-SeeMe client is available for Windows® 95 & NT 4.0 and Mac® OS platforms and the reflector program is available for Windows NT® 4.0 and UNIX platforms. It conforms to the H.323 videoconferencing protocol and can therefore communicate with other conforming applications including Microsoft® NetMeeting™ and Intel® ProShare®. CU-SeeMe implements whiteboard and text-chat services for multi-user collaboration during conferences, parental controls to restrict certain incoming and outgoing calls as well as caller ID for screening incoming calls. Audio Compression Intel DVI Delt-Mod DigiTalk G.723.1 G.723.1 Voxware Bandwidth (Kbps) 32 16 8.5 6.4 5.3 2.4 Video Compression White Pine M-JPEG White Pine H.263 Bandwidth (Kbps) Table 2 CU-SeeMe compression algorithms 236 3.2 Teleconferencing applications: A survey Remote Language Teaching (ReLaTe) Remote Language Teaching (ReLaTe) [15] is an interactive videoconference environment for small multi-party conferences using the Internet multicast technology. It was developed at the Department of Computer Science of the University College London as a part of the umbrella Esprit project MICE (Multimedia Integrated Conferencing for Europe). ReLaTe comprises two basic sub-components, the Robust-Audio Tool (RAT) [16] and the Video Conferencing Tool (VIC) [17]. Figure 4 shows ReLaTe’s integrated user interface. Figure 4 ReLaTe integrated user interface ReLaTe’s multi-party and two-party conference modes are very similar in implementation due to the use of multicast technology. In the multi-party conference, each party initiates the connection to a multicast address. All the data is sent to this address, and is distributed to all the parties connected to the same multicast address. Audio, video and other data of each party reaches therefore all the other connected parties. Similarly for the two-way conference, each party sends its data to a specified address and port (unicast) of the other party’s system. The two sub-components of ReLaTe, the VIC and the RAT, work in the following way. VIC captures the video data of each person’s camera equipment, compresses it and sends it to the other party(ies). On the receiver’s side, the data are decompressed and are displayed into a window of the screen. RAT captures the audio data from the sender’s microphone, compresses it, with one of the algorithms shown in Table 3, and sends it to the other party(ies). At the receiver’s site the data is decompressed and fed to the audio device. When the network is characterized by a random loss of packets3, audio quality suffers from the gaps in the audio stream. RAT uses the following piggy-back technique for recovering the lost packets: each packet, say Pn, caries the audio data of two consecutive time periods Tn-1 and Tn. The audio data of Tn is compressed with a greater compression ratio than that of Tn-1 and therefore is smaller and caries sound of inferior quality. The receiver buffers in a 3 Internet is an example of such network having a packet loss varying from 0-5% inside local networks, to an average of 20% or more for international connections. D. Thanos and C. Arapis 237 FIFO queue the audio buffers Tn-1 and Tn. In case packet Pn is lost, lower quality audio is still played using the Tn audio data of the previous packet. Of course, even with this method, if two consecutive data packets are lost, there is still a gap in the audio stream. A disadvantage of this method is that more audio data is transmitted and therefore more bandwidth is required. As an alternative to this method, recent versions of RAT provide the option to send interleaved audio. The idea of interleaved audio is based on the following observation: audio intelligibility suffers more when the audio stream is interrupted for a long period, rather than when it is interrupted for a smaller period many times. RAT uses the schema in Figure 5 to multiplex the audio data into the packets. In case of a packet loss, the reconstructed stream has smaller gaps than the audio included in the lost packet. Although interleaving does not reduce the amount of loss observed, it does significantly improve the perceived quality of an audio stream. The obvious disadvantage of interleaving is that it increases latency. This limits the use of this technique for interactive applications, although it could be used for noninteractive scenarios. The major advantage of interleaving is that it does not increase the bandwidth requirements of a stream. Original audio stream 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 lost packet Interleaved stream in packets 1 5 9 13 2 6 10 14 3 7 11 15 4 8 12 16 Reconstructed audio stream 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 lost audio Figure 5 RAT’s interleaved audio schema If many users speak at the same time, RAT mixes the audio signals into one, and feeds the mixed stream to the audio card. The mixing is done on the receiver’s side and therefore the audio streams of all the concurrent speakers arrive at each conference node. The disadvantage of this method is that the bandwidth requirements for each connection increase linearly with respect to the number of speakers. Audio Compression 16 bit linear PCM µ-law DVI GSM LPC Bandwidth (Kbps) 128 64 32 13.2 5.8 Table 3 ReLaTe’s compression algorithms. 238 Teleconferencing applications: A survey ReLaTe incorporates the following additional features: secures the data sent over the network, using DES encryption mechanisms. Offers a shared whiteboard application integrated in its interface. The audio component, RAT, uses a silence-detection method to reduce the traffic on the network: When the energy of the audio signal is below a certain threshold value, indicating that the participant is not speaking, RAT does not send any audio data, reducing therefore the network traffic. ReLaTe’s sub-components RAT and VIC exist as individual freeware tools. Their source code is also available under copyright restrictions from UCL University. ReLaTe as well as its sub-components are available for Sun UNIX and SGI workstations and are being ported for Windows 95 and NT 4.0 platforms. 3.3 MS NetMeeting NetMeeting [18] is a freeware interactive videoconferencing application by Microsoft Corporation, that also targets multi-party conferences using audio and video. It is very widely used due to its integration in Microsoft Internet Explorer and its freeware status. NetMeeting is very similar in operation to the CU-SeeMe application for one-to-one conferences. It uses the multimedia hardware of each party to capture audio and video data, compresses it and sends it over network to the other side. NetMeeting users can log on to one of many directory servers owned by Microsoft (‘Internet Locator Servers’ (ILS) in the NetMeeting terminology) to find other connected NetMeeting users (Figure 6). Figure 6 NetMeeting directory server D. Thanos and C. Arapis 239 Multi-party conferences for NetMeeting differ from those of the CU-SeeMe application. Once a conference is initiated by two users, other users can ask permission to join the meeting. If accepted, they can send audio and video data to the other parties, but users have the ability to listen and view a video of only one party at a time. The audio and video data is sent directly to the other party/ies and do not pass through a server (like the multi-user conference mode of CU-SeeMe). NetMeeting offers additional features such as application sharing, shared whiteboard, shared clipboard, file sharing capabilities and text based chat. NetMeeting runs only on the Microsoft Windows platforms, but offers interoperability with other H.323 compatible products. 3.4 Internet Phone Internet Phone [19] from VocalTec [20] is a commercial, interactive videoconferencing software providing audio and video interconnection of two or more parties. Figure 7 illustrates the user interface of Internet Phone. Like in NetMeeting, directory servers owned by VocalTec, specific for Internet Phone allow users to browse for existing meetings and join one of them if they have permission. Internet Phone has two operation modes: one-to-one and multiparty mode. Figure 7 Internet Phone and its statistics dialog In the one-to-one mode the client waits for a connection from another party. Once the connection is established, it acts similar to the NetMeeting client, showing the video of the other party in a window of the screen and playing the received audio to the speakers. In the multi-party mode, the parties join a specific conference at a specified server. Users can only speak to other conference attendants, but cannot use video. Internet Phone uses a simple token mechanism for users to speak: only one user can speak at a time, by pressing a button on the interface. The button is available only when no other person is speaking. When 240 Teleconferencing applications: A survey a person is speaking, the voice packets are compressed and sent to the server who broadcasts them to all the other clients. Internet Phone offers the possibility to call and speak to users on the regular phone network. Internet telephone service providers act as gateways from the Internet to the telephone network. When a user dials a regular phone number, a connection to such a gateway is established. The gateway calls the specified number and passes the sound to-and-from the telephone network. Internet telephone is a commercial service and therefore users who want to use it have to pay for it. Internet Phone can work with full or half duplex audio hardware and supports caller ID. It can be configured to comply with the H.323 protocol and therefore communicate with other H.323 compliant applications. It offers a shared whiteboard application, text chat and voice mail capabilities. 3.5 Speak Freely SpeakFreely [21] is an interactive conferencing tool limited to the exchange of audio data (voice) between users. It is widely used due to its availability on various platforms, namely Windows 95, NT, and UNIX for SUN and Silicon Graphics computers. In its basic mode of operation, one user initiates a network connection to a known port on the other party’s computer. Other parties ready to communicate can be retrieved from one of the directory servers. Once the connection is established, SpeakFreely captures audio from one computer’s audio card, compresses it and sends it over the network to the other party. On the remote side, SpeakFreely decompresses the received data and plays it to the sound card. Both half and full duplex transmission is supported depending on the audio hardware and drivers. SpeakFreely supports several audio compression algorithms that can be manually selected depending on the available bandwidth. Table 4 shows the various compression algorithms used by SpeakFreely along with their bandwidth requirement. Audio Compression Simple ADPCM Simple + ADPCM GSM Simple + GSM LPC LPC-10 Bandwidth (Kbps) 40 40 20 16.5 8.25 6.5 3.46 Table 4 Speak Freely compression algorithms SpeakFreely implements multicast allowing multi-party discussion groups on networks supporting it. For networks not supporting multicast, audio packets are sent to each user to allow for transmission of an audio feed to multiple hosts. D. Thanos and C. Arapis 241 A feature distinguishing SpeakFreely from other applications is its optional feature to encrypt the audio data before it sends it to the other side. This allows for secure/private conversations in two-party or multi-party conferences. In its secure mode of operation, SpeakFreely encrypts the compressed data, in a way that only the other party can decrypt (for e.g. using the PGP encryption algorithm [22]). This prevents others from intercepting the network and listening to the audio data transmitted. The reverse operation is performed on the other side in order to listen to the data. The received stream is decrypted and then decompressed. Table 5 shows the encryption algorithms used by SpeakFreely for secure communications. Encryption algorithm PGP - Prety Good Privacy IDEA - International Data Encryption Algorithm DES - Data Encryption Standard Table 5 SpeakFreely encryption algorithms 3.6 Real Video and Real Audio Real Video is a commercial package by Real Networks [23] targeting the broadcast of audio and video over the Internet and corporate Intranets. Real Audio is a similar application, but limited to the distribution of audio only. Since Real Audio is a subset of Real Video we will restrict our presentation to Real Video only. Real Video is intended to provide streaming video and/or audio for both on-demand content and real-time live broadcasts. It is a one-way tool i.e. multimedia data are broadcast in one direction only, from the content provider to the clients (receivers). Due to its intended use, i.e. streaming audio/video and one-way tele-presentations, Real Video inserts a significant delay (~8 sec.) to compensate for data losses during the transmission. This delay makes Real Video inappropriate for real-time interactive conferences. However it can be used for real-time non-interactive teleconferences where the delay does not constitute a drawback. An example is a large conference over the Internet where participants can only listen or view the talks. Introducing an eight second delay in such a scenario does not affect the service. Real Video consists of three main sub-components. The RealPlayer (or RealPlayer Plus) which is the end-user application, used as a client for viewing streaming video and audio. The other two components, the Real Video Encoder and the Real Video Server, are used on the provider side to produce and distribute Real Video content respectively. There are two scenarios that Real Video serves. The first is streaming prerecorded audio and video data. The second is broadcasting real-time live events. In the first scenario the provider prepares a Real Video media file. This file can be created either by using the Real Video Encoder, to capture and store the media file, or by converting a prerecorded video file (QuickTime, AVI and MPEG) to the Real Video format. Files in Real Video format can be streamed to the clients from the Real Video Server. The 242 Teleconferencing applications: A survey provider creates WWW pages containing hyperlinks to the required media file4. When a user clicks on such a hyperlink the MIME protocol is used to start the client application: Real Player or Real Player Plus. Real Player connects to the machine running the Real Video Server and requests the specified file. The Real Video Server streams the media file to the Real Player, which plays it back at the client’s computer. In the second scenario the Real Video media stream is not prerecorded, but is created real-time at the provider’s side. The Real Video Encoder is used to capture video (and audio) from the provider’s camera and microphone and convert it on the fly to the real Video media format. The encoder sends the stream to the Real Video Server5, which can serve it to the clients requesting it, in the same way it serves the prerecorded media file. All the clients requesting this live media, receive the same live stream from the time they connect to the server. Figure 8 shows Real Video’s schema for the broadcast of live media scenario. Real Video stream Real Video Server Real Video Encoder Streaming data Player Player Client ClientClient Client Real Player Client Figure 8 Real Video live broadcast schema For both the live broadcast and the on-demand service, the provider can serve the same stream in different formats, (each format requires different bandwidth) and users can choose the one which is best suited to the content and their bandwidth availability. Real Video also supports a bandwidth negotiation mechanism to allow automatic selection of the format to use. The available Real Video media stream formats as well as the intended content and bandwidth requirements are shown in Table 6. 4 More precisely the media file contains information on the streaming protocol and the IP address of the server. 5 Notice that the Real Video Encoder and the Real Video Server do not necessarily run on the same machine. D. Thanos and C. Arapis 243 Audio compression Real Video Bandwidth (Kbps) 5 - 80 (*) Video compression Real Video (standard) Real Video (fractal) Bandwidth (Kbps) 1 - 420 (*) (*) user specified, depending on the quality Table 6 Real video compression algorithms Real Video also offers a feature called ‘synchronized multimedia’ where the Real Player application is connected to the WWW browser and directs it to display specific URLs while viewing the media stream. The provider attaches the URLs to specific intervals of the media stream and the Real Video Server sends both to the clients. When the Real Player receives a URL, it indicates to the connected WWW browser to display the page of that URL. Real Video uses proprietary network protocols to deliver the media stream to the clients. The network technology used is a combination of the HTTP, TCP and UDP protocols as well as multicast technology. It is not compliant to the H.323 protocol. It provides an autoconfiguration tool to choose the best protocol for each user, as well as connection statistics information for the UDP mode (Figure 9). Figure 9 Real Player’s statistics window The end-user client, RealPlayer, is available for free for Windows® 95 & NT 4.0, Mac® OS and most UNIX platforms. Real Networks provides an enhanced commercial version of the client, RealPlayer Plus, which offers some additional functionality. Namely it allows to store on the local disk a broadcast stream and play it offline, it allows better sound quality for 28.8 modems and has buttons for preset stations. The RealVideo Server as well as the RealVideo Encoder, are available for Windows® 95 & NT 4.0 and UNIX platforms. They exist in several commercial packages depending on the intended utilization. 244 3.7 Teleconferencing applications: A survey Other videoconferencing tools WebPhone [24] from Netspeak Corporation [25] is a commercial videoconferencing application similar to Internet Phone and NetMeeting. It is an interactive application compliant to the H.323 protocol, offers encryption, full duplex audio, caller ID, centralized directory server support, voice mail and text chat. VDOPhone [26] from VDOnet Corporation [27] is a commercial interactive videoconferencing tool offering H.323 compatibility, text based chat, parental control options and application sharing compatible with MS NetMeeting. OnLive! Traveler [28] by OnLive! Technologies is a 3D Virtual World software that provides real-time communication by allowing groups of people to meet in a virtual environment and talk with their own voices through animated sprites (avatars). It uses the VRML 1.0 protocol for displaying the avatars in a virtual 3D space. It mixes audio of all users in order to provide real-time group communications. PowWow [29] is a freeware interactive voice-based communications tool featuring multi-user conferences (up to 8 parties can talk at the same time), as well as text-to-speech synthesizer technology. PowWow users can connect to a central server to join a conference, use a shared whiteboard as well as use text chat features. FreeTel [30] is a freeware for non-commercial use, interactive application for voicebased and text-based communication. Net2Phone [31] by IDT Corporation provides both the software and the gateway servers to communicate through Internet to the conventional telephone network. Like Internet Phone, the user can compose any domestic or international call and Net2Phone will forward the voice data to a gateway that will pass the call to the normal telephone network. Net2Phone allows users to charge an account, and use that money to pay the gateways for the calls. Vosaic [32] offers a broadcast teleconference application suite similar to Real Video. The suite comprises three components: The Vosaic Studio, the application to prepare the media stream either live, or by converting an existing video/audio file to the Vosaic format. The Vosaic MediaServer, the application which allows the transmission of prepared multimedia streams to multiple clients. Finaly, the MediaClient is the end-user application which displays the audiovisual audio streams at the client computers. The MediaClient application exists both as a Web browser plug-in as well as a java applet which can be viewed inside any Web browser without the need of a specific plug-in. The MPEG1, MPEG2 and H.263 video codecs are currently supported for the transmission of video streams, as well as the MPEG audio (32 Kbps), GSM (13 Kbps), half-rate GSM (8.3 Kbps) and half-rate 723 (3.3 Kbps) audio codecs, for the transmission of audio streams. D. Thanos and C. Arapis VDO Phone yes yes no yes yes 2 1 yes yes no y es y es y es y es y es y es y es no no Vosaic Web Phone yes no no no yes no yes n /a yes yes no Real Video Speak Freely yes yes no yes yes yes yes 2 1 yes no no Net2Phone Internet Phone yes yes yes yes yes yes no 1 no yes no no FreeTel MS NetMeeting yes no no no no no yes yes no yes no PowWow ReLaTe yes yes no yes yes yes yes 20 12 yes no no OnLive! Traveler CU-SeeMe T w o w ay H .3 2 3 c o m p a t. A p p l. sh a r in g T ext chat C a lle r ID P a r e n ta l c o n tr o ls M u ltic a st N B u sers N B v id e o str e a m s S ile n c e d e te c tio n D ir e c to r y se rv e r S e c u r ity Paym ent 245 yes yes y es y es no no no no n /a n /a yes y es n /a n /a n /a n /a no no y es y es 8 n /a n /a n /a n /a n /a n /a yes yes n /a n /a no no no no n o no no y es y es - Table 7 Comparative table of teleconferencing tools 4 Conclusion In this paper we have presented a survey of teleconferencing packages/tools which are either freeware or commercial products. We have classified them into two main categories: broadcast teleconference systems and interactive teleconference systems. Broadcast teleconference systems are essentially one-way transmission systems: during the whole teleconference session a user broadcasts data and media streams to the teleconference participants. Interactive teleconference systems are two-way transmission systems: a user may transmit data and media streams to other participants and receive data and media streams from any other participant. Currently, limited support of multicast technology and lack of network bandwidth, especially for users connected over Internet, limit both the number of teleconference participants and the various types of media streams to be exchanged, for example animation and video at 25 frames/sec. However in the near future things will change. The next generation IPv6 protocol fully supports multicast. The advent of new types of networks such as ATM, ISDN and Gigabit Ethernet will substantially increase bandwidth availability. The new types of networks will also provide new types of services more appropriate for teleconference applications, for example the ability to tune quality of service parameters. The impact of the above improvements in the teleconference area will be important. Teleconference technology will be used in new application domains requiring extra services in addition to audio/video communication and application sharing. Examples of services include the ability to securely exchange data, the ability to set access rights depending on the user profile and methods to incorporate payment services. Furthermore the number of participants in a teleconferencing 246 Teleconferencing applications: A survey session will substantially increase requiring adaptation of user interfaces and provide satisfactory solutions for new challenges such as initial set-up procedures. Finally, current teleconferencing software will have to be integrated in teleconference environments assisting users during the whole lifetime of teleconferences. Namely assistance is needed before and after the teleconference session per se. Before the teleconference session the teleconference administrator would define the teleconference agenda. Constraints specified in the agenda will be verified during the teleconference session. Whenever the constraint is violated, the teleconference administrator will be prompted to take the appropriate actions. The teleconference session will be stored in a teleconference database. After the teleconference, the teleconference database would allow users to issue queries not only on a single teleconferencing recording, but also on collections of teleconferences. The selected teleconferences or part of them could be then ‘replayed’, for example, listen and view audio and video streams, and view actions of users on the shared application in use. References [1] C. Breiteneder, S. Gibbs and C. Arapis. “TELEPORT-An Augemented Reality Teleconferencing Environment”, 3rd Eurographics Workshop on Virtual Environments, Monte Carlo, Monaco, Februay 1996. [2] C. Arapis, D. Konstantas and T. Pilioura. “Design Issues and Alternatives for Setting up Real-time Interactive Telelectures”. Proceedings of the 1998 ACM Symposium on Applied Computing (SAC 98), Atlanta, Georgia, February-March 1998, pp. 104-111. [3] D.J. Gemmell and C.G. Bell. “Noncollaborative Telepresentations Come of Age”, Communications of the ACM, Vol. 40, No. 4, April 1997, pp. 79-89. [4] D. Konstantas, Y. Orlarey, S. Gibbs and O. Carbonel. “Distributed Musical Rehearsal”, Proceedings of the International Computer Music Conference 97, Thessaloniki, Greece, September 1997. [5] Stroud’s reviews of communications clients, http://cws.internet.com/32phone-reviews.html [6] North Carolina State University’s Desktop Videoconferencing Product Survey, http://www3.ncsu.edu/dox/video/products.html [7] D.J. Gemmell and C.G. Bell. “Noncollaborative Telepresentations Come of Age”, Communications of the ACM, Vol. 40, No. 4, April 1997, pp. 79-89. [8] H.323 ITU-T Recommendation H.323 - Packet - based multimedia communications systems, http://www.itu.int/itudoc/itu-t/approved/h.html [9] C. Arapis, D. Konstantas and T. Pilioura. “Design Issues and Alternatives for Setting up Real-time Interactive Telelectures”. Proceedings of the 1998 ACM Symposium on Applied Computing (SAC 98), Atlanta, Georgia, February-March 1998, pp. 104-111. [10] Mbone (IP Multicast) information, http://www.mbone.com/ [11] IPv6: Next generation IP protocol, The Internet Engineering Task Force, RFC 1883. [12] Dorcey, T. CU-SeeMe Desktop VideoConferencing Software, in Connexions 9, 3 (March 1995) [13] White Pine Software, http://www.wpine.com [14] CU-SeeMe VR application. Immersive Desktop Teleconferencing, in the Proceedings of the 4th ACM International Multimedia Conference (Nov. 1996) [15] Remote Language Teaching (ReLaTe), http://www.ex.ac.uk/pallas/relate/ [16] Robust-Audio Tool (RAT), http://www-mice.cs.ucl.ac.uk/mice/rat/ [17] VIdeo Conferencing tool (VIC), http://www-nrg.ee.lbl.gov/vic/ D. Thanos and C. Arapis [18] MS NetMeeting application, http://www.microsoft.com/netmeeting/ [19] Internet Phone v.5.0, http://www.vocaltec.com/products/iphone5/index.html [20] VocalTec Communications, http://www.vocaltec.com [21] SpeakFreely, http://www.fourmilab.ch/speakfree/windows/ [22] PGP: Pretty Good Privacy by Simson Garfinkel, O’Reilly & Associates, 1994 [23] Real Networks, http://www.realnetworks.com [24] WebPhone, http://connect.netspeak.com/product/webphone/index.html [25] NetSpeak Corporation, http://connect.netspeak.com/ [26] VDOPhone, http://www.vdo.net/vdostore/vdophone.html [27] VDOnet Corporation, http://www.vdo.net/corporate/ [28] OnLive! Traveler, http://www.onlive.com/prod/trav/about.html [29] PowWow application, http://www.powwow.com/ [30] FreeTel, http://www.freetel.com/ [31] Net2Phone, http://www.net2phone.com/ [32] Vosaic streaming multimedia products, http://www.vosaic.com/products/index.html 247