SanKu JO <email@example.com>
Lawrence Berkeley National Laboratory
Pierce E. CANTRELL <firstname.lastname@example.org>
Texas A&M University
Multicasting is a bandwidth-efficient approach to transmitting real-time audio and video to multipoint receivers over the current Internet. The heterogeneous nature of the Internet, however, can cause a conflict between receivers in multiparty videoconferencing. In a real-time videoconferencing session with many participants, some receivers could suffer congestion when their link bandwidths are very limited or congested with heavy traffic. Because of different link capacities, a single multicast stream cannot always provide good service to all receivers. If the source transmits high-quality video data, receivers with low bandwidth will suffer from high packet loss, which causes a degradation in quality. If the source adapts the video quality to support receivers with low bandwidth, the participants with high bandwidth links will complain about the lower quality.
Layered multicast has been proposed as an effective solution to cope with the heterogeneity of network bandwidth for multiparty videoconferencing over the Internet. In layered multicast, a source transmits layered video streams through multiple channels. By adjusting the subscription level of layers, each receiver can dynamically adapt to its own capacity. In layered coding, a video source is divided into several sub-streams where a lower layer must be received before a higher layer can be decoded. Adding incremental layers in order increases the quality of the representation. In order to utilize a higher layer, all of the lower layers must be received.
Although significant research on layered multicast has been conducted, problems in scalability and implementation remain. Therefore, we are proposing a dynamic quality-of-service (QoS) control scheme based on a new architectural feature to support layered multicast videoconferencing over the current best-effort and heterogeneous Internet. The receiver adds and drops layers to reduce the packet-loss rate by reacting quickly to network congestion. Some other important issues, including information sharing and bandwidth fairness, are also considered in designing the architecture and QoS control scheme. The new approach is implemented on a videoconferencing tool, CafeMocha, which has been developed in the Texas A&M University Multimedia and Networking Laboratory. We will show the soundness of the proposed concepts through the experimental results.
An Internet videoconferencing source generates bursty and high bandwidth data that can cause congestion, particularly for receivers with low bandwidth. In delivering a multimedia stream from one source to many receivers, as in the broadcast of a lecture, or from many sources to many receivers, as in a videoconference, the traditional mechanism of sending a separate stream to each destination, so-called "unicasting," is obviously inefficient in its use of bandwidth. In contrast, "multicasting" is a bandwidth-efficient approach for transmitting real-time audio and video to multipoint receivers over the current Internet. By managing network bandwidth efficiently, multicasting  allows a single data stream to be directed and distributed to multiple receivers. Therefore, the number of videoconference receivers can be increased without increasing the bandwidth.
However, the heterogeneous nature of the Internet [2,3] makes videoconferencing a challenging research area. In a real-time videoconferencing session with many participants, some receivers can suffer congestion when a link or links on the path from the source to the receiver have limited bandwidth or become congested with other traffic. Because of the difference in available or connection bandwidth as shown in figure 1, a single multicast stream cannot always provide good service to all receivers. If the source transmits high-quality video data, receivers (R3 or R4) with low bandwidth will suffer from high packet loss, which causes a degradation in quality. If the source adapts the video quality to support receivers with low bandwidth, the participants (R1 or R2) with high bandwidth links will complain about the lower quality. Furthermore, network bandwidth varies dynamically based on network congestion. Besides link bandwidth heterogeneity, the receivers' terminal performance and interest in media type or video quality may be different.
Figure 1: Bandwidth Heterogeneity of Receivers
Several approaches have been suggested to cope with the heterogeneity of network bandwidth for multiparty videoconferencing over the Internet. One approach is to use a transcoding gateway [4,5]. Alternatively, simulcast  has also been suggested, by which a source sends multiple copies of the same signal at different rates each on a separate multicast channel for several groups of heterogeneous receivers. A more efficient approach, layered multicast [2,5,7,8,9,10,11], has also been proposed. Layered multicast combines layered video compression with Internet protocol (IP) multicast group transmission [ 1,12,13 ].
Shacham  proposed heterogeneous multicast combining layered video compression with a layered transmission system for heterogeneous multipoint communication. Deering  first suggested a practical scheme in which a source transmits multiple layer streams through multiple IP multicast groups. Each layer can be mapped onto a separate IP multicast group. Receivers can adjust their IP multicast group membership to adapt to network heterogeneity and network bandwidth variations. Turletti and Bolot  introduced the concept that layered multicast could be implemented through layered coding and IP multicast group membership control.
In layered multicast, a source simultaneously transmits several hierarchically layered streams, each on a separate multicast group. In this layered encoding scheme, the source signal is divided into a number of hierarchical layers, each of which represents an increasing level of quality . The base layer describes a basic quality level. Adding incremental layers in order increases the quality of the representation. Because of the hierarchical dependence, therefore, a lower layer is more important than a higher one. Each receiver can tune its own received quality by adjusting the number of layers it receives by sending Internet group management protocol (IGMP)  messages to join or drop a layer.
On the basis of the general concept of layered multicast, many authors [2,5,7,9,10,11] have addressed issues in realizing effective Internet videoconferencing tools. For example, McCanne and colleagues [5,8,10] propose the receiver-driven layered multicast (RLM) protocol, which describes the first practical adaptation algorithm by a receiver. Wu and colleagues  propose ThinStreams, where a "thick," high-bandwidth video signal is split into several layers, each with an identical bandwidth, in order to reduce packet loss from joining a layer with too high a bandwidth. Li and colleagues  propose the layered video multicast with retransmission (LVMR) method, whereby the smart reliable multicast transport protocol (SRMTP) is used at the transport layer to retransmit lost data of MPEG-2 video stream. They also suggest a hierarchical architecture that locates a subnet agent (SA) in each subnet and an intermediate agent (IA) in every domain within a multicast group.
However, some problems exist with RLM, ThinStreams, and LVMR. The layer control algorithm of RLM is very susceptible to the high burstiness of video, resulting in significant packet loss. RLM's distributed information-sharing functions also reveal a scalability problem. The test packets for congestion detection in ThinStreams can lead to significant packet noise for all participants and a high source load in an IP multicast session with many participants. LVMR is conducted based on a layered MPEG-2 using three frames of intraframe, predictive, and bidirectionally-predictive. The accumulated bandwidth of the three layers ranges from 148 Kbps to 1.5 Mbps which is not practical in the current Internet where one multicast backbone (Mbone) video session is assumed to require a bandwidth less than 128 Kbps. Locating SAs and IAs throughout the entire IP multicast tree could result in an overhead similar to the tentative-join with failure notification scheme suggested by McCanne .
In this paper, we propose a new dynamic quality-of-service (QoS) scheme based on our layered multicast architecture, where a new network function, the agent, is designed to assist receivers that are members of the network group . The agent is located on the outgoing router of a homogeneous network group (HNG) in order to provide receivers with useful information. The HNG is defined as a set of subnets or networks that share an outgoing router connected to the Internet backbone. These subnets are linked through a homogeneous bandwidth or a bandwidth higher than the outgoing router's bandwidth. A layer control algorithm for adding and dropping layers at the receiver is developed to reduce packet loss by reacting quickly to network congestion.
Although much previous research [7,8,9,11], including some on RLM, ThinStreams, and LVMR, has been conducted for layered multicast videoconferencing, all of these studies have problems in scalability and implementation. Furthermore, both the RLM and ThinStreams protocols are designed and simulated based on a constant bit rate (CBR) video source where each layer has a fixed bandwidth, while many of the current layered video encoding algorithms are designed to be variable bit rate (VBR) (e.g., Taubman and Zakhor's 3-D Subband Coding , PVH Codec , and layered MPEG ). Most of the videoconferencing tools for the Internet have also been implemented using VBR encoders.
Because the VBR or CBR nature of the video source is one of the most important factors in designing a layered multicast system, RLM and ThinStreams need to be reevaluated for a layered VBR source. A layered multicast architecture should be designed and developed with consideration for the VBR features from the beginning. Because a VBR source is bursty, layered multicast videoconferencing over the Internet for a VBR source has many research challenges.
We suggest a new layered multicast videoconferencing architecture based on a VBR source. In designing a new architecture, we have several fundamental goals. First, the new architecture should operate well for the layered VBR streams in use on the Internet today. Furthermore, the scalability of the approach should be considered for different network topologies, different numbers of sessions and receivers, and various layered encoders. Another design goal in this research is to avoid significant changes to the current network architecture in order to achieve timely deployment.
Considering the network environment and essential design goals, we propose a new architecture that we refer to as an agent-assisted layered multicast architecture (ALMA). As illustrated in figure 2, ALMA consists of several logical functions through the source, the agent, and the receiver.
Figure 2: System Block Diagram of ALMA
The basic function of the source is to split the video signal into hierarchical layers and transmit them through multiple real-time protocol (RTP) channels, where each RTP channel  uses its own IP multicast group address. In ALMA, the source also periodically transmits information about the video source through the real-time conferencing protocol (RTCP) channel of the core layer, the lowest layer of the session. This information is useful to the agent and receivers for layer and bandwidth control.
When a new receiver joins the session, it subscribes to the core layer with two channels (RTP and RTCP). The receiver will ascertain features of the session based on information transferred through the RTCP channel of the core layer, that is, the lowest layer. It can then adjust parameter values for layer control. Because some information, such as the bit rate of a layer, varies with time, the source dynamically calculates these time-varying quantities and transmits them periodically on the RTCP channel of the core layer. The receiver can conduct adaptive layer control using this session information from the source. In particular, when bandwidth differences between layers are very large, as in MPEG, the bit rate information of each layer will be useful. In ALMA, a receiver can use the bit rate information as well as the bandwidth information from the agent to reduce the number of join experiments.
There are two kinds of information about the source: static and dynamic. The numbers of layers and media are static, whereas the bandwidth information for each layer is changeable in VBR media. Several examples of source information are session type, list of media, features of the media source, maximum numbers of each medium, and rate information for each layer. In ALMA, the source information is sent through the RTCP channel of the core layer.
A source may receive feedback through the RTCP channel from the agent for further control functions. Because the number of agents is limited when compared with the large number of receivers, the source can effectively gather useful information about the receivers from the agents without a feedback implosion problem.
In the ALMA block diagram shown earlier in figure 2, a receiver conducts the dynamic control of the layer subscription level in order to adapt the QoS to the current network capacity. The QoS control function will be discussed in detail in the next section. Receivers in an HNG are supported by the agent for their layer control, information sharing, and bandwidth control.
Based on the bottleneck point of the path between the source and the HNG, the session can be specified as being in one of two modes: ALMA-A or ALMA-B. When the bottleneck point is located somewhere besides the outgoing router of the HNG, the session mode is set to ALMA-A. If the bottleneck point is the outgoing router of a HNG, the session is set to the ALMA-B mode. A receiver runs differently according to its session mode. With regard to the layer control function, it checks the network status indicated by the packet-loss rate. When the network is congested, it drops a higher layer.
When it does not observe any packet loss for a specific period, the receiver tries to add the next layer using an adaptive algorithm. A receiver retains the source information transferred through the RTCP channel of the core layer, and it can use this information to achieve better performance using the QoS control. When the session mode is ALMA-B, this source information can be very useful for successful join experiments. According to the result of the add or drop decision function, a join or leave message is transferred to the router through IGMPv2 . In adding or dropping layers, a receiver cooperates with other receivers in the same session through the HNG. This information sharing and cooperation is conducted through communication with the agent. Detailed explanations about layer control and other issues related to the receiver will be described in the following section.
To support receivers participating in a layered multicast session through an HNG, an agent was designed that could be located on the outgoing router of the HNG. The agent consists of two main functional blocks: the bandwidth manager (BAM) and the session information manager (SIM), as shown in figure 2. The BAM conducts functions related to bandwidth, including bandwidth monitoring, bandwidth reporting, and inter-session bandwidth control. The SIM functions include session information gathering, intra-session coordination, and preliminary session admission control.
When the agent is located in the router, the complexity of the agent should be minimized so as not to degrade the performance of the router during its own basic functions. The BAM can be implemented by using the existing bandwidth monitoring function supported by most routers and routing kernel programs . The BAM continuously collects statistics to characterize the link's multicast data streams. On the basis of both current and past transmission statistics, the agent estimates available bandwidth capacity from a known multicast rate limit or a maximum link capacity. Through communication with session participants, the SIM keeps track of information pertaining to all active sessions passing through the outgoing router in the HNG. The session information is shared by all receivers in the same session.
BAM conducts three functions: bandwidth monitoring, bandwidth reporting, and inter-session bandwidth control. Most multicast routers  calculate the total number of bytes transmitted. We can use this information to find available bandwidth at a specific time and report it to the receivers.
As Woodings  points out, an effective bandwidth estimator must be selected to track subnet multicast traffic. Simply selecting the most recent bandwidth sample is not reliable, since VBR data streams are bursty in nature. For instance, a sudden motion in a frame could result in poor compression and a large amount of data being transmitted for a single frame, even though previous and future frames might compress well. Similarly, a high-motion sequence might have two very similar sequential frames that compress highly despite the overall compression rate being poor. A more advantageous technique is to employ a filter to smooth out the statistics between samplings.
In this equation, ω2 is the current bandwidth estimate, ω1 is the new measurement, and σ is a weighting factor. Decreasing σ weights the previous data more heavily. Regardless of the estimation technique used, the sampling frequency should be chosen carefully. Sampling more often than every 5 seconds typically causes the estimator to be too influenced by transmission transients. Not sampling at least every 20 seconds prevents the estimator from adapting to true, long-term changes in network traffic. The values of σ and the sampling frequency are selected experimentally to characterize the link bandwidth capacity. The available bandwidth in the router, Rrouteravailable, can be calculated by (2), in which Rroutermaximum indicates the maximum bandwidth of the router.
This information is reported to receivers upon request. The receiver can use this information for two functions: bottleneck probing and layer control. For layer control, this information is useful only for receivers in a session where the bottleneck is the outgoing router in the HNG. The source can also use this information for controlling the distribution scope of each layer in the session.
When the bottleneck point of multiple sessions is the outgoing router, the agent can control the bandwidth of each session for fairness and weighted assignment of bandwidth or session priority. The BAM uses session information collected by the SIM for this bandwidth control function. The receivers in sessions whose bottleneck locations are not the outgoing router of the HNG are not controlled by this function of the agent. The receiver must inform the agent of its mode.
SIM conducts session information gathering and reporting, intra-session coordination, and preliminary session admission control. The agent collects session information from the receivers participating in multicast sessions in the HNG. Every time a receiver joins or leaves a session, it reports its status to the SIM. SIM maintains information for each session as well as global information. The IP multicast group address of every session's core channel is used as a key in the session information records. SIM maintains a structure for each session to keep the following information: the total number of participants, the current session mode, the latest subscription layer level, the session mode, and the latest join trial time. SIM also keeps other static information, such as the maximum number of layers in the session. From the information for all sessions, SIM can obtain such global information as the total number of active sessions.
SIM conducts intra-session information sharing or coordination among the receivers in a session based on the gathered session information. The session information maintained by the SIM can also be used by the BAM for bandwidth control among multiple sessions sharing the bottleneck bandwidth of the outgoing router in the HNG.
SIM can also support a simple admission control for a new session in the HNG. When a receiver wants to join a new session that is not active through the HNG, the SIM can check the total number of sessions and their subscription levels. If there are sessions subscribed to multiple layers, the SIM can accept the new session. When all sessions are subscribed to only the core layer and the available bandwidth is not sufficient, it can recommend the receiver to join a new session later.
We will describe the dynamic QoS control function achieved by adjusting the subscription level in the receivers. Each receiver can react to network traffic variations and source burstiness. We will also explain the finite state machine for layer control and the learning algorithm, which is designed to reduce failed experiments. Although we investigated several issues involved in the scalability of ALMA , such as intra-session information sharing and inter-session bandwidth control, in this paper we will concentrate on the dynamic QoS control by a receiver.
Dynamic QoS control by rate adaptation through layer control is one of the key functions in layered multicast. In a layered video encoder, even a small amount of packet loss in the base layer results in the inability to use the corresponding data received for higher layers. Therefore, the new layer control scheme we propose aims to reduce this loss while maintaining a high level of bandwidth utilization. Thus, we developed an active layer control function that reduces the loss rate by reacting quickly to congestion and by reducing failed join experiments.
As we explained in the previous section, the agent can be located in an outgoing router covering an HNG. According to the results of the bottleneck probing function, the session is specified as being in one of two modes: ALMA-A or ALMA-B. Regardless of its session mode (i.e., ALMA-A or ALMA-B), a receiver will drop the current highest layer without any time delay if it suffers high packet loss. When a receiver is in the ALMA-A mode, it makes a join experiment through the adaptive algorithm, which is designed to reduce failed experiments. To avoid excessive join experiment failures, an exponentially increasing delay is imposed between failed experiments.
If a receiver is in the ALMA-B mode, a receiver will use both the layer rate information from the source and a bandwidth report from the BAM. Only if the available bandwidth is greater than the bandwidth required to join the higher layer will a receiver perform a join experiment. Using the RTCP channel  of the core layer, the source periodically distributes its stream information, including the estimated bit rate of each layer, the maximum layer number, and the layer assignment of multiple media.
Because the source is VBR, the estimated rate of each layer dynamically changes with time. The periodic bit rate information from the source is very useful to a receiver in its decision to join a higher layer, especially when the bandwidth difference between layers is very large, such as with layered MPEG . Even when the session mode is ALMA-A, the source information could be very useful for efficient layer control. With the source rate information, ALMA can be applied to other layered video encoders with various layer rate distributions. In much of the previous research in this area, layer control is performed without knowledge of the source characteristics [7,11].
The basic dynamic QoS control in ALMA is achieved through a finite state machine. Depending on the receiver mode (ALMA-A or ALMA-B), several state transition conditions are slightly different. After explaining the common part of the state machine, we will describe differences between ALMA-A and ALMA-B.
At the beginning point in designing the QoS control function, we had many states with various complicate state transition conditions. We intended to make a receiver react sophisticatedly to a VBR source and the variation of network bandwidth for a good performance. However, the bursty nature of VBR resulted in very different results without any consistency for each experiment. Therefore, we decided to develop a layer control algorithm with consistent performance through a simpler state machine. Figure 3 shows the state transition diagram for dynamic QoS control by a receiver in ALMA-A. It has four states: ADD, DROP, STABLE, and UNSTABLE. A receiver can be in any of these states depending on the state transition conditions, such as observed packet-loss rate and certain timers.
Figure 3: State Transition Diagram of ALMA-A
Table 1: Packet-Loss Threshold Values and Network State Conditions
|Packet-Loss Threshold Values|
|•ë||Packet loss observed by a receiver|
|•ëic||Any instant loss after adding a layer|
|•ëu||Loss rate for underloaded status|
|•ëc||Loss rate for congested status|
|•ëhc||Loss rate for highly congested status|
|Network Status and Conditions|
|INSTANTLY LOADED||Any packet loss caused by adding a layer|
|UNDERLOADED||•ë < •ëu|
|LOADED||•ëu °Â •ë < •ëc|
|CONGESTED||•ëc °Â •ë < •ëhc|
|HIGHLY CONGESTED||•ë °Ã •ëhc|
For the most basic state transition conditions, we checked the network status by a scheme that is dynamically indicated by the loss rate observed for a specific period by a receiver as Buss and colleagues did . Table 1 represents the packet-loss rate thresholds for deciding five network situations. The variable •ëic means any packet loss just after a join experiment. Even a single packet loss observed by a receiver in the ADD state indicates the network status as INSTANTLY LOADED. The •ëu is the threshold loss rate representing the network condition as UNDERLOADED, which is a loss rate acceptable to the receivers. When a receiver suffers a packet-loss rate more than •ëu but less than •ëc, it interprets the network situation as LOADED. If the loss rate is more than •ëc, a receiver interprets the network as CONGESTED. In this network status, the video quality is likely to be worsened by remaining subscribed to the current level. The HIGHLY CONGESTED state, indicated by •ë °Ã •ëhc, is the condition in which a receiver in the STABLE state will drop a layer instantly. A receiver in the STABLE state should not react sensitively to very temporal change in network and bit rate of source. This state transition condition in the STABLE state could absorb the very transient congestion. This temporal congestion can be caused by the sharp increase in bit rate of VBR source, by the join experiment of other receivers to a problematic layer with high bit rate, or by the increase of other network traffic.
The threshold values in Table 1 were chosen as a result of several experiments. When we chose large loss rate values for these parameters, we could observe high performance in a receiver at the cost of higher packet loss. The values can be adjusted according to the features offered by the layered encoder. For example, if the layered encoder is very loss resilient, the loss threshold acceptable to a receiver can be increased. Because the loss rate, •ë, is used by the layer control function to decide the network state, the state transition is decided by the observed loss rate calculated using the numbers of received packet and lost packet for •Ä T by (3).
Because every packet is sent using RTP with its sequential number, a receiver can count total numbers of received and lost packets for a specific period. A receiver checks the loss rate for every network state checking period, presented as Tns. Normally, •Ä T and Tns can have the same time value. •Ä T decides the level of sensitivity to loss, and Tns decides the level of frequency of the state transition. When a state transition occurs, both Tns and packet information are reset. Whenever a packet is received, the information about received packets is saved in the linked list. Using the sequence number of the packet, we can check for any losses. Furthermore, the loss rate is updated every time a new packet arrives.
Because of the hierarchical dependency of a layered encoder in a layered multicast videoconference, reducing the loss rate is more important than obtaining more data at the cost of a larger packet loss. Therefore, one of the main goals of the state machine is to reduce the packet-loss rate by reacting quickly to network congestion.
Receivers are designed to drop the current highest subscribed layer in any state and transition to the DROP state. However, the state transition conditions needed to drop the current highest layer are slightly different depending on the current state. When a receiver attempts to join a layer by change to the ADD state and observes any packet loss whatsoever, it drops the layer being added because the probability that the loss is caused by the new layer is very high. If the network is LOADED or CONGESTED for time •Ä T in either the STABLE or ADD state, the state changes to UNSTABLE. If the network is UNDERLOADED, even after adding a layer, the state changes to STABLE.
When the receiver is in an UNSTABLE state, it will drop the highest layer when it experiences a CONGESTED condition. If the receiver observes the network in LOADED, it remains in the UNSTABLE state. To delay the join trial for the next layer, a receiver backs off the join timer TJN+1 when we denote the current subscribed level as N. If the link is UNDERLOADED, the state is changed to STABLE. A receiver in the DROP state will drop the current layer if the link is LOADED or CONGESTED. If the loss rate is less than •ëu the state is changed to UNSTABLE. Therefore, it will need several Tns to reach a STABLE state after a receiver drops a layer.
The layer control function is also designed to guard against interference from independent receivers in different sessions, so that a receiver in a STABLE state will not react to transient congestion caused by a join experiment conducted by other receivers in a different session. If the network is LOADED or CONGESTED, the state is changed to UNSTABLE.
Because the state transition from STABLE to ADD differs according to whether the receiver is in the ALMA-A or ALMA-B mode, it will be discussed separately. In addition, relaxation and back-off are functions for the learning algorithm that will be discussed later. As mentioned above, the term relaxation means multiplicatively decreasing TJi using a specific relaxation constant. The term back-off means multiplicatively increasing TJi using a specific back-off constant. Relaxation occurs only in a STABLE state. After the receiver joins the next higher layer, it moves to STABLE if the network is UNDERLOADED for Tns. If a STABLE state is kept for the next Tns, relaxation occurs. Back-off takes place during any transition to DROP.
If the session is in the ALMA-A mode and the current state is STABLE, a receiver will perform a join experiment for the next higher layer (N+1) if the join timer for that layer, TJN+1, has expired and if N < Nlocalmax, where Nlocalmax is the local maximum layer specified by the user based on the local machine performance and his interest in video quality level or media types. A receiver depends on the learning algorithm to select a time for adding a layer. If the link is UNDERLOADED for time Tns while in the STABLE state, the join timer for the current layer, TJN, is relaxed.
If the result of the bottleneck probing function indicates that the outgoing router of the HNG is the congestion point of a session, a receiver in the session will operate its state machine in the ALMA-B mode, especially for join trials (as illustrated in Figure 4).
If a receiver in the ALMA-B mode is in a STABLE state, the join timer for the next layer, TJN+1, has expired, and if N < Nlocalmax, the receiver will request a bandwidth report from the BAM. If the available bandwidth in the outgoing router meets the join trial condition given below, the receiver will perform a join experiment.
In (4), Brouteravailable indicates the available bandwidth in the outgoing router. BN+1 is the average bit rate of the (N+1)th layer, which is sent from the source through the RTCP channel, and •ã is the weighting decision parameter needed to achieve a different probability of the successful join experiment. If this parameter value is very high, then the frequency of the join experiment will be decreased while the probability of success will be increased. In our research, we set •ã as 1. We may choose this value in terms of the bit rate and the burstiness of each layer. In deciding if a receiver should perform a join trial based on the available bandwidth, we may adopt a more sophisticated approach, such as a measurement-based admission control algorithm for predictive service. (This is a future research topic.)
If the link is UNDERLOADED for time Tns in the STABLE state, the join timer for the current layer, TJN, is relaxed as in ALMA-A. If Brouteravailable is not sufficient and the join experiment is suppressed, the same relaxation occurs in the ALMA-B mode.
Figure 4: State Transition Diagram between STABLE and ADD in ALMA-B Mode
All of the functions for ALMA have been implemented in a real network environment. The source and receiver ALMA functions were merged with the CafeMocha tool with six VBR video layers . The agent functions execute on the local router kernel running on the FreeBSD computer. The performance of the proposed functions for a layered multicast videoconferencing will be evaluated with metrics, including the observed loss rate and the subscription level of each receiver in a rate-limited environment. Experiments were conducted using the lab test bed with several topologies . In this paper, we present the experimental results in an ideal environment with single source and single receiver for evaluating the dynamic QoS function.
We evaluated the ALMA functions by means of experiments in the test bed using video sources that were obtained directly from a videocassette recorder. In all of the experiments, a sequence from the movie When Harry Met Sally was used. The source output rate can vary slightly between different runs since we are running at a low frame rate of only three frames per second. All sequences are limited to 15 minutes.
Figure 5 illustrates the cumulative bandwidth for the movie When Harry Met Sally. The bit rates for layers 0 and 1 remain constant because the bit rate for layer 0 is limited to 20 Kbps . The bit rates for the cumulative layers, except for the two lowest layers, are changing widely on the various scenes being run.
Figure 5: Cumulative Bandwidth Rates of the Movie
An ideal layer subscription level can be obtained from the source bit rate for each layer and the bandwidth limit of 150 Kbps in the router, which we set as maximum bandwidth for all experiments. The ideal subscription level at a specific point is the level whose cumulative bit rate is most close to but less than the router bandwidth limit. For a bandwidth limit of 150 Kbps in the router, the ideal subscription level is naturally very different for different sources. Considering the high variations in bit rate for the different sources, designing a QoS layer control for various video sources is a significant challenge.
We tracked a number of program state variables during every test. For each frame at the source, we saved the frame timestamp and the octet count on each encoded layer. With these data and the knowledge of the rate limit at the router, we were able to construct the "ideal" subscription sample path for the receiver. Using that ideal sample path, we then calculated the corresponding bandwidth received by a user under an ideal layer control mechanism. At the receiver for each frame received, we saved the local time, the frame's timestamp, the total number of octets received, the number of packets received, the number of packets lost during the reception of the frame, the received bytes, the lost bytes, and the level of subscription. We traced the same information for each layer. With these data, we were able to construct the receiver's subscription sample path and its received bandwidth. We then evaluated the effectiveness of the layer control algorithm by comparing the results to the "ideal" case. From the traced data, we were able to obtain the values that we needed to calculate the metrics. We denoted the ideal total data by Rsideal, the received total data by Rrsum, and the total data lost by Rrloss.
We used several metrics to evaluate the ALMA functions. In calculating the metrics, we took into account the features of hierarchically encoded data. In the layered streams, a higher layer can be decoded only if the lower layers are received as explained above. The data received for one frame of the higher layers is useless if the loss rate of the base layer is over a threshold that is a function by the loss resilience of the codec. In some codecs that are sensitive to even a small packet-loss rate, we may not be able to decode all the data of the base layer when a receiver suffers packet loss; furthermore, all the data from the higher layers are wasted in this case. The wasted data rate can be determined based on the degree of hierarchical dependency and the effect of loss rate on the layered encoders.
In the CafeMocha videoconferencing tool , we could observe many "jerky" blocks even with packet-loss rates less than 10 percent. Therefore, we used a total loss rate per frame of 15 percent as the threshold in calculating the wasted data rate. For a frame with a total loss rate less than 15 percent, we did not regard any data of higher layers as being wasted. We can get the effective data rate based on this feature. The "effective" data means the received data without any wasted data. The total effective received data, Reeff, can be calculated by (5), in which Rrnoise is the total data wasted because of the high packet-loss rate.
Using the effective data rate, we can calculate some other metrics. With the assigned bandwidth limit in router, we were able to calculate the cumulative ideal rates, Rsideal, a receiver can obtain using the source bit rates.
We also calculate the performance rate with noise, Pn, and the loss rate with noise, Ln, without considering wasted data. We get the noise frame rate, Fn, that represents the percentage of video frames with noise.
The results achieved with the dynamic QoS control of ALMA through the layer control function are presented and analyzed. All of the results shown were gathered by experiments in our test bed using an implementation of the ALMA system in the CafeMocha videoconferencing suite. We performed six experiments using the two modes of ALMA with the same source, a 15-minute sequence taken from the movie When Harry Met Sally. The results of the basic functions of ALMA-A and ALMA-B are examined and compared with those of RLM as implemented by Gholmieh .
When the outgoing router with the agent is not the bottleneck on the path between the source and the receiver, or when the agent is not available, a receiver will run the dynamic QoS control in the ALMA-A mode. Even though the outgoing router connecting the HNG to the Internet backbone is likely to be a bottleneck, the congestion point could be located somewhere else along the path rather than at the router because of a change in Internet traffic somewhere along the path. Therefore, a receiver needs to check the bottleneck point frequently through the bottleneck probing function. If the result of the probing function shows that the congestion point is not the outgoing router of the HNG, a receiver will then switch to the ALMA-A mode.
Figure 6: Results of Dynamic QoS Control in the ALMA-A Mode
Figure 6(a) shows the ideal and observed subscription-level paths. The thin line represents an ideal subscription level, and the thick one indicates observed subscription levels at the receiver in the ALMA-A mode. When the video source is static with little motion, the data rate is low and the ideal subscription level is high, as shown in the intervals between 100 and 200 seconds. When the video scene is highly changeable, the ideal subscription level is low, as shown in the interval between 480 and 600 seconds. Therefore, the ideal subscription level represents a burstiness of source under the bandwidth limit of 150 Kbps in the router bottleneck. Figure 6(b) presents an ideal data rate and the observed data rate. The graph for the ideal data rate has been inferred in the same way as the graph for the ideal subscription level. When the bit rate observed by the receiver exceeds the ideal rate, congestion will occur in the router and cause packet loss. We can see this phenomenon in Figure 6(c), which shows the packet losses measured at the receiver. When the observed subscription level is higher than the ideal one, we can see a high data rate in Figure 6(b), causing the high packet losses seen in Figure 6(c).
There are about nine failed join experiments that cause transition congestion in Figure 6(a), which can degrade video quality at these times: 50, 280, 300, 390, 500, 530, 560, 720, and 800 seconds. Repeated join failures can be seen between 480 and 600 seconds. These failed join experiments are inevitable because we do not have any mechanism to determine the bandwidth availability of the link. Only the learning algorithm can work in this situation, as observed between 480 and 600 seconds.
When the source rate sharply increases, as it does between 200 and 240 seconds, a receiver will need to drop several layers consecutively. Because the receiver observes a high packet loss, it will drop a layer instantly and change the state from STABLE to DROP. Even after the first drop has occurred, the receiver will need to drop another layer because packet loss continues in the DROP state. When a receiver drops layers one after another, as in this case, the receiver will experience continuous packet loss because of a combination of the leave latency and Tns. A receiver needs to wait three seconds to prune the dropped layer and an additional Tns to check the network state after the previous drop has completed. It will take at least Tns + LEAVE_EXPIRE_TIME [12,18] for another drop to occur after the first drop is finished. Therefore, the loss observed between 200 and 240 seconds is inevitable, even though the layer control function of ALMA is designed to react quickly to source burstiness and network congestion.
When the join experiment fails, a receiver will drop the layer instantly. However, the packet loss will still be observed for the three seconds of LEAVE_EXPIRE_TIME. Therefore, the leave latency in IGMP is a very important factor in deciding the performance of layered multicast. Although some packet loss can be observed around 415 seconds, layer dropping does not happen because the receiver is in the STABLE state and the loss rate is less than •ëhc. Therefore, we can keep a receiver from reacting too sensitively to very short source burstiness and network congestion. When a receiver repeatedly fails to join a layer, it multiplicatively will increase the join timer for the layer by backing off. This function is visible between 450 and 600 seconds.
ALMA-B represents the case where the outgoing router with the agent is a distinctive bottleneck in the path between the source and the receivers in the HNG. In this case, a receiver can use the bandwidth information available from the BAM for the join experiments. The receiver will decide if it can perform or suppress a join experiment, based on (4).
Figure 7: Results of Dynamic QoS Control in ALMA-B Mode
Figure 7(a) shows the ideal and observed subscription level sample paths, Figure 7(b) shows the ideal and measured bandwidth usage, and Figure 7(c) shows the packet losses measured at the receiver. Compared with the results of experiments using ALMA-A, the number of failed join experiments is reduced from nine to only two at around 300 seconds. Because a receiver will perform a join trial only if the available bandwidth in the router is greater than the bandwidth of the next layer to be added, it can reduce the number of failed join experiments. This effect is very noticeable between 480 and 600 seconds. During this period in ALMA-A, there were three failed join experiments whereas there were no join experiments undertaken in the ALMA-B mode.
The loss rate and the frequency have been greatly reduced as can be seen by comparing Figure 7(c) with 6(c). Even though a receiver can get help from the BAM, it cannot perform perfect join experiments because of burstiness in the source and variations in the network bandwidth. The effectiveness of BAM can be affected by the parameters used for bandwidth measurement, such as the bandwidth measurement time interval in the agent and the decision parameter, •ã, in the layer control algorithm.
A performance comparison of the dynamic layer control function of ALMA and RLM  was conducted using RLM results from Gholmieh's CafeMocha RLM implementation . Figure 8(a) shows the ideal and observed subscription-level sample paths, Figure 8(b) shows the ideal and measured bandwidth usage, and Figure 8(c) shows the packet losses measured at the receiver.
Figure 8: Results of Dynamic QoS Control of RLM
There were 11 failed join experiments using RLM that caused transient congestion with high packet losses around 10, 50, 210, 300, 320, 350, 510, 520, 560, 650, and 720 seconds. As in the ALMA-A mode, this problem is particularly significant between 490 and 560 seconds in Figure 8(c). Because ALMA-A and RLM perform join trials based on only local data from the learning algorithm, frequent failed experiments are inevitable.
In RLM, the detection timer can cause a delayed drop that will lead to a high packet loss, as shown between 200 and 300 seconds in Figure 8(c). Although this will make a receiver durable for transient congestion, it will also make RLM vulnerable to bursty video sources and large variations in network traffic. The high packet loss around 55 seconds could also be caused by the long detection timer. When a receiver drops a layer using RLM, the packet loss will be much higher than that in ALMA because of this timer, as shown around 410, 490, and 790 seconds in figure 8(c).
Loss rates for most of the loss events in RLM shown in figure 8(c) are higher than those of the loss events in ALMA-A shown in figure 6(c). If we have a loss-resilient layered codec that can recover 20 percent of the packet loss, for example, these differences in the value of the loss events result in very different QoS for the two approaches.
Table 2 shows the average performance of the dynamic QoS control functions of ALMA-A, ALMA-B, and RLM for six experimental trials. Using the same source, we tested the layer adaptation functions of RLM and ALMA. When we compared the mean values, RLM had the performance rate with noise, Pn, 10 percent greater than the two modes of ALMA. When we subtracted the wasted data caused by high packet losses, the effective performance rate, Pe, of RLM was slightly higher than those of the two ALMA cases.
Table 2: Results of Six Experiments with RLM, ALMA-A, and ALMA-B
The loss rate with noise, Ln, for RLM is two times greater than those of the two ALMA modes. If many sessions exist through the router in the HNG, this large waste in bandwidth could be very critical. The effective loss rate, Le, of RLM is 10 percent higher than those of the ALMA-A mode and 14 percent higher than those of the ALMA-B mode. The fact that the loss rate of RLM is higher than those of two ALMA modes means that the video quality of the receiver is much worse in RLM than it is in two of the ALMA modes. Although RLM produces a Pe slightly higher than ALMA-A and ALMA-B, it is achieved at the cost of a higher packet loss and greater waste.
The high loss rate will also lead to a high wasted-data rate, Wn, because of the hierarchical dependency of the layered streams and the best-effort delivery of the current Internet. The values of the noise frame rate with bad video quality, Fn, indicate that a participant may suffer noise or bad video quality for 113 seconds in RLM, for 65 seconds in ALMA-A, and for 45 seconds in ALMA-B out of a total of 900 seconds in the experiment.
The average maximum congestion duration in RLM is 33.7 seconds, which is caused by the delayed drop. This is much higher than the 15.7 and 17.0 seconds of ALMA-A and ALMA-B. In the worst case, a receiver running RLM will suffer significant packet losses for more than 40 seconds. The cumulative total time involved in congestion and loss events in RLM is much greater than that of the two modes of ALMA: 153 seconds for RLM, 86 seconds for ALMA-A, and 58 seconds for ALMA-B.
We presented the results of several experiments using the dynamic QoS control function of ALMA. We also compared RLM and ALMA. In most experiments, ALMA-B had better results with lower loss rates and a performance comparable to that of RLM. Without any help from the agent, the results for ALMA-A were better than those for RLM.
Although RLM produces a Pe slightly higher than those of ALMA-A and ALMA-B, this result was achieved at the cost of higher packet loss and waste. A receiver suffered significant packet loss when the source stream was getting very bursty, as shown between 200 and 300 seconds in figure 8. Without using any function supported by the agent, ALMA-A reduced the loss rate to half that of RLM at the slight cost of a lower rate performance. ALMA-B reduced the loss rate more than ALMA-A did, while simultaneously obtaining higher performance rates. Although the difference in loss rate between ALMA-A and ALMA-B is not significant, the frequent failed join experiments in ALMA-A produced a bad-quality video for viewers. Consequently, our dynamic QoS control function of ALMA is very effective with a good performance, reducing the loss rate that is critical in layered multicasting.
 Stephen E. Deering, "Internet multicast routing: State of the art and open research issues," MICE seminar, The Swedish Institute of Computer Science, Stockholm, Sweden, Oct. 1993.
 Nachum Shacham, "Multipoint communication by hierarchically encoded data," in Proc. IEEE INFOCOM '92, May 1992, pp. 2107-2114.
 Ingo Busse, Bernd Deffner, and Henning Schulzrinne, "Dynamic QoS control of multimedia applications based on RTP," in Proc. First International Workshop on High Speed Networks and Open Distributed Platforms, St. Petersburg, Russia, June 1995.
 Thierry Turletti and Jean-Chrysostome Bolot, "Issues with multicast video distribution in heterogeneous packet networks," in Proc. 6th International Workshop on Packet Video, Portland, OR, Sept. 1994, pp. F3.1-F3.4.
 Elan Amir, Steven McCanne, and Hui Zhang, "An application level video gateway," in Proc. ACM Multimedia '95, San Francisco, CA, 1995, pp. 255-265.
 Xue Li and Mostafa Ammar, "Bandwidth control for replicated-stream multicast video distribution," in Proc. 5th International Symposium on High Performance Distributed Computing (HPDC-5), Syracuse, NY, Aug. 1996, IEEE.
 Steven McCanne, "Scalable compression and transmission of internet multicast video," Ph.D. dissertation, University of California, Berkeley, CA, Dec. 1996, http://HTTP.CS.Berkeley.EDU/ mccanne/csd-96-928.ps.gz.
 Elan Amir, Steven McCanne, and Randy Katz, "Receiver-driven bandwidth adaptation for light-weight session," in Proc. ACM Multimedia '97, Seattle, WA, Nov. 1997, http://www.cs.berkeley.edu/ elan/pubs/papers/ scuba-acm-mm97.ps.
 Linda Wu, Rosen Sharma, and Brian Smith, "ThinStreams: An architecture for multicasting layered video," in Proc. NOSSDAV'97, St. Louis, MO, May 1997, ACM.
 Steve McCanne, Van Jacobson, and Martin Vetterli, "Receiver-driven layered multicast," in Proc. SIGCOMM'96, Stanford, CA, Aug. 1996, ACM, pp. 117-130.
 Xue Li, Sanjoy Paul, and Mostafa Ammar, "Layered video multicast with retransmission (LVMR): Evaluation of hierarchical rate control," in Proc. IEEE INFOCOM '98, San Francisco, CA, Mar. 1998, pp. 1062-1072.
 William C. Fenner, "Internet group management protocol, version 2," RFC 2236, Inter-Domain Multicast Routing Working Group, Internet Engineering Task Force, Nov. 1997, ftp://ftp.isi.edu/in-notes/rfc2236.txt.
 Henning Schulzrinne, Stephen Casner, Ron Frederick, and Van Jacobson, "RTP: A transport protocol for real-time applications," RFC 1889, Audio-Video Transport Working Group, Internet Engineering Task Force, Jan. 1996, ftp://ftp.isi.edu/in-notes/rfc1889.txt.
 SanKu Jo and Pierce E. Cantrell, "An agent-assisted layered multicast architecture for videoconferencing," in Proc. Multimedia Technology & Applications Conference, Anaheim, CA, Sept. 1998, IEEE, pp. 242-248.
 David Taubman and Adideh Zakhor, "Multi-rate 3-D subband coding of video," IEEE Transaction on Image Processing, vol. 3, no. 5, pp. 572-588, Sept. 1994.
 Mark A. Woodings, "A router agent capacity assessment in packet video," MS thesis, Texas A&M University, College Station, TX, Aug. 1997.
 Ralph A. Gholmieh, "Multicast multilayer videoconferencing: Enhancement of a multilayer codec and implementation of the receiver driven layered multicast," MS thesis, Texas A&M University, College Station, TX, Dec. 1997.
 Mrouted Program, Version 3.8, Nov. 1995, ftp://parcftp.xerox.com/pub/net-research/ipmulti/mrouted.3.8.tar.Z.