Reliable Multicast Considering the Temporal Dependence in Packet LossSuyoung Yoon, Kwangjoon Lee, Eunsook Jin, Jungmin Seo, Jinhan Kim and Sangkyu Choe
Multimedia Technology Research Lab., Korea Telecom, South Korea emails:{syyoon, kjoon, ejin, jmseo, jinhan, skchoe}@kt.co.kr
Abstract
Most of reliable multicast protocols assume that packet losses are temporally independent. They do not consider the possibility of the long burst loss. In this paper, we examine how much the consecutive packet losses affect the performance of reliable multicast protocols that use multiple multicast groups(MMG) and present how to handle it. We explore two mechanisms, one is to detect the temporal dependence of the packet error and the other is to deal with the burst loss. We propose two methods to detect the occurrence of the burst loss, the distributed method and the centralized method. Our reliable multicast protocol also manages to handle the burst by modifying the group management policy. We experimentally show performance gains of our protocol compared to the protocol MMG with respect to receiver processing times in the high packet loss environment.
1. Introduction
Reliable multicast has been an active research area in the last few years. The requirements for reliable multicast communications vary widely, depending on the application and network scenarios. The main challenging issue of the reliable multicast area is the problem of scalability. Since IP multicast applications have large number(hundreds or even thousands) of participants who may be distributed worldwide, traditional error control schemes used primarily in point-to-point applications do not scale to meet the requirements of large-scale multicast. Implosion and exposure are the key issues that are commonly addressed in most scalable reliable multicast protocols. Implosion is a problem that occurs when the loss of a packet triggers simultaneous messages from a large number of receivers. And exposure(or retransmission scoping problem) is a problem that occurs when recovery-related messages reach receivers which have not experienced loss.
Reliable multicast protocols can be categorized into two groups: ARQ based protocols and FEC based protocols. Using ARQ, the sender retransmits lost data upon request from the receiver. Tthe basic principle of FEC based protocols is that the original data is encoded to obtain some parity data, which is sent by the sender along with the original data. Many reliable multicast protocols based on combining FEC and ARQ techniques have also been proposed. Most reliable multicast protocols previously proposed use a single multicast channel(or group). In these protocols all packet transmissions and retransmissions are done over the single multicast channel. Each receiver therefore receives all of the retransmissions of a packet, even after correctly receiving the packet. Recently, some reliable multicast protocols using multiple channels have been studied. The use of multiple multicast channels is expected to reduce the network bandwidth consumption as well as receiver processing cost. This is because retransmissions are only forwarded along the links leading to subscribers of the retransmission channels.
Most of reliable multicast protocols assume that packet losses are temporally independent. They do not consider the possibility of the long loss burst. But recent studies have observed temporal correlation in losses on the Internet. That is, there is a significant amount of consecutive losses at each receiver. Therefore, previously proposed reliable multicast protocols do not work well in this situation. In this paper, we explore two mechanisms for scalable reliable multicast to handle the long loss burst, one detecting the temporal loss burst, other dealing with the burst. During the normal operation, our reliable multicast protocol maintains multiple multicast groups, which is similar to other protocols using multiple groups, but it manages the groups in a different manner.
In order to detect the occurrence of the temporal burst, we propose two methods, the centralized method and the distributed method. In the centralized burst detection method, the sender is responsible for examining the temporal burst and notifying it to receivers in the multicast group. In the distributed method, both the sender and the receivers always keep the status of loss occurrences consistently so that they can detect the burst at the same time. After detecting the temporal burst, the protocol changes its group management method by adjusting the number of multiple channels and the number of packets in a data block according to the level of the burst. Using various simulations, we show the superiority of our reliable multicast protocol over the well-known ARQ based protocol and FEC based protocol in the long loss burst environment as well as in the normal mode.
This paper is organized as follows. Section 2 presents our reliable multicast protocol. Section 3 compares the receiver processing times of previous protocol and our reliable multicast protocol. Section 4 gives related works on reliable multicast protocols. Section 5 presents a summary of the results and our conclusions.
2. Protocol description
In this section, we present in detail our reliable multicast protocol that handles the temporal burst loss.
Recent experiment results[12] show that there is a significant amount of consecutive losses at each receiver during a reliable multicast session. One or more extremely long loss bursts, lasting from a few seconds up to several minutes, occur in almost every trace. Such long loss burst was also reported in [9] for the case of point-to-point connections. Because there can be so many receivers that participate the multicast session, much research efforts that focus on handling packet errors have been performed. That is also true of handling the long temporal burst.
The reliable multicast protocol we will consider is based on the protocols that use multiple multicast groups(MMG)[5]. There are two versions of protocol MMG, P2 and P3. We use the protocol P3 as our base model to detect the burst loss by all receivers. The protocol P3 exhibits the following behavior.:
· the sender sends all original transmission on a multicast group Aorg; · the sender retransmits packet i on multicast channel Ai, where i = 0, 1, 2, ...; · whenever a receiver detects a lost packet(say i), it subscribes to multicast group Ai, waits for a random period of time and then multicasts a NAK on the group Aorg, and starts a timer; · upon receipt of a NAK for a packet that a receiver has not received, but for which it(the receiver) has initiated the random delay prior to sending a NAK, the receiver sets a timer and behaves as of it had sent that NAK; · the expiration of a timer without prior reception of the corresponding packet serves as the detection of a lost NAK or retransmission; · on receiving packet i on Ai, a receiver unsubscribes from Ai.
The use of multiple multicast groups is expected to reduce the network bandwidth consumption as well as the receiver processing cost. This is because receivers do not process the unwanted retransmission packets and retransmissions are only forwarded along the links leading to subscribers of the retransmission groups. The analysis of MMG shows that the protocol outperforms the reliable multicast protocols which use only one multicast group for both original packet transmission and retransmission. They observed that MMG can achieve a throughput within 1% of the maximum achievable throughput using a very small number of multicast groups(usually less than 5).
But, their analysis assumed that packet loss events are temporally independent. That is, MMG does not consider the possibility where many consecutive packet errors can be occurred at a receiver. In this case, there may exist many receivers that cannot join a retransmission group. These receivers have to either wait on a multicast group available or use the existing multicast group. In the first case, user response time would be very slow. In the second case, we do not make use of the advantage of using multiple multicast groups because the receivers that share a multicast group can receive retransmission packets that they do not want to get.
Therefore, we need a mechanism to handle the long burst packet error. Our reliable multicast protocol, MMGB(Multiple Multicast Groups for Burst losses), that is aimed at managing the burst errors is consist of two steps, detecting the temporal burst and adjusting the retransmission group management policy.
2.1 Detecting the Temporal Burst
In this section, we present methods for detecting the temporal burst. The burst detection methods watch the network condition by examining the receiving packets all the time during the multicast session. If the amount of occurred packet errors exceeds the predefined threshold value, it is supposed to be in situation of a burst loss. The burst detection process has the following two requirements.
· Both the sender and all receivers must detect the burst loss at the same time so that they can handle the burst in the same way. · Both the sender and all receiver must also detect the end of the burst at the same time so that they can return to the normal mode.
We propose two methods for detecting the long burst loss, the distributed method and the centralized method. In the distributed method, both the sender and the receivers keep the error status for all packets consistent so that they can detect the burst at the same time. In the centralized burst detection method, the sender is responsible for examining the burst loss and notifying it to members of the multicast group.
2.1.1 Distributed Burst Detection Method
Throughout this paper, we assume that the network hierarchy of the sender and the receivers has only one level. That is, the messages from the sender are delivered to receivers without intermediate nodes like the designated receivers of RMTP[6], and vise versa.
In this method, both the sender and all receivers share the global information about each packet errors. The sender has to keep the status of each receiver about packet errors. But most reliable multicast protocols try to reduce the NAK explosion, where every NAK message sent by each receiver does not arrive at the sender, or even receivers do not send it to the sender(NAK-suppression). Therefore, it is hard for the sender to collect packet status of all receivers. Instead, it can get the global packet status, from which the sender can see whether all receivers received packets without a loss or some receivers detected packet errors. We define the following data structures to detect the temporal burst. These data structures should be maintained in both the sender and all receivers consistently all the time.
· Qi, 0 £ i £ N : 1 if the packet i was received with error, 0 if no error, where N is the total number of original packets to be sent to receivers in the current multicast session. · Wp : the window size where recent Wp packets are examined to detect the temporal burst.
In general, the possibility of packet error is very low in normal multicast session, therefore, the sender has the status of each packet where for only small number of the packets corresponding error bits are set. This results in that we can assume that for the sender, if the number of the most recent Qi elements whose error bit was set exceeds the predefined value, then the sender decides it as the burst loss. For the receivers, the burst detection criterion is the same as that of the sender. But, unlike the sender which is provided with error information about all receivers, receivers need to get the NAK message for all multicast transmission in the session so as to keep the global queue. This requires that we adopt the protocol P3 of MMG in which NAK messages from each receiver which experienced packet error are sent to the members of original multicast transmission group instead of the retransmission group for the packet error in P2 of MMG. This may cause the increase of receiver processing time. But it is expected to be smaller than the processing cost of unwanted retransmission.
2.1.2 Centralized Burst Detection Method
In contrast to the distributed burst detection method, centralized method maintains the global packet status, Qi, only in the sender size. The data structure and its operation method are the same as that of distributed one. The sender detects the burst loss using the data structures, Qi and Wp, in the same manner. After detecting the burst, the sender must notify the burst condition to all receivers that participate the multicast session. And also it sends the message for the end of burst loss if there is no burst loss any more.
In order for the centralized burst method to work properly, the number of message types between the sender and receivers must be added. That would increase the receiver processing cost. Therefore, in the following section, we choose the distributed burst detection as our burst detection method.
2.2
Group Management for Burst Loss
The number of available multicast groups is one of the most important factors for the performance improvement in the reliable multicast using multiple groups. Ideally, if there is no limitation on the number of multicast groups, that is, the number of available multicast groups is ¥ , MMG protocol works well in any situation such as burst loss. However, because the multicast groups is a finite internet resource, a multicast session can use very limited number of multicast groups. MMG suggests that the protocol can achieve a throughput of almost maximum using a very small number of multicast groups. But, as described above, the protocol cannot handle the long burst loss using the fixed number of multicast groups.
After detecting the long burst loss, we need to increase the number of retransmission groups according to the extent of the burst. In our protocol, MMGB, the number of retransmission groups, NG is increased by NG * (NE / Wp), where NG is the number of retransmission group before the burst and NE is the number of packet errors in the most recent Wp packets received by receivers. With this method, the number of retransmission groups can be increased by the original number of retransmission groups, NG, when no packets of most recent Wp packets have been received correctly by receivers.
In the following section, we will show that MMGB outperforms MMG in receiver processing time with several parameters.
3. Performance Evaluation
We now examine the relative performance of protocols MMG and MMGB with respect to the measure of receiver processing time. The effect of burst errors cannot be calculated by the analysis method as in [5]. Therefore we examine the receiver processing cost by simulation. The receiver processing cost for each receiver i, CRi, is obtained by the following expression.
CRi = (Cp + CJ + CL + CNS + CNR + CR + CT) / N
where Cp, CR : time to receive a packet(original packet and retransmission packet respectively), CJ : time to process a join, CL : time to process a leave, CNS : time to send a NAK, CNR : time to receive a NAK, CT : time to process a timeout
To calculate the CRi, we need to know the processing times associated with receiving a data packet, sending/receiving a NAK packet, joining/leaving a multicast group and their timer processing time. We use the experiment results from [5] for the values which are summarized in Table 1.
Table 1. Processing times(in microseconds)
If there are no packet errors, only Cp is added to CRi as the packet processing time for receiver i. If the receiver i detects a packet errors, Cp, CJ, CL, CNS, CNR, CR and CT are added to the CRi where CNS is not added to CRi only if another receiver j which also detected packet error for the packet sent the NAK message(NAK suppression). Those values can be added to CRi several times until the packet will be correctly received by the receiver.
Fig. 1 shows the simulation results of the receiver processing times of MMG and MMGB with packet error rate 0.1 and 0.01 as the number of receivers increases. As expected, we see that MMG incurs higher receiving costs than MMGB. Observe that the difference of the receiver processing costs is getting larger when the packet errors is high(p=0.1) as the number of receiver increases.
Fig. 2 shows the simulation results of the receiver processing times of MMG and MMGB with the number of receivers 500 as the packet error rate increases. We can also see that MMGB outperforms MMG. And, MMGB shows much better performance than MMG as the packet error rate increases.
4. Related Work
One of the most popular existing reliable multicast protocols is scalable reliable multicast(SRM)[2]. SRM is a NAK-based protocol that has been implemented for a shared whiteboard application. It emphasized multicasting both the repair requests and retransmissions. The receiver use two timer, called the repair timer and the request timer to perform loss recovery, to reduce the NAK explosion at the sender and to avoid multiple repair packets for duplicate requests. Therefore efficiency of this protocol depends on the accurate estimate of the timer expiration time which may otherwise lead to repair and request explosion at each member of the group. Log-based reliable multicast, LBRM[3], and reliable multicast transport protocol, RMTP[6], are two hierarchical approaches in which designated receivers at a certain level supply repairs to lower-level designated receivers and determining their processing and storage requirements is still being studied.
Reliable multicast could benefit from the use of forward error correction(FEC)[8]. The basic principle of FEC is that the original data is encoded to obtain some parity data, which is sent by the sender along with the original data. This parity data is used by a receiver to independently recover lost data. However, FEC-based loss recovery will not perform well in the presence of heterogeneous loss. Even if only a few receivers experience very high loss, a large number of FEC packets will be generated at the sender and will be sent everywhere thereby wasting network bandwidth and causing unnecessary packet processing at all the other receivers. Many reliable multicast protocols based on merging FEC and ARQ technique have also been proposed[10][4].
Recently, some reliable multicast protocols using multiple channels have been studied. The use of multiple multicast channels is expected to reduce the network bandwidth consumption as well as receiver processing cost. This is because retransmissions are only forwarded along the links leading to subscribers of the retransmission channels. Cheung et al. [1], McCanne[7] and Vicisano[11] have proposed to use multiple multicast groups for flow and congestion control, but not for error recovery. Kasera et al[5] proposed an approach where a single multicast group is used for the original transmission of packets and retransmission of packets are done separate multicast groups.
5. Conclusions and Future Work
In this paper, we examined how much the consecutive packet losses affect the performance of reliable multicast protocols that use multiple multicast groups and presented how to handle it. We explored two mechanisms, one is to detect the temporal dependence of the packet error and the other is to deal with the burst loss. We discussed two methods to detect the occurrence of the burst loss, the distributed method and the centralized method. Our reliable multicast protocol also manages to handle the burst by modifying the group management policy.
We showed performance gains of our protocol compared to MMG with respect to receiver processing times, as shown in Section 3, especially in the high packet loss environment.
Our protocol here described assumed that the hierarchy of the sender and receivers has only one level. That is, the sender sends messages to receivers directly. But many reliable multicast protocols use intermediate nodes between the sender and receivers which aggregate the messages of receivers. Therefore we need to extend our scheme to handle the multilevel hierarchy. And we are currently considering the possibility of further improving our protocol by enhancing the group management policy after detecting the burst loss.
6. References
[1] S. Y. Cheung, M. H. Ammar, and X. Li, "On the use of destination set grouping to improve fairness in multicast video distribution," Georgia Institute of Technology, Atlanta, GA, Tech. Rep. GIT-CC-95-25, July 1995. [2] S. Floyd, V.Jacobson, S. McCanne, C. Liu, and L. Zhang, "A reliable multicast framework for light-weight sessions and application level framing," IEEE/ACM Trans. Networking, vol. 5, pp. 784-803, Dec. 1997. [3] H.W. Holbrook, S.K. Singhal, and D.R. Cheriton, "Log-based receiver reliable multicast for distributed interacive simulation," in Proc. ACM SIGCOMM, Aug. 1995, pp. 328-341. [4] R. Kermode. "Scoped hybrid automatic repeat request with forward error correction(SHARQFEC). Proc. ACM SIGCOMM '98, Sep. 1998, Vancouver, Canada. [5] Sneha Kuma Kesera, Gisli Hjalmtysson, D. F. Towsley, J.F. Kurose, "Scalable reliable multicast using multiple multicast channels," IEEE/ACM Trans. Networking, vol. 8. pp 294-309, Jun. 2000. [6] J. Lin and S. Paul, "RMTP:A reliable multicast transport protocol," in Proc. IEEE Infocom, 1996, pp. 1414-1424. [7] S. McCanne, V. Jacobson, and M. Vetterli, "Receiver-driven layered multicast," in Proc. ACM SIGCOMM Conf., Aug. 1996, pp. 117-130. [8] J. Nonnenmacher, E. Biersack, and D. Towsley, "Parity-based loss recovery for reliable multicast transmission," IEEE/ACM Trans. Networking, vol. 6, pp 349-361, Aug. 1998. [9] V. Paxson, "End-to-end Routing behavior in the Internet," in Proc. 1996 ACM SIGCOMM Conf. [10] D. Rubenstein, J. Kurose, D. Towley, ""Realtime reliable multicast using proactive forward correction," NOSSDAV '98, Cambridge, UK, Jul. 1998. [11] L. Vicasano, L. Rizzo, and J. Crowcroft, "TCP-like congestion control for layered multicast data transfer,", in Proc. IEEE Infocom, Mar. 1998, pp. 996-1003. [12] Maya Yajnik, Jim Kurose, Don Towley, "Packet loss correlation in the MBONE Multicast Network,", GLOBECOM '96.
|