Anshul KANTAWALA <firstname.lastname@example.org>
Samphel NORDEN <email@example.com>
Ken WONG <firstname.lastname@example.org>
Guru PARULKAR <email@example.com>
Washington University in St. Louis
In this paper, we propose DiSp (differentiated services over IP), a new framework for supporting differentiated services over the Internet. DiSp is different from the current Internet Engineering Task Force proposal for DiffServ but still maintains the goals of DiffServ where we move complexity from the internal routers out to the edge routers of DiffServ clouds and Autonomous Systems. DiSp supports three classes of services: real-time, statistical bandwidth, and best-effort. The admission control policy for the real-time and statistical flows allows fixed-delay bound guarantees to be given to quality-of-service applications. We also discuss how our architecture can easily support important applications such as virtual private networks.
The current Internet supports only best-effort service irrespective of the characteristics of the application that uses the service. But applications such as Internet Protocol (IP) telephony, video-on-demand, video-conferencing, and other real-time applications require end-to-end QoS (Quality of Service) support. Furthermore, different applications require different transmission guarantees. For example, video-on-demand applications can tolerate large delays but require bandwidth guarantees. However, IP telephony is delay intolerant and requires more comprehensive guarantees on bandwidth and delay. Thus, there is a need to support service discrimination by explicit resource allocation and scheduling in the network. Current research on QoS-based networks has resulted in the development of the Integrated Services (IntServ) architecture using the RSVP signaling protocol for signaling per-flow requirements to the network. IntServ is used to quantify these QoS requirements using an admission-control-based approach. However, IntServ suffers from scalability, complexity, and deployment problems.
These deficiencies have led to the development of the beginnings of an alternative QoS delivery model known as Differentiated Services (DiffServ) [1, 2]. To address the scalability issue, DiffServ aggregates flows into service classes rather than maintaining per-flow state. Furthermore, QoS requirements are specified out-of-band removing the necessity for a signaling protocol such as RSVP. Packet classification is based on the setting of bits in the TOS byte of the IP header. Flow aggregation in DiffServ has several beneficial consequences. First, DiffServ routers map a large number of flows to a small number of per-hop behaviors. Thus, instead of every router having to manage individual flows, only the edge routers need to be concerned with QoS. Second, aggregation facilitates the construction of "end-to-end" services by linking multiple autonomous domains together using simplified service agreements at the boundaries of the domains.
DiffServ is still in its infancy and has not yet matured into a service framework that can satisfy the diverse application requirements. There are many issues to be resolved: 1) precise service class definitions, 2) admission control policies, 3) strategies for policing and shaping of aggregate flows, and 4) congestion-handling mechanisms. Our architectural framework attempts to tackle some of these issues and supports two fundamental service enhancements: 1) receiver subscriptions, and 2) statistical bandwidth guarantees.
We describe a new architecture called DiSp (Differentiated Services over IP) that builds on the basic DiffServ idea of flow aggregation to provide user-controlled traffic services. DiSp has four key features: 1) It has three service classes: real-time (RT), statistical-bandwidth (SB), and best-effort (BE), with detailed profile specifications. 2) It has mechanisms for policing and shaping aggregate flows. 3) Although real-time flows are treated in an aggregate manner, DiSp provides service guarantees on a per-flow basis (we note that, in keeping with the DiffServ ideals, DiSp does not maintain per-flow state information in ANY router and uses simple priority scheduling mechanisms among the three classes). 4) DiSp uses efficient monitoring mechanisms that can provide accurate feedback for congestion control and overall network management. The use of our proposed signaling protocol facilitates third-party negotiations that are essential for network configuration, management, and provisioning.
Our goal is to define and support a model that allows the seamless integration of our proposed DiffServ architecture with IntServ, since both of these models are complementary. We highlight the effectiveness of our approach by considering a challenging task of resource allocation for real-time applications using Virtual Private Networks (VPN).
The rest of the paper will be organized as follows. In Section 2, we will present and motivate our proposed approach. Section 3 deals with details of our proposed architecture, followed by the details of the admission control algorithm in Section 4. We discuss our congestion control policies in Section 5 and support for individual real-time multicast flows in Section 6. We then describe issues regarding resource allocation for the statistical bandwidth class and a target application, VPN, in Section 7. We finally present related work and conclude the paper in Sections 8 and 9 respectively.
DiSp supports two fundamental service enhancements: 1) receiver subscriptions, and 2) statistical bandwidth guarantees. With receiver subscriptions, each receiver (or set of receivers) negotiates for a fixed delay bound on a flow originating from a remote source. Unlike other DiffServ proposals [1, 4], which are based on source reservations, our model emphasizes support for applications (e.g., Video on Demand (VOD), stock quotes) where receiver reservations are more appropriate. In these real-time applications, the sender should not be forced to reserve and pay for resources when there are no receivers present. Other motivations for adding receiver-based reservation control can be found in .
With statistical bandwidth guarantees, each Autonomous System (AS) can negotiate an aggregate bandwidth profile for its high bandwidth flows. One of the hard problems for supporting such a service class is admission control and resource reservation for the aggregate flows without prior knowledge of the routes taken by individual flows within the aggregate. We envisage such a service class to be useful for providing QoS support for VPNs. DiSp supports three service classes:
Flow scheduling is performed in strict priority order: 1) RT (highest), 2) SB, 3) BE. Thus, routers (edge and internal) need only maintain three output queues per link and do not need complex fair queueing algorithms.
Our architecture is shown in Figure 1 and consists of:
The connecting Autonomous Systems (ASs) are responsible for marking packets according to their respective classes. If an edge-router encounters an unmarked packet, it is treated as part of a BE flow. Thus, DiSp provides backward compatibility for legacy IP networks. Since DiSp handles policing and shaping of flows in an aggregated manner, we rely on the connecting ASs to provide flow isolation from misbehaving flows within the aggregate. For example, an AS could be viewed an as IntServ cloud providing per flow QoS internally to all its flows while negotiating aggregate profiles for flows transiting through the DiffServ cloud.
A crucial component of the DiSp architecture is SPiD (pronounced "speed"), the signaling protocol employed by DiSp. The current Internet Engineering Task Force (IETF) proposal for DiffServ does not incorporate a signaling protocol. This decision was based on scalability concerns. The RSVP protocol has a high overhead in its functionality (two-phase approach), where per-flow state information is set up in each router when performing resource reservation. There have been proposals for modifying RSVP for use with aggregated flows . However, aggregation introduces a host of issues (e.g., maintaining per-flow guarantees, isolating flows) which would add to the complexity of RSVP. Also, with regard to multicast, RSVP suffers from problems of handling QoS reservations with heterogeneous reservation styles. SPiD is a lightweight, efficient signaling protocol with the following key features:
We envisage the following types of control messages that will be used by SPiD.
Thus, while SPiD is lightweight, it offers an enhanced set of features to support both hard and soft bandwidth guarantees in DiffServ. In addition, it provides support for network management, which is another key component of our DiSp architecture.
SPiD has several control messages that must receive transmission guarantees to prevent performance degradation. DiSp uses a separate minimum spanning tree control network with statically reserved bandwidth to avoid delays.
Each AS can specify a profile for each RT flow and each SB flow aggregate. Note that RT flows are delay sensitive whereas SB flows are bandwidth sensitive. An RT profile specifies a delay bound for a particular flow through three parameters:
An RT profile is specified for each flow and stored in the ingress router of an ISP.
Each AS specifies a single aggregate profile for its SB flow to an ISP. This profile is stored in the ingress router of the ISP receiving the SB flow. An SB profile specifies the minimum bandwidth guarantee for a flow aggregate (not an individual flow) through two parameters:
Each ingress router of an ISP stores profiles for each real-time flow and each high bandwidth flow aggregate for each connected AS. Although policing of real-time flows from a particular AS will be done in an aggregated manner (as described in the Edge Router Internals section), the edge-router has to adjust the policer according to the individual flow's acceptable loss rate and selectively drop packets in times of severe congestion when the guarantees cannot be met.
The NOC maintains a centralized database that is used for admission control. The database maintains information about the reserved bandwidth, delay, and a list of real-time flows with their associated ingress edge-router for each link of each router within the DiSp network. The parameters stored for each link i, with capacity C, and n RT flows include:
Each RT flow association Ai identifies the flow (srci, dsti) and the ingress router (routeri) hosting the flow.
Each internal router also stores a running count of number of active best-effort flows on each link. This information is used by the signaling protocol to provide explicit congestion window-size feedback to the best-effort sources in times of congestion.
The main function of an edge-router is policing and shaping of real-time and high bandwidth flows (Figure 2). For each input link in the edge-router, a modified token bucket scheme is used to police real-time flows. High bandwidth flows are policed using a fixed-size queue and a pacer. Each output link has three output queues, one for each service class, which are served in strict priority order: 1) RT, 2) SB, 3) BE.
r is a set of timers in which the i-th timer expires every Ti seconds, where Ti is dimax/(hop count) for flow i. Also a queue of size b corrects any jitter that real-time packets may have suffered in the AS network. Packets arriving from real-time flows are queued in FIFO (First In, First Out) order if there is no token to service them instantly. Incoming packets are dropped if the queue is full, thus policing RT flows with respect to their reservations. When a packet is dispatched on link i, a token of size Pimax, the largest packet in flow i, is removed from the token bucket.
For example, a token of size Pimax will be generated every dimax/(hop count) seconds for a real-time flow i. The token supply is reduced by Pimax every time a packet is sent out. Packets are sent only if the token supply is at least Pimax. This scheme allows the router to police and pace the real-time flows as an aggregate bundle instead of being forced to use a separate token bucket and queue for each real-time flow, thus reducing the overhead at edge-routers. If all flows conform to their reservations, all RT delay guarantees will be met. The only drawback of this scheme is that it cannot isolate nonconforming RT flows, a responsibility of the AS egress routers.
The DiSp admission control algorithm ensures that delay and bandwidth guarantees can be met for all accepted connections. The admission control procedure is almost the same for both real-time and statistical bandwidth flows. The difference is that delay and jitter for statistical bandwidth flows are not checked. When a new connection request is made to an edge router, there are two tests that are performed for RT flows. First, DiSp checks for sufficient bandwidth along the route selected for the RT flow. Second, it checks that the sum of the end-to-end delay and jitter bounds exceed the worst-case delay experienced by the packet at each hop along the route. Once a flow satisfies these two checks, all flow associations < src, dst, ingress-router > on the QoS route are added to the NOC database. The parameter that maintains the overall bandwidth of the service class is updated. Finally, the maximum packet size for the flow is computed and updated if larger.
Real-time and statistical bandwidth flows will not experience any congestion during normal operation, but may experience some congestion when a link goes down and the flows are re-routed. For this particular scenario, we re-route the real-time and statistical bandwidth flows to the next hop router using an alternate path. For example, consider link i in router Rj, which connects Rj to Rk goes down. The NOC tries to find an alternate path from Rj to Rk that can accommodate the bandwidth and delay requirements of all flows on link i. If such a route cannot be found, the NOC will signal the originating ASs of the respective flows, indicating a need for renegotiation or readmission of the affected flows.
Best-effort flows can encounter congestion even during normal operation because of the aggressive windowing strategy employed by TCP. DiSp monitors active best-effort flows and provides explicit window-size parameter feedback to flows going through a congested link. Using the Smart Port Card (SPC) card with an embedded ATM Port Interconnect Chip (APIC) , DiSp can snoop all BE flows on a link at gigabit rates and monitor the currently available bandwidth for BE flows. Each router stores the number of active best-effort flows and the source address of each flow for each link. If the link experiences congestion, the router sends feedback messages to the host indicating a smaller TCP congestion window size (number of active flows/currently available bandwidth). We plan to add this enhancement to the current TCP protocol to be able to handle such feedback messages and adjust the congestion window size accordingly. This scheme is an enhancement of the ECN (Explicit Congestion Notification) mechanism proposed for TCP . We will also experiment with varying the holding times of the new congestion window size before allowing TCP to resume its normal congestion window control algorithm.
Since IETF's DiffServ deals with aggregate flows, resource provisioning for individual multicast flows is not supported. Using the receiver-based subscription enhancement, DiSp provides baseline support for multicast RT flows. Consider a multicast group M with source S1 and receivers D1 and D2, as shown in Figure 4. When a new receiver D3 wants to join M, only resources from R4 to D3 are reserved, since a virtual path from S1 to D1 already exists. If the QoS requirements for D3 are different from the other receivers, the flow S1 to D3 will be considered as a new RT flow and go through the admission control and resource reservation process. Non-RT multicast flows will be treated as BE flows. Providing multicast support for aggregated SB flows is an open issue.
There are two major issues concerning the statistical bandwidth class:
One application that helps point towards a feasible solution to the above issues is providing VPNs for corporations. Since a VPN is a fairly static network connecting remote sites of a company, DiSp can use this information to reserve resources along the paths connecting the sites. Currently, VPNs only provide a secure network, but no QoS guarantees. Using the statistical bandwidth class, DiSp can provide a VPN with some minimum bandwidth guarantee between the remote sites. The issues that need to be addressed for such an application are:
Current research in DiffServ has resulted in a number of IETF drafts that attempt to tackle the issues of defining service classes, per-hop behaviors, integration of DiffServ with IntServ, and so on. In this section we discuss how our work complements and builds on the existing research in this area and highlight some of the differences between our proposed scheme and the traditional view of DiffServ.
IETF DiffServ was essentially developed to prevent any complex signaling and allow out-of-band negotiation. Thus the use of the ToS byte information implicitly decided the kind of service a flow would receive at a router. However, there is a need to use a signaling protocol that can perform the negotiation between the user and ISP, allow the service profiles to be disseminated to the various edge routers for indicating admitted flows, allow users to renegotiate the profile (subscriptions), and perform congestion notification. Our proposed signaling protocol SPiD performs the above functions in an efficient manner. We have also used an admission-policy-based approach that can guarantee the services demanded by the various flows.
DiffServ also does not clearly define the types of service classes that are provided. While some of the proposals  discuss Premium and Assured service classes, no characterizing parameters are defined apart from bandwidth. We have proposed the use of three service classes (Real-time, Statistical-Bandwidth, and Best-effort) and defined their QoS parameters to provide greater flexibility to the user in terms of being able to specify delay and delay jitter, in addition to the standard bandwidth parameter. We also propose the use of a separate control network for which a static spanning tree route is maintained. All control messages are thus guaranteed a minimum bandwidth and do not suffer from problems of insufficient bandwidth due to admission of higher class flows like real-time flows. Related to this, we also use QoS routes for choosing the best possible route. By restricting ourselves to a simple three-queue approach, we are removing much of the complexity that is involved in using some computer-intensive Weighted Fair Queueing mechanism at the edge router for scheduling the flows.
Although edge routers in DiSp treat flows from different service classes as aggregates (which is similar to traditional DiffServ), we enforce specification and admission of individual real-time flows rather than flow aggregations. In general, aggregation of diverse real-time flow specifications is meaningless. However, we police and shape real-time flows as an aggregate. Also, DiSp's admission control is explicit whereas IETF DiffServ's is implicit (see Section 3).
Multicast flows are treated by DiSp in the same way as any other flow. Thus, DiSp does not require any separate mechanism to handle multicast flows. A new receiver joining a multicast group is handled similar to any other new flow. With heterogeneous multicast traffic, diverse profiles (based on receiver requirements) can be easily supported.
We have proposed a new framework for supporting differentiated services over the Internet. This approach is different from the current IETF proposal for DiffServ while still maintaining the goals of DiffServ. Our architectural framework allows distribution of complexity to the edge routers as well as the AS routers. We have proposed services to support real-time and statistical service classes apart from the usual best-effort class. The admission control policy for the real-time and statistical flows allows hard guarantees to be given to QoS applications.
1. D. Clark and J. Wroclawski. "An approach to service allocation in the Internet," Internet Draft, July 1997.
2. S. Blake, D. Black, M. Carlson, E. Davies, Z. Wang, and W. Weiss. "An architecture for differentiated services," Internet Draft, August 1998.
3. G. Parulkar, D. Schmidt, E. Kraemer, J. Turner, and A. Kantawala. "An architecture for monitoring, visualization and control of gigabit networks," IEEE Network, 11(5): 34-43, October 1997.
4. K. Nichols, V. Jacobsen, and L. Zhang. "A two-bit differentiated services architecture for the Internet," Internet Draft, November 1997.
5. B. Ohlman. "Receiver control in differentiated services," Internet Draft, March 1998.
6. S. Floyd. "TCP and Explicit congestion notification," ACM Computer Communications Review, 24(5): 10-23, October 1994.
7. R.Guerin, S. Blake, and S. Herzog. "Aggregating RSVP based QoS requests," Internet Draft, November 1997.
8. Z. Dittia, G. Parulkar, and J. R. Cox. "The APIC Approach to High Performance Network Interface Design: Protected DMA and Other Techniques," IEEE INFOCOM 97, Kobe, Japan, 1997.
This work was supported in part by NSF grant ANI-9714698 and by Intel.