Internet QoS: Architectures and Mechanisms for Quality of Service
Zheng Wang summarizes his new book
The current Internet has its roots in the ARPANET, an experimental data network funded by the United States Defense Advanced Research Projects Agency (DARPA) in the early 1960s. An important goal was to build a robust network that could survive active military attacks such as bombing. To achieve this, the ARPANET was built on the datagram model, where each individual packet is forwarded independently to its destination. The datagram network has the strength of simplicity and the ability to adapt automatically to changes in network topology.
For many years, the Internet was primarily used by scientists for networking research and for exchanging information between eachother. Remote access, file transfer, and e-mail were among the most popular applications, and for these applications the datagram model works well. The World Wide Web, however, has fundamentally changed the Internet. It is now the world's largest public network. New applicationssuch as video conferencing, Web searching, electronic media, discussion boards, and Internet telephonyare being developed at an unprecedented speed. E-commerce is revolutionizing the way we do business. As we enter the twenty-first century, the Internet is destined to become the ubiquitous global communication infrastructure.
The phenomenal success of the Internet has created new challenges. Many new applications have very different requirements from those that the Internet was originally designed for. One issue is performance assurance. The datagram modelon which the Internet is basedhas few resource management capabilities inside the network and therefore can not provide any resource guarantees to usersyou simply get what you get. When you try to reach a Web site or to make an Internet phone call, some parts of the network may be so busy that your packets cannot get through at all. Most real-time applications such as video conferencing also require some minimal level of resources to operate effectively. As the Internet becomes indispensable in our life and work, the lack of predictable performance is certainly an issue which needs addressing.
Another issue is service differentiation. Because the Internet treats all packets the same way, it can only offer a single level of service. The applications, however, have diverse requirements. Interactive applications such as Internet telephony are sensitive to latency and packet losses. When the latency or the loss rate exceeds certain levels, these applications become literately unusable. In contrast, a file transfer can tolerate a fair amount of delay and losses without much degradation of perceived performance. Customer requirements also vary depending on what the Internet is used for. For example, organizations that use the Internet for bank transactions or for control of industrial equipment are probably willing to pay more to receive preferential treatment for their traffic. For many service providers, providing multiple levels of services to meet different customer requirements is vital for the success of their business. The capability to provide resource assurance and service differentiation in a network is often referred to as quality of service (QoS). Resource assurance is critical for many new Internet applications to flourish and prosper. The Internet will become a truly multiservice network only when service differentiation can be supported. Implementing these QoS capabilities in the Internet has been one of the toughest challenges in its evolution, touching on almost all aspects of Internet technologies and requiring changes to the basic architecture of the Internet. For more than a decade the Internet community has made continuous efforts to address the issue and developed a number of new technologies for enhancing the Internet with QoS capabilities.
This book focuses on four technologies that have emerged in the last few years as the core building blocks for enabling QoS in the Internet. The architectures and mechanisms developed in these technologies address two key QoS issues in the Internet: resource allocation and performance optimization. Integrated Services and Differentiated Services are two resource allocation architectures for the Internet. The new service models proposed in them make possible resource assurances and service differentiation for traffic flows and users. Multiprotocol label switching (MPLS) and traffic engineering, on the other hand, give service providers a set of management tools for bandwidth provisioning and performance optimization; without them, it would be difficult to support QoS on a large scale and at reasonable cost.
The four technologies will be discussed in depth in the next four chapters. Before we get down to the details, however, it is useful to look at the big picture. In this first chapter of the book we present a high-level description of the problems in the current Internet, the rationales behind these new technologies, and the approaches used in them to address QoS issues.
1.1 Resource Allocation
Fundamentally, many problems we see in the Internet all come down to the issue of resource allocation-packets get dropped or delayed because the resources in the network cannot meet all the traffic demands. A network, in its simplest form, consists of shared resources such as bandwidth and buffers, serving traffic from competing users. A network that supports QoS needs to take an active role in the resource allocation process and decides who should get the resources and how much.
The current Internet does not support any forms of active resource allocation. The network treats all individual packets exactly the same way and serves the packets on a first-come, first-serve (FCFS) basis. There is no admission control either-users can inject packets into the network as fast as possible.
The Internet currently relies on the TCP protocol in the hosts to detect congestion in the network and reduce the transmission rates accordingly. TCP uses a window-based scheme for congestion control. The window corresponds to the amount of data in transit between the sender and the receiver. If a TCP source detects a lost packet, it slows the transmission rate by reducing the window size by half and then increasing it gradually in case more bandwidth is available in the network.
TCP-based resource allocation requires all applications to use the same congestion control scheme. Although such cooperation is achievable within a small group, in a network as large as the Internet, it can be easily abused. For example, some people have tried to gain more than their fair share of the bandwidth by modifying the TCP stack or by opening multiple TCP connections between the sender and receiver. Furthermore, many UDP-based applications do not support TCP-like congestion control, and real-time applications typically cannot cope with large fluctuations in the transmission rate.
The service that the current Internet provides is often referred to as best effort. Best-effort service represents the simplest type of service that a network can offer; it does not provide any form of resource assurance to traffic flows. When a link is congested, packets are simply pushed out as the queue overflows. Since the network treats all packets equally, any flows could get hit by the congestion. Although the best-effort service is adequate for some applications that can tolerate large delay variation and packet losses, such as file transfer and e-mail, it clearly does not satisfy the needs of many new applications and their users. New architectures for resource allocation that support resource assurance and different levels of services are essential for the Internet to evolve into a multiservice network. Over the last decade the Internet community came up with Integrated Services and Differentiated Services, two new architectures for resource allocation in the Internet. The two architectures introduced a number of new concepts and primitives that are important to QoS support in the Internet:
m Frameworks for resource allocation that support resource assurance and service differentiation; m New service models for the Internet in addition to the existing best-effort service; m Language for describing resource assurance and resource requirements; m Mechanisms for enforcing resource allocation.
Integrated Services and Differentiated Services represent two different solutions. Integrated Services provide resource assurance through resource reservation for individual application flows, whereas Differentiated Services use a combination of edge policing, provisioning, and traffic prioritization.
1.1.1 Integrated Services
Although the problems with the best-effort model have long been recognized, the real push for enhanced service architectures came in the early 1990s after some large-scale video conferencing experiments over the Internet. Real-time applications such as video conferencing are sensitive to the timeliness of data and so do not work well in the Internet, where the latency is typically unpredictable. The stringent delay and jitter requirements of these applications require a new type of service that can provide some level of resource assurance to the applications.
In early 1990 the Internet Engineering Task Force (IETF) started the Integrated Services working group to standardize a new resource allocation architecture and new service models. At that time the World Wide Web, as we know it today, did not yet exist, and multimedia conferencing was seen by many people as a potential killer application for the Internet. Thus the requirements of the real-time applications had major impacts on the architecture of Integrated Services. The Integrated Services architecture is based on per-flow resource reservation. To receive resource assurance, an application must make a reservation before it can transmit traffic onto the network. Resource reservation involves several steps. First, the application must characterize its traffic source and the resource requirements. The network then uses a routing protocol to find a path based on the requested resources. Next reservation protocol is used to install the reservation state along that path. At each hop admission control checks whether sufficient resources are available to accept the new reservation. Once the reservation is established, the application can start to send traffic over the path for which it has exclusive use of the resources. Resource reservation is enforced by packet classification and scheduling mechanisms in the network elements, such as routers.
The Integrated Services working group proposed two new service models that a user can select: the guaranteed service and the controlled load service models. The guaranteed service model provides deterministic worst-case delay bound through strict admission control and fair queuing scheduling. This service was designed for applications that require absolute guarantees on delay. The other service model, the controlled load service, provides a less firm guarantee-a service that is close to a lightly loaded best-effort network. The Resource Reservation Setup Protocol (RSVP) was also standardized for signaling an application's requirements to the network and for setting up resource reservation along the path. The Integrated Services model was the first attempt to enhance the Internet with QoS capabilities. The research and development efforts provided valuable insights into the complex issues of supporting QoS in the Internet. The resource allocation architecture, new service models, and RSVP protocol were standardized in the late 1990s. But deployment of the Integrated Services architecture in the service provider's backbones has been rather slow for a number of reasons. For one, the Integrated Services architecture focused primarily on long-lasting and delay-sensitive applications. The World Wide Web, however, significantly changed the Internet landscape. Web-based applications now dominate the Internet, and much of Web traffic is short-lived transactions. Although per-flow reservation makes sense for long-lasting sessions, such as video conferencing, it is not appropriate for Web traffic. The overheads for setting up a reservation for each session are simply too high. Concerns also arose about the scalability of the mechanisms for supporting Integrated Services. To support per-flow reservation, each node in a network has to implement per-flow classification and scheduling. These mechanisms may not be able to cope with a very large number of flows at high speeds. Resource reservation requires the support of accounting and settlement between different service providers. Since those who request reservation have to pay for the services, any reservations must be authorized, authenticated, and accounted. Such supporting infrastructures simply do not exist in the Internet. When multiple service providers are involved in a reservation, they have to agree on the charges for carrying traffic from other service providers' customers and settle these charges among them. Most network service providers are currently connected through bilateral peering agreements. To extend these bilateral agreements to an Internet-wide settlement agreement is difficult given the large number of players. The Integrated Services architecture may become a viable framework for resource allocation in corporate networks. Corporate networks are typically limited in size and operated by a single administrative domain. Therefore many of the scaling and settlement issues we discussed above vanish. Integrated Services can support guaranteed bandwidth for IP telephony, video conferencing over corporate intranets. RSVP can also be used for resources allocation and admission control for traffic going out to wide-area networks.
The ideas, concepts, and mechanisms developed in Integrated Services also found their ways into later work on QoS. For example, controlled load service has influenced the development of Differentiated Services, and similar resource reservation capability has been incorporated into MPLS for bandwidth guarantees over traffic trunks in the backbones.
1.1.2 Differentiated Services
The Differentiated Services architecture was developed as an alternative resource allocation scheme for service providers' networks. By mid-1997 service providers felt that Integrated Services were not ready for large-scale deployment, and at the same time the need for an enhanced service model had become more urgent. The Internet community started to look for a simpler and more scalable approach to offer a better than best-effort service.
After a great deal of discussion, the IETF formed a new working group to develop a framework and standards for allocating different levels of services in the Internet. The new approach, called Differentiated Services, is significantly different from Integrated Services. Instead of making per-flow reservations, Differentiated Services architecture uses a combination of edge policing, provisioning, and traffic prioritization to make possible service differentiation.
In the Differentiated Services architecture, users' traffic is divided into a small number of forwarding classes. For each forwarding class, the amount of traffic that users can inject into the network is limited at the edge of the network. By changing the total amount of traffic allowed in the network, service providers can adjust the level of resource provisioning and hence control the degree of resource assurance to the users.
The edge of a Differentiated Services network is responsible for mapping packets to their appropriate forwarding classes. This packet classification is typically done based on the service level agreement (SLA) between the user and its service provider. The nodes at the edge of the network also perform traffic policing to protect the network from misbehaving traffic sources. Nonconforming traffic may be dropped, delayed, or marked with a different forwarding class. The forwarding class is directly encoded into the packet header. After packets are marked with their forwarding classes at the edge of the network, the interior nodes of the network can use this information to differentiate the treatment of the packets. The forwarding classes may indicate drop priority or resource priority. For example, when a link is congested, the network will drop packets with the highest drop priority first.
Differentiated Services do not require resource reservation setup. The allocation of forwarding classes is typically specified as part of the SLA between the customer and its service provider, and the forwarding classes apply to traffic aggregates rather than to individual flows. These features work well with transaction-orientated Web applications. The Differentiated Services architecture also eliminates many of the scalability concerns with Integrated Services. The functions that interior nodes have to perform to support Differentiated Services are relatively simple. The complex process of classification is needed only at the edge of the network, where traffic rates are typically much lower.
The Differentiated Services approach relies on provisioning to provide resource assurance. The quality of the assurance depends on how provisioning is carried out and how the resources are managed in the network. These issues are explored in the next section, where we discuss performance optimization in the networks. Because of the dynamic nature of traffic flows, precise provisioning is difficult. Thus it generally is more difficult, and certainly more expensive, to provide deterministic guarantees through provisioning rather than reservation.
1.2 Performance Optimization
Once the resource allocation architecture and service models are in place, the second issue in resource allocation is performance optimization; that is, how to organize the resources in a network in the most efficient way to maximize the probability of delivering the commitments and minimize the cost of delivering the commitments. The connection between performance optimization and QoS support may seem less direct compared with resource allocation. Performance optimization is, however, an important building block in the deployment of QoS. Implementing QoS goes way beyond just adding mechanisms such as traffic policing, classification, and scheduling; fundamentally, it is about developing new services over the Internet. Service providers must make a good business case so that customers are willing to pay for the new services and the new services will increase the return of their investment in the networks. The cost-effectiveness of the new services made possible by QoS capabilities is a major factor in the rollout of these services.
The Internet's datagram routing was not designed for optimizing the performance of the network. Scalability and maintaining connectivity in the face of failures were the primary design objectives. Routing protocols typically select the shortest path to a destination based on some simple metrics, such as hop count or delay. Such simple approaches are clearly not adequate for supporting resource allocation. For example, to make a reservation, we need to find a path with certain requested resources, such as bandwidth, but IP routing does not have the necessary information to make such decisions. Simply using the shortest-path algorithm for selecting paths is likely to cause high rejection rate and poor utilization. The shortest-path routing does not always use the diverse connections available in the network. In fact, traffic is often unevenly distributed across the network, which can create congestion hot spots at some points while some other parts of the network may be very lightly loaded.
Performance optimization requires additional capabilities in IP routing and performance management tools. To manage the performance of a network, it is necessary to have explicit control over the paths that traffic flows traverse so that traffic flows can be arranged to maximize resource commitments and utilization of the network. MPLS has a mechanism called explicit routing that is ideal for this purpose. MPLS uses the label-switching approach to set up virtual circuits in IP-based networks. These virtual circuits can follow destination-based IP routing, but the explicit routing mechanism in MPLS also allows us to specify hop by hop the entire path of these virtual circuits. This provides a way to override the destination-based routing and set up traffic trunks based on traffic-engineering objectives.
The process of optimizing the performance of networks through efficient provisioning and better control of network flows is often referred to as traffic engineering. Traffic engineering uses advanced route selection algorithms to provisioning traffic trunks inside backbones and arrange traffic flows in a way that maximizes the overall efficiency of the network. The common approach is to calculate traffic trunks based on flow distribution and then set up the traffic trunks as explicit routes with the MPLS protocol. The combination of MPLS and traffic engineering provides IP-based networks with a set of advanced tools for service providers to manage the performance of their networks and provide more services at less cost.
1.2.1 Multiprotocol Label Switching
MPLS was originally seen as an alternative approach for supporting IP over ATM. Although several approaches for running IP over ATM were standardized, most of the techniques are complex and have scaling problems. The need for more seamless IP/ATM integration led to the development of MPLS in 1997. The MPLS approach allows IP routing protocols to take direct control over ATM switches, and thus the IP control plane can be tightly integrated with the rest of the IP network.
The technique that MPLS uses is known as label switching. A short, fixed-length label is encoded into the packet header and used for packet forwarding. When a label switch router (LSR) receives a labeled packet, it uses the incoming label in the packet header to find the next hop and the corresponding outgoing label. With label switching, the path that a packet traverses through, called the label switched path (LSP), has to be set up before it can be used for label switching.
In addition to improving IP/ATM integration, MPLS may also be used to simplify packet forwarding. Label lookup is much easier compared with prefix lookup in IP forwarding. With MPLS, packet forwarding can be done independent of the network protocols, and so forwarding paradigms beyond the current destination-based one can be easily supported. However, the driving force behind the wide deployment of MPLS has been the need for traffic engineering in Internet backbones. The explicit route mechanism in MPLS provides a critical capability that is currently lacking in the IP-based networks. MPLS also incorporates concepts and features from both Integrated Services and Differentiated Services. For example, MPLS allows bandwidth reservation to be specified over an LSP, and packets can be marked to indicate their loss priority. All these features make MPLS an ideal mechanism for implementing traffic-engineering capabilities in the Internet.
The purpose of MPLS is not to replace IP routing but rather to enhance the services provided in IP-based networks by offering scope for traffic engineering, guaranteed QoS, and virtual private networks (VPNs). MPLS works alongside the exiting routing technologies and provides IP networks with a mechanism for explicit control over routing paths. MPLS allows two fundamentally different data-networking approaches, datagram and virtual circuit, to be combined in IP-based networks. The datagram approach, on which the Internet is based, forwards packets hop by hop based on their destination addresses. The virtual circuit approach, used in ATM and frame relay, requires connections to be set up. With MPLS, the two approaches can be tightly integrated to offer the best combination of scalability and manageability.
MPLS control protocols are based on IP addressing and transport and therefore can be more easily integrated with other IP control protocols. This creates a unified IP-based architecture in which MPLS is used in the core for traffic engineering and IP routing for scalable domain routing. In several recent proposals extending the MPLS protocols to the optical transport networks has even been considered. MPLS may well become the standard signaling protocol for the Internet.
1.2.2 Traffic Engineering
The basic problem addressed in traffic engineering is as follows: Given a network and traffic demands, how can traffic flows in the network be organized so that an optimization objective is achieved? The objective may be to maximize the utilization of resources in the network or to minimize congestion in the network. Typically the optimal operating point is reached when traffic is evenly distributed across the network. With balanced traffic distribution, both queuing delay and loss rates are at their lowest points.
Obviously these objectives cannot be achieved through destination-based IP routing; there simply is not sufficient information available in IP routing to make possible such optimization. In traffic engineering, advanced route selection techniques, often referred to as constraint-based routing in order to distinguish them from destination routing, are used to calculate traffic trunks based on the optimization objectives. To perform such optimization, the traffic-engineering system often needs networkwide information on topology and traffic demands. Thus traffic engineering is typically confined to a single administrative domain.
The routes produced by constraint-based routing are most likely different from those in destination-based IP routing. For this reason these constraint-based routes cannot be implemented by destination-based forwarding. In the past, many service providers used ATM in the backbones to support constraint-based routing. ATM virtual circuits can be set up to match the traffic patterns; the IP-based network is then overlain on top of these virtual circuits. MPLS offers a better alternative since it offers similar functions yet can be tightly integrated with IP-based networks.
The existing Internet backbones have used the so-called overlay model for traffic engineering. With the overlay model, service providers build a virtual network comprising a full mesh of logical connections between all edge nodes. Using the traffic demands between the edge nodes as input, constraint-based routing selects a set of routes for the logical connections to maximize the overall resource utilization in the network. Once the routes are computed, MPLS can be used to set up the logical connections as LSPs exactly as calculated by constraint-based routing.
The downside of the overlay model is that it may not be able to scale to large networks with a substantial number of edge nodes. To set up full-mesh logical network with N edge nodes, each edge node has to connect to the other (N - 1) edge nodes, resulting in N ? (N - 1) logical connections. This can add significant messaging overheads in a large network. Another problem is that the full-mesh logical topology increases the number of peers, neighbors that routers talk to, that a routing protocol has to handle; most current implementation of routing protocols cannot support a very large number of peers. In addition to the increased peering requirements, the logical topology also increases the processing load on routers during link failures. Because multiple logical connections go over the same physical link, the failure of a single physical link can cause the breakdown of multiple logical links from the perspective of IP routing.
Traffic engineering without full-mesh overlaying is still a challenge. One heuristic approach that some service providers have used is to adjust traffic distribution by changing the link weights in IP routing protocols. For example, when one link is congested, the link weight can be increased in order to move traffic away from this link. Theoretically one can achieve the same traffic distribution as in the overlay model by manipulating the link weights in the Open Shortest Path First (OSPF) routing protocol. This approach has the advantage that it can be readily implemented in existing networks without major changes to the network architecture.
The need for QoS capabilities in the Internet stems from the fact that best-effort service and datagram routing do not meet the needs of many new applications, which require some degree of resource assurance in order to operate effectively. Diverse customer requirements also create a need for service providers to offer different levels of services in the Internet.
The Internet community has developed a number of new technologies to address these issues. Integrated Services and Differentiated Services provide new architectures for resource allocation in the Internet. Integrated Services use reservation to provide guarantee resources for individual flows. The Differentiated Services architecture takes a different approach. It combines edge policing, provisioning, and traffic prioritization to provide different levels of services to customers.
MPLS and traffic engineering address the issues of bandwidth provisioning and performance optimization in Internet backbones. The explicit route mechanism in MPLS adds an important capability to the IP-based network. Combined with constraint-based routing in traffic engineering, MPLS and traffic engineering can help network providers make the best use of available resources and reduce costs.
The following Web site has a collection of articles related to the early history of the Internet: www.bell-labs.com/user/zhwang/index.html.
The basic principles of datagram networks and a detailed design were first described by Paul Baran in his 1964 RAND report "On Distributed Communications." Although the report was discovered after the ARPANET had already started, the current Internet is remarkably close to what Paul Baran originally had in mind. This 12-volume historical report is now available on-line at www.rand.org/publications/RM/baran.list.html. For a general introduction about data networking and the Internet, we recommend the following:
Peterson, L., and B. Davie. Computer Networks: A Systems Approach. San Francisco: Morgan Kaufmann, 1999.