Simulation Study on the DS Forwarding Architectures

Mika LOUKOLA <mika.loukola@hut.fi>
Jorma SKYTTA <jorma.skytta@hut.fi>
Helsinki University of Technology
Finland

Abstract

This paper presents the performance simulations of Internet protocol (IP) traffic marked with differentiated services (DS) codepoints on the forwarding path of a DS-capable router. DS Internet drafts present several possible per-hop-behaviors (PHBs) and their example implementation mechanisms. These include one queue for each traffic class with higher priority classes having strict priority over the lower priority queues. Another possibility is to implement a PHB group with one queue per two PHBs. In this case active queue management is needed. Random early detection with in and out is a good example. The extreme case would be having only one queue for each PHB group. In this case there is a need for weighted random early detection queue management with different threshold values for each PHB. The simulation results reveal the true nature of each approach and guide vendors to select the right mechanism for each PHB.

Introduction
The differentiated services architecture
Bringing QoS to the user
The simulated forwarding architectures
Simulation results
Comparison between the simulated architectures
Conclusions
References

Introduction

Traditionally the physical interface of the Internet user has had the only impact on the quality of service (QoS) experienced by the customer. New applications have emerged demanding much improved service quality. At the same time a new customer group has come into existence: the corporate customers who wish not to be forced to build their own corporate networks, but to be able to use the Internet as one. The differentiated services (DS) architecture offers service providers ways to offer each customer a differentiated QoS regardless of the physical access interfaces. Differentiated Services Working Group (DSWG) defines the framework architecture for the differentiated services. The DSWG defines marking of the Internet protocol (IP) packets or, more precisely, the allocation of DS codepoints (DSCPs). Those codepoints are used to indicate how the packet is treated in the core network. The DSWG defines only behaviors of the respective DSCPs, not the mechanisms to implement them. It is up to the vendor how to implement each per-hop-behavior (PHB). This paper presents simulation results of different implementation mechanisms and comparisons between them. This information is vital when selecting the implementation mechanism for each PHB.

The differentiated services architecture

The DSWG has redefined the TOS field of IPv4 [9] and the 8-bit class field of IPv6 [10,11] and renamed it as the DS field. The DSWG allocates DSCPs from the DS field space to be used for different PHBs. In general, DSWG should be specifying behaviors, syntax, and semantics and not mechanisms and policies (see Fig. 1).

Fig. 1. The Role of DSWG

The DSCP address space has been divided into standard action, local use, and experimental use pools. The standard action pool is designated for the common Internet usage. Several Internet drafts recommend the use of these DSCP, which include Default PHB [3], Class Selector Compliant (CSC) PHB Group [3], EF PHB [7], AF PHB Group [4], and Dynamic RT/NRT PHB Group [5]. Table 1 summarizes the status of the recommended codepoints. The notation is the same as used in the respective drafts.

**Table 1. Recommended DS codepoints**
DSCP Value	PHB
000000 (P1,00)	Default PHB, Class Selector - DP0
000010 (P1,01)	DRT_11
000100 (P1,02)	DRT_12
000110 (P1,03)	DRT_13
001000 (P1,04)	Class Selector - DP1
001010 (P1,05)	DRT_14
001100 (P1,06)	DRT_15
001110 (P1,07)	DRT_16
010000 (P1,08)	Class Selector - DP2, AF11
010010 (P1,09)	AF12
010100 (P1,10)	AF13
010110 (P1,11)	DRT_21
011000 (P1,12)	Class Selector - DP3, AF21
011010 (P1,13)	AF22
011100 (P1,14)	AF23
011110 (P1,15)	DRT_22
100000 (P1,16)	Class Selector - DP4, AF31
100010 (P1,17)	AF32
100100 (P1,18)	AF33
100110 (P1,19)	DRT_23
101000 (P1,20)	Class Selector - DP5, AF41
101010 (P1,21)	AF42
101100 (P1,22)	AF43
101110 (P1,23)	EF
110000 (P1,24)	Class Selector - DP6
110010 (P1,25)	DRT_24
110100 (P1,26)	DRT_25
110110 (P1,27)	DRT_26
111000 (P1,28)	Class Selector - DP7
111010 (P1,29)	no recommendation
111100 (P1,30)	no recommendation
111110 (P1,31)	no recommendation

Bringing QoS to the user

A common mistake is to think the DS framework is an attempt to achieve only soft-service guarantees over the Internet. This idea frequently leads to the incorrect conclusion that DS can support only weak guarantees about service levels as well as the assumption that the delivered services are always implicitly lossy because of mandatory packet drop. The DS architecture consists of additional concepts -- admission controls and traffic shaping -- clearly described in the drafts.

Admission controls enable the construction of backbone transport classes that are not oversubscribed and therefore can meet much stricter service-level agreements (SLAs). With the correct provisioning these SLAs could possibly approach the class of service normally associated with Frame Relay CIR (committed information rate) or ATM (asynchronous transfer mode) CBR (constant bit rate) circuits. Traffic shaping permits the rate-based guiding of synchronous traffic flows, including TCP aggregates, without the need for routine packet drops. Instead, the scheduling algorithm regulates class throughput.

As these additional capabilities become more widely understood and deployed, DS will be less frequently referred to as a simple-priority packet-dropping scheme capable of providing few if any hard-service-level guarantees.

The simulated forwarding architectures

The simulated architectures support a PHB Group containing eight priority or drop precedence (DP) levels indicated by eight separate DSCPs. DSCP values for each PHB are as follows: DP0=000xxx, DP1=001xxx, DP2=010xxx, DP3=011xxx, DP4=100xxx, DP5=101xxx, DP6=110xxx, and DP7=111xxx. A lower drop precedence level (DP0) indicates a higher relative order. The key idea in the DS traffic differentiation is to guarantee that the packets marked with a DP level indicating a lower relative order do not have higher probability of timely forwarding than do packets marked with a DP indicating a higher relative order. [3]

Fig. 2. Forwarding Architecture Model 1

Model 1 illustrated in Figure 2 shows the basic configuration of having separate queues for each PHB.

Model 2 (see Fig. 3.) separates the packets sharing the same output port to four queues. Each queue is shared by two PHBs. In this case RIO is selected as an active queue management algorithm. Different threshold values for each PHB sharing the same queue are shown in Fig. 4.

Fig. 3. Forwarding Architecture Model 2

Fig. 4. Discard Probabilities of the Two DPs Sharing the Same Queue in Model 2

Model 3 (Fig. 5) is very similar to Model 1 (Fig. 2), but it uses a round-robin type of scheduling algorithm. In this way the packets marked with DPs indicating a lower relative order do not starve because a specific amount of the total bandwidth is allocated for each queue.

Fig. 5. Forwarding Architecture Model 3

Figure 6 illustrates Model 4. In this case there is only one queue for all eight PHBs. A weighed random early detection (WRED) is the selected active queue management algorithm. The threshold values and the discard probabilities of each PHB are illustrated in Figure 7.

Fig. 6. Forwarding Architecture Model 4

Fig. 7. Discard Probabilities of the Eight DPs Sharing the Same Queue in Model 4

In Model 5 (Fig. 8) there are four PHBs in each queue. The discard probabilities of the PHBs sharing the same queue are illustrated in Figure 9.

Fig. 8. Forwarding Architecture Model 5

Fig. 9. Discard Probabilities of the Four DPs Sharing the Same Queue in Model 5

Simulation results

While classifiers police the input traffic, some output port still may be overloaded from time to time. These simulations illustrate a situation of 12.5 percent overload on an output port. The simulations reveal how the proposed forwarding architectures differ from each other. In all the simulation architectures, the input traffic receives packets marked with eight different DSCP values. The DSCP value in each packet is a result from a random function; thus the DSCP probabilities are identical. The simulation period is one second. New packets are generated every 142 盜 with the packet size of 400 bytes (=22.5 Mbps). The destination IP address results in the forwarding of the packet to the monitored output port. Packets are forwarded every 160 盜 (=20.0 Mbps), which results in the monitored output port being overloaded with 12.5 percent. Each queue has a length of 20 packets.

When a round-robin scheduler is used, the queues are served in the sequence stated below:

    for (j=number_of_queues;j>0;j--) {
        for (i=0;i<j;i++) {
            serve_queue(i);
        }
    }

When using round-robin or a strict priority scheduler, the scheduler moves along to the next queue and serves it if the actual queue to be served is currently empty. This way, if any number of packets are located in any of the output queues, one packet is forwarded every 160 盜. In each simulation architecture, an identical number of packets are thus forwarded to the monitored output port.

Model 1

As Figure 10 shows, the packets marked with lower DP values stay in the node for a much shorter time. The strict priority scheduling almost starves the queue containing packets marked with DP7. Table 2 shows what percentage of the total packets marked with a certain DP value have been forwarded. The rest have been discarded by the queue management algorithm or are left in the queues after the simulation has ended.

Fig. 10. Scaled Cumulative Transfer Delay for Each Drop Precedence Level in Model 1.

**Table 2. Forwarding Percentages, Model 1**
DP Level	Percentage of Packets Forwarded Through the Node
DP0	100.00
DP1	100.00
DP2	100.00
DP3	100.00
DP4	100.00
DP5	99.89
DP6	100.00
DP7	10.32
TOTAL	88.75

Model 2

When two DP levels are inserted to the same queue, the RIO queue management algorithm results in the effective queue length being different for each DP level sharing the queue. As Figure 3 shows, the average queue length experienced by the packet marked with a higher DP value is only 50 percent of the total length. The packets marked with lower DP level experience queue length of 75 percent of the actual queue size. Thus, the packets marked with higher DP values are forwarded faster than the packets having higher relative order (see Fig. 11). At first this might sound like an attack against the DS framework idea in which packets marked with DP values indicating a higher relative order should have a higher probability of timely forwarding. Table 3, however, reveals the whole picture. A lower percentage of the packets marked with higher DP values get forwarded. These kinds of solutions would serve well real-time applications that require low delay and relatively low bit-rate. If real-time packets would be inserted to a queue with a high occupancy level, they would be useless when they arrived at the final destination.

Fig. 11. Scaled Cumulative Transfer Delay for Each Drop Precedence Level in Model 2

**Table 3. Forwarding Percentages, Model 2**
DP Level	Percentage of Packets Forwarded Through the Node
DP0	100.00
DP1	100.00
DP2	100.00
DP3	100.00
DP4	99.89
DP5	99.77
DP6	76.42
DP7	34.46
TOTAL	88.75

Model 3

The average delay values seen in Figure 12 are similar to those in Model 1. However, in this case the round-robin scheduler guarantees bandwidth for the higher DP queues and thus spreads the discard of packets among the higher DP queues (see Table 4).

Fig. 12. Scaled Cumulative Transfer Delay for Each Drop Precedence Level in Model 3

**Table 4. Forwarding Percentages, Model 3**
DP Level	Percentage of Packets Forwarded Through the Node
DP0	100.00
DP1	100.00
DP2	100.00
DP3	99.66
DP4	99.89
DP5	99.66
DP6	74.20
DP7	36.84
TOTAL	88.75

Model 4

In this case only one queue is used for all DP levels. The same effect that was seen in Model 2 repeats itself. The effective queue length is smaller among higher DP values (see Fig. 13). Table 5 shows that the higher DP values are discarded more often.

Fig. 13. Scaled Cumulative Transfer Delay for Each Drop Precedence Level in Model 4

**Table 5. Forwarding Percentages, Model 4**
DP Level	Percentage of Packets Forwarded Through the Node
DP0	100.00
DP1	100.00
DP2	100.00
DP3	100.00
DP4	99.43
DP5	97.73
DP6	70.66
DP7	42.22
TOTAL	88.75

Model 5

This architecture results in two groups of delays as seen in Figure 14. The effective queue length is as stated in Figure 9.

Fig. 14. Scaled Cumulative Transfer Delay for Each Drop Precedence Level in Model 5

None of the packets inserted to the queue with a higher relative order have been discarded (see Table 6).

**Table 6. Forwarding Percentages, Model 5**
DP Level	Percentage of Packets Forwarded Through the Node
DP0	100.00
DP1	100.00
DP2	100.00
DP3	100.00
DP4	99.43
DP5	99.55
DP6	69.95
DP7	41.20
TOTAL	88.75

Comparison between the simulated architectures

The simulation results show very clearly how the DS router architectures treat each of the eight simultaneous DSCP packet streams. When selecting the proper implementation mechanisms, one should carefully look at the packet loss characteristics of each approach. These models have been previously studied in Loukola 1998 [12].

Random early detection (RED) is not directly the answer we need for DS, assuming DS means more than just a set of TCP application. If DS means putting nontraditional traffic on the Internet -- that is, traffic that is not just point-to-point data -- then just implementing RED by itself is not the answer. Along with RED one must add caveats and new constraints.

RED might do wonders to decrease congestion at routers, but it will only work for new nondata users if it leaves a lot of free headroom. For example, if a network is all data (applications well-suited to congestion-avoidance), then it can be used up to, say, 90 percent of capacity. But if a network must carry a heavy dose of voice and video, then RED must be adjusted to keep the data traffic at, say, 40 percent of the maximum capacity, in order to give the nonthrottling traffic a chance at decent service. That obviously depends on the relative amount of unicast data versus other traffic.

So along with RED, one must start thinking of how much headroom must be set aside for the best-effort and better-than-best-effort part of DS. Video and voice, for example, won't use TCP, but they also cannot accept a huge amount of packet loss.

Table 7 shows an example selection of implementation mechanisms for EF, AF, Class Selector Compliant, and DRT PHB Groups.

**Table 7. Example Selection of Mechanisms**
PHB	Mechanism
EF	Model 1
AF	Models 3, 5 (one queue for each class, WRED, round-robin)
CSC	Model 3
DRT	Model 5 (separate queues for NRT and RT classes)

When the router may not re-order packets of a flow belonging to a particular PHB Group (DRT and AF classes), then all the packets of each flow must be inserted into a single first in, first out (FIFO) queue. Class Selector Compliant does not have this restriction and can thus utilize more queues.

Conclusions

Consider WRED as an admission mechanism into a single memory block from which a scheduler extracts packets according to precedence. If the higher-weighted traffic is dropped with a much lower probability, then that traffic will tend to starve out the traffic marked with lower relative order even when both sets of traffic are properly congestion-avoiding.

The only really obvious fix is to do strict admissions control on the higher-precedence traffic, such that it never uses more than a particular percentage of the bottleneck bandwidth. This can be done with round-robin scheduler.

On the other hand, WRED control laws that start discarding packets with higher queue utilization can lead to the lower relative order traffic being starved behind a relatively long long-term queue containing packets marked with a higher relative order.

RED applied with different control laws on individual queues that are serviced according to precedence seems to trade off one problem (removing all buffer space from the lower relative order traffic) for the other (not transmitting much lower relative order traffic because the average queue length at the higher relative order is long). It is hard to imagine any application that really benefits much from a long average queue length at a bottleneck. Therefore, the control law should be set to take source dynamics into account and explicitly keep the average queue length short by dropping more heavily when there is more bottleneck contention.

Vendors should really make it possible to populate an array of drop probabilities indexed by the current average queue length with an arbitrary control law. Some "considered useful and popular" control laws (parabola or 1993 or traditional FIFO tail drop) might be generated through a shorthand in the configuration, as in Cisco 7200 series RSP processors [13].

References

Blake S., et al., "An Architecture for Differentiated Services," Internet Draft <draft-ietf-diffserv-arch-02.txt>, Torrent Networking Technologies, October 1998
Nichols K., "Definition of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 Headers," Internet Draft <draft-ietf-diffserv-header-04.txt>, Cisco Systems Inc., October 1998
Nichols K., et al., "Definition of the Differentiated Services Field (DS Byte) in the IPv4 and IPv6 Headers," Internet Draft <draft-ietf-diffserv-header-04.txt>, Bay Networks, October 1998
Heinänen J., et al., "Assured Forwarding PHB Group," Internet Draft <draft-ietf-diffserv-af-03.txt>, Telia Finland, November 1998
Loukola M.V., et al., "Dynamic RT/NRT PHB Group," Internet Draft <draft-loukola-dynamic-00.txt>, Helsinki University of Technology, November 1998
Nichols K., et al., "A 2-Bit Differentiated Services Architecture for the Internet," Internet Draft <draft-nichols-diff-svc-arch-00.txt>, Bay Networks Inc., November 1997
Jacobson V., et al., "An Expedited Forwarding PHB," Internet Draft <draft-ietf-diffserv-phb-ef-01.txt>, LBNL, November 1998
Kilkki K., "Simple Integrated Media Access," Internet Draft <draft-kalevi-simple-media-access-01.txt>, Nokia Research Center, June 1997
Postel J., et al., "Internet Protocol," RFC 791, Sun Microsystems, October 1981
Deering S., Hinden R., "Internet Protocol, Version 6, Specification," RFC 1883, Xerox PARC, Ipsilon Networks Inc., December 1995
Deering S., Hinden R., "Internet Protocol, Version 6, Specification," Internet Draft <draft-ietf-ipngwg-ipv6-spec-v2-01.txt>, November 1997
Loukola M.V., "Differentiated Services Schemes and Application Feedback," in Proc. 1st IEEE International Conference on Networking the World (CNIW'98), 9-12 December, 1998, Ahmedabad, India, pp. 63-71
World Wide Web Consortium, "Introduction: Quality of Service Overview," http://www.cisco.com/univercd/cc/td/doc/product/software/ios120/12cgcr/qos_c/qcintro.htm, url valid: 18 January 1999, Cisco Systems Inc.

Simulation Study on the DS Forwarding Architectures

Abstract

Contents