Michael H. Behringer <Michael.Behringer@dante.org.uk>
Dante
United Kingdom
Keywords: TF-TEN, JAMES, TEN-34, ATM WAN experiments, performance, SVC, signaling, ATM-ARP, NHRP, addressing, network management, CDVT, VBR, RSVP, ATM security.
Up to the beginning of 1997, Internet backbones in Europe consisted mostly of 2 Mbit/s line technology. In 1996 the national research networks (NRN) in Europe with DANTE as project manager started a project--similar to the Internet II efforts in the United States--to develop a pan-European backbone with higher speeds and more advanced technology: TEN-34 (Trans-European Network Interconnect at 34 Mbit/s). As there was desperate demand for more Internet bandwidth in the short term, the first phase of the TEN-34 project (3) was limited to providing a standard high-speed Internet service for the academic community. This TEN-34 network is now operational (4). The need for more advanced applications was, however, recognized, so the second goal of the project was to investigate new technologies such as bandwidth on demand, to support new applications, and to make them available on the TEN-34 network.
This paper describes the experiments carried out in the framework of the TEN-34 project to examine new technology trends, mostly based on ATM, for their suitability for the provision of backbone Internet services. A task force, consisting of experts of many European countries, was established to carry out these experiments. This Task Force TEN (Trans-European Networking) (1) is organized under the framework of TERENA (5).
The experiments described here cover a wide range of aspects of ATM, from network management to signaling, as well as performance-related issues and in-depth examinations of ATM technology-related issues such as cell delay variation. Unfortunately, the results are far from encouraging. In several areas of research we came to the conclusion that too many pieces of the jigsaw are still missing. Especially the use of SVCs does not seem to be feasible over a WAN yet.
But why did we discover so many problems with ATM in the wide area, where there seem to be no problems in the local area? After all, SVCs are already successfully being used in LANs. The difference is that in the LAN, bandwidth is close to "unlimited." As soon as there are bandwidth restrictions, however, as is the case on WANs, it becomes far more difficult to accommodate many different traffic flows. And software and hardware have not yet adopted the mechanisms needed to cope with low bandwidth. Clearly ABR was designed to solve some of these issues, and implementations of device drivers and switch software will also improve. So our conclusion is not that the more advanced ATM features will not work, but that far more time is needed to deliver on the promises that were made on ATM.
The "Internet technology" is developing further as well, and it is not apparent yet whether either side can solve all the problems. Thus we are also looking into RSVP as an alternative model for reserving bandwidth. The state of NHRP is also under investigation. In these times of semi-religious arguments about ATM versus IP it is important to stress that we are not biased toward either technology. We have real demands, and the best technology will be deployed.
Quite a few buzzwords of ATM, such as ABR and PNNI, do not appear in this paper. The reason is not that we do not consider them, but implementations of them were either not available, or we did not have time to do experiments on them. The second phase of the TF-TEN tests, which started in May 1997, will cover these areas and continue some of the tests described here.
In the following sections the results of our experiments are outlined. Due to space limitations it was not possible to provide extensive explanations of the results. In general, experiments in which we discovered problems are covered in greater length. For more detailed explanations please see our Web site (1) or get in touch with the experiment leaders directly (see Acknowledgements). This list contains the experiments carried out so far:
The experiments carried out were TCP and UDP throughput measurements between one or more hosts. The workstations had to be patched to allow bigger window sizes for TCP. Netperf was used in single user mode to do the tests. The result of both tests was that a utilization of the VPs of almost the full theoretical maximum could be reached. We did not discover any unexpected interference with the ATM layer, and the results were as expected.
The tests showed that switching over a policed network is highly unstable and hardly usable for an operational service. There are a number of problems that have to be resolved before switching can be used in its intended way, between end applications:
These issues must be resolved, and all the equipment between end applications needs to support all those features to make SVCs work in the way that was intended. Availability of these features in fully supported software releases of workstations, routers, and switches is not yet in sight.
Although this full solution might be still some years ahead, one could envision the use of SVCs in a more constrained way to bypass some of the problems. For example, it would be possible to leave the "accounting" of available bandwidth to the human user. Prototype applications of TCP implementations and of applications are available. But such a solution would first of all not meet the stability requirements needed in an operational service. And even then, there are still a number of problems in today's systems:
One way to avoid some of these problems is to do traffic shaping on a switch before the public network, which is policing the traffic. In this case it is the switch that would throw away the cells and the flow control is left to TCP. This solution works well. In our tests we noticed almost no cell losses, as there was sufficient buffering on the switches. Up to 10 concurrent TCP connections were tested over CBR VCs with 2 Mbit/s, and we could observe a fair sharing of bandwidth, as can be expected from using TCP. This solution, however, works only for TCP, and one UDP connection can break the whole system. We did not look into packet discard mechanisms here.
We also investigated the set-up and tear-down times for international SVCs by sending a number of pings, in which the first one is longer than the subsequent ones, due to the set-up of the SVC (if the calls succeed). The distribution of set-up times was inconsistent and did not meet our expectations. More research on this is currently being undertaken.
In summary, we concluded that SVCs in the way they were intended to be used, between end applications, will still not be feasible for the next two to three years. Even on a very limited scale, where the number of SVCs on a given VP never exceeds one and all the parameters are hard coded, the current implementations of drivers, applications, and router and switch software seem to be far too unstable to provide any operational service on top.
The set-up in use was one ATM ARP server in Austria, which was used by the other participants in the SVC experiment to resolve IP addresses to the NSAP addresses we used. As expected, we did not discover any major problems, and the ARP server was used in a normal way to support the other experiments where needed.
More experiments with several ARP servers and logical IP subnetworks will be carried out to test the scalability. However, no major problems are expected.
The experiment we carried out first involved two hosts in Norway and one in Spain. We used Cisco routers and the NHRP implementation of IOS version 11.0. This set-up worked as expected, and there were no major problems, apart from the usual SVC problems (see section 2.2.), and even those were partly tackled through the shaping done on the router. We experienced some routing instability problems on our router with cache invalidations, probably due too low software levels.
We are currently setting up a test environment with more than one IP hop, to test the scalability, and more of the parameters such as the number of packets received before the shortcut is activated. For a big IP network, the question remains whether this solution will scale with the standard CBR or VBR services: Due to the large number of potential connections over the network and the fact that bandwidth is limited, this would lead to either very small VCs or very few of them. Before services such as ABR are available, the n-to-n problem might still prevent NHRP from being deployed widely in a backbone environment.
A survey among the national research networks (NRNs) in Europe showed that most NRNs are using or are planning to use DCC format NSAP addressing for their ATM networks (7, 8, 9, 10). The public ATM network operators have not come to an agreement on which addressing scheme to use commonly, and it does not look likely that there will be an agreement. This means that an address translation function will be needed to make ATM addressing work globally. However, this requirement has not yet been acknowledged by the ATM operators, as they are still fighting for one common addressing scheme.
Supported address translation mechanisms between E.164 and NSAP addressing had not been available at the time of writing this paper, and only beta versions were becoming available. Therefore, no practical experience in address translation could be gained.
The traditional network management protocol used in the Internet is SNMP, which we concentrated on in the first phase of the tests. The switches in use were polled from one central NM platform in Belgium both through the ATM connections (in band) and through the production Internet (out of band). We used standard MIBs where available, and also manufacturer-specific MIBs. In none of the cases did we experience any problems. The results of the polling were made available on a Web page so that all project participants could check the status of the network.
In the next phase of the project we envision trialing different NM protocols such as the X-user interfaces. The JAMES network plans to support this feature for its users. At the time of writing this paper, no experience on this protocol could be reported.
To evaluate the cell delay variation of a single stream of data over an ATM VC, we established an international VC with several measurement points on the path, where the delay variation could be measured with the help of ATM analyzers. The longest path in test involved five such checkpoints. Over this path we sent an (almost) 0 CDVT cell stream and measured the arrival times for each cell on each checkpoint. The distribution of cell interarrival times was then analyzed.
The results were far from what could be expected from a CBR service. While at the beginning of the VC the cell interarrival time varied only by 3 to 10 µs, with each switch that was passed the cell interarrival time variation increased, and on the receiving side interarrival time variations of up to 130 µs could be observed. At the time of writing this paper no explanation for this behaviour could be found. Especially since the service used was a CBR service, this finding is at least concerning.
This figure shows the interarrival times measured at the last checkpoint in the path, where a significant variance can be seen. When comparing the measurements on different checkpoints, one can see that the further away from the source, the wider the spread of the interarrival times, and the longer the average interarrival time.
These results suggest that in large ATM networks, reshaping might become necessary after a number of switches, as each switch adds to the variation in interarrival times. We have some experience in using VPs with CDVT of around 300 µs. This still seems to be a reasonable value for current ATM networks; however, it might need to be reviewed with significantly growing ATM networks.
The main question of the experiment is whether there is any benefit at all for IP traffic on a VBR service. Experiments on VBR services were carried out in the Netherlands (11) and in Switzerland, in both cases locally. The main results from the Dutch tests were:
For more information on these results see (11). A confirmation of these results over an international VP was being carried out at the time of writing this paper. The results shown here demonstrate that VBR services can be used, but they do not offer any advantages over CBR services for IP traffic, except that on the ATM network itself it is cheaper to provide VBR services. Configurations should always be with PCR=SCR, and the BT should be as big as possible.
In both cases the ATM VCs were established as expected and the performance of the set-up was stable. These tests are currently being re-evaluated in the wide area. The major problem is that the software used for the tests is not supported, which currently makes the introduction of RSVP into the production network impossible.
RSVP multicast has not been considered in this phase of the project.
So far the work carried out is only theoretical. A threat analysis and more related information can be found on the TF-TEN home page (1). Practical experiments are expected to start in May 1997, and results will be available on the above-mentioned site.
All the experiments listed here are part of the first phase of the TEN-34 testing program, which was not finished at the time of writing this paper. More work is clearly needed to fully understand the capabilities of ATM networks and of comparable IP services. In some of the areas described above, new questions arose during the tests.
There are also a number of technologies that were not examined in phase one. Phase two of the project, starting in May 1997, will also investigate other technologies, such as ATM routing and new traffic classes such as ABR. The focus of the tests carried out here is to make experimental services available on the production TEN-34 network. Although the more interesting features of ATM seem to be not in a state yet for operational service, we will keep on following the developments in ATM- and IP-related activities. The latest information on our experiments can always be found on the TF-TEN home page (1).
Experiment | Leader | |
---|---|---|
TCP/UDP performance | Mauro Campanella | CAMPANELLA@mi.infn.it |
Tiziana Ferrari | Tiziana.Ferrari@cnaf.infn.it | |
SVC testing | Christoph Graf | Christoph.Graf@dante.org.uk |
ARP testing | Ramin Najmabadi Kia | Ramin.Najmabadi@helios.iihe.ac.be |
Simon Leinen | simon@switch.ch | |
NHRP testing | Olav Kvittem | Olav.Kvittem@uninett.no |
Addressing issues | Kevin Meynell | K.Meynell@terena.nl |
Network management | Zlatica Cekro | cekro@helios.iihe.ac.be |
CDV Tests | Victor Reijs | Victor.Reijs@surfnet.nl |
IP over VBR | Olivier Martin | omartin@dxcoms.cern.ch |
RSVP | Olav Kvittem | Olav.Kvittem@uninett.no |
ATM security | Paulo Neves | pneves@rccn.net |
Michael H. Behringer
DANTE
Francis House, 112 Hills Road
Cambridge CB2 1PQ, United Kingdom
Phone: +44.1223.302992, Fax: +44.1223.303005
Michael.Behringer@dante.org.uk