A System for Flexible Network Performance Measurement

Andrew K. ADAMS <akadams@psc.edu>
Matthew MATHIS <mathis@psc.edu>
Pittsburgh Supercomputing Center
USA

Abstract

National Internet Measurement Infrastructure (NIMI) is a software system for building network measurement infrastructures. A NIMI infrastructure consists of a set of dedicated measurement servers (termed NIMI probes) running on a number of hosts in a network, and measurement configuration and control software, which runs on separate hosts. A key NIMI design goal is scalability to potentially thousands of NIMI probes within a single infrastructure; as the number of probes increases, the number of available measurable paths increases via the N-squared effect, potentially allowing for a global view of the network.

A fundamental aspect of the NIMI architecture is that each NIMI probe reports to a configuration point of contact (CPOC) designated by the owner of the probe system. There is no requirement that different probes report to the same CPOC, and, indeed, there generally will be one CPOC per administrative domain participating in the infrastructure. But the NIMI architecture also allows for easy delegation of part of a probe's measurement services, offering, when necessary, tight control over exactly what services are delegated.

The architecture was designed with security as a central concern: All access is via public key credentials. Each NIMI probe is configured by its CPOC (or a delegate of the CPOC) to allow particular sets of operations to different credentials. The owner of the probe can thus determine who has what type of access by controlling to whom particular credentials are given.

The sole function of a NIMI probe is to queue requests for measurement at some point in the future, execute the measurement when its scheduled time arrives, store the results for retrieval by remote measurement clients, and delete the results when told to do so. An important point for gaining measurement flexibility is that NIMI does not presume a particular set of measurement tools. Instead, the NIMI probes have the notion of a measurement "module," which can reflect a number of different measurement tools. Currently, these measurements include traceroute, TReno, mtrace, and zing (a generalized "ping" measurement), but it is simple to include other active measurement tools on selected probes.

In addition to giving an overview of the architecture, we will discuss experiences with running NIMI to conduct a number of Internet measurement studies.

Contents

1. Introduction

Interest in and development of performance measurements has increased over the past few years with efforts by the Internet Engineering Task Force's (IETF's) Internet Protocol Performance Metrics (IPPM) working group. They have attempted to quantify network measurements, and projects such as Surveyor [A97], which monitors one-way link characteristics of a finite section of the Internet. However, a tool that, one, empowers the end user to initiate a widespread, yet controlled, measurement and, two, has the ability to evolve and massage the measurement is not yet available.

National Internet Measurement Infrastructure (NIMI) is a software system for building network measurement infrastructures. A NIMI infrastructure consists of a set of dedicated measurement servers (termed NIMI probes) running on a number of hosts in a network, and measurement configuration and control software, which runs on separate hosts. A key NIMI design goal is scalability to potentially thousands of NIMI probes within a single infrastructure; as the number of probes increases, the number of available measurable paths increases via the N-squared effect, potentially allowing for a global view of the network.

Programmability was provisioned for in the design of NIMI in that a measurement (which can consist of 'n' individual measurements) is injected into the infrastructure from a "Measurement Client (MC)," which could run from a desktop machine. As results are collected, the measurement can be altered to focus on a different metric or expanding/refining the location within the Internet via the MC.

A fundamental aspect of the NIMI architecture is that each NIMI probe reports to a configuration point of contact (CPOC) designated by the owner of the probe system. There is no requirement that different probes report to the same CPOC, and, indeed, there will generally be one CPOC per administrative domain participating in the infrastructure. Thus, administration scaling problems are avoided outside of a confined domain. But the NIMI architecture also allows for easy delegation of part of a probe's measurement services, offering, when necessary, tight control over exactly what services are delegated.

The NIMI infrastructure was designed from day one to be secure. Authentication and encryption on all communication between NIMI components is done via public key credentials. Furthermore, each NIMI probe is configured by its CPOC (or a delegate of the CPOC) to authorize particular sets of operations per credential. This allows the owner of the NIMI probe complete control over what actions the owner of a credential can request.

In order to be responsive to the growing performance measurement tool suite, NIMI was specifically designed to be ignorant of measurement tools. Measurement tools are packaged into "modules," which can contain several unique measurement tools, that drop into NIMI. This allows for greater flexibility in performing measurements. Hence, NIMI is not a measurement tool, but a command and control system for managing measurement tools.

Although we will describe the architecture in detail, this paper's main focus is to discuss our experiences in attempting to deliver a system of this magnitude and in conducting a number of Internet measurement studies using NIMI.

2. Architecture

The NIMI architecture is patterned after Paxson's Network Probe Daemon (NPD), which was used to perform the first large-scale measurements of end-to-end Internet behavior [Pa97, Pa96b]. The NPD experiment included 37 participating sites and measured more than 1,000 distinct Internet paths. The probe daemons listened for measurement requests using a custom NPD protocol. The requests would first be authenticated, an important step to prevent misuse of the NPD, using a challenge/response based on a shared private key and an MD5 cryptographic checksum. Once authenticated, the request would be served and a reply returned.

Using the experiences gained from the NPD project, with the added knowledge that NIMI must be scalable to achieve the necessary goal of global saturation, NIMI was architected to be lightweight. This would allow for NIMI to be more portable, reaching a greater level of exposure on the operating systems that compose the Internet. Likewise, NIMI must be flexible because NIMI must be able to incorporate new performance measurement tools as they become available. Moreover, NIMI must be fully programmable, in both adapting to needs of the initiator of a measurement and in controlling access -- NIMI must be able to support a diverse set of policies. And finally, NIMI must be secure to quell concerns of gathering sensitive data.

The first two requirements were met by segmenting NIMI into a suite of tools and components with distinct functionality. The most prolific component of NIMI is the NIMI probe, or nimid. Its sole job is to process and queue measurement requests as they arrive, execute the requests at the appropriate time, bundle and ship the results of the requests to a Data Analysis Client (DAC), and delete the results when instructed to. Furthermore, the NIMI probe is further subdivided into two distinct daemons, the nimid and the scheduled. The nimid is responsible for authentication, encryption, and authorization of requests and shipping off results. The scheduled simply queues, executes, and bundles (for shipping) measurement requests.

The next major component is the CPOC. The CPOC acts as the administrative point of contact for a set of NIMI probes within the CPOC zone of control (administrative domain). The CPOC's job is simply to provide the initial policies for each unique NIMI probe, and over time, provide updates to these policies. The CPOC will also act as public key dispenser and eventually a software repository for NIMI components and NIMI measurement modules.

The MC injects a measurement request into the NIMI infrastructure. The MC assembles information for one or more measurements from the user and produces a request that is sent to the NIMI probe(s) that the user requested to perform the measurement(s). The MC is the only NIMI component that the end-user actually operates.

And the final component, the DAC, acts as a repository and post-processor of the data returned by the NIMI probe(s) upon completion of a measurement. This component can actually be run from the MC to collect immediate results, or as a daemon to collect ongoing measurements. The following image depicts the flow of messages between NIMI components:

2.1 Measurement modules

As mentioned earlier, flexibility was achieved by treating the tools used to perform the measurements as stand-alone third-party software modules that "plug in" to NIMI. Hence, NIMI has no knowledge of the measurement tools -- when the NIMI probe receives a measurement request, it exec()'s the name of the measurement tool listed in the request.

In an effort to standardize results and "un-taint" requests (or commands) exec()'d by the NIMI probe, all measurement tools, when added, are "wrapped" with a script. This wrapper actually has three benefits. First, it bundles the results generated from the measurement tool into a standardized form for easier processing by the DAC. Second, it allows for a common application programming interface (API) to be used between the scheduler within the NIMI probe and measurement tools (i.e., the destination host could always use the -d operand). And third, as we will discuss more thoroughly later, the wrapper does not need to be setuid root, and thus allows for more flexibility, since a system administrator with superuser privileges is not required in order to implement a change in the measurement tool's wrapper script.

The following measurement modules are currently supported:

Adding a new measurement tool as a module simply requires generating a wrapper for the tool and propagating both the tool and wrapper to all NIMI probes. Currently this is done via SSH. However, eventually the goal would be to place the measurement module on the CPOC for automatic delivery to the NIMI probes.

2.2 Policy control

NIMI is capable of supporting diverse policies through the use of an Access Control List (ACL) that houses a unique set of rules for a given NIMI component. Each NIMI component has its own ACL, which is a table comprising columns representing "requests" or "tasks" and rows listing "credentials" (a list of all supported requests can be found in section 2.3). The intersection of a request and credential is a Boolean value, or eventually, a script that can be applied to the arguments of a request to generate a Boolean value. A NIMI probe receives its initial ACL table on startup from its CPOC. The CPOC, however, can delegate some ACL management to other NIMI components. Hence, it is conceivable that a MC could be allowed to insert an ACL entry for another MC into the nimid's ACL table. Eventually, the notion of inheritance and "sand-boxing" will be added, to allow for ease and security of delegation.

The figure below shows a sample ACL table, depicting the following scenario:

2.3 Local management and control

Besides the policy control listed in the previous section, NIMI was designed with fail-safe management built into the file hierarchy for the local system administrator. First , the ACL table for the NIMI component exists as a flat text file within the "acl/" subdirectory. If a system administrator for a local nimid chooses to remove a particular policy (ACL entry), he or she can delete the corresponding ACL row from the text file by deleting the tuples that combine to make the ACL row from the text file. Furthermore, upon receipt of a measurement request, the scheduled moves the request into a text file located in the "pending/" subdirectory. When the time to exec() the measurement arrives, the measurement request is moved to the "active/" subdirectory. Upon completion of the measurement, the measurement request and results (as a tar file) are moved to the "dock/" subdirectory to await shipping by the nimid. The nimid, upon delivering the tar file to the appropriate DAC, moves the measurement request and tar file to the "completed/" directory. Hence, the status of any measurement is immediately determinable by NIMI or a human system administrator by examining the file system.

2.4 Communication and security

Since the NIMI components heavily rely on communication, a messaging protocol was developed that was capable of handling the simple, yet evolving, dialect between components. NIMI messages are encrypted via RSA private/public key pairs and passed between NIMI components via TCP/IP. A message consists of a header, which lists the necessary information to decrypt the message body (assuming you have the appropriate key), and an encrypted message body. The message body, in its un-encrypted form, consists on one or more message blocks. Each block or "block type" maps to a "task" or "request" that a NIMI component can react to, as well as any data necessary to complete the request. A sample TEST_ADD block looks like this:

TEST_ADD Handle-id treno-exec nimi1.psc.edu 200001050000 dac.psc.edu "-q nimi.washington.edu"

Depending on the context, block type and request can be used interchangeably. When a message is received, the NIMI component will decrypt and authenticate the message, and strip off the first block from the message body. Depending on the NIMI component, it will then use the block type of the first block (along with any data within the block) and perform an ACL table lookup to authorize the request (block type). If the authorization succeeds, the block is passed to a function for processing.

Currently, the following block types are supported by NIMI:

The following are in the development pipeline:

3. Deployment and experiences

The first version of NIMI was deployed on four PCs, located at LBNL, FNAL, and two at the PSC (located at commodity and vBNS Internet feeds). We have since expanded the measurement base by deploying NIMI, in a controlled fashion, on about 35 hosts, including

The actual installation of NIMI probes on the sites was tractable and easily controlled by following these seven steps:

  1. Ftp NIMI source tar file and NIMI measurement tools tar file.
  2. Untar the source files; build and install.
  3. Untar the binary NIMI measurement tools tar file.
  4. Generate a private and public key pair with the included tool.
  5. Edit the nimid and scheduled configuration files to specify the host name and key.
  6. Inform the CPOC administrator of the new NIMI's existence.
  7. Start NIMI.

As one can imagine, NIMI installs easily, with very little room for error. However, when dealing with software that is extremely flexible, problems surface as scaling occurs. The following is a synopsis of the problems encountered during the initial scaling phase of NIMI.

First, recalling that NIMI was engineered to be ubiquitous, it was designed to run as any user (not requiring super-user privileges), to allow for maximum interest in participation. However, since most useful network measurement tools require the use of raw sockets, some of the tools themselves need to be installed setuid root. This presented unforeseen problems: If a tool requires its permission to be setuid root, then the system administrator of that platform must make the necessary changes to the tool before it is usable via NIMI. This resulted in NIMI running on several hosts, unable to perform any measurement requests until the system administrator of the hosts tweaked the appropriate measurement tools.

Likewise, if a new version of a tool (i.e., traceroute), which requires its permission to be setuid root, was propagated out to 1,000 sites via NIMI, this would require possibly 1,000 different system administrators to make changes to the permissions on the tool prior to its use by NIMI. There are a couple of methods to approach this problem (i.e., NIMI could be chroot()'d to exist in its private directory); however, this may make installation a little more complex. Another possibility would be to allow the group (that the user NIMI is in) write access to the Berkeley Packet Filter and stipulate that any and all tools that NIMI can use must use the Berkeley Packet Filter if they require raw sockets. Again, this would impose a slight increase in the amount of work at the initial installation and could possibly limit some of the tools allowed (i.e., a HTTP server).

It should be noted that the previous problems encountered as NIMI grew in popularity were side effects resulting from the desire to maximize the flexibility of supporting a wide range of testing tools. Clearly, it needs to be dealt with soon.

The next problem that hindered the expansion of NIMI was briefly touched on in the previous examples, and is related to the "need" of a system administrator to install NIMI. Since many of the tools require the Berkeley Packet Filter, the operating system itself must be configured to have the Berkeley Packet Filter installed. Or in a more general sense, the operating system must be able to support whatever measurement tool is required for the measurement request. Unfortunately, in some cases this required a kernel rebuild.

4. Results

Although NIMI is currently in ALPHA development, it is being used by two different projects:

Moreover, NIMI is being used to monitor the PSC commodity and vBNS networks.

The following two graphs are samples taken from the data retrieved from the Web100 project. The first shows the performance of a stock ftp client from the PSC commodity link out to Washington University over an 10Mb/s ethernet limited path. The red dots are transfers of 1MB files; the green dots are transfers of 10MB files:

Working in theme with the project, another box was performing the same transfers, but was connected to the vBNS via a 100Mb/s ethernet card -- the exact first step most would take to improve the connection:

It is interesting to note that the performance did not increase at all, which is what the project hoped to show. At this time, troubleshooting this scenario was not within the project scope. However, when the project is ready to "diagnosis" the problem, NIMI could then be configured to measure different segments of the path (provided that a NIMI probe was located somewhere in between), or to retrieve different metrics by altering the measurement to use another tool.

5. Future plans

A few major improvements would be extremely beneficial to NIMI. The first and foremost is a public key server to distribute the public keys to any component that requests a key. This could be done with the architecture that is currently implemented, simply by generating a KEY_XFER block that the CPOC could process. However, this would require that all components register their public keys with this "well-known" CPOC. At this time, the correct course of action to pursue is not clear.

The second improvement would be resource checking and locking. One can envision multiple measurements coming in from different MCs that request identical start times. To handle this correctly, the NIMI probe must either (1) be certain that the measurement can run multiple invocations at once without the fear of bias or (2) inform one (or both) MC(s) that the requested resource is already allocated.

And finally, the ACL table improvements (listed in the POLICY CONTROL subsection) consisting of inheritance, sand-boxing, and resource policy checking (with the scripts) must be implemented to allow for the broadest amount of policy control.

6. Acknowledgments

This work has been supported through funding by the Defense Advanced Research Projects Agency award #AOG205 and by the National Science Foundation under Cooperative Agreement No. ANI-9720674.

7. References

[A97] G. Almes, "IP Performance Metrics: Metrics, Tools, and Infrastructure," http://io.advanced.org/surveyor/, 1997.

[Ca99] Ramon Caceres, "Multicast-based Inference of Network-internal Characteristics," http://www.research.att.com/~duffield/minc/, 1999.

[Ja89] V. Jacobson, traceroute, ftp://ftp.ee.lbl.gov/traceroute.tar.Z, 1989.

[MM96] M. Mathis and J. Mahdavi, "Diagnosing Internet Congestion with a Transport Layer Performance Tool," Proc. INET '96, Montreal, June 1996.

[Pa96a] V. Paxson, "Towards a Framework for Defining Internet Performance Metrics," Proc. INET '96, Montreal, 1996.

[Pa96b] V. Paxson, "End-to-End Routing Behavior in the Internet," Proc. SIGCOMM '96, Stanford, August 1996.

[Pa97] V. Paxson, "End-to-End Internet Packet Dynamics," Proc. SIGCOMM '97, Cannes, France, September 1997.