Mapping Where the Data Flows
By Martin Dodge
When I type the Internet Societys Web address into my browser, the html and graphics are seamlessly downloaded and the page displayed within a couple of seconds. But how does this data actually get from the ISOC server to my PC in a basement office in central London? How does it flow through the Internet to reach me? You can answer this with traceroute, a useful tool that allows you to lift the lid on the Internet and get a packets-eye view of the network. If youre like meintrigued as to how the Internet works beyond your browser and the telephone jack in the wallthen traceroute can be a fun tool to use to explore and map the Internet.1
A traceroute utility maps the path that data packets take between two points on the Internet, showing all of the intermediate nodes traversed, along with an indication of the speed of travel. Traceroute was invented in 1988 by Van Jacobson at Lawrence Berkeley National Laboratory in the U.S. Today a traceroute utility often comes as part of the operating system. Windows, for example, has a small utility called tracert, which is used by typing, at the MS-DOS prompt, tracert <Internet URL, e.g., www.yahoo.com>.
To illustrate how traceroute can map the Internet, I used it to analyze the path from my PC in London to www.isoc.org. The program works away for a few seconds as it dynamically explores the pathway the data takes, moving one node at a time. The end result is the following rather cryptic-looking output:
This might look meaningless to those unfamiliar with traceroute,
but it is in fact a kind of one-dimensional map of how the data
flows, with each node listed on a separate line. The map gives
valuable information on the real-time routing of data packets
through the Internet between London and Reston, Virginia, in the
U.S.the apparent location of the ISOC Web server. Each row of
the chart contains a node number followed by three time measurements
in millisecondssuch as 20ms 30ms <10ms. These are three separate
measurements of the time it took the data packet to travel from
the origin computer to that particular node and back. This is
called the round-trip time (RTT) and gives an indication of the
speed of each link. Finally, each row identifies a node as a domain
name and numeric IP address.
The traceroute output is comparable to a railway timetable that charts (1) how trains travel through the network, (2) which stations they visit, and (3) when they arrive. The traceroute timetable above shows that data travelling from London to ISOC had to pass through 16 intermediate stations, or network routers, to reach the end of the line. To decode the traceroute chart further, read it line by line to see each section of the route through the network.
In our trace we begin at node one, which is the departmental router for my office, known by the somewhat cryptic domain name cisco-2.bart.ucl.ac.uk. The data packets move rapidly onward to the next node, which has only a numeric address and is likely to be an anonymous but important link somewhere in a server room in the university. At node number three the data leaves my universitys internal network and joins the London metropolitan-area networkhttp://www.lonman.net ..uk/Images/map99.gifwhich provides a fast backbone for universities and colleges in London. From there it moves on to the JANET backbone, the U.K.s academic and research networkhttp://www.ja.net/at a gateway machine called south-east-gw.ja.net. Node five is the gateway router to the transatlantic link for the JANET network, which connects to ny-pop.janet, its point of presence (pop) in New York. At this point the data packets have crossed the Atlantic, and consequently, you can see a marked jump in the round-trip travel times from under 10 milliseconds to 70 milliseconds caused largely by the 3,000-mile distance between London and New York. The traffic then flows into the Teleglobe backbone networkhttp://www.teleglobe.com/in New York at nodes seven and eight before passing to AlterNetpart of UUNETs backbone empire, http://www.alter.net/lang ..en/network/also in New York.
Notice the strange, long domain names of these routers at the core of the Internet. These names often hint at the city where the node is located. They could contain the full name of the citysuch as newyorkor just an abbreviation such as nycwhich can require some educated guesswork to decode. Fortunately for traceroute explorers, many of the large backbone operators use similar naming conventions for their network infrastructures.
From AlterNets network in New York, packets flow through two more nodes before dropping down to Washington, D.C., at node 12. Then it is on to two nodes in TCO, according to their domain name, which is most likely to be Tysons Corner, Virginia. We are nearing our target. At node 15 we have reached ISOCs network, and node 16 is the end of the linethe Internet Society's Web server. The ISOC home page took 16 hops, across four different networks, to get to my browserquite a feat of routing and cooperation, but all in a days work for the Internet. This happens, unseen, for the millions of Web surfers who need not worry about where the data flows.
Triangulating the Internet with Web-Based Traceroutes
Conventional traceroute utilities are limited in one important respect: the origin point of the exploration is fixed to the location of the PC running the trace. To overcome this limitation, you can use Web-based traceroutes that allow you to run a trace from different starting points. These allow you to explore the Internets topology from multiple locationsa kind of virtual triangulation. I ran a traceroute from Canberra, Australia, to ISOC by using a Web-based traceroute publicly provided by Telstra, a major Australian telecom company and Internet backbone operator: http://www.telstra.com.au/. The output trace timetable is as follows:
Note that the output from this version of traceroute is in a slightly
different format from the previous version.
The traceroute utility is installed on a Telstra server that is quite likely located in Canberra, given its domain nameCanberra.telstra.netso this is where our data packets begin their journey to ISOC. The next two nodes in the trace are also within Canberra, according to their domain names. At node four the data moves a couple of hundred miles from Canberra to Sydney.
The big trans-Pacific hop occurs at node six. There is certainly a marked jump in the RTT at this point in our journey, caused by the 7,500-mile distance across the Pacific Ocean. There is no domain name for node six to give us a clue to its location, but it is likely to be in California.
At node eight, the data joins the AboveNet network: http://www.above.net/ network/network.html. The SFO domain name means it is probably in San Francisco. After another node in SFO, the data moves to a node in SCL, which is probably Santa Clara, California. The next major element in the journey is the hop across the continental U.S. from California to the Washington, D.C., region, at node 11. This long distance is matched by another significant increase in the RTT. Nodes 1113 have no domains to tell us which networks they are on, but at node 14 the data has arrived at the ISOC network and Web server.
To run a trace from Canberra yourself, go to http://www.telstra.net/cgi -bin/trace. There are several hundred freely available Web traceroute servers in many different countries and cities across the world. Thomas Kernen maintains a good list of them at http://www.traceroute.org/.
Add a Bit of Geography
An obvious refinement of the basic traceroute is to show on a map the route the data takes visually. This is known as a geographical traceroute. A number of these applications (see sidebox) attempt, with varying degrees of success, to map the physical location of Internet nodes traversed in a trace. I used two of the bestNeoTrace and VisualRouteto run the first trace example from London to ISOC to see how they performed when actually mapping the route. Both of them are easy-to-use, affordable application utilities.
NeoTrace is a geographical traceroute utility developed by NeoWorx. It provides four different views of the trace in tabbed panels: a geographic map, a nodal graph layout, a conventional listing, and a line graph of RTT performance. A trial of NeoTrace for Windows 95/98/NT can be downloaded from http://www.neotrace.com for free; the full program costs US$ 29.95. The figure shows the map view result of our trace; it has done a good job, successfully locating and mapping 8 of 16 nodes, including the target in Reston.
The second geographic traceroute application is Datametrics Systems Corporations VisualRoute. The application interface includes both a zoomable world map and a detailed listing of the trace. A trial version of VisualRoute can be downloaded for free from http://www.visualroute.com, and the full program costs US$ 29.95; it is available for Windows 95/98/NT, Linux and Solaris. VisualRoute also provides a Web server version, with examples currently located in England; the Netherlands; Canada; Freemont, California; and Fairfax, Virginia (http://www.visualroute.c om/server.html). In our trace to ISOC, VisualRoute found and plotted the location of six nodes with confidence and made educated guesses at five others, including the target. The results are shown in the figure below.
Although geographic traceroutes are useful, their mapping capabilities
currently are limited by the serious difficulty in mapping an
Internet node to an actual latitude and longitude. There is no
automatic way to match up these virtual and real-world addresses.
This is true even in the U.S., where one can at least match backbone
nodes to particular cities somewhat reliably. Consequently, geographic
traceroutes use a variety of heuristics to try to resolve a network
node to a geographic location with various levels of accuracy.2 This is a difficult problem to crack, and traceroutes largely
depend on looking up Internet addresses in static databases of
latitude and longitude. These databases, however, are only partial
and cannot keep pace with the Internets constant growth and change.
They seem especially weak outside the U.S.
NeoTrace and VisualRoute offer a partial solution, allowing you to add your own nodes and locations to their databases, along with more detailed maps. A more effective solution is to add geographic location information to domain names, as set out in a DNS-LOC proposal that can be dynamically queried, but this has not been implemented widely.3 An alternative might be the NetGeo, developed by the Cooperative Association for Internet Data Analysis (CAIDA), a service for mapping network entitiessuch as IP addresses, domain names, and ASesto geographical locations.4
Why is traceroute useful? First, traceroute is an important Internet debugging tool for those involved in keeping networks running. It can help identify routing problems quickly and simply. It can also be useful in tracking down the source of spam e-mail,5 as well as in trying to find a Web sites true location before giving it ones credit card details. Many Web sites using country-level domains are not actually hosted in the nation they indicate.
On another level, traceroute can help satisfy those who are curious to know how their computers connect to the Internet and how they can access information from all around the world as if it were just next door. Traceroute reveals the hidden complexity of datas path to a given destinationsometimes across 10 or 20 nodes or more, perhaps owned and operated by competing companies. Looking at what is happening in real time on the Internet always gives me a small sense of wonder that this system works so well, enabling tens of millions of people to communicate daily. So if you have a little time, go and trace the route to your favorite Web site and uncover the hidden complexity of the Internet that lies beyond the browser window. You may be surprised by where the data flows.
1 The following articles provide good background information on using traceroute to explore the Internet:
J. Rickard, "Mapping the Internet with Traceroute," Boardwatch magazine, December 1996.
J. Carl, "Nailing Down Your Backbone: The Imprecise Art of Tracerouting," Boardwatch ISP Directory, Summer 1999.
S. Dumett, "Tracing Your Route Through The Net," PreText magazine, March 1998.
Unix Man page for Traceroute. http://www.zytek.com/t raceroute.man.html.
2 For a nice summary of various means of determining an Internet hosts geographic location, see "Finding a Host's Geographical Location," by Uri Raz. http://www.private.org.il /IP2geo.html.
3 "Geo-enabling the Domain Name System." http://www.ckdhr.com/dns-loc/ ..
4 "NetGeothe Internet Geographic Database." CAIDA. http://www.caida.org/Tools/N etGeo/.
5 B. Mattocks, "Reading Email Headers." http://www.blighty.com/sp am/bill.html.
Join the Internet Society today: http://www.isoc.org