Mapping Where the Data Flows
By Martin Dodge
When I type the Internet Societys Web address into my browser, the html and graphics are seamlessly downloaded and the page displayed within a couple of seconds. But how does this data actually get from the ISOC server to my PC in a basement office in central London? How does it flow through the Internet to reach me? You can answer this with traceroute, a useful tool that allows you to lift the lid on the Internet and get a packets-eye view of the network. If youre like meintrigued as to how the Internet works beyond your browser and the telephone jack in the wallthen traceroute can be a fun tool to use to explore and map the Internet.1
A traceroute utility maps the path that data packets take between two points on the Internet, showing all of the intermediate nodes traversed, along with an indication of the speed of travel. Traceroute was invented in 1988 by Van Jacobson at Lawrence Berkeley National Laboratory in the U.S. Today a traceroute utility often comes as part of the operating system. Windows, for example, has a small utility called tracert, which is used by typing, at the MS-DOS prompt, tracert <Internet URL, e.g., www.yahoo.com>.
To illustrate how traceroute can map the Internet, I used it to analyze the path from my PC in London to www.isoc.org. The program works away for a few seconds as it dynamically explores the pathway the data takes, moving one node at a time. The end result is the following rather cryptic-looking output:
This might look meaningless to those unfamiliar with traceroute,
but it is in fact a kind of one-dimensional map of how the data
flows, with each node listed on a separate line. The map gives
valuable information on the real-time routing of data packets
through the Internet between London and Reston, Virginia, in the
U.S.the apparent location of the ISOC Web server. Each row of
the chart contains a node number followed by three time measurements
in millisecondssuch as 20ms 30ms <10ms. These are three separate
measurements of the time it took the data packet to travel from
the origin computer to that particular node and back. This is
called the round-trip time (RTT) and gives an indication of the
speed of each link. Finally, each row identifies a node as a domain
name and numeric IP address.
In our trace we begin at node one, which is the departmental router for my office, known by the somewhat cryptic domain name cisco-2.bart.ucl.ac.uk. The data packets move rapidly onward to the next node, which has only a numeric address and is likely to be an anonymous but important link somewhere in a server room in the university. At node number three the data leaves my universitys internal network and joins the London metropolitan-area network http://www.lonman.net.uk/Images/map99.gif which provides a fast backbone for universities and colleges in London. From there it moves on to the JANET backbone, the U.K.s academic and research network http://www.ja.net/ at a gateway machine called south-east-gw.ja.net. Node five is the gateway router to the transatlantic link for the JANET network, which connects to ny-pop.janet, its point of presence (pop) in New York. At this point the data packets have crossed the Atlantic, and consequently, you can see a marked jump in the round-trip travel times from under 10 milliseconds to 70 milliseconds caused largely by the 3,000-mile distance between London and New York. The traffic then flows into the Teleglobe backbone network http://www.teleglobe.com/ in New York at nodes seven and eight before passing to AlterNetpart of UUNETs backbone empire, http://www.alter.net/lang.en/network/ also in New York.
Notice the strange, long domain names of these routers at the core of the Internet. These names often hint at the city where the node is located. They could contain the full name of the citysuch as newyorkor just an abbreviation such as nycwhich can require some educated guesswork to decode. Fortunately for traceroute explorers, many of the large backbone operators use similar naming conventions for their network infrastructures.
From AlterNets network in New York, packets flow through two more nodes before dropping down to Washington, D.C., at node 12. Then it is on to two nodes in TCO, according to their domain name, which is most likely to be Tysons Corner, Virginia. We are nearing our target. At node 15 we have reached ISOCs network, and node 16 is the end of the linethe Internet Society's Web server. The ISOC home page took 16 hops, across four different networks, to get to my browserquite a feat of routing and cooperation, but all in a days work for the Internet. This happens, unseen, for the millions of Web surfers who need not worry about where the data flows.
Triangulating the Internet with Web-Based Traceroutes
Conventional traceroute utilities are limited in one important respect: the origin point of the exploration is fixed to the location of the PC running the trace. To overcome this limitation, you can use Web-based traceroutes that allow you to run a trace from different starting points. These allow you to explore the Internets topology from multiple locationsa kind of virtual triangulation. I ran a traceroute from Canberra, Australia, to ISOC by using a Web-based traceroute publicly provided by Telstra, a major Australian telecom company and Internet backbone operator: http://www.telstra.com.au/. The output trace timetable is as follows:
Note that the output from this version of traceroute is in a slightly
different format from the previous version.
Although geographic traceroutes are useful, their mapping capabilities
currently are limited by the serious difficulty in mapping an
Internet node to an actual latitude and longitude. There is no
automatic way to match up these virtual and real-world addresses.
This is true even in the U.S., where one can at least match backbone
nodes to particular cities somewhat reliably. Consequently, geographic
traceroutes use a variety of heuristics to try to resolve a network
node to a geographic location with various levels of accuracy.2 This is a difficult problem to crack, and traceroutes largely
depend on looking up Internet addresses in static databases of
latitude and longitude. These databases, however, are only partial
and cannot keep pace with the Internets constant growth and change.
They seem especially weak outside the U.S.
About the Author: Martin Dodge