Hiroyuki Inoue <firstname.lastname@example.org>
Nara Institute of Science and Technology
Kanchana Kanchanasut <email@example.com>
Asian Institute of Technology
Suguru Yamaguchi <firstname.lastname@example.org>
Nara Institute of Science and Technology
The AI3 is a satellite network covering several countries with Japan as a hub. In this project, we have adopted a caching system with prefetching. On top of these two caching and prefetching engines, we have added an agent to plan prefetching strategies based on the traffic and user behavior. The agent can from time to time force the AI3 caching system to redistribute documents to be cached within the network and to selectively prefetch documents. Several factors--such as access patterns, bandwidth consumption, and hit rates--can trigger the agent's actions to rearrange the caches, thus making the cache system adaptive. The agent's actions include cleaning up the cache by removing the least recently used pages, forcing page migrations from the hub cache to the rim caches, and adjusting the prefetching intervals. In this paper, we introduce the design and implementation of our adaptive WWW cache mechanism.
With the growing popularity of the World Wide Web, mechanisms such as mirroring and caching have been proposed to rescue the Internet by reducing the page waiting time and WWW traffic. Mirroring involves cooperation between the page owners and the mirror sites and thus requires prior arrangements. Caching can be more generally applied where a cache server can be set up to provide closer-to-home services for users who wish to reduce the page access time by selecting the cache servers as their proxy server. The first cache server introduced was the CERN HTTPD server , where recently accessed pages are stored for subsequent use. Several cache servers can be organized to forward requests from one cache to another hierarchically using Harvest . International trials on cooperative caching, such as NLANR  and the TERENA's CHOICE  project, as well as hierarchical caches, are going on. Several issues around WWW cache systems have been pointed out: low hit rate of the cache, consumption of bandwidth between cache systems, and consistency management of the cache.
To improve the performance of the cache, several proposals to prefetch documents have been made [8, 9, 10, 11]. It has been found that although prefetching improves the hit rate and the retrieval time, the bandwidth consumption increases dramatically.
The AI3 network , which is the Internet backbone of the Asian research community, has its major gateway connected to the global Internet with T3 fat pipe in Japan and T1 links to four countries (Cambodia, Indonesia, Hong Kong, and Thailand). In this network, we are going to implement an adaptive WWW cache mechanism. The goals of this trial are to reduce WWW traffic in this backbone and to maintain high hit rate of the cache for WWW users in the AI3 networks. Our cache mechanism has three components: a cache manager, a page prefetching engine, and a traffic controller called "agent."
For several technical developments and experiments in the AI3, we are constructing and operating the AI3 testbed. The installation of this network was begun in 1996. Figure 1 shows the current configuration and status of the AI3 testbed. The testbed uses Ku-band satellite communication channels as its datalink technology. The ground station--its main hub--is located at NAIST in Japan, and several 1.5 Mbps (megabits per second) links go to Indonesia, Hong Kong, Thailand, and the other partners' countries and regions. The link is operated on a 24-hour, 7-day basis. The network protocol employed here is IPv4, but IPv6 and other new features such as RSVP and MBone are going to be employed in this testbed.
Figure 1. The AI3 Network
The WWW cache server is widely used on the Internet to reduce network load . When a user requests a URL (uniform resource locator), the WWW client does not send the request to the original WWW server that contains the requested data but to the cache server as the proxy server. If the object specified by the URL is in the cache server, it is sent to the client without access to the original WWW server. The cache server is normally located at the local site of clients or at the network that connected the clients to a high-speed line (at least 10 Mbps). If the users' requests demonstrate a high degree of cohesiveness, both the traffic and the latency of page retrieval for WWW users are expected to be improved.
For further improvement of WWW latency, there is a prefetching technique that looks ahead to linked objects such as links to another page, inline images, sounds, videos, and so on in a WWW page. The prefetching server also works as a cache server. When the prefetching server accepts a request from a client, it retrieves the requested object and parses it, and also retrieves the linked objects and preserves them in the cache. This is very effective in improving the latency for the users and the hit rate of the cache, because almost all users tend to subsequently access objects that are linked to the page currently being displayed. However, prefetching increases network traffic. The prefetching server Wcol at NAIST has increased the hit rate of the cache by 20%, but increases WWW traffic by 200%.
Because of the nature of satellite transmission, the AI3 network link becomes a bottleneck when clients in several countries with AI3 access WWW servers in other countries. For networks with slow links, a proposal has been made to use a smart cache , which will improve the latency of low-bandwidth links by using file prefetching with previous access patterns, deferred access, automatic updating at night, and file compression. The smart cache consists of two cache servers at both ends of the slow link working in cooperation with each other.
The AI3 satellite link is not quite low bandwidth (1.5 Mbps), but has a large RTT (round trip time) of about 500 milliseconds. The time necessary to retrieve an object via HTTP is proportional to the RTT, and at least two RTTs [13 ]are required. To alleviate this problem, HTTP 1.1  includes persistent connections and pipelining, which W3C evaluated . They reduce use of system resources (e.g., total usage of packets and sockets), but increase total time for page retrieval. In the AI3 network, prefetching at the rim side is about 1 second faster than at the hub. However, prefetching increases traffic congestion on the satellite link.
The AI3 network consists of the Internet itself, the hub networks connected to the Internet by the T3 line, satellite links between AI3 hub and rim networks, and WWW servers and clients, as shown in Figure 2. We call the cache servers of the hub and rim networks the "hub cache" and "rim cache," respectively. We mainly consider WWW access from clients on the rim side to servers on the hub side. Each cache server has huge disks to cache WWW data and access logging. The satellite gateway observes the traffic that it forwards, as well as its contents.
Figure 2: System Model
When a client on the rim side requests a WWW object, it sends the request to the local rim cache. If the requested object is not found in the rim cache, it is forwarded to the hub cache. If the requested object is not found there, the hub cache fetches the object from its source. Then, hub and rim caches analyze the access pattern to predict and optimize future requests and prefetch the object if necessary.
Here the hub and rim functions are clearly separate. The hub prefetches as much as it can using the existing access pattern and does not have to worry about bandwidth or RTT. The rim tries to economize prefetching. It does its own prefetching for commonly used pages at regular intervals and can ask the hub to prefetch troublesome or bottlenecked objects.
The hub cache does prefetching based on access pattern analysis, as explained below. The rim cache concentrates on minimizing traffic between the rim and the hub and hence can work independently. The AI3 adaptive cache system has the following features: bandwidth control, access pattern analysis for prefetching, and prefetching while considering RTT.
By observing the traffic and delay on the satellite link, the cache servers determine whether to prefetch the WWW object. If the satellite link looks congested, the rim will stop prefetching and allow the hub to continue.
Because all accesses from the rim concentrate at the hub, the hub knows the information of the request and its contents. The hub cache analyzes them and determines which pages are popular, and prefetches to minimize the latency for the clients. It builds the access pattern statistics that represent the relationship with WWW objects and the frequency of reference. If a client requests one of the objects, the cache determines the priority of related objects and prefetches the objects with the highest priorities.
Blind prefetching of WWW objects by the rim cache could result in huge traffic across the satellite link. Therefore we must consider the RTTs of the satellite link and the link to the WWW server. The rim cache keeps track of the RTT and time for retrieval of each document. Subsequent requests will automatically force the cache to fetch the object. If an object is popular enough, it is put on a list and is prefetched at regular intervals. When there is a scheduled prefetching request for the object, the cache will look up the information about it. If the time for retrieval is much greater than the RTT, then there is little difference for the user whether it is prefetched or not. In this case, the cache will not prefetch but will leave the object to be fetched by the next user.
A rim cache serves users within a rim network and is one level down the hierarchy from the hub cache. Physically all rim caches are directly connected through satellite communication. The main role of the rim cache is to reduce traffic on the satellite link by acting as a proxy server for all clients within the rim network.
The rim cache is a secondary level cache; hence a request from a client within the rim network is first sent to the rim cache. If the rim cache cannot find the object it will forward the request to the hub cache. If this request is also a miss, the hub cache will obtain the requested object from its source and keep a copy while forwarding the request to the rim cache. At the same time the hub cache would prefetch the neighboring object of the requested page. Only the requested object gets passed down to the rim cache; the rim cache fetches any neighboring objects from the hub cache, if required.
Reducing the traffic on the satellite link means that we have to increase the cache hit rate at the rim's end. An agent is used to control the rim cache and prefetch popular pages to be cached. This prefetching is done at regular intervals during off-peak hours.
The agent keeps the following information on each page or URL it encounters:
The agent selects a number of pages whose frequencies of access are higher than some preset value, or popularity threshold, for regular prefetching. For each page, the prefetching interval should ensure that the page gets prefetched after an update. The higher the frequency of updates, the more frequently the page needs to be prefetched.
The distance metric and the size of the file help the agent decide whether to prefetch a page. For example, if the page is large and time-consuming to retrieve, as reflected by the distance metric, it may not be worthwhile to prefetch because the prefetched page may not be used at all. In such a case, it is better to wait until someone requests that particular page and cache.
Apart from doing routine prefetching, the agent can be triggered by traffic congestion. The agent can delay its scheduled regular prefetching and analyze bandwidth consumption. If the traffic is caused by regular prefetching, the agent can review the prefetching threshold or adjust the popularity level to remove pages from the prefetching list. If the traffic is caused by the miss rate from the rim cache, the agent can increase the number of pages to be prefetched in order to improve the rim cache hit rate.
However, if the congestion is caused by the cache itself, the agent can decide to clean the cache by removing underutilized pages. To determine if a page should be removed from the cache, the agent can examine its frequency of access over a period of time. The regular prefetching schedule can be organized such that large pages or pages with high distance metric be prefetched during off-peak periods.
The rim cache learns the clients' access pattern and analyzes it for prefetching to minimize latency. Objects are classified into structured objects (e.g., HTML  FTP directories, including in-line objects) and unstructured objects (e.g., images, sound, text files). Most structured objects are Web pages. The algorithm of access pattern analysis and prefetching is as follows:
Figure 3: Access Pattern Analysis
To avoid the congestion of satellite links by prefetching, the rim and hub cache servers decide whether to prefetch by considering the object's past RTT and time for retrieval. The algorithm of prefetching while considering RTT is as follows:
We have described a caching mechanism adopted by the AI3 project that is currently under implementation. A satellite link between Thailand and Japan is planned to be available in February 1997. We expect to run experiments in the very near future from which performance data will be gathered and analyzed.
The authors would like to thank JSAT for supporting our operation of the satellite channels.
Hiroyuki Inoue is a Ph.D. candidate at the Graduate School of Information Science, Nara Institute of Science and Technology, and also works for Sumitomo Electric Industries, Ltd. His research interests include load balancing of the distributed system and network security for the Internet.
Kanchana Kanchanasut is an associate professor in Computer Science, Asian Institute of Technology, Thailand. She received her Ph.D. from the University of Melbourne in 1991. Her recent research interests include computer networking and mobile computing.
Suguru Yamaguchi has been an associate professor in the Graduate School of Information Science, Nara Institute of Science and Technology since 1993. He received his Ph.D. in engineering from Osaka University in 1991. He has been also a member of the WIDE Project since its creation in 1987, where he has been conducting research on network security systems for wide area distributed computing environments. His research interests include technologies for information sharing, multimedia communication, network security, and network management for the Internet.