INET Conferences




Other Conferences

[INET'98] [ Up ][Prev][Next]

A Top 10 Approach for Prefetching the Web

Evangelos P. MARKATOS <>
Catherine E. CHRONAKI
Foundation for Research and Technology - Hellas (FORTH)


In the World Wide Web, bottlenecks close to popular servers are very common. These bottlenecks can be attributed to the servers' lack of computing power and the network traffic induced by the increased number of access requests. One way to eliminate these bottlenecks is through the use of caching. However, several recent studies suggest that the maximum hit rate achievable by any caching algorithm is just 40% to 50%. Prefetching techniques may be employed to further increase the cache hit rate by anticipating and prefetching future client requests.

This paper proposes a top 10 approach to prefetching, which combines the servers' active knowledge of their most popular documents (their top 10) with client access profiles. According to these profiles, clients request and servers forward to them, regularly, their most popular documents. The scalability of the approach lies in the fact that a Web server's clients may be proxy servers, which in turn forward their top 10 to their frequent clients which may be proxies as well, resulting in a dynamic hierarchical scheme, responsive to users' access patterns as they evolve over time. We use trace-driven simulation based on access logs from various servers to evaluate top 10 prefetching. Performance results suggest that the proposed policy can anticipate more than 40% of a client's requests while increasing network traffic by no more than 10% in most cases.



The World Wide Web traffic continues to increase at exponential rates [9]. One way to reduce Web traffic and speed up Web accesses is through the use of caching. Caching documents close to clients that need them reduces the number of server requests and the traffic associated with them. Unfortunately, recent results suggest that the maximum cache hit rate that can be achieved by any caching algorithm is usually no more than 40% to 50% -- that is, regardless of the size of the cache and the caching scheme in use, one out of two documents cannot be found in the cache [1]. The reason is simple: Most people browse and explore the Web, trying to find new information. Thus, caching old documents is of limited use in an environment where users want to explore new information.

One way to further increase the caching hit ratio is to anticipate future document requests and preload or prefetch these documents in a local cache. When the client requests these documents, the documents will be available in the local cache, and it won't be necessary to fetch them from the remote Web server. Thus, successful prefetching reduces the Web latency observed by clients and lowers both server and network load. In addition, offline prefetching activated after-hours when there is plenty of bandwidth at low rates may reduce overall cost and improve performance. At the same time, prefetching facilitates offline browsing, since new documents that the user will most probably be interested in are automatically prefetched.

Unfortunately, it is difficult, if not impossible, to guess future user needs, since no program can predict the future. In this paper we propose top 10, an approach to prefetching that successfully addresses the above concern through the use of server knowledge to locate the most popular documents; the use of proxies to aggregate requests to a server and amortize the costs of prefetching over a large number of clients; and adaptation to each client's evolving access patterns and needs.

Server knowledge

In the Web, it is well known that "popular documents are very popular" [9]. Thus, the largest percentage of Web requests to a server is for a small set of documents (files). We call this set of documents "top 10" -- the most popular documents of a server. Only documents that are members of the top 10 are considered for prefetching by the clients. The actual number of documents in the top 10 (which is not always 10) is fine-tuned based on client profiles which reflect the volume of requests initiated in the recent past by the client and the amount of disk space allocated for prefetched documents.

This idea of prefetching only popular items is not new: it has been followed for several years now by music stores. A significant percentage of a music store's stock contains the most popular recordings of each week. Both music store owners (proxies) and music store customers (users) make their purchases based on the week's top 10. The actual purchases themselves determine next week's top 10, which will determine future purchases, and so on. The whole process of creating a top 10 chart, distributing it to popular music magazines and radio (or TV) channels, and making purchases based on the top 10 of the current week, is being successfully used by millions of people each week all over the world.


The average client makes a small number of requests to each server. Arlitt and Williamson [2] report that at least 30% of a server's clients make only one request and never request anything from the same server again. If a client is going to request only one document from a server, prefetching makes no sense. If, however, clients go through a proxy, prefetching documents at the proxy may improve performance significantly, since documents prefetched on behalf of one client may also be used by other clients as well.


The volume of requests generated by various Web clients differs. Some of them (esp. proxies) generate large numbers of requests, while others request only a few documents. Top 10 prefetching has been designed to adapt to these cases and allow frequent users to prefetch lots of documents, while occasional users are allowed to prefetch few documents, if any at all. Prefetching is controlled by the access profile of the client. This profile contains the number of documents the client has requested from each server in the recent past in order to determine how many documents should be prefetched from each server in the future. This way, if the access patterns of the client change, top 10 prefetching will follow the trend and start prefetching from the currently popular servers.

In the rest of the paper we will describe top 10 prefetching in more detail and present performance results based on trace-driven simulation.

Top 10 prefetching

Figure 1. Top 10 prefetching operates in a client-proxy-server framework.

The top 10 approach to prefetching is based on the cooperation of clients and servers to make successful prefetch operations. The server side is responsible for periodically calculating a list with its most popular documents (the top 10) and serving it to its clients. Actually, quite a few servers today calculate their ten most popular documents, among other statistics, regularly (e.g., see Calculating beyond the ten most popular documents is an obvious extension to the existing functionality.

To make sure that documents are prefetched only to clients that can potentially use them, top 10 does not treat all clients equally. Time is divided in intervals and prefetching from any server is activated only after the client has made a sufficient number of requests to that server (>ACCESS_THRESHOLD). Thus, no documents are prefetched to occasional clients, while frequent clients are identified and considered for prefetching.

Some clients, i.e., proxies, make many more requests than others, and a correctly designed algorithm should prefetch different numbers of documents to different clients. For example, it makes no sense to prefetch the 500 most popular documents to a client that made 10 requests during the previous time interval. Taking the recent past as an indication of the near future, that client will make around 10 requests in the next time interval and only a small percentage of the prefetched documents will be used. On the other hand, a client that made 20,000 requests during the previous interval will benefit from prefetching the 500 most popular documents, and even more. Top 10 prefetching adjusts the amount of prefetching to various clients based on the amount of requests made in the recent past. Along those lines, a client may not prefetch more than the number of documents it accessed during the previous time interval.

Finally, to make sure that top 10 policy can be more or less aggressive when needed, the TOP-10 parameter defines the maximum number of documents that can be prefetched during any time interval from any server. Thus, at any point, a client cannot prefetch more than the TOP-10 documents even if it accessed lots of documents during the previous interval. By choosing a large value for TOP-10, prefetching can be very aggressive. On the other hand, small values of TOP-10 limit the extent of prefetching.

In summary, the top 10 approach to prefetching has two safeguards against letting prefetching get out of control: (i) ACCESS_THRESHOLD which identifies occasional clients and does not prefetch documents to them; and (ii) TOP-10 which can practically deny prefetching even to very frequent clients. We believe these safeguards are enough to control the extent of prefetching.

Client-proxy-server prefetching framework

We envision the operation of top 10 prefetching in a client-proxy-server framework (see figure 1). Prefetching occurs both at the client and the proxy level. User-level clients prefetch from first-level proxies to cater to the needs of particular users. First- and second-level proxies play both the client and the server role. First-level proxies are clients to second-level proxies and prefetch and cache documents for user-level clients (i.e., browsers). Second-level proxies are clients to various popular servers from which they prefetch and cache documents to be served to their own clients.

We picture first-level proxies at the department level of companies or institutions and second-level proxies at the level of organizations or universities. Even though this framework implies a structure, this structure is dynamic and may support dynamic proxy configuration schemes. In any case, top 10 prefetching may be transparent to the user and cooperate with the caching mechanisms of the browser or the proxy.

Figure 2. Top 10 prefetching depends on the cooperation of the various HTTP servers and a client-side prefetching agent

The implementation of top 10 prefetching is based on the cooperation of server and client-side entities (see figure 2). On the server-side, the top 10 daemon processes the access logs of the server and compiles the top 10, the list of the most popular documents on that server. Then, it updates a Web page presenting this information, and the top 10 is served as yet another document by the http server. The frequency of evaluating the top 10 depends on how frequently the content on the server changes.

On the client side, the prefetching agent logs all http requests of the client and adapts its prefetching activity based on them. The prefetching agent cooperates with a proxy that filters all http requests initiated by the client. If an http request can be served from the local cache of prefetched documents, the proxy serves the document from the cache. Otherwise, it forwards the request to the Web server or the next level proxy. Daily or weekly, depending on the configuration, the prefetching agent goes through the client access logs which contain all http requests made by the client and creates the prefetching profile of the client, that is, the list of servers from which prefetching should be activated. The number of documents requested from any of those servers during the previous time interval exceeds the ACCESS_THRESHOLD. Finally, based on the prefetching profile of the client, the prefetching agent requests the most popular documents from the servers which have been activated for prefetching. The number of documents prefetched from each server is equal to the number of requests to that server during the last time interval or the TOP_10, whichever is less.

Although the details of prefetching top 10 documents can be fine-tuned to suit each client, the underlying principle of prefetching only popular documents is powerful enough to lead in successful prefetching. An advanced prefetching agent may request and take into account additional parameters like document size, percentile document popularity, and client resources to make a more informed decision on what should be prefetched.


Experimental environment

Server traces

To evaluate the performance benefits of prefetching, we use trace-driven simulation. We have gathered server traces from several Web servers from a variety of environments that include universities, research institutions, and Internet providers both from Europe and the United States. All traces total more than four million requests. Specifically, the traces are from

  • Parallab ( A Supercomputing Center associated with the University of Bergen, Norway. These traces represent heavy traffic, especially from all over Norway.
  • FORTH: Traces from our organization.
  • Rochester ( Traces from the Computer Science Department of the University of Rochester, New York, USA.
  • NASA: Traces from the Web server at NASA's Kennedy Space Center.
  • FORTHnet: Traces from the Web server of an Internet provider, that originated from our organization.

We believe that it is very important to use traces from a variety of sources, since different server traces display access patterns from different client bases.

The characteristics of the traces from our servers are summarized in the following table:

Name Duration Total Requests
FORTH 4 months 328,070
Rochester 1.5 months 499,073
Parallab 1 month 843,442
NASA 1 month 1,697,736
FORTHnet 2 months 1,333,413

We preprocessed the original traces and removed requests to "cgi" scripts. We have also removed requests from local (within the same domain) clients, since local users tend to reload their pages frequently (e.g., while changing them), thereby creating an artificial popularity for some pages.

Performance metrics

The performance metrics we use in our experimental evaluation are the hit ratio and traffic increase. The hit ratio is the ratio of the requests that are serviced from prefetched documents to the total number of requests. It represents the "guessing ability" of our algorithm. The higher this ratio is, the lower the client latency and the server load.

The traffic increase is the increase in traffic due to unsuccessfully prefetched documents. Since no program can predict the future, some of the prefetched documents will not be actually requested, and thus they should not have been prefetched in the first place.

The design of a prefetching policy is the art of balancing the conflicting factors of hit ratio and traffic increase while trying to guess future requests. At one extreme, if an algorithm never prefetches any documents, it will not suffer any traffic increase, but it will also have zero hit ratio. At the other extreme, a very aggressive prefetching algorithm may prefetch the entire Web (assuming enough resources), but this would saturate the network with prefetched documents that will never be requested.

In all our experiments we assume that a prefetched document stays in the cache for the entire duration of a time interval, so that a small cache will not distort our results. This assumption is not unrealistic, however, since the cost of magnetic disks is low, usually lower than network bandwidth cost.

The benefits of prefetching

Figure 3. Successfully prefetched documents as a function of the size of the TOP-10.

Figure 4: Traffic increase as a function of the size of the TOP-10.

In this first set of experiments we investigate the costs and benefits of our top 10 prefetching approach. Figures 3 and 4 plot the hit ratio and the traffic increase as a function of the TOP-10: the maximum number of documents that any client, no matter what its access history is, can prefetch within a time interval. The time intervals in these experiments are chosen to be 50,000 client accesses long. You may observe that for all servers, as the size of the TOP-10 increases, the hit ratio increases as well, which is as expected, since the more documents a client is allowed to prefetch, the better its hit ratio will be.

FORTHnet has the best hit ratio of all servers. Therefore, prefetching from FORTHnet results in high performance. To understand this performance advantage we need to grasp the dimensions that influence prefetching in general and the hit ratio in particular. The hit ratio is high when (i) lots of prefetching is being done and (ii) this prefetching is successful. Top 10 prefetches documents to repeated clients which are those that visit a server during successive time intervals. Additionally, top 10 prefetches large volumes of documents to heavy clients which are clients that access lots of documents during a time interval. Thus, the more repeated and heavy clients a Web server has, the higher the hit ratio is going to be.

It turns out that FORTHnet has the largest percentage of repeated clients (23.5%) as figure 5 suggests. Effectively, one out of four FORTHnet clients visit for at least two successive time intervals. Since FORTHnet has more repeated clients than any other server, it has the potential for prefetching to more clients. Moreover, FORTHnet (almost) has the most heavy clients as well, as figure 6 suggests. Actually, the ten best FORTHnet clients amount for 12% of FORTHnet's requests, the largest percentage in any of the servers we studied. As FORTHnet has both heavy and repeated clients, its clients benefit from top 10 prefetching.


ServerRepeated Clients

Figure 5. Percentage of repeated clients of each server. Observe that 23.5% of FORTHnet's clients (vs. 5.4% of FORTH's) are repeated, i.e., they visited during two successive time intervals.


Figure 6. Cumulative percentage of requests as a function of the number of clients that make these requests. Figure 7. Cumulative percentage of requests as a function of the documents requested. Top 10 documents are very popular on each server, but the degree of their popularity depends on the server.

Going back to the hit ratio in figure 3, we see that the performance of the NASA server and the FORTH server follows that of FORTHnet. This is as expected, since NASA has lots of repeated clients, but few heavy clients, and FORTH has lots of heavy clients, but few repeated clients. Finally, Rochester and Parallab follow with lower hit rates, since neither of them has particularly large numbers of repeated or heavy clients. It is interesting to note, however, that although Parallab has more heavy clients than Rochester, and a comparable number of repeated clients to Rochester, Rochester's hit ratio is better. This can be explained by looking at the documents each server serves to its clients. Figure 7 shows the cumulative percentage of requests for a server's documents. We see that Rochester has significantly more popular documents than Parallab does. For example, the ten most popular of Rochester's documents account for 30% of Rochester's total requests, while the ten most popular of Parallab's documents account for only 10% of Parallab's requests. Thus, prefetching the ten most popular of Rochester's documents is going to result in higher hit ratio than prefetching the ten most popular of Parallab's documents.

From the above discussion it is clear that the performance of prefetching depends on several factors. The most important ones seem to be the client base of a server and the popularity of the documents a server provides. Frequent clients that access lots of documents form a very good basis for successful prefetching.

Although prefetching reduces the number of requests made to a Web server, it may also increase traffic, since the prefetched documents may not be needed by the client that prefetched them. Figure 4 plots the traffic increase as a function of the TOP-10 for all servers simulated. We see that the traffic increase is small for almost all servers for low (< 500) value of TOP-10. For example, prefetching up to 500 documents results in less than 12% traffic increase for any server. Actually, the traffic increase for Parallab is only 5%.

The effect of proxies

In Figure 3 we notice that, with the exception of FORTHnet, the hit ratio of top 10 prefetching is between 3% and 12%, which is rather low. Although it could be increased by making a more aggressive prefetching (e.g., by increasing ACCESS_THRESHOLD), aggressiveness will significantly increase the traffic. Recall that the essence of prefetching is in keeping a good balance between high hit ratio and low network traffic increase. Thus, we should find other ways to improve the performance of prefetching.

One way to improve the performance of prefetching is through the use of proxies. Proxies are being extensively used for caching and firewall purposes by intervening all requests from a domain [15]. We advocate that prefetching can benefit from the use of proxies. In the current traces, several clients even from the same domain make distinct requests to a specific server. Thus, each server ends up with lots of clients, few of which qualify for prefetching. If, however, all these clients access the server through a proxy, the proxy would aggregate all the client's requests and qualify for prefetching as a repeated and heavy client. Thus, the proxy would prefetch documents that could be used to reduce the latency of any of its clients and thus improve performance. For example, if a document is prefetched on behalf of one client in a proxy's cache, the document may be served locally to all other clients that use the same proxy. Thus a client will be able to make use of a document that was prefetched on behalf of another client.


Figure 8. Successfully prefetched documents as a function of the size of the TOP-10. Figure 9. Traffic increase as a function of the size of the TOP-10.

To show the benefits of proxying in prefetching, we use trace-driven simulation and introduce artificial proxies that gather client requests and distribute the benefits of prefetching onto a larger number of clients. Artificial proxies are generated by grouping clients into larger groups and considering the whole group as a proxy for all the group's clients. Although several grouping algorithms can be designed, we use a straightforward one, which is very close to the proxying schemes used in practice. The grouping algorithm is as follows:

A request coming from a client will be considered as coming from a proxy that has the same name as the client, with the first part of the client's name stripped off.

For example, all requests coming from clients and are grouped into requests coming from proxy That is, all requests that originate from any computer of the computer science department of the University of Rochester appear as coming from a single computer from that department, which is what most reasonable proxying schemes do.

Figure 8 plots the hit ratio (for proxies) due to prefetching as a function of the TOP-10. We see that the hit ratio of prefetching using proxy servers has doubled or even tripled compared to Figure 3. For example, the hit ratio of prefetched documents from FORTHnet is close to 45%, for Parallab, 18%, and for the other servers, in-between. Fortunately, this increase in hit rate comes at almost no increase in network traffic, as Figure 9 suggests. For low (< 500) TOP-10 values, the traffic increase is less than 20%, and sometimes there is even a traffic decrease! The reason for the observed traffic decrease is simple: A prefetched document that will be used by two clients of the same proxy results in traffic decrease, since the document is fetched into the proxy only once, thus saving the second request that the second client would make if there were no proxy.

We should note, however, that for aggressive prefetching, the network traffic increases as high as 60%, which starts to get significant. Fortunately, when TOP-10 is less than 500, the hit ratio is almost the same with the cases for higher values of TOP-10, and the traffic increase is down to at most 20%, which seems reasonable. Interestingly enough, this observation holds for Figs. 3 and 4 where no proxies are used: increasing TOP-10 more than 500 does not noticeably increase hit rate.

Second-level proxies



Figure 10. Successfully prefetched documents as a function of the size of the TOP-10.

Figure 11. Traffic increase as a function of the size of the TOP-10.

To carry the idea of proxying one step further, in this section we study two-level proxies. First-level proxies aggregate requests from user-level clients, while second-level proxies aggregate requests of first-level proxies. The algorithm we use to group the first-level proxies into second-level proxies is as previous:

Second level proxies are found by stripping off the first two parts of a client's name.

For example, all requests coming from clients and will appear as coming from second-level proxy Effectively, first-level proxies aggregate requests from within a department, while second-level proxies aggregate requests from within an institution. In the specific example of the University of Rochester, our method assigns a first-level proxy at each department and a second-level proxy for the whole university.

Figure 10 plots the hit ratio (for second-level proxies) as a function of the TOP-10. We see an even higher performance improvement compared to Figure 8. FORTHnet reaches a hit ratio of more than 60%, while even the server with the worst performance (Parallab) reaches a hit ratio close to 45%. Even better, this performance improvement is usually accompanied by a noticeable traffic decrease, as Figure 10 suggests, since a document prefetched on behalf of one client will be used by lots of other clients as well. Although for very aggressive prefetching, traffic increase may go as high as 60%, prefetching up to 500 documents results in good performance and unnoticeable traffic increase.

Frequency of top 10 release

In this section, we investigate how often a new top 10 should be released. The top 10 music charts have been released every week for several decades now, without any significant problems. It would be interesting to see if the same one-week time interval should apply to the calculation of the top 10 of each Web server as well.

Intuitively we believe that the time interval for the top 10 calculation should neither be too large nor too small. For example, if a new top 10 is released every several months, then it may be out of date, and all clients that prefetch it may not use it. If, on the other hand, a new top 10 is released every few minutes, then it will probably not be credible and would imply a significant overhead for clients that would prefetch probably the same top 10 every few minutes.


Figure 12. Hit ratio as a function of time. Figure 13. Traffic increase.

To find what values of the time interval would be appropriate, we conducted a trace-driven simulation, where we varied the time interval from half a week up to a month and plotted the performance results in figures 12 and 13 (TOP-10 is fixed at 500, with the exception of NASA, where it is fixed at 100). We see that for most servers, hit ratio improves slowly with the time interval and then declines. * Interestingly enough, we see that for all servers the best interval seems to be between one and two weeks. Thus, every one to two weeks, a new top 10 should be released.

Figure 13 shows the traffic increase as a function of the time interval. We see that traffic increases slowly with the time interval, and sometimes it fluctuates around a value. For Parallab and Rochester, traffic increases sharply after two and three weeks, respectively. The reason is that after that time interval, lots of clients start to qualify for prefetching, and lots of prefetching operations start to clients that do not really need the prefetched data. Fortunately, for time intervals less than or equal to 2 weeks, all servers' traffic increase is lower than 20%. In summary, the best balance between hit ratio and traffic increase seems to be achieved when a new top 10 is released every couple of weeks.

Previous work

The area of Web prefetching is rather new and is currently being actively explored. Padmanabhan [13] suggested a server-initiated prefetching algorithm. He observed that Web document requests have interdependencies; that is, if a document tex2html_wrap_inline956 is requested from a Web server by some client, then probably document tex2html_wrap_inline958 will also be requested within a small time interval tex2html_wrap_inline960 by the same client. Each Web server keeps this dependency information in the form of a graph. The nodes of the graph are documents; an edge between documents tex2html_wrap_inline956 and tex2html_wrap_inline958 represents how probable is the access of document tex2html_wrap_inline958 after document tex2html_wrap_inline956 has been accessed. When a client requests document tex2html_wrap_inline956 , the server, along with document tex2html_wrap_inline956 , sends all the documents tex2html_wrap_inline958 that are likely to be accessed next. Alternatively, the server along with document tex2html_wrap_inline956 sends only the names of the documents tex2html_wrap_inline958 that are likely to be accessed, leaving the initiative for prefetching to the client. Padmanabhan validated his approach using trace-driven simulations that gave very encouraging results: e.g., a 36% reduction in network latency can be achieved at the cost of a 40% increase in network traffic.

Bestavros has also proposed a similar approach for server-initiated prefetching [4, 3]. For all pairs of documents tex2html_wrap_inline956 and tex2html_wrap_inline958 , Bestavros calculates the probability p[i,j] with which document tex2html_wrap_inline958 will be requested within time interval tex2html_wrap_inline960 after document tex2html_wrap_inline956 is requested. According to those probabilities, a server can advise clients on which documents to prefetch. Bestavros conducted trace-driven simulations which provided very encouraging results. For example, his approach using only 10% extra bandwidth can result in 23% reduction in document miss rate.

Contrary to the previously proposed sophisticated prefetching algorithms, our top 10 approach uses a simple and easy-to-calculate metric. Most Web servers routinely calculate their most popular documents, among other statistics. Thus, no extra work is needed on behalf of the server in order to calculate which documents should be prefetched. Interestingly enough, top 10 achieves similar (and sometimes better) results than other prefetching heuristics. For example, top 10 has been shown to achieve close to a 60% hit rate, at only a 10% traffic increase (see figure 10), because TOP-10 capitalizes on two Web invariants:

  • Popular documents are very popular; and
  • The use of proxies increases the hit ratio and reduces network traffic.

Gwertzman [10, 9] proposed a geographical push-cashing approach to reduce a Web server's load: when the load of a Web server exceeds some limit, the server replicates (pushes) the most popular of its documents to other cooperating servers that have reduced load, so that clients will be able to make future requests for these documents from the other servers. Push-cashing replicates only popular documents (much like top 10) in some server, while top 10 replicates them in some proxy close to the client. In push-cashing, the client still needs to request the replicated documents from some (usually nonlocal) server, while in top 10 the clients request the documents from a local proxy. To make matters worse, in push-cashing clients need to know which server to ask the documents from; that may involve a request to the original server, which adds even more to the client's latency. Thus, although push-cashing may off-load a busy server by replicating its most popular documents, it still requires the client to make one or more requests to nonlocal servers in order to find the replicated data. On the contrary, top 10 replicates popular documents only to local proxies, so that clients always make local accesses to replicated data.

Wachsberg et al. [17] propose the use of prefetching as a way to improve performance of Web browsing over low-bandwidth links. They propose a client-based approach whereby each proxy will keep a list of documents needed by its clients, and it will decide which of them to prefetch. However, they have reported no performance results yet.

The benefits of prefetching are going to be investigated within the LowLat [16] project. In the LowLat approach, a preloader will be responsible for communicating with the Web servers and prefetching documents from them. The prefetching process will take into account several factors including bandwidth, cache load and space, and server load.

Recently, there has been considerable work on Web caching, that is, caching of popular documents close to clients [1, 5, 7, 6, 8, 11, 12, 14]. All this work aims at reducing both network traffic and server load by keeping data close to clients that re-use them. Most Web servers and proxies today support some form of caching. Our work complements the research in caching, since all benefits of prefetching are in addition to those of caching. While caching attempts to provide fast access to a document the second time it is accessed, prefetching provides fast access to a document the first time it is accessed. Moreover, the already existing infrastructure for caching (proxies, etc.) can also be exploited for prefetching.

In summary, we believe that our top 10 approach is an easy way to calculate an algorithm that achieves effective prefetching of documents in the Web.


In this paper we present a systematic approach towards the reduction of the Web latency experienced by Web clients, by prefetching documents before they are actually requested by the users.

Prefetching has not been employed in the Web so far, mainly for several reasons: (i) a prefetching robot can easily get out of control and start prefetching everything that is out there; (ii) prefetching may be ineffective, since nobody knows what a client will want to access; (iii) proxies may delay clients; and finally, (iv) prefetching over high-speed interconnection networks may result in minor performance improvements.

We believe our prefetching approach addresses all previous concerns about prefetching for the following reasons:

  • The top 10 approach to prefetching uses several well-defined thresholds to make sure that only a small number of useful documents are prefetched. The prefetched documents do not increase the total traffic by more than 10-20%. Our prefetching approach sometimes even reduces the total network traffic by aggregating several clients' requests. Thus, it cannot get out of control and lead to traffic chaos.
  • We prefetch only the proven popular documents that most clients have accessed and that future clients will probably want to access. Thus, the risk of bringing useless documents is minimized. Essentially, by prefetching only popular documents, the risk of guessing the future is significantly reduced.
  • Although sometimes a user "feels" that it is faster to retrieve a document directly from a server instead of going through a proxy, this is because most of the other users think that they should go through a proxy and avoid putting unnecessary load on a server. If everybody starts accessing the servers without intervening proxies, most servers will collapse, and most interconnection lines close to the servers will saturate. The only way to avoid the collapse is to use proxies somewhere in the traffic route from a client to a server. One could argue that aggregating several clients through a proxy will slow down all clients and eventually saturate the proxy. Although this may be true to some extent, it cannot easily happen. Proxies are usually powerful computers capable of handling hundred of requests per second. If each interactive user that browses the Web makes one request every few seconds, then the proxy will be able to handle up to a few thousand users before being unable to cope with more requests. Realistically, most departments do not have that many users to generate the load needed to saturate a proxy.
  • Although prefetching over a high-speed interconnection may result in minor performance improvements, certainly it does not hurt the clients, and it may benefit the servers by taking some of the load off the server and putting it on the proxies.

In addition to addressing previous concerns, prefetching in general, and top 10 in particular, has several advantages:

  • Prefetching can be used during off-peak periods to download useful documents at low transfer costs. By transferring documents during low-rate periods, prefetching pays for itself and may even result in profit.
  • Prefetching reduces client latency by spreading the load from busy servers to idle clients/proxies. Thus, it improves user turnaround time and user productivity.
  • Prefetching can also be used to organize up-to-date digital repositories of related documents. For example, a prefetch agent may download all documents related to a specific research topic, update them regularly, and make them available to local users.

Although several people have concerns about prefetching, we believe that the top 10 approach has been specifically designed to address these concerns and bring out the benefits of prefetching. Top 10 takes the risks out of prefetching by doing it in a controlled way that results in significant performance improvements with only a minor traffic increase.


In this paper we present a top 10 approach for prefetching World Wide Web documents. Top 10 prefetches only the most popular documents (that is where the name comes from) and only to clients that will be able to use them. We use trace-driven simulations of server traces to evaluate the costs and benefits of our approach. Considering our experimental observations, we conclude the following:

  • The top 10 approach to prefetching may result in significant performance improvements. Our experimental results suggest that top 10 manages to prefetch (up to) 60% of future requests, with little (less than 20%) corresponding increase in traffic (see figure 10).
  • Top 10 has been shown to be robust over a wide variety of parameters and server loads. We have used traces from five different servers both in Europe and the United States. Top 10 always managed to result in good performance for all traces without significant traffic increase.
  • Prefetching the most popular documents is a simple but effective prefetch heuristic. It requires very little effort to be computed on the server side, while it provides performance comparable to (if not better than) previously proposed sophisticated prefetching heuristics.


This work was supported in part by project PENED 2041 2270/1-2-95. We deeply appreciate this financial support. We also thank the University of Rochester, the University of Bergen, ICS-FORTH, NASA, and the University of Crete for providing us with traces of their Web servers.


M. Abrams, C.R. Standridge, G. Abdulla, S. Williams, and E.A. Fox. Caching Proxies: Limitations and Potentials. In Proceedings of the Fourth International WWW Conference, 1995.
M.F. Arlitt and C.L. Williamson. Web Server Workload Characterization: The search for Invariants. In Proc. of the 1996 ACM SIGMETRICS Conference, May 1996.
Azer Bestavros. Speculative Data Dissemination and Service to Reduce Server Load, Network Traffic and Service Time for Distributed Information Systems. In Proceedings of ICDE'96: The 1996 International Conference on Data Engineering, March 1996.
Azer Bestavros. Using speculation to reduce server load and service time on the WWW. In proceedings of CIKM'95: The Fourth ACM International Conference on Information and Knowledge Management, November 1995.
Azer Bestavros. Demand-based document dissemination to reduce traffic and balance load in distributed information systems. In Proceedings of the 1995 Seventh IEEE Symposium on Parallel and Distributed Processing, October 1995.
J.-C. Bolot and P. Hoschka. Performance Engineering of the World-Wide Web. In Proceedings of the Fifth International WWW Conference, 1996. Paris, France.
Anawat Chankhunthod, Peter B. Danzig, Chuck Neerdaels, Michael F. Schwartz, and Kurt J. Worrell. A Hierarchical Internet Object Cache. Technical Report 95-611, Computer Science Department, University of Southern California, Los Angeles, California, March 1995.
S. Glassman. A Caching Relay for the World Wide Web. In Proceedings of the First International WWW Conference, 1994.
J. Gwertzman. Autonomous Replication in Wide-Area Networks. Technical Report 17-95, Harvard University, 1995.
J. Gwertzman and M. Seltzer. The Case for Geographical Pushcaching. In Proceedings of the 1995 Workshop on Hot Operating Systems, 1995.
Radhika Malpani, Jacob Lorch, and David Berge. Making World Wide Web Caching Servers cooperate. In Proceedings of the Fourth International WWW Conference, 1995.
E.P. Markatos. Main Memory Caching of Web Documents. Computer Networks and ISDN Systems, 28(7-11):893-906, 1996.
V.N. Padmanabhan. Improving Word Wide Web Latency. Technical Report 95-875, University of California at Berkeley/CSD, 1995.
J.E. Pitkow and M. Recker. A Simple, Yet Robust Caching Algorithm Based on Dynamic Access Patterns. In Proceedings of the Second International WWW Conference, 1994.
Neil G. Smith. The U.K. National Web Cache: A state of the Art Report. Web Journal, 1(3), Summer 1996.
Joe Touch. About the LowLat Project, 1996.
Stuart Wachsberg, Thomas Kunz, and Johnny Wong. Fast World-Wide Web Browsing Over Low-Bandwidth Links, 1996.
The interested reader will notice that the NASA and Rochester curves stop in two and three weeks respectively. The reason is that in order to conduct a meaningful prefetching experiment, we need at least two time intervals: one interval to find the top 10, and a second interval to give the top 10 to clients and measure their hit ratio. Since we had only one month's traces for NASA, the highest time interval we could simulate was 2 weeks.

* The author is also with the University of Crete.

[INET'98] [ Up ][Prev][Next]