Experimental Investigation of Hierarchical Internet Object Cache Characteristics
B. KOLICS <firstname.lastname@example.org>
Object caching is a widely used technique in the current Internet environment. It enables one to save network bandwidth, which is often a scarce resource, and to reduce users' perceived latency of downloading a network object. Many software products have been developed to enable Internet Object caching: CERN httpd, Harvest, Squid, Netscape Proxy Server, Microsoft Catapult, etc. Although one can configure any of the existing products easily, the configuration settings may not reflect the given state of network, operating system, and other cache influencing parameters -- i.e., these settings represent a static rather than a dynamic caching model. Markatos has already shown in the case of main memory caching of Internet objects that a dynamic or adaptive approach outperforms other, nondynamic models.
Our final goal is to build up a simplified discrete dynamic model of a specific Internet object caching system. The state variables in this model reflect the current state of the caching system using a reduced set of cache-related variables and parameters. To be able to specify state variables and to eliminate irrelevant details from our model, we need to measure the static and dynamic characteristics of different object caches.
The discrete dynamic model of the distributed and hierarchically organized cache system is developed for dynamic simulation, diagnostic, and control purposes. The dynamics are described by a discrete event dynamic system model, more specifically by a stochastic colored Petri-net model. The system model is composed of different hierarchical levels reflecting the hierarchical nature of the cache system itself and the hierarchical nature of the caching procedures therein.
The dynamical behavior of the hierarchical object caches has been investigated by specially designed measurement series, which are the most common in the R&D environment. The investigation took place at four different institutions in Hungary: at the top-level cache of the academic network, at a research institute, and at two universities.
The following cache variables have been investigated by the measurements:
By using these different data sets we will be able to specify
Object caching is a widely used technique in the current Internet environment. It enables one to save network bandwidth that is often a scarce resource and to reduce users' perceived latency of downloading a network object. Many software products have been developed to enable Internet Object caching (CERN httpd, Harvest, Squid, Netscape Proxy Server, Microsoft Catapult, etc.). Although one can configure any of the existing products easily, the configuration settings may not reflect the given state of network, operating system, and other cache-influencing parameters; in other words, these settings represent a static rather than a dynamic caching model. Markatos has already shown  in the case of main memory caching of Internet objects that a dynamic or adaptive approach outperforms other, nondynamic models.
The idea is that by using the tools and techniques that are common in process control and computer science theory, one could investigate Internet Object caching and fine-tune the relevant cache parameters to optimize the performance of the cache system. The first step is to build up a simplified discrete dynamic model of a specific Internet object caching system for control purposes. The next step is to design an adaptive way of tuning relevant cache parameters.
The dynamic model of the caching system is developed in the usual way. Starting from the system description and from the modeling goal, the dynamic variables are identified and then an abstract model is built.
A simplified dynamic model of a leaf cache in a cache hierarchy is developed. The cache server has a parent on a higher level. It is assumed that after a given time-out, the cache is able to download the requested object itself from the original source.
The cache accepts requests from the clients and delivers the requested objects to them either from its cached objects or from the hierarchy or from the original source.
The aim of our modeling is to find ways to tune the most influencing cache tuning parameters in a model-based and adaptive way. Therefore the model includes the following:
Because of the above modeling goal, a request level mechanistic dynamic model is needed which is able to describe the nonlinear dynamic behavior of the cache over a wide operation region.
The performance of a proxy cache is characterized by the hit-rate and by the response time of the individual queries. Moreover, the quality of service based on cached WWW-objects is also important where the ratio of stale (i.e., not up-to-date responses) is of importance. The above variables can be regarded as measurable output variables of the proxy cache as a dynamic system and will naturally be the subject of optimization or control.
The major dynamic variables influencing the behavior of a proxy cache are that of the characteristics of the requests. The number of requests per hour is a time-varying and random variable which can be regarded as a disturbance to the dynamic system. Its average value characterizes the load of the cache. However, the quality of the requests (the "difficulty" of the requests) also plays a role but it is rather difficult to associate a measurable variable (or variables) to it.
The set of state variables of the proxy cache system consists of the variables which characterize the actual status of the cache comprising its "past" into their value. The actual status of the resources of the proxy cache computer relevant to caching plays a role here, such as the size and the content of the cache memory and disk space, the content and organization of the object disk directories, etc. The state variables in this model reflect the current state of the caching system using a reduced set of cache-related variables and parameters.
Finally, the potential input variables, i.e., the measurable variables which can be directly influenced, are surely among the 100 tuning knobs available for tuning proxy caches. From this set the most influencing and important variables are the available memory size, disk size, the processor capacity, the sizes of the object disk directories, file handlers, and the disk speed.
The discrete dynamic model of a leaf cache server is developed for dynamic simulation, diagnostic, and control purposes. The dynamics are described by a discrete event dynamic system model, more specifically by a stochastic timed colored Petri-net model (CPN model) . A Petri net is a kind of finite automata designed to model finite sequences. Colors of the tokens are used to distinguish various types of elements (in our case, requests) to be processed by the system. The timed and stochastic nature of the CPN is used to describe the time-dependent and random nature of the subprocesses or steps involved.
In order to obtain a simple and feasible model of the caching system, the following assumptions have been made:
The structure of a CPN is usually shown graphically as a signed directed bipartite graph with scripts associated with the places, arcs, and transitions. Places are denoted by circles and transitions as rectangles on CPN figures. The operation of a CPN is visualized by the movement of the tokens with various and varying colors on the places of this graph. In addition to this, a CPN contains inscriptions on its places, transitions, and arcs describing the following:
The super-page (top-level page) of the hierarchical CPN model of the investigated cache system is shown in Fig. 1 as an example of a CPN structure.
The most important dynamic variables described in section Dynamic (time-varying) variables are represented in the CPN model above in the following way:
The cache load is modeled through the time-dependent set of the request tokens on the place Arrival.
Only the response time as the main characteristic variable is taken into account (assuming that the hit-rate is above the acceptable 25 percent value). The response time as a distribution is computed from the service-time color item of the arrived request tokens on the place Exit.
The state variables of the cache system are represented by the set of the cached request tokens on the places Object list, Memory object, and Disk object.
The input variables are mainly present in various constraints, i.e., upper limit on available resources in the CPN model, such as
In addition to the dynamic variables, the CPN model contains parameters that characterize the modeled cache system. The model parameters are mostly response times of various kinds associated with the relevant transitions in the CPN and can also be measured. These parameters possess a random and slowly time-varying nature, i.e., they can be characterized by slowly time-varying distributions. The following table summarizes the model parameters and their locations in the CPN model.
Note that the time trans-resp is only a very rough approximation of the substeps when the transmission from the hierarchy takes place. It also depends heavily on the number and size of the requests in a given time interval. For our simplified model it is assumed that the dependence of trans-resp on the size and number of requests is negligible; in other words, an average size and an average network bandwidth are assumed.
To be able to eliminate irrelevant details from our model and to verify it against real behavior, we need to measure the static and dynamic characteristics of the modeled object cache. The dynamic variables (input, state, output, and disturbance variables) of our CPN model have been measured using special tools.
In addition, the model parameters above have also been determined from measurements either directly or indirectly. To estimate the missing model parameters it is assumed that under low cache load the additivity of the service time of the individual requests holds; in other words, the total service time (being the response time) is the sum of the time of the individual service steps and it does not depend on the presence of the other requests being serviced at the same time.
It is important to note that this assumption is surely not valid under high load when all or at least some of the cache system resources are overloaded.
The dynamic behavior of the object cache has been investigated by specially designed measurement series, which are the most common in the research and development environment. The investigation took place as part of an exhaustive measurement series performed at 4 different institutions in Hungary: at the top-level cache of the academic network, at a research institute, and at two universities.
The caching software used at the modeled cache was Squid-1.1.17.  This software generates two log files which are useful for analyzing the cache activity. We applied a patch made available by Alex Rousskov  which enhances the log file records by some time profiling data. It must be stressed that these measured data contain per request based measurements since the profiling data are measured by the caching software for every request.
The following fields associated with various activities in per request based log files have been used:
Accepting request from the client (Client In):
Sending data to the client (Client Out):
Sending request to a primary or proxy server (Server Out):
Receiving response from a server (Server In):
Swapping data into the memory buffer (Disk In):
Swapping data out from memory buffer to disk (Disk Out):
The patch also appends the first_delay/total_duration entries to SWAPIN and SWAPOUT records (the time needed to swap an object from/to memory to/from disk) which are also used.
The response time of a query is computed from the above measured data using the formula below:
Response-time = Client Out (start + total duration) - Client In.start
From the list of measured variables above it is seen that the model parameters (see Table 1) can be directly computed from enhanced log files except for trans-resp. The distribution of the model parameters can be determined from the log files collecting queries of the same service type: for example, cached objects stored in memory. Fig. 2 and Fig. 3 show the empirically determined distribution function of disk-io and send-client, assuming low load of the cache.
If the individual response and transmission times as random variables can be regarded as independent, then the mean value and variance of the response time as a random variable can be easily computed from the mean values and variances of the times belonging to the processing steps. As it was said before, this independence assumption is usually valid under low load. It is important to note that the distribution of the random variables seems to be logarithmic-normal (see Fig. 2 and Fig. 3).
Once the empirical distribution of the response time from the times of the individual steps was estimated, it was compared to its empirically determined distribution function for memory cached objects, and a good correspondence was observed.
Finally, the directly unmeasured trans-resp parameter can be estimated from the empirically determined mean value and variance of the response time and that of send-client under the same assumptions as above using the statistics generated from the data of noncached requests using the formula
trans-resp = response time - send-client
resulting in the following statistics:
The tool we used for the simulation is Design/CPN 3.0.4 available at . The simulator has a graphical editor part, which enables users to create colored Petri nets and simulate them. It is capable of handling hierarchical, stochastic, and timed nets as well. The specification of the CPN variables, guards, and arc expressions can be done by using the CPN/ML language. The timed CPN model of the investigated Web caching system is shown on Fig. 1, which is printed from the graphical editor part of Design/CPN 3.0.4. The specification expressions are shown as well. Inscriptions on places indicate the initial marking, i.e., the set of tokens with which one starts the simulation. Having specified an initial marking, the simulator searches for enabled transitions (i.e., for transitions with their preconditions fulfilled) and fires them (i.e., moves tokens from places representing the preconditions to the places of post-conditions) in each time step. The simulation stops if there is no more transition to fire, i.e., if all request tokens have been processed.
Simulation input. A simulation run is performed by specifying the initial marking which represents the following cache variables:
Simulation output. As a result of the completed simulation the following two items are generated:
The initial marking, i.e., the simulation input of a simulation run, has been computed from the measured and evaluated per request based log files in different cases, i.e., for different cache load and time interval. The model output is generated therefrom by performing a simulation run and collecting the simulation output. The statistics computed from the service-time color element of the request tokens on the place Exit are regarded as an empirical distribution of the response time measured for the same load and time interval which is the measured output.
For model validation purposes, the request tokens appearing on the place Arrival have been generated in random way as follows. The ratio of the cached and noncached requests has been chosen according to the 30% hit-rate of the cache.
Both the model output and the measured output are random variables corresponding to the response time; therefore, they can be compared using standard methods in mathematical statistics.
Good correspondence has been found between the measured and simulated distribution of the response time.
It is expected that the Internet will contain several thousands of cooperating cache systems. Modeling and process control might help to make it happen.
For this purpose, a novel timed hierarchical stochastic colored Petri net model of Internet object caching has been developed. Simulation tools have been used to verify the model against real measurements. We used a couple of assumptions in order to scale. A good correspondence has been found between the model output and measured data for low cache load.
One task for the future is to refine the model and extend it with new levels of abstractions (with modeling parts of the cache systems). Another task is to extend the model for cache hierarchies.
The model is designed to be used for adaptive tuning of the most influencing cache configuration parameters in a model-based way. Extensive simulation studies using the developed model will be used to create a condensed description of the multivariate dependence function of the response time on the cache tuning parameters.
 Evangelos P. Markatos: Main memory caching of Web documents. In Proceedings of the Fifth International World-Wide Web Conference, Paris, France, May 1996.
 B. Kolics: Toward Intelligent Internet Object Caching (Diploma thesis). University of Veszprém, 1997.
 Ho, Y.C., Cassandras, C.: A New Approach to the Analysis of Discrete Event Dynamic Systems. Automatica, 19, 149-167 (1983)
 Kurt Jensen: Coloured Petri Nets, Basic Concepts, Analysis Methods and Practical Use, Vol. 1-3, Springer Verlag, 1992-1997
 Design/CPN Online home page http://www.daimi.aau.dk/designCPN/
 Squid Internet Object Cache on page http://squid.nlanr.net/
 Squid Profiling Statistics on page http://www.cs.ndsu.nodak.edu/~rousskov/research/cache/squid/profiling/