[INET'99] [ Up ][Prev][Next]

Traffic Modeling of Online Multimedia Education

Bill LAVERY <wjl@swin.edu.au>
Tony CRICENTI <tcricenti@swin.edu.au>
Swinburne University of Technology
Australia

Abstract

Broadband Internet access will rapidly proliferate beyond the education and business sectors to become available in homes and workplaces. This will open up new opportunities and new challenges for tertiary education to go online -- that is, for educational services to be delivered to students at home and in their workplace, overcoming the time and travel constraints of conventional place-based face-to-face educational methods. Thus, Internet-delivered online multimedia education (OLME) is likely to develop into a major mechanism for the provision of off-campus education, and if so, it will represent a significant component of Internet traffic.

In this OLME, a student uses a personal computer to access (over the broadband Internet) an educational server computer, the latter typically at a university or other educational service provider. The student's PC (the client) downloads software from the university computer (the server), and then executes that software on the client. The student learns by working through the material presented and interacting with the executing software, which typically includes information presentation, simulations, tutorials, and formative tests. In addition to this learning process, there will generally be an assessment process, also conducted online.

At first sight, it might appear that this type of education is not very different from distance education, and indeed online education is sometimes perceived as being essentially just distance education delivered electronically rather than physically. The authors believe that this latter view is fundamentally incorrect. We believe that a more appropriate view of tomorrow's OLME can be found in today's high-quality educational CD-ROMs [e.g., Ryan 97], for the latter will set market expectations [Sykes & Sewell 96]. Thus good-quality OLME of the foreseeable future will be characterized as follows.·Presentation of the material to be mastered will be engaging, using multimedia technologies; for example, students will rarely be presented with substantial blocks of text to be read on the screen.· There will be high levels of interaction, in which the learner learns by "hands-on" experimentation, typically adjusting the parameters of simulations to complete some specified tasks, observing the consequences of actions and choices, and thereby developing working understandings of the underlying principles.· The presentation of the simulations will be very realistic, approaching that of virtual reality, resulting in a high degree of engagement of the learner with the lesson.· There will be ample problem solving, exercises, and tests to promote concept clarification and deep learning.· Personalized and student-centered lessons will guide the student through lessons in the most appropriate manner applicable to the individual student. Internet-conveyed student-to-student and student-to-teacher communication will be inherent and commonplace, and provide the group cohesion required to inhibit the debilitating isolation so common in today's distance education.· Students will function with a globally connected paradigm, routinely and self-directedly accessing resources and support from the global Internet and its community.

In this paper we consider the characterization of the traffic emanating from OLME servers when serving a number of OLME clients over the broadband Internet. We consider three elements:

  1. We propose a model for the OLME study process, which leads to theoretic models of the traffic emanating from a OLME server to service the study process of individual OLME clients; by combining multiple clients we derive a server source traffic model for serving multiple clients. The resultant model is a very bursty traffic generator, for which simulations confirm very effective statistical multiplexing.
  2. We describe measurements taken of server traffic to individual clients when using a number of commercially available OLME packages, and use the traffic measurements to validate and refine the theoretic model of 1.
  3. We describe traffic measurements for an OLME server while conducting online evaluation for many students simultaneously, using a proprietary online assessment package. We anticipate that the user data transfer process of such online assessment will be substantially different from the data transfer process when learning new material.

We then comment on the significance of these traffic models for the performance of the OLME application, considering architecture, traffic mix, and quality-of-service capabilities envisaged for the broadband Internet.

Contents

1. Introduction

It is apparent that broadband Internet access will rapidly proliferate beyond the education and business sectors to become available in homes and workplaces. This will open up new opportunities and new challenges for tertiary education to go online, that is, for educational services to be delivered to students at home and in their workplace, overcoming the time and travel constraints of conventional place-based face-to-face educational methods. Thus, Internet-delivered online multimedia education (OLME) is likely to develop into a major mechanism for the provision of off-campus education, and if so, it could represent a significant component of Internet traffic. In this OLME, a student uses a personal computer to access an educational server computer, the latter typically at a university or other educational service provider, as in figure 1.

Figure 1. OLME Architecture

The student's personal computer (PC; the client) downloads software from the university computer (the server), and then executes that software on the client. The student learns by working through the material presented and interacting with the executing software, which typically includes information presentation, simulations, tutorials, formative tests, and an assessment process, the latter sometimes also conducted online.

At first sight, it might appear that this type of education is not very different from distance education, and indeed online education is sometimes perceived as being essentially just distance education delivered electronically rather than physically. The authors are of the opinion that this latter view is fundamentally incorrect. We believe that a more appropriate view of tomorrow's OLME can be found in today's high-quality educational CD-ROMs [e.g., Ryan 97], for the latter will set the market expectations [Sykes & Sewell 96]. Thus good-quality OLME of the foreseeable future will be characterized as follows:

In this paper we consider the characterization of the Internet traffic emanating from OLME servers and OLME clients. We consider two levels of this characterization:

We then consider the significance of these traffic models for the performance of the OLME application, considering network architecture, traffic mix, and quality-of-service capabilities envisaged for the broadband Internet [Lavery & Cricenti 98].

2. Online multimedia education study process model

First consider the student's traffic demands. Typically the student (client) dials the server, logs in, and then starts a module from some courseware. In the instructional process, the student may be presented with OLME, and "doing a subject" consists of working through a number of modules, each of which consists of a number of interactive screens and/or simulations. Auxiliary communication, such as with tutors and administrators, is not considered, for it is assumed to constitute a negligible and much less demanding traffic component when compared with the traffic demands for downloading the lessons.

From a data traffic demand point of view, there will be identifiable stages:

Figure 2 models the data traffic demands involved in such Internet-delivered OLME.


Figure 2. Online Multimedia Education: Transmissions from Servers.

The model consists of three levels of timing diagram, involving "Download Data Bursts," "Lessons," and "Study Sittings," as defined below.

Download Data Bursts (DDBs): These are short bursts of data sent to the client, corresponding to one (usually) interactive screen, with which the student then interacts and learns.

The typical figures are first estimates based on our experience observing the behavior of users of CD-based interactive education [Ryan 97] and should be treated with caution; in practice, these figures will vary with the data size of the DDBs and the transmission rates. Note that the time between bursts of data is very long compared with the duration of a burst, and thus such traffic demands would be described as very bursty. The DDBs are assumed to be conveyed as some best-effort protocol, as this application is not highly sensitive to delay jitter.

Lessons: These are periods over which the student learns a topic by interacting with a series of interactive screens. Each screen is downloaded when the student requires it, in the sequence specified by the student and/or the supervisory software. Thus multiple DDBs constitute one lesson.

Study Sittings: the period that a client spends connected to a server.

3. OLME IP traffic characterization

Figure 2 illustrates how we model the student's use of the OLME courseware; that is, the figure models student demand behavior. We now seek a more detailed understanding and characterization of the IP traffic generated by such users accessing such OLME courseware. Related work has been done for "multiplayer network games" traffic [Bangun & Beadle 97; Borella]; this last study presented a source model for the traffic using a discrepancy test (l2 measure). The characterization of the traffic is important when analyzing the performance of the OLME over the Internet; the theoretical models and methodologies used are summarized in Michiel and Laevens (1997) [Michiel & Laevens 97].

Surveying currently available Internet sites that contain OLME material, we can identify the following broad classes of OLME material; the classification is based on distinguishably different traffic requirements:

For each of these OLME traffic classes, within the period of each DDB, the IP traffic will be a two-way exchange of IP packets, that is, data-bearing packets and TCP (transmission control protocol) acknowledgment-bearing packets. Thus, as in figure 3, the DDBs (in the data stream from server to client) will contain multiple IP packets. Note also, that at the same time there will also be acknowledgment packets flowing back to the server.
 Download Data Burst               Download Data Burst Interval

Figure 3. Typical Traffic Characteristics

To characterize the DDBs, the following parameters are used:

To estimate these parameters, a client and sever were set up on a segment of an Ethernet-based local area network (LAN), and a third machine running Linux was used to collect statistics, as in figure 4.

Figure 4. Experimental Setup

OLME HTML pages were placed on the server PC and accessed and downloaded by a World Wide Web (WWW) browser on the client PC; this is typical of WWW-based OMLE carried over TCP/IP on the Internet. On the Linux PC, TCPDUMP [JLM] was used to capture the packets flowing between client and sever. Fixed file sizes ranging from 10 kB to 1 MB were downloaded from the server by the client at regular intervals; the file sizes were chosen to be representative of (some of) the different classes of OLME material described above. From the collected trace, the DDBs were identified by looking for large gaps in the packet flow, as shown in the trace of figure 5.
Figure 5. Typical TCPDUMP Trace

Figure 6 shows a typical plot of the content of DDBs flowing in the network, resulting from a client request. The bytes represent the total number of bytes transferred during a DDB, including IP and TCP headers, for traffic emanating from both client and server. Figure 6. DDB Content (Bytes)

We see two distinct levels of content in the data bursts in the session; the small-sized bursts are caused by the navigation (e.g., switching back to an index page), while the larger-sized bursts are caused by the file transfers (e.g., a complex page). The interesting aspect of these plots is that the Client Request Burst Contents are significant, approximately 10 kB when downloading a 100 kB file from the server. This large size is caused by the action of the TCP protocol in sending acknowledgment packets from the client back to the server, as the client receives incoming packets from the server. Note that the Client Request Burst Duration must also depend on the size of the file being downloaded from the server.

The results for three different sizes of files downloaded from the server are summarized in the table below:

File Size
(B)

 

Mean Burst
Duration (s)

Std (s)

Std/Mean

Mean Burst
Content (B)

Std (B)

Std/Mean

10k

client request

0.388

0.078

0.201

3612

0

0

 

server download

0.22

0.024

0.109

24961

0

0

100k

client request

1.16

0.05

0.043

10352

1.90

1.84E-4

 

server download

1.02

0.03

0.029

128570

1.48

1.15E-5

1M

client request

5.27

0.12

0.022

77756

809

1.04E-2

 

server download

5.11

0.12

0.023

1157056

809

6.99E-4

Note that the Server Download Burst Content is greater than the download file size; this is attributed to the TCP/IP overhead. As expected, the Burst Duration depends on the size of the file transferred, and as noted above, Client Request Burst Content and Duration also depend on the file size. Interestingly, the mean Client Request Burst Duration is greater than the mean Server Download Burst Duration, even though the mean Client Request Burst Content is much smaller than the mean Server Download Burst Content. This is because the client must wait for the last packet from the server to arrive successfully before sending the last acknowledgment packet. In addition TCP uses delayed acknowledgments, that is, it waits for a period of time (in our case, approximately 150 ms) hoping to be able to piggyback the acknowledgment onto some data [Stevens 94]. Thus at best the Client Request Burst Duration will be the "ack. delay time" longer than the Server Download Burst Duration.

Estimating the data throughput rate can be misleading for both client and server, since, within a DDB, the packets can arrive irregularly, with quiet periods between arrivals of some packets (e.g., while waiting for an acknowledgment packet). Defining the mean Burst Throughput as the mean burst bytes divided by the mean Burst Duration, we get:
 

File Size (B)

 

Mean Burst Throughput (kbps)

10k

Client request

9.09

 

Server download

110.80

100k

Client request

8.71

 

Server download

123.09

1M

Client request

14.41

 

Server download

221.12

The greater efficiency for the larger download file transfers is attributed to the higher mean packet length achieved for larger files, as illustrated in the histogram in figure 7. Note the very bimodal nature of the histogram.


Figure 7. Distribution of Packet Lengths

In a second experiment an actual online lesson was performed and statistics were collected. The OLME material was made up of mainly HTML text screens, where the user accessed the pages in a sequential fashion. In this case the client, located in Melbourne, Australia, accessed the OLME material via the WWW from two servers in the United States (Utah and New York). The graphs in figure 8 summarize the distributions obtained for a lesson of 23 minutes. It must be stressed that these are preliminary indicative results and lack statistical validity; we are collecting further data.

The DDB duration histograms show that most DDB duration times are less than 10 seconds, which correlates well with the fact that most DDBs contain less than 50 kB. We also find that there are some long burst durations (10 s) and correspondingly large burst content (72 kB), typically caused by pages that contain image files. Both the Burst Content and Burst Duration distributions are skewed (possibly negative exponential), so the mean in this case will be "pulled" higher by the extreme values in the distribution. In this case the median may be a better measure of the central tendency. The median Server Burst Duration is 1.01 s (mean 1.97 s) and the median Server Burst Content is 1313 B (mean 8089 B), which results in a median DDB throughput data rate of 10400 bps. The Burst Interval histograms show that most intervals are less than 30 s; however, some intervals are long, typically caused by pages that contain large amounts of text and the reader taking considerable time to read through the material before requesting the next page.

We note that there is a direct relationship between the Server Burst Duration and the Client Burst Duration, as we found from the first experiment. The relationship between Burst Content and Burst Duration is more complicated in that the Burst Duration is a function of both the Burst Content as well as the throughput data rate and delays introduced in the Internet. More data are being collected in order to better quantify relationships.



Figure 8. Histograms for OLME Traffic.

4. Transmission and quality considerations

The user-perceived quality of the OLME is predominantly related to latency, that is, the time from when the user requests a page to when that page appears on the screen. This latency is limited by the quality of the broadband access to the home, which in turn is limited by transmission technology and/or cost considerations.

As the Internet progressively becomes a high-capacity, high-speed service, it is the Client links of figure 1 which could be the throughput rate limiting bottleneck, having a dominant impact on latency. These Client links may be wireless, such as local multipoint distribution service (LMDS) [ACA 99], or they may be one of the digital subscriber loop (DSL) technologies [Sykes & Sewell 96]. These technologies provide an IP-based downstream capacity of fixed upper bandwidth, able to be shared between a number of subscribers and used for various multimedia services such as OLME, video on demand (VoD), and telephony. Inevitably, the service provider will not install a dedicated channel capacity for each user, but will rely on the bursty nature of the various traffic demands to maximize the number of customers serviced, exploiting statistical multiplexing. In practice, at any download instant, an individual may be experiencing maximum bandwidth (say 6 Mbps) when OLME traffic demands are low, but may intermittently drop back (to say 3, 2, 1, or 0.5 Mbps) when many OLME users are active.

In addition the different multimedia services will be competing for bits. Most significant may be VoD, which will be conveyed over the same links in the form of IP packets with a quality-of-service constraint that delivery be not subject to excessive jitter. Thus at the instants when the instantaneous traffic demands on the distribution point are high, VoD traffic may have priority over OLME traffic, effectively introducing significant delay and delay variation into the delivery of the OLME DDBs.

Similar capacity-sharing and bit-competition considerations apply to alternative residential broadband access technologies. Hybrid fiber coax systems will have qualitatively similar limitations, as users and services compete for bits on the shared cable tree to the homes. Even with fiber to the building and DSL copper pair systems, there will still be shared capacity limits, typically occurring within exchanges, for equipment sharing is a key method by which network providers minimize cost. In addition, the customers will be offered a range of delivery quality packages based on the degree of shared resources; the more highly shared resources will be cheaper, and will be subject to greater delay variability. The consequences of these issues are qualitatively not unique to Internet-delivered OLME, for they are (painfully) evident in browsing today's WWW; long delays and delay variability impact on the user's satisfaction with the service.

The uplink traffic and the asymmetric nature of the broadband local access technologies may also be a limiting factor. Section 3 suggests that the uplink IP traffic, emanating from the client to server, may not be insignificant, and is typically of the order of 10 percent of the downlink traffic because of the acknowledgment and other packets associated with the downlink transfers. For this situation, those local access technologies, which have different capacities on the uplink and downlink, need to ensure adequate uplink bandwidth. Thus for example, if LMDS or digital TV broadband IP technologies rely on the copper phone lines for the uplink, as the downlink bandwidth is increased, this uplink bandwidth will also need to be increased by a proportionate amount. It should also be noted that as well as being a significant amount of traffic, the uplink packet flow within a DDB is roughly regular, an effect that decreases the effectiveness of the statistical multiplexing of many simultaneous users.

The specific traffic class of the IP-delivered OLME material also influences the efficiency of the statistical multiplexing, and thereby influences the number of subscribers that can be connected to a shared capacity local access system. For example, the delivery of streamed audio or video will involve more regular IP packets in the downstream traffic, and, assuming the use of the user datagram protocol rather than TCP, will require less upstream traffic.

5. Conclusions

We have provided models for the user traffic demands for IP traffic that results for a specific Internet application (OLME). For this application, the IP traffic measurements presented suggest that while the resultant IP traffic is indeed bursty, the traffic models will be bursty and long tailed and traffic class specific.

We have considered the implications on delivery latency of various local access shared-capacity technologies. The uplink traffic has been shown to be significant, and for asymmetric local access technologies, we have shown that this can be the link that limits performance.

Further OLME IP traffic characteristics are being recorded to derive comprehensive OLME IP traffic models for individual users and aggregated traffic.

6. References

[ACA 99] ACA Discussion Paper: Further Allocation Of Radiofrequency Spectrum Above 20 GHz Suitable For Broadband Wireless Services, http://203.37.2.230/28_31GHz/Futuredisc.htm

[Bangun & Beadle 97] Bangun R.A., Beadle H.W.P., "Traffic on a Client-Server Based Architecture for Multi-User Network Game Applications." Proc. ICT97 vol. 1, 2-5 April 1997, pp. 93-98.

[Borella] Borella M.S., "Source Models of Network Game Traffic."

[JLM] Jacobson V., Leres C., and McCanne S., "tcpdump," ftp://ftp.ee.lbl.gov/tcpdump.tar.Z

[Lavery & Cricenti 98] Lavery W. and Cricenti A., "Performance Limits of the Internet and Shared Local Access Technologies for On-line Education & Training," Globecom 98, Sydney, 7-11 November 1998.

[Michiel & Laevens 97] Michiel H., Laevens K, "Teletraffic Engineering in a broadband Era," Proc of IEEE, vol. 85, no. 12, December 97.

[Ryan 97] Ryan C, "Exploring Perception: an Interactive CD," Brookes Cole, Q2, 1997.

[Sykes & Sewell 96] Sykes P., Sewell R., "Telstra's Interactive Broadband Services Program." Proc. 7th International Network Planning Symposium, 24-29 November 1996, pp. 723-728.

[Stevens 94] Stevens W.R., "TCP/IP Illustrated," vol. 1. Addison Wesley, 1994, pp. 265-267.

[INET'99] [ Up ][Prev][Next]