Extending the World Wide Web for Multicasting an HTML Document

Myung-Ki Shin <mkshin@pec.etri.re.kr>
Electronics and Telecommunications Research Institute
Korea

Abstract

The World Wide Web (WWW) provides a simple and effective means for users to share information over the Internet. However, as an access to the information on the Web is asynchronous, the Web, which is based on a strict client-server model, does not provide real-time information sharing support. One method of real-time information sharing is Hypertext Markup Language (HTML) document multicasting by which widely distributed participants can send HTML documents and join or leave a host group.

This paper describes augmenting the WWW to support HTML document multicasting for real-time collaboration. This is accomplished by extending the traditional WWW architecture, including protocols, naming scheme, and document format, and merging this multicasting capability into the current WWW framework. For the integration of the WWW into MBone conferencing, application scenarios are presented. To implement this multicast-enhanced scheme seamlessly as a new media type "HTML" on MBone, adaptation of HotJava protocol handler and content handler was suggested. The multicast-enhanced Web browser over this architecture can traverse and share a set of WWW documents via the MBone with integrated MBone audiovideo tools, such as the LBL's VAT and VIC.

Keywords: World Wide Web (WWW), Hypertext Markup Language (HTML), Multicast Backbone (MBone), multicasting, uniform resource locator (URL), Java.

Contents

Introduction

The Internet is the world's largest computer network. It is an international collection of smaller networks, computers, and the people who use them. The World Wide Web (WWW) provides a simple and effective means for users to share information over the Internet. However, as an access to the information on the Web is asynchronous, the Web, which is based on a strict client-server model, does not provide real-time information sharing support. This limitation of WWW is due to traditional data model which involves hypertext link and index search in asynchronous manner, and architecture which is composed of Hypertext Transfer Protocol (HTTP), Hypertext Markup Language (HTML), and uniform resource locator (URL)[1].

We describe augmenting the WWW to support HTML document multicasting for real-time collaboration. This is accomplished by extending the traditional data model and architecture and merging this multicasting capability into the current WWW framework. Specifically, we present the application scenarios for the integration of the WWW into Multicast Backbone (MBone) conferencing.

Related work

Several attempts have been made for the Internet to use the WWW as synchronous collaboration. An earlier approach was to extend WWW for collaboration, such as a distributed authoring, talking, and annotation and conferencing system; a recent approach is to make a remote presentation application using WWW and MBone. To implement these capabilities, WWW client APIs such as the NCSA Mosaic Common Client Interface (CCI), Netscape Client API, and Java API have been used.

WWW synchronous collaboration

On the WWW, synchronous means clients can get immediate feedback. The starting point of this approach is that WWW clients cannot use servers to communicate automatically with other clients. Many research groups such as W3C and EIT have studied extending WWW for synchronous collaboration, including audiovideo integration[2]. The projects, such as Crystal Web,[3] COReview,[4] Yarn,[5] COMET+Mosaic,[6] provide WWW participants for distributed synchronous environments. However, these systems are X Mosaic browser-dependent and cannot support audiovideo conferencing tools except for Crystal Web and multicasting environment for multi-user interaction.

Remote presentation using WWW and MBone

A remote presentation is a presentation in which some or all listeners are geographically separated but connected through a network, and usually includes audio, video, and some kinds of presentation-functionality. It can be easily built on the MBone using the WWW.

The MBone is layered on top of portions of the physical Internet to support Internet Protocol (IP) multicast. The MBone is a critical piece of the technology, which is needed for making multiple-person audio and video conferencing over the Internet--sharing any digital information--cheap and convenient[7], while the WWW is a convenient user interface for information presentation.

Four earlier applications exist for distributing HTML document over the MBone. In 1995, Vinary Kumar presented his work on the Shared Mosaic[8], a multicast extension to the National Center for Supercomputing Application's (NCSA's) X Mosaic browser. Two or more shared Mosaic clients running on multicast-capable machines can share a set of URLs (not HTML documents). Ed Burns's Webcast[9] enables a group of Mosaic browsers to traverse and share a set of HTML documents via the MBone. It uses the Mosaic CCI for browser interface and Reliable Multicast Protocol (RMP) for real-time transmission. It provides for sending both URLs and HTML documents. The mMosaic[10] proposed by Gilles Dauphin is another tool for sharing HTML documents over the MBone. It is also an extended version of the X Mosaic and provides UCL' SDR plug-in. All of the above browser environments are X Mosaic-dependent. Lulea University CDT's mWeb[11] application written in Java includes functionality for distribution of HTML pages using Scalable Reliable File Distribution Protocol (SRFDP) and Scalable Reliable Real-time Transport Protocol (SRRTP). These new protocols and the mWeb application not only bring the WWW and MBone closer together, making the distance to a true real-time WWW smaller, but also provide platform-independent environments. However, they do not define an extended architecture and application scenarios for the integration of the WWW into MBone conferencing.

Extended architecture

This paper describes augmenting the WWW to support HTML document multicasting for real-time collaboration. This is accomplished by extending architecture and merging this multicasting capability into the current WWW framework. The extended WWW architecture for HTML document multicast must provide methods, such as the protocol, naming scheme, and document format for this capability.

Figure 1 illustrates real-time conferencing-related protocols including HTML document multicasting fit together for integrating the WWW into MBone conferencing. We define a new interactive multicast media "Shared HTML." The HotJava over it stands for a conventional Java browser and provides standard HTML documents sending and receiving as well as browsing. The MBone session directory SDR can be used for plug-ins of new media Shared HTML; the SDR can launch VIC, VAT, and HotJava together. In this paper, I will not describe synchronization techniques between HTML document and real-time data.


Figure 1. Multimedia Conferencing Stack Including HTML Document Multicasting

HTTP+Multicast

HTTP uses TCP reliable transmission and this does not go with the shared HTML documents. In order to solve this problem, much research[12] has proposed a new real-time HTTP protocol that maps HTTP onto UDP. This approach can make browser-server implementation more complex, and the WWW browser has two different protocol processing routines, HTTP over TCP and HTTP over UDP. Also, a new reliable multicast protocol approach may be infeasible in the global Internet. The IETF has decided that standardization of reliable multicast protocols for the Internet is premature, due to cause a particular threat to the operation of the global Internet. It has assigned a research group to this problem (in the Internet Research Task Force [IRTF])[13]. So we propose HTTP+Multicast for HTML multicasting protocol as shown in figure 2. This "+" means not add but sequential flow. Figure 2 (ii) shows HTTP+Multicast flow composed of HTTP request, HTTP response, Multicast send, and Multicast join message, while traditional HTTP communication flow is composed of just request and response as in figure 2 (i).

This extended protocol architecture is simple. On the environment of many existing participants, a sender sends the HTML document to the multicast group after getting it from a HTTP server. So there is only one HTTP server access for the HTML document sharing; if the receivers join they can also see the document. Additionally, this protocol has a possible enhancement of integrating RTP into it if RTP payload format is defined for shared HTML documents.


Figure 2. A Flow of HTTP+Multicast

A MIME encoding for aggregate HTML documents sending

Recently, the need has become obvious to be able to send aggregate HTML documents including images and various objects, such as applets and plug-in data. An aggregate HTML document is a Multipurpose Internet Mail Extension (MIME)-encoded message that contains a root document as well as other data that is required in order to represent that document (in-line image, Java applets, etc.). Aggregate documents can also include additional elements that are linked to the first object. This paper proposes how to send such documents in the MIME messages. This can be accomplished via the encapsulation such as MHTML[14].

If a message contains one or more MIME body parts containing links and also contains separate body parts, data, to which these links refer, then this whole set of body parts (referring body parts and referred-to body parts) should be sent within a "multipart/related" body part. A body part, such as a text/HTML body part, may contain hyperlinks to objects that are included as other body parts in the same message and within the same multipart/related content. Often such linked objects are meant to be displayed in-line to the reader of the main document. For example, objects are referenced with the IMG tag. This is adaptable to new tags proposed in the ongoing development of HTML, such as the applet and plug-in objects (EMBED). Our extended architecture supports receipt of multipart/related with links between body parts using both the Content-Location and the Content-ID method as defined [14].

An example with relative URLs to an embedded GIF image is as follows:

An HTML document stored at http://pec.etri.re.kr


<HTML>

<HEAD></HEAD>

<BODY>

<H1> Example1 </H1>

An Example of a MIME Encoding of Aggregate HTML Documents.<P>

<IMG SRC=/image/icon.gif>

</BODY>

</HTML>

is encapsulated into the MIME message for sending:

Mime-Version: 1.0

Content-Base: http://pec.etri.re.kr

Content-Type: Multipart/related; boundary="boundary-example-1";

type=Text/HTML

--boundary-example 1

Content-Type: Text/HTML; charset=US-ASCII



<HTML>

<HEAD></HEAD>

<BODY>

<H1> Example </H1>

An Example of a MIME Encoding of Aggregate HTML Documents.<P>

<IMG SRC=/image/icon.gif>

</BODY>

</HTML>



--boundary-example-1

Content-Location: "/image/icon.gif"

Content-Type: IMAGE/GIF

Content-Transfer-Encoding: BASE64



R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5

NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A

etc...



--boundary-example-1--

URLs for multicast addressing

The current URL scheme in the RFC 1630 defines HTTP, FTP, Gopher, Mailto, News, and Telnet. This standard is required to change. There are several ways to point out to multicast addressing extension. M. Handley[15] suggested that defining a URL syntax for real-time media is a bad idea, in that such session information as multicast address, UDP port, and media format is typically done using SDR, which advertises such sessions using the SAP and SDP. His opinion is right in the case of audiovideo stream, however, if is not in the shared HTML documents, because the shared HTML is used in the WWW browser itself.

We define new URLs for multicast addressing. This multicast-enhanced URL, named "multicast," can specify sending and receiving an HTML document. The format of the URL specification is as follows.


multicast://multicast_address:port:ttl:host/file_path/file_name



multicast://multicast_address:port

The first URL is for sending an HTML document in host/file_path/file_name with multicast_address, port, and ttl (Time-To-Live), and the second URL is for joining with multicast_address and port. This scheme is extended to reveal multicasting functionality, such as joining or leaving a host group or sending an HTML document with ttl value. For example, you can type the URLs on a browser as follows to send the HTML document returned from getting "http://pec.etri.re.kr/index.html" to 224.4.12.12 multicast address with port 4444 and 127 ttl:


multicast://224.4.12.12:4444:127/pec.etri.re.kr/index.html

To join this, use the URL such as follows:

multicast://224.4.12.12:4444.

Hyperlink for multicast in HTML

An HTML user allows the user to navigate the hyperlinks. A hyperlink anchor head for HTML document multicasting must represent URLs for multicast addressing including a multicast protocol identifier. The HTML format of the hyperlink for multicast send is as follows:


<A HREF = multicast://multicast_address:port:ttl:host/file_path/file_name> Description </A>

An example is


<A HREF = multicast://224.4.12.12:4444:127/pec.etri.re.kr/index.html> Example2 </A>

Application scenarios for integration of WWW into MBone conferencing

The application scenarios are needed for integration of WWW into MBone conferencing. We define the application scenarios on integration of MBone conferencing into this extended WWW. In other words, new media HTML is defined and transmitted with audiovideo stream over the MBone. For software requirements, we assume the LBL's VAT and VIC for audiovideo, and, specifically, Sun Microsystems HotJava for the HTML document. The MBone is used for the multicasting network. The scenarios are as follows:


Figure 3. Application Scenarios for Integration of WWW into MBone Conferencing

We will discuss the steps involved in creating a session and sending an HTML document. The sender side of the figure 3 depicts the scenarios.

  1. Session announces including HTML as a new media type.
    The SDR can be configured to the new tools such as the HotJava by means of plug-in modules that the media, protocol, and formats support by a tool. The SDPv2[16] can be used to describe the MBone session including a new media type "HTML", and this description conveyed as the SAP by SDR. SDR session management is loosely controlled. Problems relating to synchronization may arise.
  2. SDR launches HotJava for HTML documents as well as VAT and VIC.
    The SDR selects the "Start All" button, and then it can launch VAT, VIC, and HotJava according to each media type. The audiovideo transmissions are same as the conventional MBone scenarios.
  3. HotJava requests HTML documents.
    A sender specifies a multicast-enhanced URL in HotJava for sending HTML documents, then the HotJava requests HTML documents to the WWW server.
  4. WWW server responds.
    As a traditional HTTP communication, the WWW server responds to the HTTP request. Only one HTML document server access exists for multicasting an HTML document.
  5. HTML document sent.
    The HotJava's enhanced multicast function sends the HTML document to a specified host group. The step 3, 4, and 5 should be issued sequentially as specifying the multicast URLs.

We will discuss the steps involved in participating in a session and receiving an HTML document. The receiver side of figure 3 depicts the scenarios.

  1. Session information is caught by using SDR.
    A receiver receives session information such as the address, port, media type, format, and tool using the SDR.
  2. SDR launches HotJava for HTML documents as well as VAT and VIC.
    SDR selects the "Start All" button, and then SDR can launch VAT, VIC, and HotJava according to each media type. The audiovideo receiving process is the same as for conventional MBone.
  3. Receivers receive an HTML document as a result of joining multicast group.
    Receivers join the host group to receive real-time HTML document using the HotJava multicast function.

Benefits of using HotJava environment

To implement this multicast-enhanced architecture in the Web browser seamlessly, we adapt the HotJava environment. The HotJava browser is the finest pure Java Web browser available. It is built on the HotJava tool kit, which provides a secure, platform-independent, scalable, and customizable base for building pure Java Web-aware applications[17]. The browser's capability can be dynamically extended, without increasing its base memory footprint, by installation of new content and protocol handlers for use with new media types or protocols. This makes the HotJava browser an ideal, scalable solution for the new media type "HTML" on MBone.

One of the most powerful features of HotJava is its extensible support for protocols. We define the new protocol handler "multicast," and HotJava can then add support for this new handler on the fly, without being recompiled or requiring the users of this new handler to install it. This approach provides users various benefits of a client platform-independent environment and extension of protocol. Also, the content handler is the Java programs that HotJava loads when it needs to interpret a particular MIME type/subtype combination. So extended HTML documents defined in this paper can be interpreted easily. Figure 4 shows that HotJava supports new extended HTML data types and multicast protocols.


Figure 4. Multicast Protocol Handler and Extended HTML Content Handler

Current status and future plans

Using HotJava prebeta1, we have begun the implementation of multicast protocol handler and extended HTML content handler. Currently, multicast protocol handler is being implemented in JDK (Java Developers Kits) 1.1 beta. The JDK 1.1 network package (java.net) provides the MulticastSocket class proposed in the Sun.net package. Sun Microsystems announced that a HotJava future version will be able to automatically download content and protocol handlers, but this is not implemented in the current prebeta1 release. For this release, this new protocol and content handlers should be locally installed.

Additionally, we plan on specifying a payload format for use in encoding HTML data type within RTP. This payload format can be used for unreliable HTML multicasting document. As with other RTP applications, receiver feedback and group membership information is provided via the RTCP[18].

Figure 5 shows the layout of the RTP payload format for MIME-encoded HTML documents. The RTP does not guarantee a reliable and orderly data delivery service, so a packet might be lost in the network. To achieve a best-effort recovery from packet loss, the decoder needs assistance to proceed with decoding of other packets that are received. Thus, it is desirable to be able to process each packet independent of other packets. Each RTP packet starts with a fixed RTP header. Marker bit, payload type, and timestamp fields of the RTP fixed header are used for the HTML documents. The MIME-encoded HTML document will be carried as payload within each RTP packet.


Figure 5. Layout of the RTP Payload Format for HTML Documents

Conclusions

Our work is extending the WWW for multicasting a set of HTML documents. This is accomplished by extending the architecture and merging this multicasting capability into the current WWW framework. The extended WWW architecture for HTML document multicast must provide enhanced WWW framework, such as protocols, naming scheme, and document format for this capability. So, we propose HTTP+Multicast, URLs for multicast addressing, hyperlink for multicast in HTML, MIME encoding of aggregate HTML documents sending, and application scenarios for integration of the WWW into MBone conferencing. Additionally, this HTTP+Multicast protocol has a possible enhancement that integrates RTP into it if defining RTP payload type for live HTML document. The SDR session management is loosely controlled, and problems relating to synchronization may arise. In this paper, we did not describe new protocols and synchronization techniques between HTML document and real-time data. In order to implement this multicast-enhanced architecture in the Web browser seamlessly, we adapted HotJava protocol and content handler. This makes our approach an ideal, scalable solution for the new media type "HTML" on MBone.

References

  1. Tim Berners-Lee, Robert Cailliau, Jean-François Groff, Bernd Pollermann, World-Wide Web: The Information Universe, 1992.
  2. World Wide Web Consortium, Collaboration, Knowledge Representation and Automatability, http://www.w3.org/pub/WWW/Collaboration/, 1996.
  3. Ralph Peters, Christian Neuss, Crystal Web: A Distributed Authoring Environment for the World Wide Web, Computer Networks and ISDN Systems 27, 1995.
  4. K. J. Maly, H. Abdel-Wahab, R. Makkamala, A. Gupta, A. Pprabhu, Mosaic + XTV = CoReview, Computer Networks and ISDN Systems 27, 1995.
  5. Tak K. Woo, Michael J Rees, A Synchronous Collaboration Tool for World-Wide Web, http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/CSCW/rees/SynColTol.html, 1994.
  6. Thane J. Frivold, Ruth E. Lang, Martin W. Fong, Extending WWW for Synchronous Collaboration, http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/CSCW/frivold/frivold.html, 1994.
  7. Kevin Savetz, Neil Randall, Yves Lepage, MBONE: Multicasting Tomorrow's Internet, http://www.northcoast.com/savetz/mbone/toc.html, 1996.
  8. EIT, Shared NCSA Mosaic, http://www.eit.com/software/share_mosaic/, 1995.
  9. NCSA, Collaborative document sharing via the MBONE, http://www.ncsa.uiuc.edu/SDG/Software/XMosaic/CCI/webcast.html, 1995.
  10. ENST, mMosaic, http://sig.enst.fr/~dauphin/mMosaic/index.html, 1996.
  11. Peter Parnes, Mattias Mattsson, Kåre Synnes, Dick Schefström, mWeb: a framework for distributed presentations using the WWW and the MBone, RTMW (Real Time Multimedia and the Web) '96 Workshop, http://www.w3.org/pub/WWW/AudioVideo/RTMW96.html, 1996.
  12. Philipp Hoschka, Towards a Real-Time Multimedia Web, http://www.inria.fr/rodeo/personnel/hoschka/bof/bof-talk.ps, WWW 4th Conference BOF, 1996.
  13. Allison Mankin, Allyn Romanow, IETF Criteria for Evaluating Reliable Multicast Transport and Application Protocols, Internet Draft, ftp://ietf.org/internet-drafts/draft-mankin-reliable-multicast-00.txt, 1996.
  14. Jacob Palme, Alexander Hopmann, MIME E-mail Encapsulation of Aggregate Documents, such as HTML (MHTML), ftp://ietf.org/internet-drafts/draft-ietf-mhtml-spec-05.txt, Internet Draft, 1996.
  15. Mark Handley, Applying Real-Time Multimedia Conferencing Techniques to the Web, RTMW (Real Time Multimedia and the Web) '96 Workshop, http://www.w3.org/pub/WWW/AudioVideo/RTMW96.html, 1996.
  16. M. Handley, Session Description Protocol, Internet Draft, ftp://ftp.isi.edu/confctrl/docs/, 1996.
  17. Sun Microsystems, HotJava Browser(tm) 1.0 Draft, http://java.sun.com/HotJava/, 1996.
  18. H. Schulzrinne, S. Casner, R. Frederick, V. Jacobson, RTP: A Transport Protocol for Real-Time Applications, http://ds.internic.net/rfc/rfc1889.txt, 1996.