The KAME project implemented and deployed IPv6 multicasting at an early stage of the deployment of IPv6. Differences of multicasting between IPv4 and IPv6 require several original approaches for the implementation, including handling of multicast interfaces, using scoped addresses in Protocol Independent Multicast (PIM), and employing a management tool. This paper describes the implementation and deployment experience of the implementation in a large IPv6 backbone in Japan.
IPv6 is now in a deployment phase. Several pieces of network equipment that support IPv6 have been shipped, and some network providers have started IPv6 commercial services. However, more implementations and experiments are necessary in some areas. One such ongoing field is multicasting for IPv6.
Multicasting itself is one of the key technologies in the next generation of the Internet. In IPv4, however, multicasting was introduced as an extension of the basic specification; hence, IPv4 nodes do not necessarily support multicasting. On the other hand, specifications of IPv6 require that all IPv6 nodes support multicasting. It is therefore necessary to experiment with multicasting and implement it at an early stage of the deployment of IPv6.
The KAME project [KAME], which is a joint effort in Japan to develop referential implementation of IPv6, has been implementing and deploying IPv6 multicasting. The project was started by the WIDE (Widely Integrated Distributed Environment) project and designed to run for two years beginning in April 1998. The KAME project targeted BSD variants as base operating systems and distributed its products as free software. The products have been widely used in IPv6 backbones all over the world and will be officially contained in the BSD variants in the near future.
Although the basic notion of multicasting is common to IPv4 and IPv6, several new characteristics are introduced in IPv6 multicasting based on the results of IPv4 multicasting. For example, IPv6 explicitly limits the scope of a multicast address by using a fixed address field, whereas the scope was specified using TTL (Time to Live) of a multicast packet in IPv4. It is not straightforward to port an implementation of IPv4 multicasting to IPv6 due to such differences; thus the KAME project introduced some original approaches for purposes of the implementation.
Implementation is inseparable from actual operations. The KAME project has actually deployed the implementation of IPv6 multicasting with practical applications in the WIDE 6bone [WIDE6BONE], which is one of the largest IPv6 backbones in the world, and got feedback from the experience for purposes of further implementations.
This paper describes the implementation and deployment experiences of IPv6 multicasting by the KAME project. The rest of this paper is organized as follows: Section 2 explains the implementation by the KAME project. In Section 3, we show our deployment experiences of the implementation in a live environment. We describe future plans in Section 4, and we conclude in Section 5.
This section describes KAME's implementation of IPv6 multicasting. Since the basic mechanisms of IPv6 multicasting do not differ from those of IPv4, IPv6 multicasting can basically be implemented the same way. However, IPv6-specific characteristics present several difficulties; thus we introduced original approaches in the implementation in order to resolve these.
IPv6 multicasting is categorized into two parts, as is IPv4 multicasting: the host-router part and the router-router part. The only protocol for the former is called Multicast Listener Discovery (MLD) [MLD]. Protocol Independent Multicast (PIM) [PIM] is currently available for the latter. KAME's implementation supports MLD and PIM.
KAME's implementation can act both as a router and as a host. Figure 1 shows the entire architecture of the implementation.
The host part of MLD is implemented in the kernel, and user applications for hosts use the basic socket API (Application Programming Interface) [BASICAPI] for IPv6 multicasting.
Multicast packet forwarding, which is a basic function of the router part, is implemented in the kernel. In addition, we have two network daemons implemented in the user space to handle PIM for IPv6; one is called pim6dd, for PIM dense mode, the other is called pim6sd, for PIM sparse mode. The daemons have code fragments that handle the router part of MLD. Pim6sd also uses a special API to forward an encapsulated multicast packet in a PIM Register message.
Figure 1: The architecture of KAME's implementation of IPv6 multicasting
Traditional implementations of IPv4 multicasting use unicast addresses to identify a network interface. For example, BSD variants have a structure that is used when adding or deleting an interface for IPv4 multicast routing. The structure contains a member, which specifies an IPv4 address, which serves as an interface identifier. All IPv4 multicast routing daemons use this structure.
However, such an approach is not suitable for IPv6 for the following reasons. First, an IPv6-capable node may assign multiple addresses on a single interface, which tends to cause a configuration mismatch. Also, a link-local address is not necessarily unique within a node; consequently, it may not identify a single interface. A user must specify the interface index as well as the address in such a case. Since the specified index itself should identify a single interface, the address is actually redundant.
As a result of the above consideration, KAME's implementation uses an interface name or an index to identify a specific interface both in the kernel and in applications. All APIs and configuration files used by applications also follow this rule.
Another remarkable difference between KAME and traditional implementations of IPv4 multicasting is the lack of multicast tunnels. Multicast tunnels were introduced to deploy multicasting, even with routers that were not multicast-capable. In spite of difficulties involved in handling tunnels, such as topology complexity and encapsulation overheads, they were necessary because multicasting was an extension in IPv4 and there was no guarantee that every router supported multicasting.
In IPv6, however, all nodes are required to support multicasting; hence, all routers should be multicast-capable, which means that we do not have to use multicast tunnels to deploy IPv6 multicasting. KAME's implementation thus does not support multicast tunnels, and the kernel and applications are kept simple.
KAME's implementation supports PIM for IPv6 as a multicast routing protocol. Since PIM is designed to support multiple protocols including IPv6, an existing implementation of PIM for IPv4 can easily be made to accommodate IPv6. However, such accommodation involves consideration of characteristics specific to IPv6. This subsection describes these characteristics and our approaches to them.
PIM uses a separate routing mechanism, which runs independently of PIM itself, in order to determine network topology and to detect an upstream router for Reverse Path Forwarding (RPF). One typical case is just to refer the unicast routing table used to forward unicast packets. In this scenario, the address of a gateway for a destination must be one of the PIM neighbors on the link that attaches to the outgoing interface for the destination. This constraint is not strict in IPv4, because an IPv4 node usually assigns only one address on each interface.
On the other hand, IPv6 allows a node to assign multiple addresses on each interface, which may break the constraint. As an example, consider an IPv6 router that assigns a global address as well as a link-local address on an interface. The specification of PIM for IPv6 [PIM6] requires use of a link-local address as the source address of PIM Hello messages and uses the address as an identifier of the router. But the link-local address is not necessarily used to specify the router as a gateway in unicast routing. For example, a multiprotocol extension for BGP [BGP4+] to handle IPv6 might use an IPv6 global address as a gateway. Also, a network administrator of a router might install a static global address as the gateway to a destination. In such cases, a PIM router is not correctly recognized and RPF does not work.
Our current approach to this problem is an operational one; we restrict our PIM domain to an IGP domain. Since all IGPs currently defined for IPv6 [RIPNG, OPPF6] use link-local addresses as the next hop for forwarding packets, they are much safer than BGP. Also, when we have to configure a static route in our PIM domain, we always use a link-local address as the gateway of the route. Finally, we do not assign multiple link-local addresses on each interface of a PIM router in order to avoid possible address mismatches between PIM and unicast routing.
Another possible solution is to introduce a mechanism to resolve link-local addresses corresponding to given global addresses. We do not adopt this kind of approach, because there is currently no standard protocol to accomplish this. However, an ICMPv6 Node Information Query [NODEINFO] with the Node Addresses type would be a feasible candidate for this.
The IPv6 addressing architecture [ADDRARCH] has a notion of scope for unicast and multicast addresses. A four-bit field is reserved for each multicast address to define scopes. Routers do not forward a multicasted packet with a specific scope outside of a domain in which the multicast address is valid. In this paper, we use the term (scope) zone to refer to the domain of a specific scope.
Three types of scopes are defined for unicast addresses: link-local, site-local, and global. Routers do not forward any packets with a specific scope of source or destination address to other zones of the scope.
IPv6 differs from IPv4 in that it is not necessary, due to the scope field, to limit the scope of a multicast address using the packets' hop limit field. However, the scopes introduce another kind of difficulty into PIM.
When the first hop router for a multicast source receives a multicast packet, it encapsulates the packet into a PIM Register message and forwards the encapsulated packet by unicast to the Rendezvous Point (RP) for the multicast address. Although the Register message might break the boundary of the scope zone to which the original multicast packet belongs, intermediate routers between the first hop router and the RP cannot detect the fact that it has done so (Figure 2).
Figure 2: A global Register message may break the zone boundary of an encapsulated packet.
The PIM bootstrap mechanism also raises scope issues. A router that acts as an RP periodically sends Candidate-RP-Advertisement messages to the Bootstrap Router (BSR) in its PIM domain. Each message contains a unicast address of the RP associated with a list of multicast addresses which that particular RP will handle. If the unicast address is a scoped one, the Candidate-RP-Advertisement message may break a boundary of the scope zone. Even if the unicast address is global, a multicast address of the list may break a boundary of the scope zone.
We propose to introduce several restrictions on PIM routers to deal with these problems. For simplicity, we assume the following relationships between unicast and multicast scopes:
These relationships are not officially documented and are still under discussion in the Internet Engineering Task Force (IETF) IPNG working group. However, the consensus at a working group meeting in Tokyo in 1999 was to establish the same scope definitions in unicast and multicast. Our assumption is based on this consensus.
In order to prevent a Register message from breaking a zone boundary of a scope, we propose the following rule:
The source and destination addresses of an encapsulated multicast packet in a PIM Register message must not have a smaller scope than the respective scopes of the source and destination addresses of the Register message.
For example, when a first hop router forwards a multicast packet that has a site-local scope, the router must choose an address with a site-local or a smaller scope as the source address of the Register message. This constraint also means that every RP should be configured with an address whose scope is smaller than or equal to the scopes of the multicast addresses that the RP handles.
We also propose several rules for the PIM bootstrap mechanism to deal with the scope issues. The following three rules are for PIM bootstrap messages:
When a router forwards a bootstrap message, it should check the scope of the BSR address and should not forward the message on a link that breaks the zone boundary of the scope.
Also, a multicast address contained in the message should be removed when the message is forwarded on an interface that belongs to a different scope zone from the zone of an incoming interface.
Finally, if an RP address for a multicast address contained in the message has a larger scope than the scope of the multicast address, the RP should be ignored and should not be contained in a forwarded message.
The next two rules are for PIM Candidate-RP-Advertisement messages:
The unicast RP address stored in a Candidate-RP-Advertisement message should not have a smaller scope than the respective scopes of the source and destination addresses of the message.
Also, no multicast address associated with the RP should have a smaller scope than the scope of the RP address.
If every router in a PIM domain follows the above rules, a Register message never breaks the zone boundary of the encapsulated packet. Also, the bootstrap mechanism does not advertise an RP that may break a zone boundary when encapsulating a multicast packet in a Register message.
Our current implementation has not fully supported the above rules. It only rejects link-local scoped multicast addresses to be forwarded and prevents link-local RP addresses from being used. We therefore use an operational approach to this issue; that is, we simply restrict our PIM domain in a single site and do not use organization-local scope multicast addresses.
These operational restrictions are of course too strict for practical usage. In the future, we will introduce the rules stated in this section into our implementation and loosen the operational constraints.
According to the PIM specification [PIM], the checksum for a PIM message is calculated through the message data only, that is, without a pseudo-IP header. This is not a problem for IPv4, because IPv4 has an IP layer checksum. Even though the IPv4 header of a PIM message is corrupted, the corruption is detected in the IP layer of the receiving node and the message is discarded.
In contrast, IPv6 does not have an IP layer checksum. Since PIM depends on some fields of the IP header of a PIM message, corruption of such fields might have a serious effect on protocol execution of PIM. For example, if the source address field of a Hello message is corrupted, the receiving node will register an invalid PIM neighbor.
Therefore, our implementation calculates the checksum with a pseudo-IPv6 header [IPV6SPEC]. However, this will affect interoperability among implementations. Hence, it should not be implementation dependent but rather officially standardized as a part of the specification. We are discussing this issue in the IETF PIM working group in parallel with developing our implementation procedure.
Network management tools are essential for practical usage, and there are several tools for IPv4 multicasting [MCASTDEBUG]. However, there is currently no proposal of such tools for IPv6 multicasting.
KAME's implementation experimentally supports a multicast version of traceroute, called mtrace [MTRACE], for IPv6. We call the implementation mtrace6 to distinguish it from the original mtrace.
A query of mtrace6, like one of mtrace, is forwarded along the reverse path for a specified multicast address and a source of the multicast address, collecting information about the path, and is finally returned to the query sender as a response message. We can use mtrace6 to grasp routing topology, to detect a routing loop for RPF, or to identify points of packet loss on a reverse path. This tool is useful especially when we use a different routing protocol for RPF from that used for unicast forwarding.
Although the implementation of mtrace6 is based on that of the original mtrace, it differs from the original on several points.
First, query and response messages for mtrace6 are implemented as ICMPv6 messages, whereas messages for the original mtrace are parts of IGMP messages. This is because there is no IGMP for IPv6 and because MLD messages are also contained in ICMPv6.
Second, mtrace6 does not use IPv6 addresses to designate incoming and outgoing interfaces, because an IPv6 address, especially a link-local one, is not necessarily unique even in a single node. Instead, interface indices are used; they are also used in the kernel and in other applications.
All mtrace6 messages have a common header, which is shown in Figure 3. The format is not different from that of mtrace query messages, except in the structure of the addresses.
Figure 3: The Packet format for the common mtrace6 header
We show the packet format of mtrace6 response data in Figure 4. Each intermediate router of a trace path appends response data to the forwarded trace packet.
As was remarked earlier, we do not use IPv6 addresses to specify incoming and outgoing interfaces. Instead, we use interface indices, which are represented in the figure as the inifid and outifid fields, respectively. Since interface indices are node-local information, we need a global identifier to specify the router. The localaddr used for this purpose should usually be a global IPv6 address. The remoteaddr field is the address of the upstream router, which, in most cases, is a link-local unicast address for the queried source and destination addresses. Although a link-local address itself does not have enough information to identify a node, we can detect the upstream router with the assistance of the incoming interface (the inifid field) and the current router address (the localaddr field).
Figure 4: Mtrace6 response data format
We conclude this section with execution examples of mtrace6. Figure 5 is the network configuration for the examples. We have six multicast routers, a sending host (sender), and a receiving host (receiver). We use Protocol Independent Multicast -- Sparse Mode (PIM-SM) in the network as multicast routing protocol, and we have a single RP, rp.
Each line in the figure indicates a link that connects two routers, or a router and a host. A router's interface is given as a numeric number, and each number is attached to a line in the figure. The blue lines in the figure show the shared tree from the RP to the receiver, and the red lines depict the source path tree from the receiver to the sender.
Figure 5: The network configuration for mtrace6 examples
Figure 6 is an output of mtrace6 tracing the reverse path from the receiver to the RP. Each line of the output specifies information of an intermediate router on the path. The first column shows the number of hops from the receiver to the router. The second column shows the router's name, followed by forwarding information (in parentheses). In the parentheses, the upstream router of the path is shown, and the incoming and outgoing interfaces are indicated by an arrow ("->"). The third and fourth columns show the routing protocol and the execution status for the query, respectively.
The first line, for instance, indicates the following:
0 router4(fe80::3/7->0) PIM NOERR -1 router3(fe80::2/24->21) PIM NOERR -2 router2(fe80::a/8->11) PIM NOERR -3 rp(::/9->6) PIM RP
Figure 6: An output of mtrace6 tracing the path to RP
Figure 7 shows an output of mtrace6 tracing the source path tree from the receiver to the sender. The main differences between this and the previous output are in the upstream router address and the incoming interface at the third line, which reflect a difference between the shared tree and the source path tree.
0 router4(fe80::3/7->0) PIM NOERR -1 router3(fe80::2/24->21) PIM NOERR -2 router2(fe80::1/5->11) PIM NOERR -3 router1(::/0->17) PIM NOERR
Figure 7: An output of mtrace6 tracing the source path
KAME's implementation of IPv6 multicasting has actually been used in the WIDE 6bone. Figure 8 illustrates the network topology of the WIDE 6bone effective in December 1999.
Figure 8: The network topology of the WIDE 6bone in December 1999
Since the WIDE 6bone is a large network and the network bandwidth varies, we use PIM-SM as multicast routing protocol in order to avoid unwanted flooding of PIM-DM. We set two routers as RPs and configure them according to the geographical distribution of senders and receivers.
One remarkable characteristic of our deployment is that we use multicasting not only as an experiment but as a practical network infrastructure. Multicast applications that are used in the WIDE 6bone include multicast chatting, feeding audio and video streams, and remote lectures using multicasted Digital Video (DV) streams. In November 1999, we conducted a research meeting of the WIDE project; the meeting was held by distributing DV streams to 10 satellite sites by multicasting. Though our backbone consisted of high-bandwidth links from 45Mbps to 155Mbps, the distribution could not be realized without multicasting, because the DV streams needed over 40Mbps at a peak rate.
We have also conducted experiments on applying IP security (IPsec) with multicasting. Some of the streams we feed are encrypted at the sender using IPsec and decrypted at each authenticated receiver. One of the difficult aspects of applying IPsec to multicasting is the distribution of secret keys. Though we currently configure the keys by hand, in future implementations we will need to introduce a mechanism to distribute the keys automatically.
We have already implemented the basic specifications of IPv6 multicasting, and the implementation works in practical application. Some parts of the implementation, however, are based on our original interpretations or extensions of officially documented specifications. One such interpretation is the way in which the checksum for a PIM message is calculated, as remarked in Section 2.2. Notably, mtrace6, which was described in Section 2.3, has nontrivial extensions of original mtrace.
For interoperability with various vendors' implementations, it is necessary to clarify or to standardize such extensions. We will discuss this issue at standardization groups, including IETF, bringing our implementation experiences to bear on the discussion.
We will also try to enhance our implementation so that it will be more stable, more scalable, and more efficient.
As we showed in Section 2.2, some constraints will be necessary in order to deal with IPv6 scoped addresses in PIM, and the constraints will be included in our future implementation. We will also need to implement more management tools for IPv6 multicasting in order to make our network more stable and to diagnose trouble.
We currently deploy IPv6 multicasting in a single autonomous system. However, when we attempt to deploy it more widely, we will have to support mechanisms designed to overcome various scalability issues. MBGP[BGP4+] and Multicast Source Discovery Protocol (MSDP) [MSDP] for IPv6 will be included as these mechanisms.
Bulk streams like DV data have a serious impact on forwarding routers. We are planning to introduce label switching technology for such bulk streams, with the aim of minimizing forwarding overhead.
The KAME project, a joint effort in Japan, is at work developing a referential implementation of IPv6, and implemented and deployed IPv6 multicasting at an early stage of the deployment of IPv6.
We implemented the basic mechanisms of IPv6 multicast routing as well as dense and sparse modes of PIM for IPv6 as multicast routing protocols. Although our implementation was based on existing implementations of IPv4 multicasting, there were several differences for IPv6. We always used interface indices, rather than the interfaces' addresses, to identify interfaces for multicasting. We also eradicated multicast tunnels, because multicasting was not an extension for IPv6.
Some enhancements were required on PIM for use with IPv6. We introduced an operational method by which to correctly detect an upstream PIM router. Several additional rules were necessary to deal with IPv6 scoped addresses. We calculated a checksum for a PIM message with a pseudo-IPv6 header, whereas the checksum was calculated without a pseudo-header in IPv4. We experimentally implemented a multicast version of traceroute for IPv6 as a network management tool.
We actually used our implementation in the WIDE 6bone with practical multimedia applications, including a distributed meeting that was held by using multicasted DV streams.
We are planning to enhance our implementation to attain more stable operation in the future. The enhancements will be achieved in both implementation and standardization.
The author is indebted to core members of the KAME project, especially to Kazu Yamamoto for his detailed comments on this paper. Thanks also go to the users of the KAME network software, including members of the WIDE IPv6 working group, for their invaluable comments and contributions.