An ADSL/ATM Based Multimedia-On-Demand Trial:
Architecture, Application, and Impacts of Large Multicast Traffic
Chain-Chin¡@ Yen¡@ <email@example.com>
Chin-Chou¡@ Chen¡@ <firstname.lastname@example.org>
Tsu-I Hsu¡@ <email@example.com>
John, C. C.¡@ Shueh¡@ <firstname.lastname@example.org>
Chunghwa Telecom Labs.
Taiwan, R. O. C.
This paper present the first Multimedia-On-Demand (MOD) trial introduced
by the ADSL/ATM based network in Taiwan. A number of integrated services
(non real time and real time) are addressed in this trial, which is expected
to create a diversity of services and applications into the most widespread
twisted-pair local-loops today. This trial attempts to find out a way for
the ATM switch to support large scale point-to-multipoint bursty real time
traffics. The trial is emphasized on practical problems and factors for
deploying these new services and thus may be referred by anyone who wants
to deploy similar services on ADSL/ATM based network. A shared memory scheme
with two stage queuing approach for dealing with the multicasting traffic
is proposed in the system design of the applied ATM switch. Moreover, with
a hybrid buffer and traffic adaptive mechanism the bursty traffic can be
quite even distributed. However, all the similar trial methodologies have
a practical scalability issue that the number of served users can not be
effectively expanded due to the hardware complexity limitations. In this
paper, we present a number of approaches to solve this problem and provide
some practical trial experiences to be expected useful for any one who
has engaged in the similar project or just plans to do in near days.
Table of Contents
With the advent of WWW and Internet, multi-media applications have become
more and more prosperous and diversified.¡@ The require of high speed (bandwidth)
transmission media and access methods for transferring not only the burst
data but also the real time video services has become more and more urgent.¡@
For the recent decades, cable TV carriers have wide sprayed coaxial cable
to many communities and cities. In addition, with the development of Cable
Modem, they also leads a new kind of Internet accessing for people at home.
Besides of instinct shared medium constraints of cable, however, it is
just almost grounded in beehive communities at present due to the concerning
of construction cost. In contrast, telephony lines used to be thought of
as adequate only for voice transmission but advances in technology have
expanded their capacity to much higher transmission capacities for conveying
not limited in voice service today. Asymmetric Digital Subscriber Line
(ADSL) is the leading technology enabling ordinary twisted pair equipped
with ADSL modems to transmit movies, television, dense graphics, and very
high speed data. More than 800 million such lines exist around the world
today; new cabling, whether fiber alone or combined with coax, will take
decades to replace them all. With ADSL, telephone companies can connect
almost every home and business to exciting new interactive broadband services
now. ADSL will play a crucial role over the next twenty years as telephone
companies enter new markets connecting subscribers to the Internet and
delivering information in video and multimedia formats. New broadband cabling
will take decades to reach all prospective subscribers but success of these
new services is dependent upon reaching as many subscribers as possible
during the early years. Bringing the Internet, corporate LANs and movies
into homes and small businesses soon, ADSL will make these markets viable
for telephone companies and application suppliers alike.
Video on demand (VOD) is no longer just a figment of your imagination.
It is real and it will be available to you soon using ADSL technology.
ADSL technology turns your ordinary telephone line into a high bandwidth
communications channel. The ADSL technology is applied to the copper wire
that connects your business or residence to your local telephone service
provider. Up to 8Mbps of video data can be sent to your location over ADSL
equipment. This can be accomplished simultaneously with independent telephone
traffic over the same copper loop. The ADSL equipment will be provided
in the form of a modem or as an expan-sion card for the customer's PC.
It could also be provided as part of a television set top box or a network
computer. VOD can occur in either or both the PC and TV environment. By
attaching the ADSL equipment to an Asynchronous Transfer Mode (ATM) backbone
network, you will be able to access video content form any server in the
world that is also attached to the network. The Internet core network is
now being converted to an ATM switch fabric. An ATM fabric is ideal from
mix voice, video and data traffic. Telephone companies have conducted numerous
ADSL trials globally. These trials have proven that ADSL works well in
video on demand applications. With high speed ADSL service it will be possible
to view both stored and streaming video. VOD will tap into the power of
In Taiwan, an ADSL trial had been started by Chunghwa Telecom Co. (CHT)
from 1996 and commercial operated since 1998. It was only setup for Internet
access in university academic and dormitory area. At the same beginning
time as the former, a multimedia (Video and Internet access) on demean
trial was also setup by CHT. It used ADSL with a simple circuit switch
system to provide 28 users with broadcast TV programs, Karaoke on demand
and pay per view (near video on demand) services. This trial proved it
is feasible of ADSL for serving these multimedia applications. Following
this, the phase II trial was beginning since July 1998, which purpose is
focusing on serving more large amount of users, then push it for commercial
operation. As discussed in previous paragraph, based on the high speed
and efficient switching techniques, ATM is to be thought as a promise technology
for not only developed as a part of the B-ISDN effort, but also widely
used in assisting the layer 3 switching (e.g. IP switching, MPOA, MPLS
etc.) for internet access and multimedia applications recently. These remarkable
mechanisms are applied in the MOD trial addressed in this paper.
This paper begins by introducing the trial plan, system architecture
and multicasting design methodledge. In this trial, real time applications,
such as multicast services (e.g. video conference or video on demand) and
broadcast services (e.g. live video), are the ones which may induce a large
amount of traffics simultaneously that cause a big challenge in the design
of the switching kernel. The large amount of point-to-multipoint data flows
may introduce a significant bursty traffic by the costumer's unpredictable
usage habits or while in the period of emergent news, such that the ATM
switch with the moderate multicast design approach is not suitable for
the MOD application. A shared memory scheme with two stage queuing approach
for dealing with the multicasting traffic is proposed in the system design
of the applied ATM switch. Moreover, with a hybrid buffer and traffic adaptive
mechanism the bursty traffic can be quite even distributed. A serial of
practical trial result is shown that it can effectively reduce the bursty
loss while meets a rush traffic load. In this paper, we present a number
of approaches to solve this problem and provide some practical trial experiences
to be expected useful for any one who has engaged in the similar project
or just plans to do in near days.
ADSL Based MOD System Architecture
The system architecture of the ADSL-based MOD system is shown in Fig.
1, which can serve 400 users jointing in the first phase trial. The
services provided by this trial system are Internet service¡BKaraoke on
demand ¡Bbroadcast TV program and pay per view (near VOD). In this phase,
a 5 Gbps Virtual Path ATM Switch BEX-VPX (abbreviated as VPX) is used to
be the core switch. The VPX is a 32 x 32 (OC-3c per port) ATM switch which
has been deployed in the NII to be the backbone network in Taiwan, R. O.
C. A modified multicast function is employed in the VPX to meet the point-to-multipoint
requirement of this trail. The ATM Multiplexer (AMX) in Fig. 1 provides
the STM-1(STS-3c) to 10 BASET termination between the VPX and the ADSL
equipment, ATU-C. An AMX has 16 lines of 10 BASET ports, in which each
line can connect to an ATU-C. An ATU-C can support 4 POTS's of user or
service provider. Thus, a VPX OC-3c port with an AMX can support 64 lines
of POTS user/service provider. To support the requirement of the trail
in this phase, 7 AMXs are used to provide at most 448 users whereas 6 AMXs
are setup for connecting at most 384 lines of the video or data service
servers. The Common Control of VPX (VPX-CC) handles the PVC connections
setup of the VPX and the AMX. The required message of setup PVC connections
is received from the MOD Network Controller, which is the service and network
management of the trail system. All of the communication among the VPX,
the Video/Data Servers and the MOD Network Controller is through a private
LAN connected by Ethernet. Moreover, the VPX also has a trunking port which
is reserved for system capacity expansion or connected to other networks.
Introduction of BEX-VPX for ADSL Based MOD System
Fig. 2 shows the VPX Hardware Architecture for
ADSL Based MOD System. The VPX is a 32 x 32 ATM switch, which concludes
8 peripheral modules (BPM). The BPM contains 4 lines of OC-3c (named STM-1
Line Interface Card, SLIC) in which each line can be configured as a UNI
or an NNI.¡@ Each BPM has a duplex processor (named Broadband Line Processor,
BLP) to be the module controller. The Common Controller (CC) implemented
by an HP workstation, is connected by a multimode OC-3c fiber to one of
the SLIC, which is the central controller of the VPX. All the details can
refer to the related papers of BEX-VPX .
The MCM module is developed for multicast purpose. The MCC's function
block is shown in Fig. 3. All the multicast cells will be routed by BSM
to the MCC. The incoming cell stream (with 64 bytes internal ATM cell format)
from BSM is buffered in the OLC (Output Link Controller) chip and sent
cell by cell to the MIC (Multicast Input Controller, an FPGA designed chip).
In MCM, the SLIC in BPM is replaced by the MCC (Multi-Cast Card) which
is used to perform multicast cells copy and head translation functions.
The module processor MCP is the same as the BLP in hardware design, but
its applied software has been modified for multicast application. All the
multicast cells will be routed by BSM to the MCC. The incoming cell stream
(with 64 bytes internal ATM cell format) from BSM is buffered in the OLC
(Output Link Controller) chip and sent cell by cell to the MIC (Multicast
Input Controller, an FPGA designed chip). The contain of the Multicast
Table in this version is replaced by 1 byte "copy number" and 4 bytes (32
bits) "copy channel indication". Thus, the incoming cell (with 64 bytes
internal ATM cell format) will then be translated to a 5+53 bytes cell
by the MIC (Multicast Input Controller, an FPGA designed chip) and queued
in the Pre-FIFO. MOC (Multicast Output Controller, an FPGA designed chip)
fetches the 5 bytes added header of the cell stored in the Pre-FIFO cell
by cell as the next stage FIFO ------ the Buf-FIFO is empty. Then, the
other 53 bytes ATM cell is stored into the Buf-FIFO. The cell's data will
send to HTC and re-enter the Buf_FIFO "copy number" times, i.e. "copy number"
cells is copied. Each copied cell has its related 5-bit ID by fetch from
the message of the "copy channel indication". Each ID will be combined
with 8-bit VPI (fetched by HTC from the copied cell) to be the index of
HTT. The copied cell will then give the new translated header from HTT
which is written by MCP during the time of PVC setup. The 64 bytes translated
cell will then be sent to ILC (Input Link Controller). Finally, the copied
cells will be re-entered to BSM and routed to its' final destinations.
The multicast ability of the MCC (MultiCast Card) is 32 times of each program
source. According to the system requirement, there are at most 400 users
may watch the same program source at the same time. It is obvious that
we can not accomplish the multicast job by using a single MCC. We design
a two stage algorithm to solve this problem.
We split these MCC boards to two groups. One group is called the first
stage board and the other one is called second stage board. There is only
one single board in first stage board group which take the input of program
sources. This board will duplicate the input data stream and forward to
each board in second stage boards. This two stage design can extend the
multicast ability to support more than 32 users.
How many users can be supported by this two stage design? We analysis
this design as the following to understand the capacity and throughput
of the system. By analyzing this design, it also help us to propose an
algorithm for building channel between users and program sources.
To make the analysis clear, we define the following terms:
¡@¡@¡@¡@¡@¡@¡@ Channel: the path from a program source to user's side.
¡@¡@¡@¡@¡@¡@¡@ Link: the path that goes in or runs out from MCC
We also make the following assumptions for our system:
By the description of two stage design and the restrictions due to the
assumptions, we can list down the channel capacity of¡@ the system.
The multicast function is implemented by a two stage copy procedure.
There are 4 VPI values are reserved: 0 (MUX) , 253 (hardware loopback)
, 254 (On board loopback) , 255 (reserved for future). That means there
are at most 252 VPI values are available on each port.
There is only one data stream for one program source on one board. This
assumption is used to prevented the traffic congestion. We asterisk this
item because we will modify this assumption to meet our system requirement.
Suppose we have M MCC boards and N users available , then we have the
following constraints that limit the capacity of the system:
The limitation of our system will depends on two part, one is the multicast
ability and one is the bandwidth on each port. The multicast ability limit
M >= N/32+1. The bandwidth limitation is M>=N/50. If N is 200 then we get
M must not less that 8.
Max. number of outgoing/incoming links on each MCC will be approximately
close to 155/1.5 ~=100.
Max. value of channel number in the system will be (M-1)*32*252¡@ by the
restriction of available VPI values on each port. If we consider the Max.
number of outgoing/incoming links restriction on each port, the limitation
will be limited to (M-1)*32*100 as its upper bound.
Max. number of one program source that can be multicast will not exceed
Min. value of Max. program source number will be 252. (in the worst case)
Max. value of Max. program source number will not exceed 126*M. The limitation
will become 50*M if we consider the limit of incoming/outgoing links.
Each channel will go through MCC twice, that means one channel will use
two outgoing and two incoming links in worst case. For N users, The maximal
outgoing/incoming links that are allocated will be N*2. This value must
be smaller than the value that M MCC can provided. That means 2*N<=100M
The discussion above show that to support 200 users at the same time,
we must have at least 8 MCC in our system. The system will be more smoothly
if we add the count of the MCC because we can distributed the traffic to
more boards to prevent congestion. The spare boards also allow we provide
the fault tolerant ability to the multicast system.
Before describe the multicast algorithm, we should be notified that
there are only 252 program sources are permitted in the worst case. If
the number of users are less than 252, it is fine without considering of
this constraint. If the number of users will exceed 252, we should make
sure the worst case will not appear in our algorithm.
Building Channel (For a user request a program source)
As we can see from this algorithm, it is locally optimized to meet our
requirement. It try to spread the traffic load to each MCC board and use
the lease used FIFO queue that can optimally prevent the traffic congestion.
On each state transition, the next state will be the best one of all possible
following states. If the distribution of users' requests is uniformly distributed,
this algorithm also will be a global optimally. To design a globally optimized
algorithm upon each distribution type of users¡¦ requests are exceed the
scope of this system spec.
If user are the first one that request this program source? If yes goes
to step 2 else goes to step 4.
Find a MCC with the smallest incoming link count as our first stage board.
Find a MCC, not including the pre one, with the smallest incoming link
count as our second stage board.
In the first stage board, find a least used FIFO and use it for copy. Find
a not-in-used VPI from VPI pool of second board and Build the connection.
On the second board, also find a least used FIFO to use and link to user.
Report success and end this algorithm.
Suppose the program source use B1 as the first stage board, try to find
a MCC that acts as the second stage board of this program source and has
the smallest copy count. Note this board as B2. B1 and B2 must be distinct.
If the copy count of this program source on B2 are less than 32, then add
a branch on this board. This is similar to step 3.
If the copy count of this program source are equal to 32, that means that
the program source can not be copied more on this board. Since this board
has the smallest copy count, we know each second stage board can not copy
any more. Go to find a MCC which does not contain the data stream of the
program source and use it as second stage board. The rest of the algorithm
are similar to step 3.
According to the system requirement, we had to support 400 users at
the same time. Apply this customer size to our system restriction formulas,
we get 14 MCC boards are necessary to support this requirement. Now comes
the problem, because the number of ports on switch is not enough to allow
us have so many MCC boards at the same time, how can we support 400 users
at the same time with 12 or less MCC boards?
To solve the problem, we modify the asterisked assumption and our algorithm
to extend the multicast ability of the system. The modification of asterisked
assumption may cause lost problem if the system is high loaded. We can
solve this by a modified¡@ algorithm that acts just like the previous one
to have the locally optimized performance when the system is not high loaded,
and still can support much more customers when the system is high loaded.
We modify the algorithm to add the following step to achieve our goal.
In the new algorithm, the maximal number of one channel that can be multicast
will be (M-1)*32* 32/(M-1) = 1024. (independent to M) Another constraint
is M>=N/50. For N=400, we also get M=8 is ok. We can see that if M=8 ,
before 225 users watch the same channel, the modified algorithm acts as
the pre one. When we increase the M, these two algorithm will become all
If we fail to find a MCC that does not contain the data stream of the program
source in step 6, that means all boards are in use. We select a MCC with
the smallest count in input of the program source. Use this board as second
stage board. When we select the FIFO, instead of selecting the lease used
one, we select the FIFO that has the largest distance to the FIFO that
had been used for this program source before.
Before ending this system analysis, we should make a significant note
when we build up the system. The multicast design of this system is acted
as an queuing model that it also is restricted to the same constraints.
That means the cell lost rate will be exponential increased when the system
is close to full loaded. We should put more MCC boards on the system than
what it needed to prevent from full loaded and support a much better fault
Software Architecture and Functionality of Common Controller
Fig. 4 shows the software architecture of common
controller. The major process is mTRAN which need to handle resource management,
channel allocation and deletion, user management and fault management of
MCM. The functionality of these processes are explained in this section.
Software Functionality of BPM Peripheral Module
The BPM software runs on duplex peripheral processors with RTK real
time kernel. Its functions consists of I/O drivers, communication control,
call processing, and management function. I/O drivers include ATM driver,
IMC driver and RS-232 driver, where ATM driver will handle the receiving/transmitting
of the ATM cells from/to ATM cell stream of User-to-Network Interface (UNI),
I/O driver will handle the IMC cells in stead. RS-232 driver will be used
for the terminal communication. The communication control consists of Inter
Module Communication (IMC) and AAL5, these two functions together to fulfill
the communication between BPM and CC. The call processing and management
functions will be two major parts of the software system. Call processing
consists of resource management and Permanent Virtual Connection (PVC)
management to establish/release PVCs and allocate/deallocate VPIs. The
maintenance consists of Initialization (Init), Audit, Fault Management
(FM), Performance Management (PM), Data Loader (DL), and Traffic management
To support the multicast on-demand service, some functionalities of
BPM need be changed. They are
Software Architecture and Functionality of MCM Peripheral Module
Call Processing : In addition to PVC, a new kind of connection called MPVC
(Multicast PVC) should be supported. An MPVC is a point-to-multipoint connection
connecting a program source (either broadcast program or multicast program),
which is treated as root party, to a multicast channel in MCM, which then
copy cells to several leaf parties (i.e. user). All MPVCs are uni-directional,
that is, user traffic flows from root to leaf parties only, no traffic
for the reverse direction, including OAM cells. Unlike the original PVC,
the establishing/releasing procedures for MPVC do not require the communication
between BPMs or BPM and MCM. Upon receiving the MPVC management messages
from CC, the CP task only need to either allocate/deallocate VPI, or update
the corresponding entry in HTT accordingly.
Audit : For simplicity, the run-time data consistency check shall skip
all MPVC data, i.e. no data checking between BPM and MCM
FM : In case of physical link failure,¡@ the FM shall avoid generating
OAM cells to all MPVC connections.
The software architecture of MCM peripheral module is shown in Fig.
ATMTX - This function used to send signal message in ATM format to other
modules, like Common Controller Module and Broadband Process Module.
ATMRX - This function used to receive signal message in ATM format from
AAL5 - This function used to implement AAL5 protocol.
IMC(Inter Module Communication) - The communications among Common Controller
and the peripheral processor will route through switch. The messages will
be segmented and packed into SAR of AAL5 format and multiplexed with the
internal ATM stream. At the destination side, the IMC cells will be extracted
from the cell stream and reassembled, then sent to the processor.
MCH(MultiCast Handler) - This function used to accept commands from Common
Controller module and control the MCM card. Its action includes setting
the Multicast Table and Header Translation Table.
INIT(Initialization) - This function used to initiate the running environment
of this module. Initialization ensures the congruence of hardware status
and software data, after bootstrap loader/loader loaded the software into
the system. Its action includes initiating hardware, communicate with Common
Controller Module and creating other tasks.
FM(Fault Management) - This function used to verify the action of hardware
to see if they work normally. Normally it is triggered by Common Controller,
can be on-line and off-line executed.
DL(Data Loader) - The function of data loading is to download data from
the database of Common Controller to the peripheral processor while the
peripheral processor in the cold start or the warm start state. And if¡@
there is some data inconsistency, system will used DL to update the peripheral
Audit - Audit is to detect, confine and recover data errors and lost resources
before system performance is adversely affected.¡@ Audit also verifies
the working situation of a running process. Once an error is found, the
appropriate recover procedure will be invoked. Moreover, Audit will cause
system switch over if recovery failed.
Duplex - Since the peripheral processors are with 1 + 1 redundancy concern,
the responsibility of Duplex Control is to detect the hardware or the software
failures as soon as possible, switch the duties of failed component to
another redundant component and recover the processing conditions such
as global data, table, .. etc. For these two peripheral processors, one
is active and the other is standby. There is a Watchdog on each of the
peripheral processor. The Watchdog will send a query to another peripheral
processor via a communication channel, if it could receive an acknowledge
back in time, it can assure that another peripheral processor is still
alive. Otherwise, the recovery process will take place. By the way, it
will receive a query from another peripheral too, it must send an acknowledge
back to claim that it is still alive. The switch over criteria of peripheral
processor depends on the following conditions:
There is no any response for a query
Trigger from Audits for some unrecoverable global data error
Trigger from Common Controller
MMI(Maintenance Monitor Interface) - This function used for maintainer
to monitor the situation of MCM module. This function provide command line
method for maintainer to maintain MCM module.
Trial Results and Discussions
At first stage of our MOD trial, we have 28 users, 240 near VOD program
channels, 32 Karaoke channels and 30 live TV channels. In this stage, our
ATM switch with Multicasting facility can afford 28 users¡¦ usage requirement.
But when we try to test the multicasting capability of our ATM switch,
we find that if the number of users connected to one switch port increases
to 45, the ATM switch will lose the data cells when 45 users choose the
same program channel. This result will restrict user¡¦s expansion of our
MOD system. So we take some modification:
1. We change the design of our Multicasting card.
2. We add FIFO number of our ATM switch.
3. We modify our multicasting algorithm to group every 64 users to
one ATM switch port and restrict the choosing policy of every program channel
Then, we find our system can afford 400 users, 240 near VOD program
channels, 32 Karaoke channels and 30 live TV channels at last to meet our
trial scale and the loss rate of our system is 10e-12.
Although our MOD system can meet our trial scale, the expansion of
ADSL users and program channels is still a question needed to be solved.
There are some topics we can discuss:
1. ATM switch provides multicasting capability since the characteristics
of MOD traffic are variable.
2. Should we need to provide live TV service? Maybe we can focus on
Internet service, Multimedia on demand and other value added service. Then
We can reduce the multicasting impact of these services.
3. xDSL like VDSL cooperates with ATM core switch may be another candidate
to provide live TV service.
The ADSL-based MOD system is a new service trial initiated by Northern
Taiwan Business Group of Chunghwa Telecom Co., Ltd.¡@ There are at most
400 users jointing the trial. The services provided by this system are
Internet service, Karaoke on demand , broadcast TV program and near VOD
(pay per view). In order to capture technical experience, we use BEX-VPX
which is an ATM switch developed by Chunghwa telecom Labs to be the core
switch of our MOD system. Now we are going to plan a larger scale MOD system.
We still want to use ADSL-based architecture with ATM switch since we think
it is an optimal solution to provide multimedia on demand service.
 C. C. Yen, T. I. Hsu, P. T. Tseng, C. C. Chen, Y.
W. Chen, and L. S. Liang,¡¨ The Deployment of BEX-VPX Broadband Network
and its Network Management in Taiwan,¡¨ in the proceeding of IEEE ICCS/ISPACS,
pp. 18.6.1-18.6.5, Nov. 1996.