Design and Implementation of an Agent System for Application Service Management
Yutaka IZUMI <firstname.lastname@example.org>
Yuji OIE <email@example.com>
SNMP-based network management systems are widely used; however, managed application servers are often passive in the sense that they inform their status only in response to requests from a network management station. These passive systems cannot detect and fix some troubles in network service quickly, and the SNMP-based systems cannot be easily attached to some managed nodes, such as an application server. Therefore, these systems seriously limit flexibility in managing application servers. For this reason, we propose an agent system for application service management called an SMS (Service Management System). The SMS is designed as a daemon that wraps the managed application, in order to provide some functions, e.g., monitoring and controlling of application servers in access control instead of operators who operate management systems in a network management station. The series of these functions is called "control management" in this paper. Control management allows the SMS to detect and fix troubles according to an action scenario and to attach easily to application servers. The action scenario of the SMS is given as a script file, which is described by SMAP (Service Management Agent Programming) language. The SMS also provides SNMP interface for the ordinary SNMP-based network management system. In this paper, the design and implementation of an SMS is presented. In particular, we apply the SMS to firewall and World Wide Web services, and evaluate its effectiveness.
SNMP (Simple Network Management Protocol)-based network management systems are widely used, in which managed nodes including hardware and software are often passive in the sense that they inform their status as a response to requests from the network management station (NMS). They have been managed in a way referred to as a "polling-oriented" framework .
The polling-oriented framework seriously limits flexibility in managing application servers in the point of detecting and fixing troubles. Furthermore, because of polling delay, the network managers as the operators who operate the management system in the NMS are mostly late in detecting some troubles in managed application servers (e.g., firewall, WWW), which sometimes results in severe damage to the system . Furthermore, SNMP-based network management systems allow managed data to be exchanged between managed nodes and network managers in a straightforward manner. Polling-oriented systems also have some difficulties in adding managed nodes and changing the management configuration in their application management, especially in some application servers provided as a binary package.
Therefore, it is essential that network management systems should be developed as a quick resolution for troubles occurring in the service management, even in networks consisting of a large number of application servers.
We propose an agent system for service management, called Service Management System (SMS), which allows for flexible network management based on another network management framework, called "trap-oriented", for network service and application.
Our system is flexible in that network managers can attach a SMS Agent to application servers easily and define how the SMS Agents should recognize the troubles in managed application servers, so that application servers will avoid damage to their services caused by the trouble.
SMS controls application servers as managed nodes instead of network managers. The process of controlling the managed node using the SMS is called the control management in this paper.
The control management includes these functions: monitoring the managed nodes for detecting the trouble and fixing the trouble in the managed nodes quickly.
The SMS was built to provide those functions. A SMS consists of some agents called the SMS Agents, each of which has an action scenario in order to control the managed node in the network service. The action scenario is written by the network manager and then loaded into the SMS Agent incrementally. Therefore, it is not necessary for the SMS Agent to be restarted even if the network manager changes the contents of the action scenario. The action scenario is given as a script file, which is described with SMAP (Service Management Agent Programming language). We implemented SMAP as an event-driven, interpreter language because of its capability of describing the reactions to problems.
The SMS Agent has been designed as a daemon which wraps the managed application server in order to provide two additional functions: access control and security control. The SMS also provides an SNMP interface for ordinary SNMP-based network management systems.
In our research, we also implemented two private MIBs (Management Information Base) -- firewall MIB and WWW MIB -- for evaluation of the SMS.
The firewall MIB, which is based on MIB-II and RMON MIB defined by IETF, includes the additional variable associated with some firewall applications.
WWW MIB is newly defined here and is different from others proposed by IETF in that our WWW MIB is described to monitor and control the accesses to the WWW server, not to manage the WWW server itself.
In this paper, we will present the design and implementation of the SMS proposed here and evaluate how effective the SMS is in controlling firewalls and the WWW. In fact, we have employed the SMS to treat security attacks on the firewall proxy server and service attacks on the WWW server. Through our experiments, we confirmed the SMS on detecting these attacks quickly and changing the configuration of access control immediately. Furthermore, we discuss the functions of SMS, which should be added further in order to manage the system in a cooperative and distributed way.
The SNMP-based network management system contains four components. The network management model and framework, which contain those components, are described in Fig. 1.
The major management actions defined in SNMP management framework are explained below.
The above actions are performed in a way of "polling," which is NMS-based in that the NMS reads/writes the value of managed objects. Otherwise, "trap" operation, which is agent-based, lets a managed node report an extraordinary event to NMS.
While this type of operation is available in the network management framework using SNMP, operations based upon polling are most common there. Therefore, the current network management systems are performed in a "polling-oriented" way.
SNMP-based network management systems have some severe problems in the network management.
SNMP-based network management systems would be effective in some managed nodes like hardware including routers and hubs. However, they are not effective in managing application servers (e.g., WWW, Firewall) because there are only a few kinds of MIBs related to application management, and application servers are often provided in a binary package.
Usually, application services are not managed using SNMP so that each of them is managed individually in its own way. Therefore, application servers have another problem in the point of the overhead in the service management on each application server.
Because of lack of capability of trap operations, the SNMP Agent cannot send traps for extraordinary events in the specific managed node. Furthermore, SNMP-based network management systems have difficulties in adding new trap operations to the SNMP Agent.
In the "polling-oriented" framework, NMS cannot quickly detect the trouble occurring in the managed node. If the specific trouble occurs which is not supported by the SNMP trap, the managers in the NMS must find it from huge logs which applications left. Therefore, the managers will very likely be late in detecting the trouble in the managed application server, sometimes resulting in serious damages to the system.
A polling of shorter interval decreases the delay, whereas it in turn increases the amount of traffic on the network. This makes it difficult to find a suitable interval on each managed network.
The SNMP does not have the capability of making the SNMP Agent work.
In order to change the configuration of the managed node, all that the SNMP can do is to change a value of the associated object. In this regard, the SNMP is less flexible than RPC (Remote Procedure Call) .
We propose an agent system -- Service Management System (SMS) -- for the service management as a solution to the above serious problems. The system allows the flexible network management based on the "trap-oriented" framework for the network service and application. The agent used in SMS is called "SMS Agent," which can be attached to the application server easily. Since SMS employs the SNMP architecture as a means of communication with NMS, it can work well with the network service and network management systems.
The SMS serves as a network manager, and its control management consists of several functions: monitoring and controlling of managed nodes in the network services, detecting and fixing the trouble in managed nodes, and scheduling of managed nodes. The SMS, as a daemon process, wraps the managed nodes which are mainly described managed application servers. Therefore, the SMS can be attached to application servers easily without the reconfiguring or recompiling of the application servers.
We show the SMS solutions to the control management of application servers, firewall, and WWW.
SMS supports several functions to solve the problems of the application service management.
SMS consists of some agents called SMS Agents, each of which has an action scenario in order to control managed nodes in the network service. The SMS Agent detects the illegal access by monitoring accesses to the application server and then executes reactions defined in an action scenario against the illegal access. The control management process semi-automates a series of processes including detecting the troubles and fixing them using the SMS Agent. "Semi-automates" means here that those functions are useful against troubles about which it is known why the troubles occurred and how the manager detects and fixes them. The requirement for recognizing trouble and the reactions will be written in an action scenario by the manager and will reduce the manager's overhead due to routines of management processes. In the case of unknown trouble, the manager adds a new description of the trouble to an action scenario, and then the action scenario will be loaded by the SMS Agent incrementally. Therefore, it is not necessary to restart the SMS Agent even if the manager updates an action scenario.
The access to the application servers is limited by the access control list including IP addresses which are not permitted to access them.
When the trouble is recognized by the rule of the action scenario, the SMS Agent automatically will change configuration of the access control.
When the SMS Agent detects an illegal access from a client, the SMS acquires detailed information on the client by the use of commands such as ping and traceroute.
The SMS prohibits the illegal client from accessing the servers by adding its IP address to the access control based on the acquired information.
Furthermore, the schedule, which is described as a service available time in the action scenario, can also change the configuration of the access control automatically.
The SMS relays the packets between client and application server, and then the client can establish the connection with the application server by sending the client packets to the SMS. Therefore, the SMS covers up port numbers and addresses of the application servers by packet forwarding. Furthermore, the SMS informs each request from the client of the server maintenance like proxy instead of the server while the server is maintaining. While the application server is in maintenance, the message of announces will be selected by the type of user request such as telnet and HTML.
The design of SMS is based on the "trap-oriented" framework. Fig. 2 shows the design of SMS and its framework. The SMS wraps the managed node, thereby allowing it to be attached to the managed node easily.
Actually, when a trouble occurs in the managed node, SMS will detect and fix the trouble and send a trap to the NMS using SNMP. Our implementation has been quite successful in that it runs on BSD/OS 3.0 based on 4.4BSD-Lite2.
The configuration of SMS consists of two parts:
The relationship of those parts is described in Fig. 3.
Actually, the SMS Wrapper wraps the application server and relays data packets between client and server. In receiving each access from client, SMS Wrapper informs the SMS Agent of the client information for analyzing and recognizing the security attack or the service attack.
Then, the SMS Agent executes the script where the condition to detect the occurrence of the trouble and the reaction to the trouble are described. The script is written in an action scenario. When an illegal access is found by the SMS Agent, the SMS Agent automatically orders the SMS Wrapper to change in the configuration of the access control.
In this paper, we describe SMS Wrapper adapted to firewalls and WWW. The configuration of the SMS Wrapper is described in Fig. 4.
The SMS Wrapper decides "through" or "refuse" for each access. The client information of each request from a client including IP address and command is always checked. Furthermore, each response from the server would be checked if necessary. The SMS Wrapper has an interface to the SMS Agent for receiving ACL (Access Control List) and the schedule information. The ACL includes IP addresses of the client to which their access will be refused. When the SMS Wrapper receives a request from a client, the client information is taken out of the request data in "track" function, described in Fig. 4. The client information will be informed to the SMS Agent for statistical analysis, while this information will be checked with the ACL in "judge" function. Then request is decided whether to "through" or "refuse."
If the ACL is changed by the SMS Agent, the difference of ACLs will be informed to SMS Wrapper from SMS Agent immediately.
The SMS Agent receives the client information from the SMS Wrapper. The information will be put into the database for statistical analysis for the security check.
The illegal access, which causes a kind of trouble, such as Denial of Service Attack, would be recognized by statistical analysis of the client accesses. The configuration of SMS Agent is described in Fig. 5.
The SMS Agent can have some processes because the SMS Agent functions as a multiprotocol server for TCP and UDP. TCP is used for the communication between the SMS Agent and the SMS Wrapper, and UDP is used for the interface with the NMS using SNMP. The processes are shown below.
The action scenario is written by SMAP (Service Management Agent Programming language); we developed SMAP as an event-driven, interpreter language for the control management. The script written in the action scenario is called SMAP script. The SMAP script supports some definitions, functions, and grammars.
We developed the private MIBs for verification of effectiveness of SMS. The MIBs are Firewall-MIB and WWW-MIB, and then they are used for each application server. Their structures are described in Fig. 8 and Fig. 9.
The Firewall-MIB has been designed and implemented to control the firewall applications in the firewall gateway and the router. The Firewall-MIB includes MIB-II, the part of RMON-MIB, and specific managed objects related on some firewall applications.
Each firewall application is managed individually by the managers. The Firewall-MIB provides a unique interface to manage several firewall applications. Therefore, each managed object as a firewall application is defined in "apptype". For example, telnet proxy server of TIS Firewall Kit  is described in the managed object of "tiskit" in Firewall-MIB. The leaf of "tiskit" includes some managed objects, log-in name, port number, client address, and access permission.
Therefore, the manager can generalize the control for several firewall applications using SMS.
The WWW-MIB is designed and implemented to control the accesses to the WWW server. This WWW-MIB is different from one proposed in IETF in that our WWW-MIB is not to control the WWW server itself, but control the accesses to the WWW server, because of difficulties in adapting to the WWW server, especially to the server provided as a binary package.
The WWW-MIB has an item called "statistics" for the statistical information of accesses to the WWW server.
The purpose of "statistics" is detection of the illegal access and protection against the trouble such as "Denial of Service Attack." A lot of intentional and rapid illegal accesses in a short time cause Denial of Service Attack, and such attack would be recognized by statistical analysis of the client accesses. Therefore, "statsTable" in "statistics" preserves each client's information in a short time.
We have evaluated SMS. In the evaluation, Apache 1.2 as the WWW server and TIS Firewall Kit 1.3 as the firewall application were used and adapted to the above private MIBs.
The network configuration for the evaluation is described in Fig. 10.
The illegal access using prohibited log-in name like "root" was detected and refused, and then the SMS Agent informed the manager of the client information including the client's address and the routing information to the client.
The conditions to detect the illegal access and the reactions against the trouble are described as the SMAP script in the action scenario. The SMAP script is described in Fig. 11.
"TISfwkit" is defined as a managed object and "AccessPermit/AccessDeny" are defined as specific actions in "handle" for controlling the accesses.
The manager would receive the alarm message from the SMS Agent in detecting the illegal access. The message is described in Fig. 12. The form of the alarm message will be chosen from the window or e-mail.
Therefore, the SMS is able to be programmed and customize the condition and the reaction easily. This programmable environment using script is important to the quick resolution in the firewall service management.
If a client as a cracker continues to access a home page in the WWW server rapidly and intentionally, the overhead of the WWW server process increases, and then providing the service will be more difficult. This method is called "Denial of Service Attack." Denial of Service Attack has been difficult to detect and execute the reaction such as refusing the access from the specific client in real-time.
The SMS detects the above situation using the short-term statistical information. The SMS preserves the client information described in Fig. 13.
"statusSourceAddress" and "statusRequestPath" mean the client address and the path of a home page accessed in WWW server. When the SMS Wrapper received the request of the same page from the same client, the specific integer will be added to the client's "statusTimeToLive".
The client's same access is specified by matching both "statusSourceAddress" and "statusRequestPath". The SMS Agent decreases the number of other client's "statusTimeToLive" in receiving the other access.
Furthermore, the manager defines a threshold in an action scenario against the value of "statusTimeToLive". When the access which its "statusTimeToLive" is over the threshold, it will be recognized as an illegal access in the SMS Agent, and then the client's IP address will be added into the ACL. The part of SMAP script for recognition and reaction is also described in Fig. 13.
The graph of load average which means the overhead of the WWW server process is described in Fig. 14. In this evaluation, 30 client processes were running at the same time and one of them had continued to access rapidly and intentionally, and then it makes overhead in the WWW server.
As the result, the overhead of the WWW server process was decreasing.
The SMS is able to program and customize the configuration of control management in the service management in real-time, and then the SMS has the flexibility of the network management. Therefore, the application service avoids the serious damage which is caused by troubles and attacks immediately.
Furthermore, we address the functions of SMS, which should be added further in order to manage the system in the future network management.
SMS is designed for LAN-scale network management; it is not intended for large-scale network management including several NMSs and WAN management. The powerful Authentication systems and flexible scoping systems are important in this management since the various management information should be scoped for each manager group or each manager's management-level.
However, the information is shown for all managers who know the SNMP community. In the discussion of SNMP Working Group of IETF, the powerful authentication mechanism would be adopted in SNMPv3, as a User-level Security Model. The SMS should be provided with the authentication mechanism and further scoping function for the distributed SMS and NMS in cooperation.
The effectiveness of the SMAP scripts in detecting and fixing the trouble depends on the experience of the manager. However, the SMS will provide the facility by re-definition of useful scripts as the SMS library, and then the SMS should provide the mechanism for handling and loading of the script through the networks.
SNMPv3, which is developed in the SNMP working group of IETF, approaches distributed management by message processing and dispatching .
It would be able to fuse with the SMS because of the independence of language process from SNMP process. However, script handling and scoping of the management information are not yet provided completely.
Therefore, the SMS should also provide the control management in the network management framework using SNMPv3 and should be adapted to the distributed management, which allows further its flexibility.
We have proposed a flexible service management using SMS. It brings a trap-oriented network management framework which provides flexible control in the service management. The SMS will detect and fix the troubles; an action scenario specifies how the troubles can be detected and fixed. The action scenario includes the SMAP scripts to control the managed node. Then, the current network management systems require much time to fix the troubles and also have difficulties in controlling the several application servers. Thus, we have shown how these problems of service management can be solved within the SMS management framework. In order to verify the effectiveness of SMS, we treated the security attack to Firewall proxy server and the service attack to WWW server. From our experiments, SMS has been shown to detect these attacks quickly and change the configuration of access control in response to them immediately. The adaptation for the large-scale network management is left for a future study.
The authors would like to thank the members of information network laboratory, Nara Institute of Science and Technology, Nara, Japan. The authors also thank Prof. Fujiichi Yoshimoto (Wakayama University) and Toru Murase, Hiroyuki Inoue (Sumitomo Electric Industry) for their active discussion and valuable suggestions.