Web site monitoring and management perspectives:
A readiness-evaluation methodology



Joseph Gulla: gulla@us.ibm.com
IBM Global Services Web Hosting Delivery
Research Triangle Park, North Carolina, USA

John Hankins: hankinjo@us.ibm.com
IBM Global Services Web Offering Development
Schaumburg, Illinois, USA


Abstract
In this paper, we define a framework that can be used to evaluate the quality and completeness of the monitoring and management of a web site. This approach is scalable and extensible as it can be used for a single web site, a collection of sites, or an entire hosting center. This framework can also address the support of the launch of a new site, an upgrade to an existing site, or a one-time quality and completeness assessment. The approach is supported by a methodology that is based on a series of "perspectives" which incorporate a comprehensive view of tools, processes, organizational structure, and staff skills. The perspectives discussed include system, support, and end-user. The system perspective has as its focus the monitoring of essential infrastructure, application, and business-system components. The support perspective focuses on team processes and related tools. Key processes include change, problem, performance, and security. The end-user perspective is cente! red on measurement and improvement of the end user experience. The framework, and its supporting perspectives, provides the opportunity to take a comprehensive view of the management of a site. In support of the framework, we have developed a systematic methodology that uses a series of data tables to drive and support the analysis. These tables are used to clearly identify and document the monitoring and management components, processes, and tools that are the focus of the activity.


Contents

1 Introduction 3

2 Method 5

3 System perspective 6
3.1 Creating the infrastructure table 6
3.2 Creating the application table 8
3.3 Creating the business table 8
4 Creating the monitoring evaluation table 10

5 Support perspective 12
5.1 Change management 13
5.2 Problem management 17
5.3 Performance management 22
5.4 Security management 26
6 End user perspective 29

7 Review report or presentation to complete the review 31

8 Summary 32

About the authors 32

Footnotes 33

References 33



1 Introduction
In many enterprises, web sites have become a mission critical component of the organization as more and more businesses have come to rely on the Web for commerce as well as internal and external communication. In 1997, the global impact of the Internet trade in services accounted for well over $40 billion of U.S. exports (A Framework, 1997). By 2003, the U.S. Commerce Department estimates that business to consumer e-commerce will likely be in the range of $75 to $144 billion. Business to business e-commerce could reach between $634 billion and $3.9 trillion. (Leadership for the new millennium, 2001). Because of this, many IT organizations are putting new focus on establishing and maintaining 7x24 site availability along with high quality performance and response time. The consequences of Web site failures can be costly. E*TRADE has felt the pain of costly failures. From February 3, 1999 through March 3, 1999 E*TRADE experienced four outages of at least five hours. The dir! ect cost of these failures is not known but the company's stock price declined twenty two percent on February 5 -- just two days after the initial failure (Frick, 2000). Web management tools are plentiful but alone they are not enough to do the job.

Many organizations use multiple, ad hoc methods, tools, and services in the quest for five 9s reliability and sub 3 second response time. This collection of resources makes determining the effectiveness of Web site management tools is a challenge for many IT organizations. This is true whether you use a framework-based management solution or a collection of point products. Many organizations spend a great deal of time and money evaluating the capabilities of a given tool but do very little to understand how the full range of tools will perform in the context of the organization's staff, skills, and processes. The main challenge of web site monitoring and management is to be able to detect (and correct) a variety of problems quickly. Often these problems go undetected because the monitoring and management implementation is ineffective and does not fully leverage the capabilities of the toolset Many tools are powerful and specific to the problems that occur with the! se sites yet fail to perform at the expected level. Tivoli has dozens of products that plug into its framework that can be used to manage sites and their application components (Tivoli product index, 2000). BMC Patrol has a wide variety of Knowledge Modules (Patrol, 2000) and detailed white papers to guide the technologist. Event management is a good example of this phenomenon (Event management and notification, 2000). Considering the complexity of this implementation challenge, what analysis and planning is required to implement effective Web site-specific monitoring and management? How do you go about anticipating problems before they happen, and when they happen, how do you correct them in a timely manner? This paper provides a framework for the analysis and planning that is necessary to identify and evaluate the important monitoring and system management components for a Web site. This approach can be used by the enterprise to plan monitoring and management if it is be! ing hosted in-house or to evaluate the quality of monitoring and management if it is being outsourced to a web hosting organization.

Our approach is based on perspectives that incorporate tools, processes, organizational structure, and staff skills. The notion of perspectives is used by Strum & Bumpus (1999) to give a simple name to what others called disciplines, domains, functions, processes, and services. In our use, a perspective is a point-of-view or focus. Our perspectives include the system, support, and end user. The system perspective has a focus on hardware and software grouped by infrastructure, application, and business components. The support perspective is centered on four processes -- change, problem, performance, and security with a concentration on team, process and tools. The end user perspective is focused on measurement and improvement of availability, performance, consistency, and reliability. The approach is implemented using a three-part method.

Back to Table of Contents


2 Method
We have developed a systematic methodology supported by a series of data tables. This approach is used by IT consultants delivering services to customers like those supporting Information Technology Infrastructure Library (ITIL) Services (Introduction to ITIL, 2001) and Information Systems Management Architecture processes (Harikian, Blust, Campbell, Cooke, Foley, Gulla, Gayo, Howlette, Mosher, & O'Mara, 1996). These tables clearly identify and document the monitoring and management components, processes, and tools that are the focus of the framework. The method consists of three steps. The steps are preplanning, analysis, and review. The focus of the preplanning step is to gather materials and to perform early planning activities. The materials collected include web site design diagrams and configurations. Planning activities include listing site components and examining items used in production like restart scripts, monitors, and other tools. Preliminary system, suppo! rt, and end-user perspective tables are built during this step.

An important activity of the analysis step is the completion of the system, support, and end-user perspective tables. These completed tables are used as the basis for a presentation or report. With partially completed tables in hand, a meeting should be held to review what has been found with the owners/developers of the web site. This meeting will offer the opportunity to validate the requirements that have been discovered, as well as gather any additional information that will have an impact on the assessment. During the review step, a presentation or report that contains finding and recommendations is presented. Final versions of the system, support, and end-user perspective tables are also provided and explained. The report is a major tool that should be used to drive the activities to put the systems management solution in place to support the production web site. An appendix to the presentation or report contains a high-level implementation plan, whic! h should be used as a guide to implement the action items from the study. The system, support, and end user perspectives are now discussed in detail.

Back to Table of Contents


3 System perspective
The system perspective has three areas of focus -- infrastructure, applications, and business functions. Each area is different and taken as a whole, they cover the system aspects of the Web site. The infrastructure focus concentrates on the operating system, server and network hardware, and other devices such as firewalls. The practical aspects of infrastructure support have been discussed in detail elsewhere. A good example is Welter's work (1999). The application focus places specific attention on the database, middleware, and the application itself. Business functions focus on the comprehensive management of a collection of applications. Comprehensive means business views, monitors, and command and control.

Back to Table of Contents


3.1 Creating the infrastructure table
For the site's infrastructure focus, create a table with as many specific components as you determine to be key to the health of the infrastructure. A good starting point should include the operating system, server hardware, network hardware, and other devices like firewalls and load-balancing servers. Derive the list of infrastructure components from the documentation for the web site. For each specific component, identify a set of detailed components. For the operating system, this should include detailed components like CPU utilization, file systems, paging space, memory utilization, etc. These detailed components will become the focus of the monitors that will be used for ensure the availability of the infrastructure. Table 1 contains examples of infrastructure specific components and component details.1
Table 1.
Infrastructure identification table

Specific Component Component Details
Operating System CPU utilization

File systems

Paging space

Memory utilization

OS processes daemon (service) monitoring (Virus alert, log service, etc.)

Interface status

Network utilization

Packet loss

External access

Network collisions

Network processes daemon (service) monitoring
Server Hardware RAID array disk failure

CPU failure

Disk drive failure
Network Hardware Switch status

Router status
Other Devices Load balancing device status

Firewall status

Caching server status


Back to Table of Contents


3.2 Creating the application table
For the Web site's application focus, the concentration is on the database, middleware, and the application itself. Table 2 contains examples of application specific components and component details.
Table 2.
Application identification table

Specific Component Component Details
Application Application processes daemon (service) monitoring
Database Database processes daemon (service) monitoring

Communication support monitoring

Backup success monitoring
Middleware Middleware processes daemon (service) monitoring

Queue monitoring

Channel monitoring


Back to Table of Contents


3.3 Creating the business table
For the business aspect of the system perspective, the focus is on relating the applications as business systems. We first observed this idea from Tivoli's Global Enterprise Manager product (Tivoli global enterprise manager instrumentation guide, 1998). However, our approach in this paper is product independent. To relate applications as business systems, components are grouped and taken as a whole. Views are used to visually manage the business systems. Business system monitoring is more inclusive than regular monitors are. For example, a business system monitor could have a monitor called all business system interfaces. This monitor could check end-points of all application interfaces, check availability of queues between applications, and run a test transaction --all part of the same monitor -- all related to one business system. Business system command and control has function that is more inclusive than a regular command. For example, the command could have a f! unction to shutdown and restart selected or all daemons (services) of an application for all servers in the business system. Table 3 contains examples of business specific components and component details.
Table 3.
Business identification table

Specific Component Component Details
Business system view(s) A view or views that contains related applications and components
Business system monitor(s) Checks all application interfaces, checks availability of queues between applications, and run a test transaction (health check)
Business system command(s) Shutdown all (or selected) daemons (services) of a business system

Startup all (or selected) daemons (services) of a business system

Restart all (or selected) daemons (services) of a business system

Display all (or selected) events (traps) of a business system

After the infrastructure, application, and business components are identified and documented important data should be transferred to the monitoring evaluation table.


Back to Table of Contents


4 Creating the monitoring evaluation table
The monitoring evaluation table serves as a tool to identify monitors to be used (or developed) to address the management needs of the web site. The table also contains information about scripts, commands, and views that are needed for the management of the Web site. Table 4 is an example of a completed monitoring evaluation table.
Table 4.
Monitoring evaluation table

Specific component Monitored today? Current Tool (or proposed) Evaluation
CPU utilization Yes Tivoli DM2 Current monitor is effective
File systems Yes Tivoli DM Current monitor is effective
Paging space Yes Tivoli DM Current monitor is effective
Memory utilization No (Tivoli DM) Proposed monitor will be effective
OS Processes Daemon (services) No (Tivoli DM) Proposed monitor will be effective
Interface status No (NetView)2 Proposed tool will be effective
Network utilization No (NetView) Proposed tool will be effective
Packet loss Yes NetView Current tool is effective
External access Yes ESM2 Current tool is effective
Network collisions No (Trend)2 Proposed tool will be effective
Network processes daemon (service) monitoring No (Tivoli DM) Proposed monitor will be effective
RAID array disk failures No (Researching) Site has had problems that have gone undetected
CPU failure No (Researching) Site has had problems that have gone undetected
Disk drive failure No (Researching) Site has had problems that have gone undetected
Switch status Yes NetView Current tool is effective
Router status Yes NetView Current tool is effective
Load balancing device status No (Tivoli DM) Proposed monitor will be effective
Firewall status No (Custom script) Proposed script will be effective
Caching server status No (Tivoli DM) Proposed monitor will be effective
Application processes daemon monitoring No (Tivoli DM) Proposed monitor will be effective
Database processes daemon (service) monitoring No (Tivoli Oracle Module) Proposed monitor will be effective
Communication support monitoring No (Tivoli Oracle Monitor) Proposed monitor will be effective
Backup success monitoring Yes Tivoli DM At times, unsuccessful backups have gone undetected
Middleware processes daemon (service) monitoring No (Tivoli MQ Module) Proposed monitor will be effective
Queue monitoring No (Tivoli MQ Module) Proposed monitor will be effective
Channel monitoring No (Tivoli MQ Module) Proposed monitor will be effective
Business views Yes NetView Current views will need to be enhanced
Business monitors No (Custom monitors using Tivoli DM) Proposed monitors will be effective
Business commands No (Custom scripts) Proposed commands will be effective


Back to Table of Contents


5 Support perspective
The key components of the support perspective are team, process and tools. These components are explored in the context of four disciplines -- change, problem, performance, and security. Change, problem, performance, and security management are widely practiced disciplines in the industry. They are basic to IBM systems management (Mangold & Brandner, 1993) and many other organizations focused on the management of systems like ISIL (Introduction to ITIL, 2001). The scope of the support perspective is both broad and narrow. The broad scope has to do with the readiness of change, problem, performance, and security teams to handle the Web site's needs. The narrow scope is how monitoring, command, and control interact with the specific functional perspectives. For example, during a change window, how is monitoring handled to avoid a flood of false alerts?
5.1 Change management
Change management is a process whose goal is to provide defect-free implementation of changes to the system environment. This process includes planning and documentation of the change, real time management of the change, verification of completion or, in the case of failure, verification of restoration back to the original state, and follow-up analysis and reporting. Changes, including both installation and modifications, should be accomplished in a logical and orderly fashion, which achieve the expected result without undesired service disruption. From a system-monitoring viewpoint, we see three important issues:
  1. How to shut down the monitoring system during a change activity in order to avoid a flood of false positive alerts

  2. How to reactivate the monitoring system follows a change activity to verify that all systems and services are functioning normally.

  3. How to convert problem management activities into change activities
From an overall management viewpoint, the change evaluation table has a series of questions that are designed to discover how prepared the change management team, tools, and processes are to handle the Web site being evaluated. Table 5 is an example of a completed change evaluation table.
Table 5.
Change evaluation table

Specific component Component details Evaluation
Team Who are the team members? Change team is in place with three members

What skills, experience, training does the team posses? Change team is experienced and well trained in the change-management discipline

What coverage is provided by the team? Is this adequate? Team works normal business hours. Recent problems during after-hours change windows may require change-management team coverage to expedite decision-making.

Are there any change measurements that can used to evaluate the effectiveness of the team? If so, what data? Few measurements exist at this time. Change success rate cannot be determined. Both a method and tool is needed to measure change success rate.

What are the strengths and weaknesses of the team? Team understands its well-defined process. Team does not have organizational support. Significant changes that should be done in the weekly window are done without change team awareness.

What other teams are key to the success of this team? Are there any issues? Problem team is related to change team as some problems result in changes being scheduled in change windows. There is no tool linkage between change and problem. Two different software tools are used and they do not work together.

Based on the above evaluation, what are the primary issues? What is the action plan to resolve? Action items include --
  • Recommendation: change team support for change window

  • Recommendation: measurements

  • Recommendation: more changes need to be planned and scheduled

  • Recommendation: interface is needed between problem and change
Tools What is the primary tool for tracking and managing change activities? Information management tool with change-management panels

What are the strengths and weaknesses of this tool? Well known and understood legacy tool; Access is only through mainframe (TSO)

What other tools are used to manage change activities? Some Databases are used to store detailed plans and other documents

Evaluate and develop action plan to address deficiencies. Action items include --
  • Recommendation: Web access to change records

  • Recommendation: Use Web to integrate change records and supporting plans
Process Is there a document, which defines the organization's change management policies and procedures? Yes, well defined

Is this consistent with actual practice, if not, where are the gaps No, significant changes appear to take place outside change window

Evaluate and develop action plan to address deficiencies Action items include --
  • Recommendation: Look for a root cause of unmanaged change
Overall What reports are available to review change activity performance? Are they adequate? Few reports

What is the leading cause of failed changes? What is being done to address this? Not tracked

What is an acceptable level of failed changes? Is this measure being met? No plan in place

What is the level of customer satisfaction with change management? If customer satisfaction is low what are the reasons why? Unknown

Evaluate and develop action plan to address deficiencies. Action items include --
  • Recommendation: Initiate change management reporting

  • Recommendation: Document acceptable level of failed changes

  • Recommendation: Initiate measurement of customer satisfaction with change management


Back to Table of Contents


5.2 Problem management
Problem management is the successful awareness of and response to all monitoring tool alerts and other manually reported or detected problems and the resolution of any events, conditions, failures, etc indicated by this information. The entire set of activities is focused on ensuring that the site is available and functioning in the manner in which it was designed. The essential issues include:
  1. Configuring automated alert tools with the appropriate level of sensitivity - A tool that queries too often can impact the functioning of the site, if it queries too infrequently it will let problems remain undetected for an unacceptable length of time. Tools that are too sensitive can also generate too many alerts or tickets, creating a flood of false positives that obscure the actual functional state of the system. Tools that are too insensitive miss problems and have little value. Getting the balance correct can be a challenge.

  2. The 7x24 nature of e-business web sites and the number of systems used in the typical web site indicate that in order to scale the problem management system and control costs some degree of automation is required. Automation can insure a rapid response to simple problems regardless of when they occur.

  3. How to achieve timely resolution of problems? This is possible but you need the right people, tools, and processes.
From an overall management viewpoint, the problem evaluation table has a series of questions that are designed to discover how prepared the problem management team, tools, and processes are to handle the Web site being evaluated. Table 6 is an example of a completed problem evaluation table.
Table 6.
Problem evaluation table

Specific component Component details Evaluation
Team Who are the team members? There is no problem team as such -- problems are handled by helpdesk personnel. Tough problems are sent over to the administrators of the Web site.

What skills, experience, training does the team posses? Problem handlers have basic skills in problem determination; administrators know the application but could benefit from documented problem-determination procedures.

What coverage is provided by the team? Is this adequate? Problem handling is done 24 X 7; administrators are available by pager after normal business hours.

Are there any problem measurements that can used to evaluate the effectiveness of the team? If so, what data? Administrators analyze problem records monthly. Root cause analysis is done on all serious problems.

What are the strengths and weaknesses of the team? There is no problem team but relationship between problem catchers and administrators works well.

What other teams are key to the success of this team? Are there any issues? Change team should be linked with problem solvers -- some problems result in changes (both scheduled and non-scheduled)

Based on the above evaluation, what are the primary issues? What is the action plan to resolve? Action items include --
  • Recommendation: document and refine problem determination procedure

  • Recommendation: open communication channels between problem solving group and change team to improve number of changes that go through a controlled process
Tools What is the primary tool for tracking and managing change activities? Information management tool with problem-management panels

What are the strengths and weaknesses of this tool? Well known and understood legacy tool; Access is only through mainframe (TSO)

What other tools are used to manage change activities? Some databases are used to store root-cause analysis and other documents

Evaluate and develop action plan to address deficiencies. Action items include --
  • Recommendation: Web access to problem records

  • Recommendation: Use Web to integrate problem records and related documents like root-cause analysis
Process Is there a document, which defines the organization's problem management policies and procedures? Yes, but high-level

Is this consistent with actual practice, if not, where are the gaps Not enough detail to really determine

Evaluate and develop action plan to address deficiencies Action items include --
  • Recommendation: linked to an earlier recommendation -- document and refine problem determination procedure
Overall What reports are available to review problem activity performance? Are they adequate? Problem reports are detailed

What is the leading cause of problems? What is being done to address this? Application instability; application is being migrated to more robust software implementation

What is an acceptable level of problems? Is this measure being met? Team is looking for 99.5% availability of application

What is the level of customer satisfaction with problem management? If customer satisfaction is low what are the reasons why? Not measured but not believed to be a major problem

Evaluate and develop action plan to address deficiencies. Action items include --
  • Recommendation: document acceptable level of problems

  • Recommendation: Initiate measurement of customer satisfaction with problem management


Back to Table of Contents


5.3 Performance management
Performance management is focused on the measurement and reporting of system resources by the application and its users. Performance management can be used to report problems in real time but is generally used to determine performance trends and to plan for necessary resources upgrades or modifications. From an overall management viewpoint, the performance evaluation table has a series of questions that are designed to discover how prepared the performance management team, tools, and processes are to handle the Web site being evaluated. Table 7 is an example of a completed performance evaluation table.
Table 7.
Performance evaluation table

Specific component Component details Evaluation
Team Who are the team members? Performance team is in place with four members

What skills, experience, training does the team posses? Performance team is experienced and well trained with performance tools

What coverage is provided by the team? Is this adequate? Team works normal business hours. Most performance management is done after the fact -- most tools are not real-time

Are there any performance measurements that can used to evaluate the effectiveness of the team? If so, what data? Few measurements exist at this time. Team consults with Web administrators and shares performance information

What are the strengths and weaknesses of the team? Team understands tools and does a good job in its consulting role. Team is not equipped to deal with "emergency" performance problems.

What other teams are key to the success of this team? Are there any issues? Performance team works with Web administrators and teams working problems

Based on the above evaluation, what are the primary issues? What is the action plan to resolve? Action items include --
  • Recommendation: performance team needs a methodology to work emergency performance problems
Tools What is the primary tool for investigating performance problems? Team uses utilities that are part of OS and records statistics to log files

What are the strengths and weaknesses of this tool? Tools are well-known and easy to use and interpret. Tools require systems administration level skills.

What other tools are used to manage performance? Just scripts and basic reporting tools like SAS

What specific performance metrics are collected CPU utilization, page space, memory, and disk utilization

What is the timeframe for the collection of performance data Data is stored for the past 6 months

Evaluate and develop action plan to address deficiencies. Action items include --
  • Recommendation: Real-time tool is needed to support emergency performance problems
Process Is there a document, which defines the organization's performance management policies and procedures? No, performance management is not a core discipline

Is this consistent with actual practice, if not, where are the gaps Team just provides consulting-level assistance

Evaluate and develop action plan to address deficiencies Action items include --
  • Recommendation: It is unclear if performance-management focus need to be more formal as performance of the site is handled carefully by the administration and performance community
Overall What reports are available to review site performance? Are they adequate? No real reports, just ad-hoc reporting

What is the leading cause of performance problems? What is being done to address this? Not tracked

What is an acceptable level of Web site performance? Is this measure being met? No plan in place

What is the level of customer satisfaction with the performance of the site? If customer satisfaction is low what are the reasons why? Unknown. Current measurements are manual and not shared with the site owners and administrators.

Evaluate and develop action plan to address deficiencies. Action items include --
  • Recommendation: Initiate performance reporting

  • Recommendation: Document acceptable level of performance

  • Recommendation: Initiate measurement of customer satisfaction with site performance


Back to Table of Contents


5.4 Security management
Security management has the goal of maintaining the integrity of controls regarding who has access to what areas of the system and what is viewable or changeable. Security management takes a variety of forms including perimeter security (firewalls and site hardening), authentication/authorization (passwords and associated permissions), intrusion detection, and policy oversight. From an overall management viewpoint, the security management evaluation table has a series of questions that are designed to discover how prepared the security management team, tools, and processes are to handle the Web site being evaluated. Table 8 is an example of a completed security evaluation table.
Table 8.
Security evaluation table

Specific component Component details Evaluation
Team Who are the team members? Security team is in place with two members

What skills, experience, training does the team posses? Security team is experienced and well trained in security products

What coverage is provided by the team? Is this adequate? Team works normal business hours. Team members are frequently paged for security alerts and emergency changes.

Are there any security measurements that can used to evaluate the effectiveness of the team? If so, what data? Few measurements exist at this time. Security information is kept confidential and incidents are not disclosed to the public or customers.

What are the strengths and weaknesses of the team? Team understands products. Team is not always aware of demands place upon them from new technology choices.

What other teams are key to the success of this team? Are there any issues? Team works well with other teams as needed.

Based on the above evaluation, what are the primary issues? What is the action plan to resolve? Action items include --
  • Recommendation: consider a change in hours worked by team. Two 12-hour shifts might better align with Web site demands.

  • Recommendation: Improve communication so security team gets advance warning on security needs of new sites
Tools What is the primary tool for tracking and managing security activities? Electronic mail to security team leader

What are the strengths and weaknesses of this tool? At times, due to high volumes, some requests are missed

What other tools are used to manage security activities? Team uses web tools to get latest security updates and to perform searches

Evaluate and develop action plan to address deficiencies. Action items include --
  • Recommendation: Consider using a work flow tool instead on electronic mail so work requests are less likely to be overlooked
Process Is there a document, which defines the organization's security management policies and procedures? Yes, well defined

Is this consistent with actual practice, if not, where are the gaps Yes, security team handles all security-related activities for the site

Evaluate and develop action plan to address deficiencies Action items include --
  • No process recommendations
Overall What reports are available to review security activity performance? Are they adequate? Some reports are available but are kept in confidence

What is the leading cause of security breaches? What is being done to address this? Some hacking has been targeted at the site but it has no serious exposures at this time

What is an acceptable level of security breaches? Is this measure being met? No exposures are tolerated as the site contains financial data. It is believed that no financial loss has happened with the site due to hacking.

What is the level of customer satisfaction with security management? If customer satisfaction is low what are the reasons why? The team works with auditors to supply the required logs and reports. Customer satisfaction is high.

Evaluate and develop action plan to address deficiencies. Action items include --
  • No overall recommendations


Back to Table of Contents


6 End user perspective
The focus of the end user perspective is on measurement, evaluation, and improvement. From an overall management viewpoint, the end user evaluation table has a series of questions that are designed to discover how well availability, performance, consistency, and reliability are being addressed. This perspective is focused on the measurement and improvement in the quality of the end user experience. Table 9 is an example of a completed end user evaluation table.
Table 9.
End user evaluation table

Specific component Measurement Evaluation Improvement Tool
Availability Is the site reachable? Based on problem records, it is available 95% of the time Could benefit from an approach that proactively tests the site Silk scripts created for deployment stress tests2

Are the site's functions operational? Too rich to test all application functions Use a sampling approach to proactively test site functionality Silk scripts created for deployment stress tests are usually a good sample
Performance What is the overall response time of the site? No measurement in place At minimum, should have a performance dataset Keynote Perspective could be used to establish basic performance measures2

Do any functions exceed a maximum time to complete? No measurement in place; Service level objective handles availability not performance Should establish performance SLA Keynote Perspective could be used to measures performance
Consistency Are the content and values that are returned by the site consistent from moment to moment and consistent with the site design and configuration? No measurement in place Create Silk script that tests key consistency measures Use Silk scripts that interact with the application exercising key consistency measures
Reliability Is the content returned correctly? Are all links functional? Are all values returned correctly? No measurement in place Create Silk script that tests key reliability measures Use Silk scripts that interact with the application exercising key reliability measures


Back to Table of Contents


7 Review report or presentation to complete the review
When the analysis is complete, a report or presentation should be created to share the finding and recommendations. We have found that some teams respond to a detailed written report whereas others require a presentation. The report or presentation should contain the following parts:
  1. Executive summary

  2. Detailed discussion and findings including a discussion of exposures and opportunities

  3. Recommendations
The report should also include two appendixes. Appendix A should contain the completed tables. This supporting detail will lend support to the findings and recommendations in the body of the report or presentation. Appendix B should contain a high-level implementation plan. This plan will provide linkage to the next phase of the project.


Back to Table of Contents


8 Summary
This paper described a framework and supporting method to evaluate management tools and methods for new or existing web site. Three perspectives are the basis of the framework -- system, support, and end-user. The method is used for proactive planning or because of problem solving activities. The method involves three steps -- preplanning, analysis, and review. A number of tables were used to support the analysis associated with the method. Findings and recommendations were discussed in a report or presentation. Appendix B of the report contains a high-level implementation plan.

About the authors
Joseph Gulla is a Senior Consulting Information Technology Manager for IBM Global Services. Mr. Gulla works with customers of the IBM Web Hosting facility in Research Triangle Park, NC. Mr. Gulla has written a number of IBM internal publications and co-authored the IBM Redbook Distributed systems management design guidelines: The smart way to design (1996). Mr. Gulla has spoken at a number of technology forums including Planet Tivoli, ESM Sharenet, IBM Security Seminars, and the NorthEast Information Systems User Group where he served on the Advisory Board. He has held a number of technical and management positions over the last 22 years.

John Hankins is a Senior Architecture and Delivery Specialist on the IBM Global Services Web Offerings team. He has worked in the field of Internet services development and management for the past 10 years. The past 5 years have focused on the specific area of web hosting services. Mr. Hankins has presented at numerous conferences on Internet services and has served as a co principal investigator on three National Science Foundation grants. He has held management positions in a variety of technology organizations over the past 15 years.

Footnotes
1. About the example tables: These tables do not contain information about a specific IBM web-hosting customer. The tables are also not representative of IBM's change, problem, performance, and security management policies. The data in the tables has been made up simply to illustrate the main points of the paper and the power of the methodology.

2. Copyright information: Tivoli and NetView are trademarks of Tivoli Systems, Inc; ESM is an abbreviation for Enterprise Security Manager which is a trademark of AXENT Technologies, Inc; Trend is a trademark of Desktalk, Inc; Silk is a trademark of Trinagy, Inc; Keynote Perspective is a product of Keynote Systems; All other products and company names are either trademarks or registered trademarks of their respective companies.



Back to Table of Contents


References

Berthold, M. & Brandner, R. (1993). Systems and network management in distributed environments. Research Triangle Park: International Business Machines.

Event management and notification - White paper. (2000). http://www.bmc.com/rs-bin/RightSite/getco. Accessed August 15, 2000.

Frick, Vaughn. (2000). Transforming the enterprise to embrace e-business. Presentation delivered on March 7, 2000 at IBM Managers meeting in RTP, NC.

Harikian, V., Blust, B., Campbell, M., Cooke, S., Foley, R., Gulla, J., Gayo, F., Howlette, M., Mosher, L., & O'Mara, M. (1996). Distributed systems management design guidelines: The smart way to design. Research Triangle Park: International Business Machines.

Introduction to ITIL Books. (2001). http://www.pinkelephant.com. Accessed April 11, 2001.

Leadership for the new millennium, delivering on digital progress and prosperity. The third annual report of the Electronic Commerce Working Group. (2001). http://www.ecommerce.gov/. Accessed April 11, 2001.

Patrol 2000 by BMC software. (2000). http://www.bmc.com/patrol. Accessed July 26, 2000.

Strum, R. & Bumpus, W. (1999). Foundations of application management. New York: John Wiley & Sons.

Tivoli global enterprise manager instrumentation guide. (1998). Raleigh, NC: Tivoli Systems.

Tivoli product index. (2000). http://www.tivoli.com/products/index/. Accessed August 31, 2000.

U.S. White House. (1997). A framework for global electronic commerce. http://www.ecommerce.gov/framework.htm. Updated July 1, 1997. Accessed May 22, 2000.

Welter, P. (1999). Web server monitoring white paper. http://www.summitonline.com/apps-databases/papers/fhesh-man.html. Accessed February 9, 1999. Author's email: pete@freshtech.com.