NYU Home: Combining Internet Tools into Personal Digital Agents
David ACKERMAN <firstname.lastname@example.org>
The Internet allows for a custom-made world that teaches us to reinvent bureaucracies, even those of higher education which are rich in conservative tradition. Too often, information services are focused on the needs of the bureaucracy and ignore personal utility. This paper will describe a new application at New York University centered on the individual.
NYU Home creates personal digital agents that individual students and faculty can adapt to their particular needs. These digital agents search, organize, retrieve, and send information that is custom-designed to fit each student and faculty member.
NYU Home revolutionizes the way members of the NYU community get their information, making it one of the most useful and practical campus communication protocols. NYU Home's primary strength is its utility to the individual. Empowering individuals not only facilitates achieving the mission of an organization, but also allows the organization an opportunity to profile its customer and staffing needs and respond to a constantly changing marketplace.
NYU Home draws on the utility of basic Internet tools (authentication, e-mail, e-mail lists, Web and discussion groups). In this paper we describe combining and integrating them in creative ways, building a modular, scalable system: an SQL database containing user attributes; a kerberos and SSL-enabled Web server for encrypted authentication; e-mail notifications for those selecting the option; Web submission and presentation of both public and personal information; and access to e-mail lists and discussion groups selected by the system or individual. The architecture and process, while particular to a New York University project in progress, contain a number of examples of general interest, where the combining of Internet tools creates synergies resulting in powerful information services.
Agents and agency have been the object of study for centuries. They were first considered in the philosophy of action and ethics. In this century, with the rise of psychology as a discipline, human agency has been studied extensively. In the 1970s and 1980s, artificial intelligence research turned toward narrower studies of specific techniques , one of which we will deal with, called "personal digital agents" or "software personal assistants." One of the early founders of the idea of software personal assistants, Nicholas Negroponte, envisioned the use of what he called "digital butlers."  They would do the work of various individuals performing everyday tasks by acting, for example, as a phone receptionist, a secretary, and a money manager.
The term "agent" came into popular use during the 1980s and 1990s in the database, operating system, and networking communities. An agent could perform a database transaction, fork a process, or contact a router. Agents -- proxies used to poll the state of an underlying entity -- were simply an interface to a system. 
According to IBM's "Intelligent Agent White Paper," intelligent agents are viewed as "software entities that carry out some degree of independence of autonomy, and in so doing, employ some knowledge or representation of the user's goals or desires." They are "technology that is making computer systems easier to use by allowing people to delegate work back to the computer. They help do things like find and filter information, customize views of information, and automate work." 
As the number of technology research fields have proliferated, so has the number of agent-related terms. A confusing array of phrases has come into use: "intelligent agent," "personal assistant," "software agent," "interface agent," and the like. Definitions and terms have been modified to suit the distinct features of individuals' and organizations' agent models.
We take the same liberty with "personal digital agent." The key concept is the utility to the individual. NYU Home (or Home, for short) utilizes agents that personalize information retrieval and filtering based on individual profiles and desires. Pattie Maes, a professor at the MIT Technology Media Lab, also stresses the importance of software agents providing active, personalized assistance to those using computer or information services. 
In fact, most information services provided by organizations focus on the organization, not the individual. Most external Web sites of organizations focus on marketing or public relations. They are centered on the organization, not the person accessing the site.
Internal systems of the organization also fail to properly cater to the individual. For example, most organizations have some type of human resource database. When these databases are opened up to the employees in the organization, the only functions enabled are typically done so as to lessen the burden of answering phones, filling out paperwork to change information, and other similar activities. Although this provides some utility to the employee, the failure to design the system with the individual as the center, rather than the database, creates a system lacking substantial potential value to both the organization and the staff.
Even businesses providing services such as securities trading, where customers are given access to their accounts, frequently fail to focus on the individual. Each person is given the same screens, options, and defaults as every other customer, even after using the system for months. Why doesn't the interface adapt to the person using it, responding to preferences and habits?
The Internet allows for a custom-made world that teaches us to reinvent bureaucracies, even those of higher education which are rich in conservative tradition. NYU Home creates personal digital agents that students and faculty can adapt to their particular needs. These digital agents search, organize, retrieve, and send information that is custom-designed to fit each student and faculty member.
When one discusses the architecture of a computer system, one typically starts with the computers, the operating system, or the software. At the heart of NYU Home is the notion of individual -- each person is unique and is going to use NYU Home with different interests and different objectives in mind. When considering those accessing the system, we needed to consider that faculty and students at a large university use computers in their offices, classrooms, and dorm rooms and at public kiosks, computer labs, and at home. It was desirable to create a way for them to visit Home without noticing a difference based on their location.
NYU Home is organized around groups and categories of information. Several categories fall into logical groups. An example of a category is the global category, focusing on information about international programs, resources, and activities. Global groups include African Studies (Black culture, literature, history, and politics) and Economics and Business (international trade policies, economic integration, aid policies, international marketing, and finance). Other categories include functional departments, various activities, and the like.
Each group has an owner or set of owners, as does each category. Group owners can create and delete categories within their group as well as submit, delete, and approve items of information (for example, events) to any of those categories. Similarly, category owners can administer items of information within categories they own. They choose the categories that information belongs in and decide how to distribute the information. Examples of the means of distributing the information are pushing it as e-mail into a list, putting it into a Web page, or submitting it to the appropriate calendar(s). Groups and categories can have mailing lists, discussion groups, chat areas, and calendars. Individuals can customize Home to access the items of interest for which they are authorized, and they can select various methods of receiving information -- for example, on the Web, in daily e-mail, or in a weekly e-mail summary.
The overall architecture is designed to be completely modular. Our philosophy is to build using pieces that we can "snap in" and "snap out" when it suits our purposes. In fact, we have changed operating systems, database software, Web servers, discussion group software, mailing list software, and chat servers -- all without difficulty. This makes adding new (or eliminating) services quite simple. By integrating the basic Internet tools in creative ways, synergies result in powerful information services.
All members of the NYU Community are given a "NetID," printed on their NYU ID card, upon arriving at the University. Using the NetID, one can activate an Internet account and set a password (we actually call them "passphrases" and enforce certain attributes on them to encourage more secure passwords). These information flows are encrypted using SSL.
The password is stored in the central authentication service (actually a group of servers) -- a kerberos server. Kerberos was selected because it is scalable and robust. It is available free of charge from MIT.
NYU operates a very large modem pool, and it was there the kerberos system was first inaugurated. Next, the kerberos authentication was added to several of the shell account machines. Then we began to use it on the Web servers for services requiring authentication.
Although a full discussion is beyond the scope of this paper, having a single sign-on throughout an organization's computing facilities is a great service to individuals. While there are many legacy systems at the University that do not presently use the central authentication service, we believe the convenience to the individual is so great that over time, most technology requiring authentication will eventually choose to use this method.
If a person who is entering for the first time has entered a valid NetID and password but has not established a profile, Home returns the profile wizard. Even without using the wizard, a person is given a default page based on what the system already knows about him or her. For example, NYU Home knows the students' current classes. The discussion group, list, and chat area for each class are automatically linked in. A personal calendar is seeded with his or her schedule of classes, exams, deadlines, and more. Faculty members by default have access to departmental collaborative work and research areas, Web areas for the classes they are teaching, administrative tools for discussion groups, e-mail lists, chat areas, class e-mail lists, and the like.
If the person has established a profile, Home will return the customized page based on the profile and then check to see if the person is authorized to use any special services of the system. For example, if the person has permission to submit events, he or she will also be given those options on his or her customized page. The NetID and password follow the person throughout his or her session to maintain security. Furthermore, referrers are checked to make sure the person is not manipulating hidden form fields when submitting information. We are quite concerned with system security.
At NYU, we have experience with many of the most popular Web servers. Our first central Web server, which began operation in 1994, was from NCSA. Recently, we have been quite pleased with the speed, features, and flexibility of the free Apache Web server . When we adopted kerberos as our central authentication function, we added a kerberos module for our Apache server.
When we wanted to do SSL, we added Stronghold servers to our site . Outside the USA, one can use the free Apache SSL server instead of Stronghold . The kerberos module we were using for Apache fit right into Stronghold, as the architecture of the two servers is virtually identical except for the SSL functionality in Stronghold.
Wherever possible, we prefer to use Perl  because it is
Our basic performance criteria are adopted from Jakob Nielsen's research on Web site speeds. Our goal is that each Web page will be in use by the person requesting it within 10 seconds of the request over a 28.8 Kbs modem. We have found that the more serious performance bottlenecks in this type of application occur in the database query turnaround times rather than in the choice of using CGI or of programming languages.
The primary use of our in-house programming is to integrate various existing software packages to suit our purposes. The packages are written in various languages and we are normally not concerned with the language of the packages.
We also prefer to keep our code portable. It is advantageous to be able to change from one hardware vendor to another. In fact, we started development using Digital UNIX, and we have been able to switch among Digital UNIX, Solaris, Linux, and others with virtually no effort. Perl facilitates this flexibility.
The actual database management package could be any of a number of popular SQL implementations. This is because we use the Perl DBI (Database Independent Interface) . Using this system one can write programs using one database and then switch the underlying database package by installing a new "driver." Our pilot was done with PostgreSQL which is a highly robust free SQL relational database package . We are changing to Oracle as we move into the next phase of the project. However, one can use many other packages, a number of which are free. Here is a list of database packages currently known to work with DBI:
Perl scripts integrate the various software facilities. Once a person is authenticated, his or her page is displayed. This initial page is not generated on the fly from the database every time an individual logs in; rather, it is generated only when there have been changes. A cgi program that allows the editing of an individual's pages can be invoked to write changes and generate a new page.
The database is central to the Home software. We use modified entity-relationship diagrams , some of which are presented below, to describe the tables in the system and how the tables are related to one another. The field lists for some representative database tables are in boxes. The Key fields are in italics. The Table Names are bold. Where a field refers to another table, the reference is listed after the field name in the form of table.field. In the sections below, we describe some of the tables of interest and we include only a portion of the data elements.
Mailing lists are used not only for discussions, but also for announcements. Lists can exist for a variety of purposes, including use by classes for academic discussions, by administrative units for coordinating their business, and by groups of individuals interested in various categories of information -- a subset of the same categories around which Home is structured.
One can subscribe to a list by simply checking a box in the profile wizard. After subscribing to a list that is one of the Home categories, one can expect to receive, in addition to the list traffic, targeted announcements and information -- items pushed into the list by category and group owners.
There are behind-the-scenes programs running for list creation, administration, and the like. Some of the tables that we use in managing lists follow.
Events are stored in the events table. There is an association between lists and events, such that upcoming events are announced to particular mailing lists. In this way, individuals get information through e-mail only about events of interest. Each day, a program mails information about the rich, diverse group of events around the University.
The programs that interact with the events tables include tools to add, delete, or edit event records, tools to manage access to the tables, and cron software to automatically e-mail notices.
Information about people is uploaded from other University systems on a daily basis. Much of this information is confidential to some degree; thus, we pay significant attention to system security.
Our classes section allows each student to access pertinent information and resources about his or her classes and to participate in the online class discussions. It allows the faculty to better communicate with the class. Each class is given a collaborative work area -- the tools of which continue to evolve -- in support of the educational experience.
The class facilities let the system know which classes there are, who is in them, and who instructs them. Each class can have a selection of network support facilities such as a mailing list, discussion group, chat, and Web area. The programs that interact with class tables import the enrollment information and allow the instructor(s) to edit class information, such as the URL.
Some specialized information comes in the form of feeds. For example, news wires, sports scores, and weather services are fed into an HTML page or an applet. Information is kept in the database that allows the placing of this onto a person's page as requested.
Each person's profile is kept in an HTML page that is created both from information in the database and selections made by the person. The page is regenerated at least each night and in response to update requests when new choices are made in the profile wizard. In addition, the location of items on the page can be intelligently updated based on the results of log scans.
Through the profile wizard, after a person is authenticated via kerberos, his or her NetID becomes available to the page update programs. This allows the system to find and edit the person's profile, filling in the information variables with data that has been requested or changed. A browser can then be set to point to one's NYU Home page as a startup page.
There are a number of possible alternative page-loading mechanisms, including SSI, php, and various proprietary Web scripting mechanisms. Other than the content that is "real-time" (see below), we have chosen to keep the pages as static files in order to maximize the system performance and minimize the number of database accesses.
Some of the content for Home, such as weather and sports, is "real-time." These feeds can be obtained in a server-push multi-part display that comes in a frame. For example, the weather forecast is presented as a graphic which is automatically drawn by system software based on data in the weather feed. It is possible to handle other data in a similar way, and it works with most graphical browsers. Server-side include techniques are also used so that text-only browsers can handle the information.
Our current list server software is ListProc from CREN . However, our tests have shown that LISTSERV  or the free package Majordomo  is also fine. It doesn't take a powerful computer to run this service. We found that Majordomo on a 200 MHz Pentium running Linux and sendmail can deal with at least 160,000 messages per day. LISTSERV software running on the NetSpace.org Linux machine delivers 200,000 messages per day . Our ListProc implementation is currently running on a DEC Alpha and uses a PMDF mail server.
We offer discussion groups using Lotus Notes Domino . However, free packages such as WWWboard by Matt Wright will also work just fine . We also offer local newsgroups upon request, but that facility is not as easy to use as the Web-based boards. Discussion groups are linked in automatically, based on information Home knows (e.g., a student is enrolled in a class, or a faculty member is part of a department), or by choice from the profile wizard.
We also are planning to offer each class a chat service. The package Every Chat is a free perl-based local chat room . IRC chat server software is also free of charge, and we have tested the free WWW-IRC Gateway . This allows a Web browser to be used in place of an IRC client. Chat groups are linked to Home in the same way as discussion groups.
The first version of any software project must, by necessity, be limited in the features offered. Subsequent versions allow the developers to add requested and useful functionality and indeed, we have a number of options we want to add. In addition, we intend to utilize additional technical methods. For example, XML seems to hold substantial promise for our work.
There are several simple functions we will be adding. A "bookmark" function will allow a person to include his or her favorite links directly on his or her page. An integrated e-mail function will display e-mail directly from Home. In addition, a personal research agent is being designed which allows a person to have multiple sources searched for his or her declared interests.
One piece of future work involves adapting the interface for each individual by analyzing logs. A simple addition is to rearrange choices on the page based on usage patterns. A more complicated analysis would check for patterns of usage. For example, if someone frequently uses Home in the same way each time, say first by checking for e-mail, then going to a discussion group, then exiting, we would want to offer that usage path by default.
Another project is to analyze various information that is accumulating as the natural result of use of the system. For example, how many employees are fans of the New York Knicks (a local basketball team)? If this number is large, perhaps a University event could include discounted tickets to a Knicks game. What are the interests of the current students, and how can they be used to make the educational experience more satisfying for them? How can these profiles be used as a marketing tool to attract prospective students?
Each of the Internet tools we are using is useful by itself. It is the combination of these tools that creates powerful information services. It is possible to use entirely free software to do this. From the operating system to the database to the Internet software, some of the best options are the free ones.
The services should be built centered on the individual. Organizations who fail to employ systems centered on the individual are missing two types of opportunities. First, they fail to provide the best possible services to customers, members, employees, etc. In the case of the university mission, this means supporting the teaching, learning, research, and operation of the institution.
Second, as the personal digital agents are used, a rich environment of information becomes available for the agent creators. The preferences and habits of the individual are of value not only for feedback into the system in order to improve it, but also for marketing the University to potential students and faculty and for a great variety of other possibilities. Creating an information service centered on the individual allows an organization to profile its customer and staffing needs and respond to a constantly changing marketplace.
Special thanks to Kristina Abeson, Alison Kraskey, and Grace Moon for their valuable assistance.