INET Conferences


Conferences


INET


NDSS

Other Conferences


[INET'98] [ Up ][Prev][Next]

The FARNET States Inventory Project: Using Distributed Maintenance to Create and Maintain a Comprehensive Public Database of Subnational Information Infrastructure Planning and Development

Casey LIDE <casey@farnet.org>
FARNET
USA

Abstract

Much of the early discussion surrounding the creation of a US National Information Infrastructure (NII) occurred at a national level, with a focus on national issues and national participants. By the mid-1990s, however, it became increasingly clear that much of the work in building and regulating the NII would occur at a local and state level, with local and state actors. No forum existed to facilitate the efficient exchange of information among these subnational builders of the NII, and state policy makers had no reasonable opportunity to learn from and cooperate with their counterparts in other states.

Begun in 1995, the FARNET States Inventory Project is a three-year National Science Foundation (NSF)-funded initiative to provide a comprehensive, publicly accessible online resource for tracking information infrastructure development and strategies in each of the fifty states. With over 120 categories for each state, the scope of the Project (located at http://www.states.org) presented several opportunities for innovation, the most important of which was the implementation of a novel methodology for creating and maintaining a cost-effective, Web-oriented database: "distributed maintenance." A primary value of the Project is its attempt to test that concept; in that sense the Project is an experiment in information science.

With the "distributed maintenance" methodology, the States Inventory Project is an attempt to create a very valuable resource as well as a database architecture with great potential for cross-application (the most obvious being its usefulness for other nations developing information infrastructures).

This paper is a "lessons learned" report from the States Inventory Project Team. It has three areas of focus:

  1. The viability of the original hypothesis: a large-scale, comprehensive-yet-detailed, publicly accessible database can be created and maintained in a decentralized, "distributed" manner, requiring very little central administrative cost relative to the benefits of the resource. As an experiment for this hypothesis, the States Inventory Project hopes to prove it correct.
  2. Potential cross-applications of the distributed maintenance database architecture, including i) its use on a subnational level for other nations with developing infrastructures; ii) its potential multi-dimensionality (can states do the same thing on a county-by-county level?); and iii) substantive areas other than information infrastructure where a similar distributed maintenance methodology may be suitable.
  3. Notable obstacles and key features of the initiative which would be relevant to similar attempts in the future, such as the public relations campaign required to solicit a high degree of distributed participation; the balance in Project central administration between laissez-faire and quality control; the development of a long-term, collaborative "tele-project," where the primary means of communication is e-mail; technical innovations developed for the distributed maintenance database architecture; the "matrix" presentation of database content (facilitating comparative analysis); the development of a custom search engine; and other such topics. It should be noted that the form and content of this paper depend to a large degree on events occurring within four to six months from the date of this writing. An accurate evaluation of the potential for success of the Project cannot be made until after the end of 1997, and accordingly the emphasis given to each of the three major subjects above will likely be in flux up to the point of final writing.

Contents

I. Introduction

The FARNET/NSF States Inventory Project is based on three premises:

  1. The development and implementation of a National Information Infrastructure (NII) can be accomplished more efficiently when accompanied by a mechanism that enables a higher degree of sharing, comparison, and exchange of ideas and experience. The builders of information infrastructure must be able to learn from each other, collaborate with each other, and act in concert when appropriate.
  2. The primary builders of information infrastructure act and are regulated at a sub-national level (e.g., states, provinces, territories, cities). A mechanism for tracking the use and development of a NII must therefore take a sub-national approach.
  3. Information technology provides the tools for the efficient creation of a mechanism for tracking the development of an NII -- providing an opportunity for the efficient and productive exchange of ideas and experience -- at a sub-national level.

The States Inventory Project (located at http://www.states.org) fosters the development of the NII by providing a single, standardized clearinghouse for tracking state information infrastructure planning and activities. Through a publicly accessible, Web-oriented database, the States Inventory Project provides an opportunity for state policymakers, academics, and information technology professionals to (1) efficiently conduct comparative analyses across all of the states in any of the 100+ discrete categories in the Project database; (2) efficiently organize their own state's activity, and retrieve all of it at one time for a helpful "snapshot" of state progress and initiatives at any given time; and (3) control their portion of the database (their state) locally, through the States Inventory Project administrative interface (enabling ongoing "distributed maintenance" of the database by a large number of persons across the country).

This paper will first discuss FARNET's motivation to launch the States Inventory Project in 1995. It will move on to set forth exactly what the Project has accomplished, and how it has met the original needs for which it was created, through the implementation of a matrix design for the Web-oriented database. The "distributed maintenance" theory behind the Project will then be discussed in some detail (including lessons learned and a somewhat preliminary evaluation of the Project's viability). The paper will conclude with some of the broader lessons learned and suggestions for how certain aspects of the Project might be adopted in other contexts.

II. Background

In the early 1990s, much of the debate concerning the development of information infrastructure in the United States occurred at a national level, with national actors and a national focus. By 1995, it became clear that while the national focus was certainly appropriate in many instances, the actual building of the National Information Infrastructure (NII) would occur largely at a state or local level (and, of course, the users of the NII are primarily state and local actors). Not only would the states themselves play a large role in promoting connectivity through major state-funded networking efforts, in many instances state regulatory bodies would be the primary regulatory force acting upon private-sector commercial networking entities.

In spite of this realization, it is no exaggeration to say that the states were proceeding to build their portions of the NII largely in a vacuum. Continuing with the usual, pre-Information Age model of insularity, the states had little opportunity or incentive to learn from the activities of other states, which were grappling with many of the same issues created by the explosion of information technology and networking.

The primary obstacle to learning from other states was the lack of a clearinghouse - or even a starting place - for finding state-level information. The cost to any single state for a comparative analysis of other states' information infrastructure environment for a given topic (distance learning, for example) would simply outweigh the benefits (which in any event would likely be restricted to the narrow purpose for which the study was undertaken, if the analysis had to be done from scratch each time). With the growth of the World Wide Web, this problem is mitigated only partially; while the information is increasingly available on-line, it tends to be presented in dissimilar formats, using inconsistent terminology and very different organization schemes (something of which the Project Team is well aware). This lack of standardization creates difficulty in both locating and assimilating the data. Clearly, the existence of the Web alone does not make comparative analysis economically feasible.

FARNET approached the National Science Foundation at the end of 1995 with a proposal for the creation of a "meta-resource" that might solve some of these problems, while simultaneously providing a mechanism for national and international actors to more easily follow the development of the NII in the United States. To help execute the Project, FARNET engaged two other organizations, which together with FARNET compose the States Inventory Project Team: ECLIPS (Electronic Commerce Law and Information Policy Strategies) is a policy organization housed at the Ohio Supercomputer Center. ECLIPS was responsible for the substantive research duties of the Project. As Research Director for the Project, Keith Harmon has directed ECLIPS' participation. The database architecture and graphical user interface were the responsibility of the Arizona State University (ASU) Web Development Team, specifically Izydor Gryko and Rob Kubasko.

While a number of persons have played a role in the development of this Project on behalf of FARNET, ECLIPS, and ASU, Keith, Izzy, Rob and the author comprise the core group, and have worked closely together (mostly via email) to bring the Project to fruition.

III. The States Inventory Project

At the time of this writing, the States Inventory Project has roughly one year of development left. Even so, it is possible for us to make a number of fairly solid conclusions about what has been -- and what will be -- accomplished by the effort.

A. The "matrix" concept

From the start, the States Inventory Project Team has sought to design the resource in a manner that maximizes the value of the data contained within it. An underlying theme of the Project -- and a major value of the effort--were the determination of how to best provide a means for the efficient extraction of knowledge from the mass of disparate data available on the Web. The National Science Foundation in 1997 embarked on a large-scale program it called "Knowledge and Distributed Intelligence" (KDI), the goal of which is to promote the development of new information technologies and applications that facilitate the discovery of knowledge from distributed datasets such as the Web. The States Inventory Project is, essentially, an early experiment in KDI.

As mentioned above, the Project has relied on a "matrix" model for its Web-oriented database, which adds another dimension to the knowledge that may be produced from a single set of data. Although a simple concept, the Project Team is aware of no other database design like it on the World Wide Web. The matrix design is easily conceptualized (although it is not presented this way on the site): At the top of the grid are the states, which govern the corresponding columns below. Along the side are the 100+ Project categories, organized into a hierarchy of 11 top-level categories (discussed further below). Each state has the same categories, and each category has a spot in each of the states. This allows data retrieval:

  1. By state. A user may select a state, which then presents him or her with the categories. The user may then select which category he or she would like to peruse within that state. In terms of the matrix, this would be the selection of a vertical column.
  2. By category. A user may select any of the 100+ categories, retrieving information in that category across all of the states (retrieving, for example, 50+ examples of distance learning funding strategies). In the matrix, this would be equivalent to selecting a horizontal row.

It is worth noting that much of the World Wide Web (and certainly the many collections-of-links) could be characterized as providing either vertical or horizontal accessibility, under the matrix concept. Vertically accessible databases would contain information encompassing several categories, but within one jurisdiction. State governments' presence on the World Wide Web is one general example. (As noted earlier, this approach is generally effective for monitoring activities in that jurisdiction, but totally inefficient for any sort of comparative analysis.) Horizontally accessible databases, on the other hand, would include information from several jurisdictions but would cover only one discrete category or topic (such as "telemedicine projects in the United States"). This approach is perfect for comparative analysis; it standardizes the information (by categorizing it as "telemedicine projects") and puts it all in one place.

The States Inventory tracks a number of categories, across a number of jurisdictions; it enables access both horizontally and vertically. Accordingly, it can be used (1) as an excellent tool for the monitoring of information infrastructure activities within a single state/jurisdiction and (2) as a tool for efficient comparative analysis across all of the states in any of the 100+ categories.

B. Content

The Project tracks information infrastructure activities and strategic planning at a state granularity. The database is divided into over 100 categories, divided into the following top-level categories (a complete listing of Project categories is available in the Appendix):

  • Communications-Related Demographics
  • Education
  • Government's Advanced Telecommunications Usage
  • Laws and Regulations Online
  • Local Competition/Deregulation
  • Local Level/Community Projects
  • Online Delivery of Services
  • State Regulation of Telecommunication/Internet
  • State-Wide Communications Infrastructure
  • Strategic Planning
  • Telecommuting/Telework
  • Universal Service/Universal Access

An entry into the database consists of (1) a hypertext "headline" linked to a document on the World Wide Web, (2) a short abstract describing that document, including keywords associated with it (enabling preliminary review of the contents, and providing grist for our search engine), (3) a "last modified" date stamp (ensuring timeliness of content -- the contributor will automatically receive an e-mail when the entry reaches a certain age), and (4) the name of the contributor of the entry, linked to an on-site directory of registered contributors. A key feature of the States Inventory Project is the fact that all of this may be entered through an intuitive interface -- no knowledge of HTML coding is necessary -- by any person who registers as a contributor. This is in furtherance of the Project's role as an experiment in the theory of "distributed maintenance".

C. "Distributed maintenance"

"Distributed maintenance" is a novel method for creating and maintaining a Web-based database. From the beginning of this Project, the Project Team realized that an attempt to create a comprehensive database of this sort would require a substantial effort, and that it would be very difficult to create and maintain with a centralized approach using centralized Project labor. At the outset, the Project Team hypothesized that the active participation of a substantial number of contributors could enable the creation and maintenance of a large, yet detailed, database with a minimum of centralized labor. This Project is an experiment in that concept, and in that sense is an experiment in information science.

After creating a Web presence and database structure in the matrix format explained above, the Project Team designed an administrative interface for use (potentially) by a large number of contributors across the country. Through this forms-based interface an entry can be made directly to the database, with very little or no moderation by the Project Team. No knowledge of HTML is required, (although it can be used for links and formatting within the entry itself).

This open methodology for adding to the database posed a number of issues relating to security and database integrity. We introduced several mechanisms designed to mitigate potential problems:

  1. Each contributor must register with the Project, selecting the state he or she would like to administer and providing (at a minimum) a name, e-mail address, professional affiliation, a Project username, and Project password.
  2. Submission of the registration form does not provide access to the administrative interface; it is merely a communication to the Project Team. The Project Team actually enters the contributor's information into the database, giving us an opportunity to "screen" potential contributors. Once a contributor is inputted into the database, he or she receives confirmation from the Project Team that his or her password and username is enabled.
  3. To reach the administrative portion of the site, the contributor must submit the proper username and password.
  4. When the contributor submits a proper username and password, he or she is taken only to the administrative interface for the state he or she selected during the registration process.
  5. Unless the contributor obtains prior approval from the Project Team, he or she is limited to the administration of a single state.
  6. The name of the contributor is included at the bottom of each entry he or she submits, which provides some measure of accountability. The name links to the "contributor directory," a page listing the names and e-mail addresses of all registered contributors.

While not fail-safe, these mechanisms certainly have been successful thus far in maintaining the integrity of the database and the quality of its content.

Distributed maintenance, to be successful, requires the active participation of a substantial number of persons. The Project Team spent considerable time devising strategies and amassing contact information for the best potential contributors around the country. Over the course of the Project, we compiled a database of about 500 prospective contributors, selected from relevant state agencies, academics, other information technology professionals, and association members. In October 1997, we sent invitations (via e-mail) to approximately 100 carefully selected persons describing the Project, asking them to participate, and inviting them to help beta test our new contributor interface. After October, we periodically sent out similar mailings, and as of this writing we have completed the "affirmative contact" portion of our effort to solicit participation.

The response has been positive. We took special care to avoid any implication that we were "spamming," and received very few negative responses (mostly from persons who expressed concerns about the amount of time they had available to spend on the Project). Of the original 100, approximately 25 registered as contributors. Since October, our list of registered contributors has grown to over 75.

While we have had a good response in terms of registered contributors, the amount of content actually provided by these persons has been less than expected. The majority have not added any information at all. Some have added only one or two items. A few have done very well, providing a substantial amount of content for their state over the course of a few months. We were somewhat surprised with the disparity between the fairly large number of registered contributors and the relatively low degree of active participation (to this point).

While several factors may contribute to the current lack of active participation, there are a couple that we believe are dominant, based on communications with registered contributors. Fundamentally, there is a lack of incentive for contributors to spend any amount of time volunteering to provide content, despite the fact that it takes no more than five minutes to write and submit any one entry. In addition to the lack of an obvious incentive, contributors simply may not have understood what sort of content the Project hoped to include. The database was essentially an empty, untested architecture for a resource; it was not a useful resource yet.

In an attempt to counter both problems, the Project Team began a concentrated, centralized effort at the beginning of 1998 to provide "starter" content for the Project database. To a degree, this is an effort to test the hypothesis that the Project will be useful. We believe that the inclusion of more information gradually will prove its utility, allowing it to draw visitors based on usefulness rather than just curiosity. We expect at some point that the content will reach a critical mass of high utility, high traffic, and substantial word-of-mouth. Once that is realized, we believe that potential contributors will recognize the value in providing data for the resource, that they will then play a more active role in providing content and maintaining the database, and that the Project will, to some degree, "snowball" into a self-perpetuating, self-maintaining resource. If that occurs, then the distributed maintenance theory will have been proven to be a viable and very powerful tool for creating and maintaining a large and detailed database in a cost-effective manner.

IV. Larger lessons, other applications

A number of lessons have been learned during the course of the Project, some of which appear to be quite fundamental and should prove to be very valuable as similar knowledge extraction (or KDI) efforts are undertaken in the future. The experience has led us to a few broad conclusions. First, the database architecture and basic category structure of the Project could very easily be adapted for use in other jurisdictions (particularly for nations that have not yet developed their NII to the degree that the United States has). Second, the matrix format is powerful. Using a database architecture very similar to the Project, effective meta-resources could be created in any number of other contexts (whether or not the developers choose to rely on distributed maintenance).

A. Adaptation of the project for other nations developing an NII

In a sense, this Project was started too late. The United States had a head start in developing information infrastructure before this Project was launched, and we have been playing catch-up. For other nations that are just beginning (or have yet) to develop an NII, the Project could prove even more valuable.

Again, a key component of the Project is that it tracks at a "sub-national" level. The matrix design and category structure could be easily adapted to track information infrastructure in any jurisdiction that can be subdivided. Instead of states at the top of the matrix, it could be countries in sub-Saharan Africa. It could be prefectures within Japan, or it could even be cities within a European nation. However, perhaps more than any other nations, developing nations can capitalize on the efforts expended for design and implementation of this Project in the U.S.

It is also worth noting that distributed maintenance may be more effective in other environments. For the Project in the U.S., we approached persons with no prior contact and simply relied on volunteer time to help fill out an untested resource. The Project's application in other environments (1) may have a more available and willing core of distributed participation, perhaps not all volunteer; (2) will have the U.S. example to point to; and (3) will have the discrete lessons learned in the U.S. effort available (such as time investment, etc.).

B. Application of the architecture to other substantive areas

The design of the Project lends itself to application in other substantive areas where a similar meta-resource might be appropriate. Again, the top of the matrix could include any number of jurisdictions. Along the side can be any categories, not just those relating to information infrastructure. Possible applications could include:

  • laws, by topic
  • marketplace categorizations for electronic commerce
  • employment listings
  • government information, by type of service (utilities, health, benefits, etc.)
  • medical information, by discipline and geography

While the format lends itself to a centralized effort at locating and cataloging information, distributed maintenance conceivably could be used in other substantive areas as well (perhaps even more effectively than the State Inventory Project).

C. The future of the project

The Project Team envisions the further refinement of the Project, and has taken steps to gradually implement such technologies as Web robots, artificial intelligence, translators, and dedicated search engines in furtherance of the Project goal (comprehensiveness, with efficient and effective retrieval). We also envision the Project Team playing a role in the adaptation of the Project to other jurisdictions and to other substantive areas, as the need arises.

Appendix: The States Inventory Project categories

Communications-related demographics

  • Population
  • Percentage of Population Rural
  • K-12 and Internet Statistics (Number, Proportion, Use)
  • Other Communications-Related Demographic Information

Education

  • K-12
    • Home Pages of Related Agencies, Task Forces, and Commissions
    • Special Projects
    • Studies, Plans, and Reports
    • Funding Issues
    • Other Issues
  • Higher Education
    • Home Pages of Related Agencies, Task Forces, and Commissions
    • Special Projects
    • Studies, Plans, and Reports
    • Funding Issues
    • Other Issues
  • Distance Learning
    • Home Pages of Related Agencies, Task Forces, and Commissions
    • Distance Learning Projects
    • Studies, Plans, and Reports
    • Funding Issues
    • Other Issues
  • Research and Education Networks
    • Network Home Pages
    • Special Projects and Test Beds
    • Studies, Plans, and Reports
    • Funding Issues
    • Other Issues

Government's advanced telecommunications usage

  • Chief Information Officers' Home Page
    • Cabinet Level
    • Other
  • Home Pages of Related Agencies, Task Forces, and Commissions
  • Special Projects
  • Studies, Plans, and Reports
  • Year 2000 Problem Information
  • Funding Issues
  • Other Issues

Laws and regulations online

  • Bill Tracking and State Legislation Home Pages
  • State Code Online
  • State Constitution Online
  • State Regulations Online
  • State Courts and Case Law Online

Local competition/deregulation

  • Acts and Regulations
  • List of Local Exchange Carriers in the State
    • List of Incumbent Local Exchange Carriers (ILECs)
    • List of Competitive Local Exchange Carriers (CLECs)
  • Special Projects
  • Studies, Plans and Reports
  • Other Issues

Local-level/community projects

  • Local Level/Municipal/Community Activities
  • Local Rural Access/Universal Service Initiatives
  • Studies, Plans and Reports
  • Funding Issues
  • Other Issues

Online delivery of services

  • Telemedicine
    • Home Pages of Related Agencies, Task Forces, and Commissions
    • Telemedicine Projects
    • Studies, Plans, and Reports
    • Funding Issues
    • Other Issues
  • Telejustice
    • Home Pages of Related Agencies, Task Forces and Commissions
    • Telejustice Projects
    • Studies, Plans, and Reports
    • Funding Issues
    • Other Issues
  • Libraries
    • Home Pages of Related Agencies, Task Forces and Commissions
    • Library IT Projects
    • Studies, Plans, and Reports
    • Funding Issues
    • Other Issues
  • Kiosks
    • Home Pages of Related Agencies, Task Forces, and Commissions
    • Kiosk Projects
    • Studies, Plans, and Reports
    • Funding Issues
    • Other Issues
  • Benefits Distribution
    • Home Pages of Related Agencies, Task Forces, and Commissions
    • Benefits Distribution Projects
    • Studies, Plans, and Reports
    • Funding Issues
    • Other Issues
  • Emergency Services
    • Home Pages of Related Agencies, Task Forces, and Commissions
    • Emergency Services Projects
    • Studies, Plans, and Reports
    • Funding Issues
    • Other Issues
  • Other Services
    • Home Pages of Related Agencies, Task Forces, and Commissions
    • Test Beds and Special Projects
    • Studies, Plans, and Reports
    • Funding Issues
    • Other Issues

State regulation of telecommunications/Internet

  • Laws Enabling State Regulation of Telecommunications/Internet
  • Telecommunications/Internet Acts and Regulations
  • Home Pages of Public Utility Commissions, Related Agencies, and Task Forces
  • Public Utility Commission Rule Making Online
    • Orders
    • Dockets
    • Publications and Reports
  • Reform of Public Utility Commissions
  • Special Projects
  • Studies, Plans, and Reports
  • Other Issues

Statewide communications infrastructure

  • Network Architectures and Connectivity Maps
  • Commercial and Non-Profit Providers in the State
    • Wireline
      • Local Exchange Carriers (LECs)
      • Inter-Exchange Carriers (IXCs)
      • Cable Company Delivery of Telephony
      • Other Wire Providers
    • Wireless
      • Cellular
      • Radio
      • Satellite/Microwave
      • Other Wireless Providers
    • Interconnection Agreements
    • Electric Utility Delivery of Telephony
    • Internet Service Providers (ISPs)
      • Public
      • Private
      • Other ISPs
    • Other Providers
  • Test Beds and Special Projects
  • Studies, Plans, and Reports
  • Funding Issues

Strategic planning

  • Strategic Planning Documents
    • Comprehensive Telecommunications/Internet Plans
    • Universal Service Plans
    • Local Competition/Deregulation Plans
    • Public Utility Commission Reform Plans
    • Government Plans for Advanced Telecommunications
    • Education Plans
    • Online Delivery-of-Services Plans
    • Funding Plans
    • Telecommunications Infrastructure Development Plans
    • Other Plans
  • Implementation of Strategic Plans
  • Assessment of the Planning and Implementation Process

Telecommuting/telework

  • State Employee Telecommuting Programs/Reports
  • Comprehensive Telecommuting Programs/Reports

Universal service/universal access

  • Home Pages of Related Agencies, Task Forces, and Commissions
  • Funding and Cross-Subsidization Information
  • State Definition of Basic Telephone Service
  • High Cost and Rural Access Activities, Programs, and Task Forces
  • Special Projects
  • Studies, Plans. and Reports
  • Other Issues

[INET'98] [ Up ][Prev][Next]