Lessons Learned from the Early Adoption of URNs in an Intranet Environment

Pedro CUENCA <pcuenca@ieee.org>
Vicente SOSA <vsosa@anaya.es>
Julián ROMERO <jromero@anaya.es>
Israel HERNANZ <ihernaz@anaya.es>
Grupo Anaya, SA
Spain

Abstract

This paper presents our experiences in the development and deployment of a standards-based uniform resource name (URN) resolver service. Although there is not yet an approved standard mechanism that client applications can use to resolve URNs, we have nevertheless been able to successfully implement and deploy a resolver that has been tested inside the controlled context of an intranet environment. This approach has enabled our company to enjoy the anticipated benefits of an URN infrastructure without having to wait a long time until the standardization process freezes and commercial solutions are made available. Based on this experience, this paper tries to communicate the message that the potential for URNs is tremendous, even if their use must presently be confined to the limits of corporate intranets.

The extensive use of URNs to represent complex document trees and Web-based applications has allowed us to identify a number of features and services that we feel are missing in the current set of proposed URN standards. Therefore, another important contribution of the paper is the discussion of some of these requirements, such as the need for a standard set of administrative services or the necessity of introducing levels of indirection in the definition of URNs. We hope that standardization bodies and interested parties can benefit from the insights gained by our experience and that some of these ideas can be taken into account in future initiatives.

State of the art
The implementation of a corporate URN resolver
Resolver extensions: When standards are just not enough
Successful deployment in an intranet environment
- Use environments
- Deployment methodology and preliminary results
Conclusions and next steps
References

State of the art

Identifiers and addressing in the Web

Just from the very first stages of Web development, work on mechanisms to refer to Web resources has recognized the importance of fulfilling a flexible, general, and uniform addressing model [1,2,5]. Request for comments (RFC) 1630 [1], for example, dated June 1994, proposes "a unifying syntax for the expression of names and addresses of objects" and introduces the notion of universal (now "uniform") resource identifiers or URIs as the basic addressing abstraction. Because the use of several types of URIs was anticipated, RFC 1630 tries to define a common syntax for all of them, and goes as far as to distinguish both URLs (uniform resource locators) and URNs (uniform resource names) as honorable members of the URI family.

Recent revisions of that early work [2] and of the design principles on which the Web model rests continue to concede the same importance to addressing. As a matter of fact, the recognition that uniform resource characteristics (URCs) are also another type of URI [3] has helped raise the consideration of identifiers to one of the two fundamental abstractions of the so-called Web model [4]. The other basic abstraction, the resource, is itself usually defined in terms of identifiers: A resource is basically anything a Web identifier can refer to, a definition that imposes no further restrictions on any aspect of resources.

Nowadays, URIs are considered to comprise three different types of identifiers: URLs, URNs, and URCs:

URLs, or uniform resource locators, are identifiers that specify the physical location where a resource is available; in other words, they contain explicit information about how the resource can be accessed. This information includes the communications protocol, the name of the machine, and the name of the file where the resource is stored in a digital format.
URNs, uniform resource names, are symbolic names that are assigned to resources by authoritative organizations. A URN is not related to the physical location(s) where the resource is stored, and the name must not include any type of information about the resource itself. These requirements are a consequence of the will to use URNs as long-lasting references to resources. URNs, therefore, are expected to convey an idea of persistence, in the sense that an URN, once assigned, continues to be a valid reference to the resource, even if it is moved to another location or the responsibility of its maintenance changes to another organization. The urn: URI scheme [2] has been reserved [1] to accommodate technological infrastructure support for URNs once they are standardized.
URCs, uniform resource characteristics, are sets of attributes (possibly in the form of key-value pairs) that state facts about the resource they refer to. The consideration of URCs as one of the basic URI families recognizes the importance that metadata (i.e., data about data) will have to determine and identify what are the most appropriate resources to present to a particular user in a certain context at a given time. If, for example, the resource pointed to by an URN is available at several URLs, URCs could be employed by the user agent to automatically select the closest one. Or they can provide information about the suitability of a resource for a child, just the way PICS ratings work. Efforts on metadata representation, like the RDF (resource description framework) standard [20], will provide the building blocks for the widespread adoption of URCs.

Even though work on URN standardization and the support of the urn: scheme reportedly started by 1993 [1], URLs are still the only addressing mechanism available in the Web. But referencing a resource by the physical location where it happens to be stored poses several questions about robustness and efficiency. Of course, the immediate mapping of files stored in servers to URLs was central in the rapid diffusion of the Web, as it eased the way for the hassle-free incorporation of lots of services and pieces of information. But the need for a symbolic, persistent, higher-level addressing mechanism is still a key feature that has to be realized for the Web to become a reliable medium.

General advantages of and considerations about URNs

The availability of a persistent naming scheme is one of the most important reasons that justify effort spent on URNs. Before proceeding any further, it is worth noting that the requirements for persistence cannot be guaranteed by technology alone, but by a social commitment of the organizations. Some organizations can, indeed, guarantee the persistence of certain URLs they maintain, such as the address of their own home pages. Even if that's the case, it is also important to realize that technological support for location- and even organization-independent names is a convenient goal to pursue, for it can pave the way for the blossom of identifiers that show a greater longevity than that of machine names and addresses.

Another important issue that must be taken into account goes beyond the technological provision of persistence to its perception by end-users. The explicit support of the urn: scheme will undoubtedly help users determine that those identifiers have received some type of endorsement about persistence and reliability.

From a more practical point of view, the indirect mapping that translates names to locations to resources brings about a number of valuable consequences that will improve the reliability and robustness of the Web. Because a single URN can map to multiple URLs, the decision as to which URL to use can be deferred until access to the resource is required. This way, if one of the URLs is not available at a certain time, another one can be chosen with no user intervention, thus making it completely straightforward to achieve high-availability systems. The implementation of an URN resolver can also perform a round-robin algorithm for the selection of equivalent URLs, which translates in a simple and scalable solution to load balancing.

Because of their commitment to persistence and lower volatility, URNs are better suited than URLs to become integrated with URCs and metadata systems. The I2C (URI to URC) service that some URN resolvers are expected to provide [10] will allow clients to retrieve information about a resource before it is accessed. This situation contrasts with the current use of URLs as the sole identifying mechanism, which requires the physical retrieval of the resource in order to find out simple data as encoding format, length, or content. From a Web computing perspective, URLs provide no guarantee of service, as the only way to know whether a resource is available is trying to fetch it.

URN standardization

Work on URN standardization is mostly being carried out by the IETF (Internet Engineering Task Force) [6]. The main goal of the URN working group is the definition of a framework for the assignment and resolution of URNs.

The need for URNs and even the basic functional requirements they must meet were identified very early in the process of Web standardization [1,7]. Therefore, a general agreement has been achieved on basic technical areas such as URN syntax [9] or the set of resolution services that can be performed on URNs [12]. Work on resolution has advanced following the lines of using DNS for resolver discovery [8,10] and HTTP (hypertext transfer protocol) as a communications protocol that is expected to be supported by most resolvers [11]. Resolver discovery using DNS has been recently submitted to the Standards Track.

Although the technical guidelines have been set up in a reasonable way, the assignment procedures are still under discussion. Documents that specify URN namespace definition mechanisms or assignment procedures for the resolution of URIs using DNS are in draft status and evolving rapidly.

URN implementations

As pointed out in the preceding paragraphs, the basic technical requirements for URN resolution have already been agreed upon to a sufficient level of detail [9,11,12]. Although the use of DNS for resolver discovery is still preliminary, at this writing (February 1999) there is not yet a general-purpose standalone resolver implementation. The URN working group does provide a set of scripts that illustrate the implementation of resolution services, but they are an ad-hoc solution tailored to the needs of an experimental namespace for IETF documents [13].

A number of proposals try to reuse the existing infrastructure to provide URLs with a commitment to persistence. WIRE (W3 identifier resolution extensions) [15] and PURL (persistent uniform resource locator) [16] use the standard HTTP redirection mechanism to introduce levels of indirection in the URLs. The Handle system [17,18] employs a proprietary hdl: URI scheme that must be resolved by a browser plug-in or a special HTTP proxy server. Although WIRE is still an experimental specification, both PURL and the Handle system are supported by their respective organizations and can be used to create URLs that map to other URLs. The idea of persistence stems from the fact that there is an institutional commitment to maintain the newly created URLs, although the mapping is allowed to change. As valuable as these services are, they are nonetheless limited in scope. They are successful in isolating users from changes in the names of the machines where their documents are stored, but they don't support the urn: scheme and don't try to provide resolution services other than a simple indirection.

The implementation of a corporate URN resolver

Requirements

The rationale that led us to build a corporate URN resolver lies in the nature of our organization. Grupo Anaya is a publishing company that produces both printed media and interactive online applications, with a special focus on educational products. Text, illustrations, photographs, maps, and all sorts of material make up a huge database of content that must be revised very frequently to be adapted to the requirements of education regulations and to the specific features of the products that are designed. These resources, however, are not maintained in a central repository database; instead, they are stored, in an unstructured format, in small departmental servers controlled by the different editorial groups in the company. Instead of trying to enforce a single solution for the organization and storage of data, our company is always looking at new ways to improve the traditional workgroup approach of editors to their work. From this point of view, it was recognized that the Web model is the only existing infrastructure that, despite some inconveniences, is working now as a bona-fide extensible, heterogeneous, distributed repository. The current efforts on addressing issues and the integration of metadata were the results of ideas that could also be applied to our heterogeneous working groups. In this context, the use of URNs was seen as a first step that would allow our company to achieve globally unique identifiers for all of the resources we maintain. Later on, a metadata model would provide the means to classify, relate, and browse through resources, thus improving reuse and helping build a consistent corpus of material.

The use of URNs was also important from the point of view of interactive online applications. All too often, we had encountered the problem of designing and maintaining complex Web-based applications that integrate data from different sources, to the point that even the most insignificant typographical change had to undergo a complete software engineering life-cycle to make its way to the user interface pages. URNs would help minimize the impact of technology by making applications independent of the physical locations where the diverse data and services reside. By using URNs to hide protocols and addresses, the implementation of Web-based applications could be more easily broken up into small independent pieces. Every one of these pieces is assigned a public URN identifier that allows access from other components in the system. This gave rise to the design and implementation of a Web-based computing architecture that has improved the efficiency and reliability of our applications, although its description is beyond the scope of this paper.

To sum up, these general requirements called for a name resolver that had to be:

Easy to access. Web browsers were planned to be the user interface tools that editors would use to access the resource repository, so the resolver had to be reachable from that environment. Access from the distributed components that make up our Web-based applications was also required.
Extensible, so that it is easy to experiment with and new functionality can be added.
Based on URN standards, to guarantee interoperability of our applications with other resolvers whenever they are made available.
As efficient as possible.

Some implementation details

To be able to achieve the high levels of interoperability, extensibility, and flexibility required, the implementation is based on Internet standards and protocols. This way, HTTP is used as the communication protocol, which makes it easy for many types of clients to access the resolution services. The resolver itself is implemented as a Java [22] servlet [23], and thus can run on most Web servers and computing architectures.

The URN to URL mapping is stored in a lightweight directory access protocol (LDAP) [24] repository, which provides for maximum scalability and excellent performance, especially for read access. If LDAP is not available, the file system can also be used as a repository. And because the resolver sports a modular architecture, other repositories can easily be added as well.

The output data can be presented in several common formats suitable for Internet use. The resolver supports the text/uri-list media type [12], but can also provide data in HTML (hypertext markup language), XML (extensible markup language)[19], and even in JavaScript [24,25]. Other formats can be added with ease.

The resolver is not restricted to a single namespace; on the other hand, it can be used to resolve several namespaces, given that certain conditions are met.

The namespace

The syntax of the URN namespace used in our intranet conforms to the guidelines given in RFC 2141 [9], with the additional constrain that the namespace specific string (NSS) must begin with a slash. To date, anaya: has been used as the namespace identifier, but it will have to be changed to x-anaya to conform to the latest recommendations on experimental namespaces [14].

Query strings

Because some of our URNs must be used to identify application code, there must be a way to encode arguments so that they can be adequately received and processed by the software components. The generic URI syntax [2] specifies that the question mark character can be used to signal the beginning of a scheme-specific query string. The URN syntax recommendation [9], however, states that the question mark is a reserved character, because discussion is needed before query strings can be standardized for URNs. To work around this situation, question marks and ampersand signs are allowed in the anaya: namespace, but they must appear in their escaped form (%3F and %26, respectively).

Apart from these technical details, the most important point is that query strings are recognized in the anaya: namespace. The name resolver must be able to identify the delimiters and look for the mapping of the base string (i.e., without the query string). Any arguments that appear in the original query are appended to the URL(s) on the fly. If, for example, a mapping exists from urn:anaya:/apps/search to http://www3.anaya.es/search, an I2L operation [12] on the string urn:anaya:/apps/search%3Fkey=value will return the following URL: http://www3.anaya.es/search?key=value.

Alternatives were contemplated before support for query strings was implemented in the name resolver. The most immediate solution is to have clients implement the corresponding logic, and use query strings only when working with URLs. Although this was cumbersome, the most important reason not to follow that approach was that URN aliases are supported by the resolver, which allows for the definition of very general services that can be referenced more conveniently. Consider, for example, the following URN definitions:

    urn:anaya:/apps/search - http://www3.anaya.es/search
    urn:anaya:/apps/searchByAuthor - urn:anaya:/apps/search%3Fitem=author
    urn:anaya:/WorksOfCervantes - urn:anaya:/apps/searchByAuthor%3Fauthor=cervantes

When a client requests urn:anaya:/WorksOfCervantes, it resolves to the following URL:

    http://www3.anaya.es/search?item=author&author=cervantes

However, there's no need for the user to know that urn:anaya:/WorksOfCervantes is related in any way to urn:anaya:/apps/search; in fact, the mapping of the former URN can be changed at any time in a completely transparent way. But for this feature to be possible, query strings must be supported by the resolver.

About the legibility of URNs

Although the examples in this paper show legible URNs that closely resemble the structure of hierarchical URLs, caution must be observed when those names are used. As a general rule, encoding any type of user-related information inside the URN string is discouraged because it limits its longevity. There are situations, however, when a legible encoding is indeed helpful.

Our approach is to use machine-generated names for those applications where user interface tools are available. That is the case for resources created by our editorial teams. On the other hand, we currently use human-readable URNs for application components, simply because lots of URN references appear inside the code and it would be otherwise impossible to follow. As the engineering team is much smaller than the editorial one, it is easier to identify general components that can be assigned legible names with a reasonable expectation for durability.

Standard resolution services

The most frequently used resolution services specified in RFC 2483 [12] have been implemented:

I2L: URI to URL
I2Ls: URI to URLs
I2R: URI to Resource
I2C: URI to URC. Please refer to explanation below.

At this moment, unimplemented resolution services are:

I2Rs: URI to Resources
I2N: URI to URN
I2Ns: URI to URNs
I=I: Is URI equal to URI

The I2C service

Currently, the I2C service is used only for a very specific purpose: the retrieval of dependencies among software components.

As pointed out in the requirements section, URNs are to serve as identifiers of software components and code modules. One of the features supported by our Web-based computing architecture is the possibility of dynamically loading and executing a certain piece of code, which is identified by an URN. This is a concept similar to that of Java classloaders [22] but without language constrains: Any dynamically bound language can be used, currently including Java and JavaScript, although other scripting languages can be incorporated in the future. URCs are used to describe the code dependencies that must be taken into account: Typically, the execution of a module will require other modules to be present in the run-time environment.

When a client requests the loading of a resource, its MIME (multipurpose Internet mail extension) type is checked to find out if it is code to be dynamically executed. If so, the I2C service is invoked, and that operation returns a list of dependencies that must be loaded before execution of the module. Although quantitative analyses have not been performed, this is an innovative use of URNs and URCs to solve some of the problems of distributed computing.

URCs are intended to be used for the description of resources from several points of view. This requires the specification of a suitable metadata model and architecture that is being currently worked out.

Output formats

RFC 2169 [11] provides some basic ideas about the formatting of resolver data when HTTP is the communications protocol in use. Thus, the new media type text/uri-list is given as an example of URI codification. Likewise, an HTML layout for representing URIs is also suggested. These formats, however, are very simple and, at the same time, difficult to parse automatically because of the lack of standard tools.

The URN resolver we implemented follows those guidelines, but introduces two other codifications: XML and JavaScript.

XML

XML was chosen because of its extensibility and the wealth of tools that can recognize and parse this format. Extensibility will presumably become more and more important as the standardization on metadata and URCs progresses. To date, though, the coding is completely straightforward. This is an example of how the results of an I2C operation are returned in XML^*:

    <I2C URN="urn:anaya:/js/collection/Collection">
        <LOCATION>
            <URI>http://anduin.anaya.es/machina/js/collection/Collection.js</URI>
        </LOCATION>
        <NEEDS>
            <URI>urn:anaya:/js/support/support</URI>
            <URI>urn:anaya:/js/support/RangeMap</URI>
            <URIurn:anaya:/js/collection/CollectionInterface</URI>
            <URIurn:anaya:/js/collection/CollectionDelegate</URI>
            <URIurn:anaya:/js/collection/RemoteCollectionDelegate</URI>
        </NEEDS>
    </I2C>

* Our implementation also returns the locations when the I2C operation is performed. The rationale behind that is that locations are indeed characteristics of the resource.

JavaScript

JavaScript [24,25] was introduced as one of the formatting options because of the facility with which this format could be understood by our distributed computing architecture, where most of the glue code [26] is written in this language. To recover the information, the string is simply evaluated, thus retaining the same flexibility that is possible with XML but without even having to use a separate parser. The following lines show the same operation demonstrated earlier, but formatted in JavaScript:

    var theJSObject = [];
    theJSObject.location = [];
    theJSObject.location['uri'] = [];
    theJSObject.location['uri'].push('http://anduin.anaya.es/machina/js/collection/Collection.js');
    theJSObject;
    theJSObject.needs = [];
    theJSObject.needs['uri'] = [];
    theJSObject.needs['uri'].push('urn:anaya:/js/collection/RemoteCollectionDelegate');
    theJSObject.needs['uri'].push('urn:anaya:/js/collection/CollectionDelegate');
    theJSObject.needs['uri'].push('urn:anaya:/js/collection/CollectionInterface');
    theJSObject.needs['uri'].push('urn:anaya:/js/support/RangeMap');
    theJSObject.needs['uri'].push('urn:anaya:/js/support/support');
    theJSObject.needs['uri'].reverse();
    theJSObject;

Resolver extensions: When standards are just not enough

As noted earlier, the implementation of a corporate resolver that had to fulfill the aforementioned requirements helped us identify a number of areas of ambiguity in the current set of proposed standards. These limitations had to be overcome by the introduction of several proprietary extensions. Although the driving force was the need to solve the practical problems encountered, the generality and extensibility of the solutions adopted were very much considered. Discussion is encouraged on these extensions so that their relevance can be assessed objectively.

Output formats

The support of XML and JavaScript as encoding formats for the results returned by the resolver has already been discussed.

Although it is true that RFC 2169 [11] leaves the door open to any format clients might deem useful, it is important nonetheless to reach an agreement on at least a basic set of endorsed formats; otherwise, interoperability cannot be guaranteed. JavaScript can safely be considered a niche format, but at the same time we think that XML is an extremely convenient option for encoding responses.

Administrative services

Most clients will need access only to the basic resolution services. However, those organizations that are responsible for the maintenance of namespaces will also need to perform administrative operations, like advanced searches or the creation of URNs. For name resolvers to be truly interoperable, these operations must also be identified, especially when taking into account that URNs can be adopted internally by organizations for their own benefit. The standardization of administrative services will foster the design of off-the-shelf resolvers that can be easily deployed in a variety of environments.

Our name resolver implements the following administrative services:

NEWURN. Used to create a new URN. The name to assign is optional: If absent, it will be created automatically. The newly created identifier is returned.
ADDLOC. Adds a new location to a URN mapping.
DELLOC. Removes a location from a URN mapping.
SETLOC. Sets a new mapping at once.
SEARCHNAMES. Regexp-based search on the URNs.
SEARCHLOCATIONS. Regexp-based search on the locations.

In all cases, results can be formatted in XML and JavaScript. The exact specification of these services is beyond the scope of this document. The submission to the URN working group of a proposal that covers these services is currently under consideration.

Query strings

Discussion on whether and how query strings will be supported in URNs is still necessary [12]. The approach taken by our implementation, described earlier, is similar to the treatment of query strings for URLs. From a conceptual point of view, the use of query strings defines a family of parametric URNs that share the same prefix, which doesn't prevent every "instantiation" from being considered a distinct identifier.

URN aliases

URN aliases are nothing more than additional levels of indirection in the name to location mapping. They represent an improvement in flexibility over the general advantages of using names instead of locations. As demonstrated in a previous example, support for aliases and query strings can simplify the access to complex resources without losing expressive power.

URN concatenation

URN concatenation is a proprietary technique similar to URN aliasing that can greatly simplify some administrative tasks. Consider, for example, a situation where URNs for several thousand image files must be created. All of the files are initially located at the same server, and it is anticipated that they will be replicated at a new location some time in the future. URN concatenation makes it possible to specify the location of each URN as the concatenation of a base URN plus a suffix string. For example:

    urn:anaya:/images/base  - http://media.anaya.es/images/
    urn:anaya:/images/00001 - concat:urn:anaya:/images/base+img1.jpg
    urn:anaya:/images/00002 - concat:urn:anaya:/images/base+img2.png
    ...

This way, when a new server is added that contains a copy of all the images, the only mapping to update is that of the base URN, that is, urn:anaya:/images/base.

This technique is extremely useful to move entire trees of documents to new locations. That is often the case for Web-based applications whenever the underlying hardware or software needs to be updated.

Note that this feature requires the name resolver to interpret an internal concat: pseudo-scheme.

Unlike complex rule-based resolution, URN concatenation is very easy to implement and simplifies several common administrative tasks. Every URN, moreover, continues to be individually addressable.

Specialized "restricted" resolution services

For most clients, the standard resolution services (I2L, I2Ls) will be enough, as they provide the final URLs where resources can be effectively found. But the introduction of several levels of indirection calls for the specification of a new set of resolution services that can provide information about the existence of alias or the use of concatenation.

To satisfy this requirement, our resolver implements a number of additional resolution services that simply return the exact mapping defined for URNs, instead of following aliases, performing concatenation, or looking at dependencies. These services are I2LR, I2LsR, and I2CR. They can be considered "restricted" versions of their counterparts because they perform resolution once and return the results as-is.

I2CR is worth looking at. As mentioned, I2C returns a list of dependencies of the resource. If A depends on B depends on C, I2C will return C and B. I2CR, on the other hand, will simply return B.

Once again, it must be emphasized that this family of services will have to be standardized if URN aliases or any other type of indirection level is supported.

Successful deployment in an intranet environment

Use environments

Once the basic resolution system was available, specific mechanisms had to be devised to guarantee access to the resolution services from all the user communities of interest.

Access from Web browsers

By the simple setting up of an off-the-shelf proxy server (Netscape Proxy Server 3.5), Web browsers can be configured to access URNs without the need to employ special plug-ins or any other type of client software. When the user requests a urn: address, the browser simply redirects the request to the proxy server. This system, in turn, performs an I2R operation on the URN, and thus the resource is sent to the client. With this simple approach, editors and other end-users are able to browse through the URN space. Some of the benefits of URN-enabled browsing will be outlined in the section on the preliminary results.

Access from applications

For Web-based applications to use URNs, client-side application programming interfaces (APIs) to the resolver have been developed. As previously indicated, these services are part of a distributed computing framework where software components and network services are identified by globally unique URNs. The resolver interface API has been implemented as a lightweight software layer with bindings to the Java and JavaScript languages. This covers all of the required functionality, to the point that the complete API can be used even from scripts embedded inside HTML pages that run in a browser environment.

Access from the outside

Because online services and applications are built taking advantage of the underlying URN infrastructure, a method of access from the outside was necessary for all public services. This is achieved by a filtering server-side software layer that parses HTML documents (either static or application-generated) and replaces all URN occurrences by their equivalent URLs. Thus, external clients can safely look at images and click on hyperlinks. This function is disabled for requests originating inside the intranet. Whenever support for URNs is generalized in the Internet, this filtering layer can simply be removed.

Deployment methodology and preliminary results

The introduction of URNs to different internal workgroups is following a phased approach that is not over yet.

After beta testing by a small group of individuals, the complete software engineering department plus several other technically oriented people were informed about the benefits of URNs and the availability of a distributed architecture specially suited to Web development. The concept of indirection and its benefits were grasped almost immediately. After a few weeks of testing and validation, the new model was selected for the pilot development of an electronic shop by a team of six people, consisting of one manager, three programmers, and two designers. The same people had participated previously in a similar effort with state-of-the-art technology but no support for URNs. Unfortunately, no quantitative data were gathered at that point, because the purpose was simply to solve problems, identify new features, and get the team used to the model. Qualitative impressions, though, were extremely satisfactory. One of the most important advantages that can be directly attributed to the use of URNs is that the team was self-organizing and parallelism was greatly improved. After an initial agreement on the set of URNs to use, all of them were created and implemented using simple static dummies and mockups. After that point, though, programmers could work on specific data access features while designers worked on page layout. Once a particular feature was finished and tested, the programmer simply removed the old fake mapping, pointing the URN to the new live implementation. From that moment on, the rest of the team members observed that real data were now available, but the way to access the resource was exactly the same as before. This contrasted with the earlier scenario where all details had to be working before results could be seen. With the new model, therefore, designers and management felt more involved in the project and programmers were more focused. Thus, productivity and employee satisfaction increased.

Shortly after that, work began on the development of a set of end-user tools intended for editorial teams. Once the first prototypes were ready, they were made available to a small group of expert knowledge engineers. The engineers are now providing feedback on the features that they think will be useful to support the building of an extensible metadata architecture, to be used in the future by regular editors. This group has also received URNs with enthusiasm, but the necessity of building a graphical tool that allows users to browse through catalogs of resources and create new ones has been made apparent.

At the same time, some of the corporate proxy servers were configured to resolve URNs, and some of the intranet documentation was moved to a URN scheme. Users can browse through pages exactly the same as before. Because the original urn: address is retained, pages can be bookmarked and used as reference. The difference, however, is that the exploitation department can move resources whenever they need to. Before the use of URNs, this was impossible without formally announcing the new locations and warning users of the possibility of encountering invalid references. Reliability and efficiency have also improved because most pages are replicated at several locations.

Conclusions and next steps

As indicated, all the benefits promised by URN technology have been observed as soon as resolution services were made available. Therefore, the effort of writing a URN resolver has paid off more than adequately, even if its use must be confined to controlled corporate contexts. This fact indicates that interoperability is a very important goal to achieve, because general-purpose name resolvers are useful pieces of software that could be easily adopted by lots of organizations. To encourage discussion on specific resolution features, our company will continue its commitment to public announcement of results, and is considering the licensing of the resolver code to the Open Software movement.

URNs can be integrated with metadata systems to leverage their capabilities and empower users with the ability to create, modify, classify, and establish relationships using the same tools. Grupo Anaya has started development of a metadata framework for publishers, code named DAWN, whose design goals will also be published [27].

References

Berners-Lee, T., "Universal Resource Identifiers in WWW," RFC 1630, June 1994.
Berners-Lee, T., Fielding, R., Masinter, L., "Uniform Resource Identifiers (URI): Generic Syntax," RFC 2396, August 1998.
Connolly, D. (ed.) "Web Naming and Addressing Overview," http://www.w3.org/Addressing/Addressing.html
Berners-Lee, T., "World Wide Web Design Issues," http://www.w3.org/DesignIssues/
Connolly, D. (ed.), "Key Specifications of the World Wide Web," World Wide Web Journal Volume I, Issue 2, Spring 1996.
Uniform Resource Names (urn) Charter, http://www.ietf.org/html.charters/urn-charter.html
Sollins, K., Masinter, L., "Functional Requirements for Uniform Resource Names," RFC 1737, December 1994.
Sollins, K., "Architectural Principles of Uniform Resource Name Resolution," RFC 2276, January 1998.
Moats, R., "URN Syntax," RFC 2141, May 1997.
Daniel, R., Mealling, M., "Resolution of Uniform Resource Identifiers using the Domain Name System," RFC 2168, June 1997.
Daniel, R., "A Trivial Convention for using HTTP in URN Resolution," RFC 2169, June 1997.
Mealling, M., Daniel, R., "URI Resolution Services Necessary for URN Resolution," RFC 2483, January 1999.
URN Working Group, "A URN Namespace for IETF documents," Work in Progress.
URN Working Group, "URN Namespace Definition Mechanisms," Work in Progress.
Girod, L., Chen, B., Frystyk, H., Mallery, J., "WIRE - W3 Identifier Resolution Extensions," Work in Progress.
Online Computer Library Center, "Persistent URLs," http://purl.oclc.org/
Corporation for National Research Initiatives, "The Handle System," http://www.handle.net/
Corporation for National Research Initiatives, "Handle Resolution Protocol Specification," http://www.handle.net/client_spec.html
Bray, T., Paoli, J., Sperberg-McQueen, C.M., (ed.), "Extensible Markup Language (XML) 1.0," The World Wide Web Consortium, February 1998, http://www.w3.org/TR/REC-xml
Lassila, O., Swick, R. (eds.), "Resource Description Framework (RDF) Model and Syntax Specification," http://www.w3.org/TR/PR-rdf-syntax, January 1999.
Wahl, M., Howes, T., Kille, S., "Lightweight Directory Access Protocol (v3)," RFC 2251, December 1997.
Gosling, J., Joy, W., Steele, G., "The Java Language Specification," Addison-Wesley, 1996.
Sun Microsystems, Inc., "The Java Servlet API," http://java.sun.com/marketing/collateral/servlets.html
Netscape Communications Corp., "JavaScript documentation," http://developer.netscape.com/docs/manuals/javascript.html
"ECMAScript language specification," ISO/IEC 16262:1998.
Ousterhout, J., "Scripting: Higher-Level Programming for the 21st Century," IEEE Computer, March 1998.
López, S., "Metadata deployment in a publishing environment," to appear in XML Europe '99.

Lessons Learned from the Early Adoption of URNs in an Intranet Environment

Abstract

Contents