roytennant.com :: Digital Libraries Columns

 

Library Journal "Digital Libraries" Columns 1997-2007, Roy Tennant

Please note: these columns are an archive of my Library Journal column from 1997-2007. They have not been altered in content, so please keep in mind some of this content will be out of date.

roytennant.com :: Digital Libraries Columns

The Expanding World of OAI


2/15/2004

I recently called standards the "engine of interoperability" ( LJ 12/03, p. 33), but I could have also called them the "building blocks of innovation." Standards, when done well, provide an important foundation on which new innovations can be built. Within the digital library community this is most evident in the development of the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH).

OAI-PMH was created in 2001 as a means to distribute metadata for digital objects, mostly papers in e-print repositories. Since then, it has not only been implemented widely by such repositories, it has also been used in unforeseen ways.

Beyond just techies

One of the first OAI services of broad utility to a nontechnical audience was OAIster at the University of Michigan. Aiming to be the "academic Google," OAIster harvests metadata records from academic-based repositories worldwide and indexes them in a central database. Users can now search more than 2.3 million records for digital objects (papers, images, books, audio files, etc.) from over 240 institutions. This service alone demonstrates the power of a simple standard to enable new and useful services.

Another service useful for exploring OAI-compliant repositories, without employing cumbersome technical tools and software, is the OAI Viewer service from OCLC. This web site allows users to browse through the sets of metadata records offered by any repository, all the way down to an individual record. It handles all of the technical requirements behind the scenes, such as sending off a properly formatted OAI-PMH request and parsing the results into a relatively easy-to-understand format.

More importantly, the service is based on the concept of Extensible Repository Resource Locators (ERRoLs) for OAI Identifiers. ERRoLs are URLs that can link to any metadata record or web resource related to supported OAI repositories. This means that by going through the OCLC OAI resolver service at errol.oclc.org , any OAI-PMH capability (e.g., retrieve record) can be made actionable by a simple URL. OCLC is even offering an RSS feed for metadata from any registered OAI-compliant repository.

These services demonstrate the power of standards. Simply by being compliant with OAI-PMH, a repository anywhere in the world can have its records discoverable from a centralized search service (via OAIster) and can offer an RSS feed (via the OCLC ERRoLs service) that the repository doesn't need to create or maintain.

Making innovation possible

One of the best examples of the kind of innovation that can be fostered by a pivotal standard is explained in the article "Using the OAI-PMH…Differently," which appeared in D-Lib Magazine. It outlines a number of uses for the protocol, all of which have little or nothing to do with its original purpose. The article describes three unusual OAI-PMH "repositories": a thesaurus, usage logs, and an OpenURL registry. A sign that a protocol is suitably versatile is that it can be easily used in situations unimagined when it was developed.

But digital library developers have not stopped here. Although the protocol was specifically designed for harvesting (collecting records into a central location for indexing and searching), others are in the midst of extending it as a means of distributed searching.

The Distributed Digital Library of Mathematical Monographs project among the University of Michigan (M), Cornell University (C), and University of Göttingen (G), Germany, has specified some extensions to the Dienst protocol (a precursor of OAI) that support searching the collections at those three institutions simultaneously. As project documentation says, "Working from the roots of the DIENST protocol developed at Cornell and the then-emergent OAI protocols, the project team focused on creating a new protocol—dubbed CGM—that was consistent with OAI, borrowed from DIENST, and added mechanisms for full text searching." All three universities are revising existing software infrastructures to accommodate this new protocol.

Another project (see Automated Subject Indexing) uses OAI as the basis for automated subject indexing of a test set of journal articles from D-Lib Magazine.

It is noteworthy that so many interesting uses have sprung up around a protocol that is still quite new. These developments indicate that the protocol is operable in a wide variety of situations, with the potential to be a building block in applications and services that we have yet to imagine.

LINK LIST

Automated Subject Indexing
dlib.org/dlib/december03/mongin/12mongin.html

Distributed Digital Library of Mathematical Monographs
www.library.cornell.edu/mathbooks

ERRoLS
www.oclc.org/research/projects/oairesolver

OAIster
www.oaister.org

OCLC's OAI Viewer
errol.oclc.org

Using the OAI-PMH…Differently
www.dlib.org/dlib/july03/young/07young.html