roytennant.com :: Digital Libraries Columns

 

Library Journal "Digital Libraries" Columns 1997-2007, Roy Tennant

Please note: these columns are an archive of my Library Journal column from 1997-2007. They have not been altered in content, so please keep in mind some of this content will be out of date.

roytennant.com :: Digital Libraries Columns

Different Paths to Interoperability


02/15/2001

   In a previous column, I discussed the importance of interoperability
   among digital library projects ('[123]Interoperability: The Holy Grail
   ,' LJ 7/98). Users should be able to discover through one search what
   digital objects are freely available from a variety of collections,
   rather than having to search each collection individually.

   In a more recent column, I highlighted a project that is achieving
   interoperability among preprint (or, as they are now commonly referred
   to, e-print) servers ([131]'Open Archives: A Key Convergence ,' LJ
   2/15/00). For digitized library materials, there are at least two good
   examples of projects that are achieving the same goal through similar
   but intriguingly different means.

   The LC model
   For three years (1996-99) the Library of Congress (LC) and Ameritech
   teamed up to offer digitization grants (of up to $75,000 each) to
   libraries in the United States. LC required successful grantees to
   provide suitable access aids for the items digitized with award money.
   These access aids could be in one or more formats: 1) U.S. MARC
   records, 2) Dublin Core records (following LC guidelines for usage), 3)
   structured headers (encoded in Text Encoding Initiative format) for
   searchable text reproductions, and/or 4) Encoded Archival Description
   finding aids.

   Awardees were required to supply LC with the records for the items
   digitized. These records were added to the LC American Memory
   collection, thereby providing one place to search the digital
   collections of LC as well as those of all libraries receiving
   LC/Ameritech awards. The digitized items themselves remain at the
   individual institutions, as do copies of the item records.

   This highly centralized model for creating a union catalog was possible
   because LC and Ameritech controlled the funding and could establish
   record creation guidelines before digitization occurred, therefore
   providing for a high level of interoperability with records among
   different institutions.

   This model also requires a high level of commitment and precoordination
   among participating institutions and a willingness from all
   participants to follow set guidelines. These collections have been
   incorporated so seamlessly into the existing American Memory
   collections that users can easily be unaware that they are searching
   non-LC collections.

   Taking this work a step further, LC is developing a 'core set of
   metadata elements to be used in the development, testing, and
   implementation of multiple repositories.' This work should be
   particularly helpful for digital library projects that are looking to
   contribute records to a union catalog -- either now or in the future.

   The Picture Australia model
   In contrast to the LC model, the Picture Australia project came about
   after a good deal of library content -- nearly 500,000 items -- had
   been digitized and cataloged. Picture Australia aims to bring together
   access to digitized images relating to Australia from several
   institutions (currently seven, including libraries, the National
   Archives, and the Australian War Memorial). The particular challenges
   dictated a more flexible solution than that chosen by LC.

   Since records had already been created for digitized materials, Picture
   Australia needed a method to collect the records, massage them into a
   common record format, index them, and make them available for web
   searching. Rather than requiring participating institutions to ship
   data periodically to a central location (the National Library of
   Australia serves as the lead institution), project developers decided
   to collect the records monthly by using a software spider. This allows
   institutions simply to put their records in a specific location on
   their servers, to be collected automatically.

   The collected records must then be translated into a common record
   format (fields are based on the Dublin Core and the storage format is
   XML) and indexed (using Blue Angel's Metastar Enterprise). Most of
   the issues remaining for Picture Australia relate to this translation
   of heterogeneous metadata into a common set of elements.

   One problem is the loss of context. As Debbie Campell, the Picture
   Australia project manager, puts it, 'A collection of images may have a
   collective title such as "Images of Paul Revere." But the image
   title may be reduced to "On a horse." So the loss of context
   becomes a discovery issue.'

   Mounting challenges

   There is also the problem of differing subject vocabularies,
   particularly between libraries and museums. The use of geographic names
   without qualification (such as the name of the state in which it is
   found) can be problematic as well for those not familiar with
   Australian geography.

   The cataloging problems can go deeper, depending on how the
   participating institutions have cataloged their materials. A key issue
   is granularity. Whereas one institution may keep track of first and
   last names, for example, another may not. Differing formats can be
   another issue. One library may keep track of dates as MM/DD/YY, while
   another spells out the month and year. These are issues that must be
   rectified when translating contributed records into a common format. To
   see examples that illustrate some of these record variations, see the
   Picture Australia Metadata Guidelines.

   Despite these challenges, Picture Australia is clearly successful in
   its effort to bring together access to a wide range of pictorial
   material in one, easy-to-use location. This success rests on several
   factors. According to project manager Campbell, one factor was a
   forgiving timeframe. Although each project task was estimated and
   delivered according to a schedule, there was no overall deadline for
   release. This allowed some flexibility in reacting to unforeseen
   problems.

   Another factor was the low threshold for participation. Institutions
   contributing records were required to do very little to make their
   records available to Picture Australia. 'Picture Australia is quickly
   able to repurpose the investment already made in digitization and
   description,' Campbell said.

   The Picture Australia model has another advantage. It has its own brand
   identity, independent of any single institution. This encourages
   contributors to participate more equally than is possible when
   assigning records to a single institution, as with the LC model.

   Pick a model, any model
   Union catalogs are a good thing. They make accessible from one location
   what was formerly only accessible by visiting multiple locations and
   often by learning different search interfaces. Our users need more
   union catalogs.

   There is no 'best' model. You use what is appropriate. If you are
   beginning a project that provides you with the opportunity to lay out
   guidelines ahead of time, by all means do so -- it will save time and
   trouble later. But many great chances for creating union catalogs will
   come after records have been created. The best thing about Picture
   Australia is that the project has proved that not only can union
   catalogs be created after the fact, but that they can be done well.

                                  LINK LIST

                                                           American Memory
                                                [124]http://memory.loc.gov
                                                   Blue Angel Technologies
                                         [125]http://www.blueangeltech.com
                                                               Dublin Core
                                                   [126]http://purl.org/dc
                                           LC/Ameritech Collections Online
                                               [127]http://memory.loc.gov/
                                                   ammem/award/online.html
                                                 LC Core Metadata Elements
                                      [128]http://lcweb.loc.gov/standards/
                                                             metadata.html
                                                         Picture Australia
                                      [129]http://www.pictureaustralia.org
                                     Picture Australia Metadata Guidelines
                                     [130]http://www.pictureaustralia.org/
                                                             metadata.html