roytennant.com :: Digital Libraries Columns

 

Library Journal "Digital Libraries" Columns 1997-2007, Roy Tennant

Please note: these columns are an archive of my Library Journal column from 1997-2007. They have not been altered in content, so please keep in mind some of this content will be out of date.

roytennant.com :: Digital Libraries Columns

The Year of the Open


09/15/2007

   Two events this year are ushering in a new era of openness--both in the
   source code and the file formats of commercial software. Adobe and
   Microsoft have announced technologies that are open and transparent
   (see "[123]IDPF Hosts Digital Book 2007," LJ 6/1/07, p. 27ff.). It is
   hard to overestimate the impact of these developments, since much of
   what they'll enable is yet to be seen. Still, they represent enormous
   potential for anyone interested in libraries, information technology,
   and coming digital services.

   PDF evolves

   Many people may not be aware that the Adobe Acrobat file format has
   long been an openly published specification, or that the full version
   of Adobe Acrobat could save an XML (extensible markup language) version
   of a PDF (portable document format). This trend is only intensifying as
   Adobe works with the Open Publication Structure (OPS) specification
   managed by the International Digital Publication Forum (IDPF).

   The beta release of the Adobe Integrated Runtime (AIR)(TM) environment
   "allows developers to use HTML/CSS, Ajax, Adobe Flash, and Adobe
   Flex(TM) to extend rich Internet applications (RIAs) to the desktop,"
   according to Adobe. Users can dynamically flow the text and repaginate
   as font size changes, thereby providing a much richer and more natural
   screen reading experience, not achievable with a standard Adobe Acrobat
   PDF file.

   Adobe Digital Editions, which uses the OPS file format, is an AIR
   application. If you go to the Adobe Digital Editions site and download
   an ebook, you'll see the potential of this publishing platform (see
   "Digital Books Redux" in the link list).

   The extensible office

   An even larger development is the news that Microsoft is introducing a
   completely new (and open) file format with Office 2007. When you save a
   document in Word, PowerPoint, or Excel, the file will have the
   character "x" added to the typical filename extension, so ".doc" will
   be ".docx" in Word 2007. This signifies that the document is in XML,
   specifically OpenXML, a growing standard.

   But there's more. If you add ".zip" to the end of the filename, turning
   "my.docx" into "my.docx.zip," and then unzip it (by double-clicking on
   it), the "file" becomes a directory that reveals a package consisting
   of the document itself in XML as well as potentially a number of other
   components--for example, higher-resolution versions of the images in
   the document and the metadata describing it.

   The true beauty of this design, however, is that it is extensible.
   Anyone can add components to this package. I could create a Dublin Core
   record describing my document, put it in the package, and zip it back
   up. When I give this document to someone else, it will have my
   contribution as well as the original files.

   Implications for libraries

   Documents will increasingly be open to other applications to
   manipulate, index, and transform. Librarians (and others) will find it
   much easier to capture files in their native format and do interesting
   things with them, such as indexing them for access and transforming
   them into canonical, standard formats for preservation, such as TEI
   (Text Encoding Initiative).

   Also, the open "package" format of Microsoft files offers interesting
   opportunities for libraries to create metadata packages that can be
   inserted into the original document's ZIP configuration and transported
   transparently as one file. Only those who need to see the library
   metadata package have to check it.

   With open software and file formats, the opportunities to enrich,
   expand, and embellish are unlimited. From such fertile fields
   innovation can flower. If we need a single word to describe 2007, I
   nominate open.
     __________________________________________________________________

   LINK LIST
   Adobe AIR [124]labs.adobe.com/technologies/air
   Adobe Digital Editions [125]www.adobe.com/products/digitaleditions
   Adobe Flex [126]labs.adobe.com/technologies/flex
   Adobe Mars Project [127]labs.adobe.com/technologies/mars
   Digital Books Redux
   [128]libraryjournal.com/blog/1090000309/post/1840011784.html
   Microsoft Open X Format
   [129]msdn2.microsoft.com/en-us/library/aa338205.aspx
   Open Office XML Formats
   [130]www.ecma-international.org/memento/TC45.htm
   Open Publication Structure (OPS) [131]www.idpf.org/2007/ops