roytennant.com :: Digital Libraries Columns

 

Library Journal "Digital Libraries" Columns 1997-2007, Roy Tennant

Please note: these columns are an archive of my Library Journal column from 1997-2007. They have not been altered in content, so please keep in mind some of this content will be out of date.

roytennant.com :: Digital Libraries Columns

A Database for Every Need


12/01/1999

   Many digital library projects require database software -- whether for
   a catalog of holdings, subject pathfinders, or other kinds of
   structured information. Database solutions are actually fairly generic;
   the same software can support a large variety of uses. Only one of the
   solutions noted below (SiteSearch) is tailored to any degree at all for
   library uses, and yet it can also be used for other types of data.
   Libraries are increasingly finding database software to be an essential
   infrastructure component, whether they use it to serve their library
   catalog or to build web pages on the fly.

   So, it's likely that your library -- no matter what size or type --
   could make effective use of database software. The good news is that
   there are many solutions from which to choose. The bad news is that
   there are many solutions from which to choose.

   Depending on the computer you have to run the software, the size of the
   database, the level of usage you expect, and your budget, one of the
   following options may be right for you. For more detailed information
   about options and choices, see Eric Morgan's paper "DBMs and Web
   Delivery."

   Workstation solutions

   Workstation solutions are database products that run on a desktop
   machine -- typically an MS Windows NT or Macintosh computer. The
   advantages include relative ease in creating and maintaining the
   database, low cost, a large installed user base, and good self-support
   options (e.g., many books that explain how to use the most popular
   systems). However, users may face infrequent data backup, "downtime"
   (time when the database is unavailable) and a lack of scalability (the
   capacity to grow larger or support more users).

   In general, these solutions are best for small, simple databases for
   which you do not expect many simultaneous users.

   The premier workstation solution for computers running MS Windows NT
   is, not surprisingly, Microsoft Access. Access has been around for
   years, and there are a large number of users you can consult for ad hoc
   assistance.

   For Macintosh users, FileMaker Pro clearly leads in terms of numbers of
   installations and integration with Macintosh web servers. However,
   FileMaker is now an option for MS Windows users and thus should be
   considered by them as well.

   Server solutions

   If a workstation solution does not offer enough power or reliability,
   you may need to step up to a larger, faster computer that is maintained
   by trained staff as a network server. Using a server, you gain a faster
   CPU (processor), more RAM (volatile memory), larger hard-disk capacity
   (persistent memory), and round-the-clock support. This allows you to
   serve more simultaneous users quickly.

   Server solutions vary widely in complexity, including very simple
   solutions (like Sprite), those with mid-range complexity (like MySQL),
   and industrial-strength applications that I discuss below as enterprise
   solutions.

   Sprite is a free Perl module, which means it can be installed and used
   on any computer that runs the Perl programming language (usually, but
   not exclusively, Unix computers). This solution is so simple it isn't
   even a database -- it stores data as a flat file. However, to exploit
   fully this solution you must be at least passingly familiar with Perl.
   For those who can write simple Perl scripts, this is a fine solution
   for small databases of a few thousand records that do not require
   frequent updates.

   MySQL is compliant with the Structured Query Language (SQL) syntax and
   thus can be queried with the same syntax used to query large commercial
   databases like Oracle and Sybase (see below). However, unlike those
   solutions, MySQL is free. It nonetheless can handle thousands of
   records with aplomb and with decent hardware support can easily handle
   many simultaneous users. This is the industrial solution for those who
   don't have industrial dollars.

   These kinds of solutions tend to be best when you have more time on
   your hands than money. Since they're free, you don't need much cash,
   but you will need time to program web front-ends to them so users can
   easily interact with these database "shells."

   Enterprise solutions

   If your needs surpass the solutions listed above, or if you wish to
   create a robust and scalable infrastructure, consider an enterprise
   solution. They do not come inexpensively -- in terms of both money and
   staff (time and level of expertise). These are big and complex
   solutions for big and complex databases.

   However, depending on the hardware you have, they may run fine on the
   same machine that runs what I call server solutions. The difference is
   not in the hardware, nor operating sytem environment, but in the
   complexity, robustness, and commercial nature (and therefore the
   availability of support) of the enterprise solutions.

   An enterprise database created specifically with library needs in mind
   is SiteSearch from OCLC. SiteSearch is both MARC and Z39.50 compliant,
   but it also can manage many other kinds of data.

   General-purpose enterprise database solutions like Informix, Oracle,
   and Sybase are primarily aimed at the business market; they may work
   for libraries, though they're less consistent in supporting such
   library standards as MARC and Z39.50. On the plus side, they
   (particularly Oracle and Sybase) are widely implemented in the
   commercial sector, which means experienced people and good support
   books should be easy to find.

   Making your choice

   To decide on the appropriate database software for a specific need, you
   must carefully consider a number of variables: the ease and
   effectiveness with which you can create a usable and effective database
   for your users; how easy or difficult it is to set up and maintain; the
   hardware that you have available or can purchase to run it; the number
   of simultaneous users you expect to serve; the amount and quality of
   technical support and service upon which you can rely (either in-house
   or commercial); and the overall cost (including any server purchase or
   upgrade, technical support, the cost of the software, etc.).

   Although generalizations do not always apply, small libraries of any
   type will usually do just fine with a workstation solution like MS
   Access, while larger libraries may require the power and scalability of
   server and enterprise solutions.

   When it comes time to decide, you can take some solace because nearly
   any solution will allow you to export your data should you make the
   wrong decision or change the decision variables cited above.

                                 LINK LIST

                                                   "DBMs and Web Delivery"
                                       [123]http://www.lib.ncsu.edu/staff/
                                             morgan/dbms-and-web-delivery/
                                                             FileMaker Pro
                                            [124]http://www.filemaker.com/
                                                                  Informix
                                             [125]http://www.informix.com/
                                                          Microsoft Access
                              [126]http://www.microsoft.com/office/access/
                                                                     MySQL
                                                [127]http://www.mysql.org/
                                                                    Oracle
                                               [128]http://www.oracle.com/
                                                                SiteSearch
                                     [129]http://purl.oclc.org/SiteSearch/
                                                                    Sprite
                                       [130]http://www.perl.com/CPAN-local
                                                /modules/by-module/Sprite/
                                                                    Sybase
                                               [131]http://www.sybase.com/