Volume 12,No 1,February 2005
Volume 12, Number 1, February 2005
Technology Electronic Reviews (TER) is a publication of the Library and Information Technology Association.
Technology Electronic Reviews (ISSN: 1533-9165) is a periodical copyright © 2005 by the American Library Association. Documents in this issue, subject to copyright by the American Library Association or by the authors of the documents, may be reproduced for noncommercial, educational, or scientific purposes granted by Sections 107 and 108 of the Copyright Revision Act of 1976, provided that the copyright statement and source for that material are clearly acknowledged and that the material is reproduced without alteration. None of these documents may be reproduced or adapted for commercial distribution without the prior written permission of the designated copyright holder for the specific documents.
- EDITORIAL:The Origins of TER: Ten Years After. By Thomas C. Wilson.
- REVIEW OF: Phil Bradley. (2004). Advanced Internet Searcher's Handbook. Portland, OR: Neal-Schuman.(ISBN: 1856045234). By Rob Withers.
- REVIEW OF: Rickford Grant. (2004). Linux for Non-Geeks: a Hands-on, Project-based, Take-it-slow Guidebook. San Francisco, CA: No Starch Press. (ISBN: 1593270348). By Wilfred Drew.
- REVIEW OF: Peter Griffiths. (2004). Managing Your Internet and Intranet Services: The Information Professional’s Guide to Strategy, Second Edition. London: Facet Publishing. (ISBN: 1856043401). By Michelle Mach.
- REVIEW OF: Alan Schwartz. (2004). SpamAssassin. Sebastopol, CA: O'Reilly.(ISBN: 0596007078). By Ray Olszewski.
by Thomas C. Wilson
TER began its published life on June 2, 1994. Much has occurred in the intervening time, both within the publication and within the broader world of information technology in libraries. It seems fitting to celebrate ten years of Technology Electronic Reviews (formerly know as Telecommunication Electronic Reviews) and to reflect on what has changed.
Development and Growth
Back in the day, those heady days of the early 1990s, a group of LITA members, primarily from the Telecommunications Interest Group and Microcomputer Users Interest Group, began exploring the idea of creating a new type of online-only publication, one that would be free of many of the constraints on printed publications, one that would fill a niche not being served by other publications, and one that would expose readers in the library community to excellent technical materials produced outside of the typical library press (http://www.lita.org/ala/lita/litapublications/ter/terv1n1.htm#whither). After months of discussion and several proposals, Telecommunications Electronic Reviews (TER) was born. And the rest, "they" say, is history!
By the appearance of the first issue, other online publications had been in existence, some for quite a while. Public-Access Computer Systems Review, for example, was first published in 1990, and Issues in Science & Technology Librarianship from the Association of College and Research Libraries Science and Technology Section came into being in December 1991. dLib and Ariadne followed in 1995 and 1996 respectively, as well as a host of others in a variety of disciplines when online publishing became "hot" and relatively more straightforward.
So what was different about TER? True to the original thought, TER began and has continued to be published only online containing a variety of resource reviews (mostly books) and occasional commentary. The irregular schedule of publishing supported the flexibility necessary for a new development, and the online format permitted the content to be of a length appropriate to the reviewed work and the reviewer's level of analysis. TER also included materials not normally reviewed in the library technology press, e.g., books from info tech, education, and standards publishers.
Reflective of its time, TER was first distributed via e-mail (LITA-L) and hosted on the University of Houston Libraries gopher and then the ALA gopher. The table of contents has been distributed to a variety of e-lists since its inception. As the world-wide web became more commonly used, TER was quickly modified to an HTML format which was first hosted at the University of Washington. In 1997 it was moved to the www.lita.org site first hosted by WLN and then migrated to ALA. Author and title indexes and searching capability were added in 1998, as the growth of material available in TER demanded. To a large degree, TER was a product of its time from the perspective of its content as well as its delivery.
As libraries grew in understanding and use of the web, TER led the way in providing useful analysis of resources about the technical infrastructure of the Internet, the tools for creating a presence on the web, and materials for teaching the use of this world-wide treasure. It also made these resources accessible to the library community weaving together highly technical topics as well as policy and management issues. With this growing scope in coverage, the TER Editorial Board changed the title in 2001 to Technology Electronic Reviews.
Life beyond TER
In the meantime, much has happened in the worlds of technology and libraries. We've all experienced a telecommunications revolution: ever expanding broadband, massively reduced baseline costs, more competition, a glut of dark fiber, several convergences of voice and data services, less competition (Who would of thought we'd see the day when a "baby bell" would buy out "Ma bell!"), and the raising of the proverbial bar on what constitutes infrastructure. To paraphrase a popular movie line, just try getting through the day without infrastructure! Much of what used to be required knowledge to play on the Internet is now hidden and taken for granted. That is not to say, however, that the same issues do not exist.
We've all also been a part of the commodification of once specialized products and services. Home networking, for example, in the early 1990's was limited to those who understood wired Ethernet, Token Ring, or Arcnet technologies and could afford to deploy them, unless, of course, one wished to consider serial device networking and null modem cables. Consumer electronics has exploded well beyond TV's, stereos, and gaming consoles. Relatively low cost options for communications, personal information management, home management, entertainment, and common chore automata have successfully pushed increasingly sophisticated technology into the collective human psyche. One need look no further than the recent Staples commercial featuring Alice Cooper to begin to grasp the cross-cultural and cross-generational appeal of and assumptions about technology.
During the past ten years the library community has begun to chart some progress in getting the powers that be to understand, "It's the content, Stupid." Now, however, the focus must also include what one can do with the content: personal and collaborative workspaces, device independence, interactivity, reusability, and continued availability. What began as the primordial web looks that way from the vantage point of today. Imagine what we might be saying in another ten years.
Measuring the Success
For any publication, one of the ways to gage its success is to determine if there is a readership and how consistent it is. The available access statistics for TER indicate that the number of retrievals for the publication have grown substantially from the beginning. The access has remained relatively stable in recent years.
TER Access Statistics (not counting LITA-L) 1996 7,143 1997 11,871 1998 16,045 1999 23,738 2000 53,014 2001 43,281 2002 50,627
Another measure of its success may come from the number and types of references from other publications that have been created over the years.
Selective Listing/Indexing of TER AcqWeb's Directory of Book Reviews on the Web ARL Directory of Electronic Journals and Newsletters BUBL LINK / 5:15 Catalogue of Internet Resources China Tech Colorado Alliance of Research Libraries, Electronic Journal Access CompInfo - The Computer Information Center CTI Engineering, Engineering Internet Resources DutchESS (Dutch Electronic Subject Service) Edinburgh Engineering Virtual Library (EEVL) Ex Libris SFX knowledgebase Gill's Favourite Electronic Journals Index Morganagus The Information ~Wave~ A Collection of Current Awareness Links Internet Public Library Librarians' Index to the Internet National University of Singapore Subject Guides for Web Resources Revista Ciencias de la Información Riviste elettroniche delle discipline bibliotecarie & documentali Simmons College, Library and Information Science Electronic Journals Thomas Parry Library - Electronic Journals in Librarianship University of Limerick, IE WEB Electronic and Computer Engineering University of Michigan, Telecom Information Resources on the Internet University of New Brunswick, Saint John Ward Chipman Library, Electronic Commerce and EDI University of Pennsylvania Library, Electronic Journals
Yet another model is the Crawford Test in reference to e-journals: "[e-journals with] a minimum of six years [in existence], can be considered lasting titles." That measure would suggest that the Technology Electronic Reviews is well on its way to becoming a staple of e-publications.
It has been an honor to be a part of this publication and the developments it chronicles. My heartfelt thanks go to the original editorial board who birthed the publication: Charles Blair, Marshall Breeding, Peter Burslem, Thomas Dowling, Pat Ensor, Elizabeth Lane Lawley, Nancy Nuckles (now Colyar), and Kate Wakefield. I also wish to thank all those who have served on the Editorial Board over the years. Vision, leadership, and tenacity have been offered by the editors of TER thus far: Thomas C. Wilson, 1994-2000; Adriene Lim, 2000-2003; and Sharon Rankin, 2003-.
 The Big Chill.
 Statistics were not tracked in the first two years. With changes in the ALA website, no more recent statistics are available.
 Crawford, W. (2002). Free electronic refereed journals: getting past the arc of enthusiasm. Learned Publishing, 15(2), 117-123.
Thomas C. Wilson is founding editor-in-chief of TER and the Director for Information Technology at the University of Maryland Libraries.
Copyright © 2005 by Thomas C. Wilson. This document may be reproduced in whole or in part for noncommercial, educational, or scientific purposes, provided that the preceding copyright statement and source are clearly acknowledged. All other rights are reserved. For permission to reproduce or adapt this document or any part of it for commercial distribution, address requests to the author at TWilson@umd.edu
by Rob Withers
The Advanced Internet Searcher's Handbook is directed at a broad audience ranging from novices to experts. Despite the title, the Handbook's stated goal is to enable this audience to "more effectively understand how search engines and related software and utilities work," so that their search techniques improve. The author intends for the Handbook to be usable both by those who read it from cover to cover and by those who simply consult chapters of interest to them.
The author has a strong interest in this topic, considering his earlier output, which includes two earlier editions of the Handbook, two editions of Internet Power Searching: The Advanced Manual (New York: Neal-Schumann, 1999 and 2002). Other books by this author include the Business and Economy Internet Resource Handbook (London: Library Association, 2000) and Getting and Staying Noticed on the Web (London: Facet, 2002).
Writing books about a rapidly changing topic such as Internet searching, even if they are frequently revised, is a challenging task. Aspiring to make the book relevant to the needs and interests of readers with varying degrees of expertise and experience adds to the challenge. Although not without some areas for improvement, the Handbook rises to these challenges and presents a thorough overview of its topic in a style that is readable yet thorough.
The Handbook is divided into three sections: "Mining the Internet for Information" covers such topics as search engines, search directories, multi- or meta-search engines, site specific search engines, the "hidden" web, tools for searching multimedia, and tools for finding people. "Becoming an Expert Searcher" includes additional resources including weblogs, virtual libraries, newsgroup and mailing list archives, and what the Handbook refers to as "other" available database resources. "The Future" deals with strategies and tools for improving searching efficiency.
The rationale for distributing chapters under these three rubrics isn't always particularly clear. Most notably, the decision to include some chapters in "Mining the Internet for Information" as opposed to "Becoming an Expert Searcher" seems arbitrary. Although chapters about widely-used tools such as search engines, search directories, and resource-specific engines clearly belong in an introductory segment, more obscure topics such as the invisible web could arguably go in either the first or second segment. Since the Handbook's strength is in its readable chapters, perhaps the first two rubrics could have been combined.
The Handbook's analysis of resources is frequently thorough and insightful. Often, the Handbook compares search results from two comparable resources. In describing the resources, the Handbook calls attention to unique features and makes explicit comparisons between similar resources in order to elucidate situations in which each particular resources is likely to shine. Moreover, the Handbook doesn't uncritically regurgitate claims made by search tools; in one instance, the Handbook notes that one service "claims to be the largest image search on the web, a claim that Google wishes to challenge. However, since [the other service] doesn't say how many images it has indexed, it's rather hard to comment!" This type of insightfulness and analysis distinguishes the Handbook from many guides which appear to be little more than cut-and-paste compilations from various services.
Because its author is British, the Handbook adopts an international focus, with examples of resources for locating some types of information that are specific both to the United States and to the United Kingdom. While some readers may not have any application for resources targeted at another country, the inclusion of resources from outside the United States is a helpful reminder of the international nature of information found on the Internet. Only in a few isolated instances (e.g., a discussion of adoption registries) does the book omit a United States equivalent to a British website.
The Handbook is currently in its third edition. The author's nearly continuous work at revising the book is often helpful, as the Handbook is sometimes able to identify changes made within the year or two since the previous printing and alert the reader to new features or new services. At times, however, it also retains information from previous editions (e.g., discussions of Veronica and Archie) that might be better suited for a book about the history of the Internet rather than a guide to advanced Internet searching.
Each chapter includes a bibliography of websites at the end of the chapter, although regrettably, there is no corresponding website which contains a set of links which can be updated to incorporate new services, remove defunct ones, and update changes to the scope or URL of these services. Entries in these bibliographies pose several challenges to the reader, as well. Unfortunately, the title of the website does not appear with the URL, which hinders attempts to quickly identify resources. While annotations about each URL might not present new information to people who have read the chapter, they might be useful to those who are skimming the bibliography. Also, the form of the URLs given are inconsistent: some include the 'http://' prefix, but others do not. While the content of these bibliographies is useful, the presentation could be more readily usable than in its current form.
Appendices include a list of source code needed to create a website that links to an extensive collection of search tools and a list of top level domains for countries.
The Handbook provides an overview of a wide array of sources, but the quantity of resources does not detract from the detail it provides about many of them. Its readable style, insertion of the author's viewpoint and occasional sense of humor, and its use of "Hints & Tips" and "Did You Know" boxes, help the book to appeal to relative novices, while the breadth and depth of coverage will appeal to those with more experience. Beginners may dip into a chapter or two, while more advanced users will benefit from a more extensive reading.
Rob Withers is the Assistant to the Dean & University Librarian at Miami University in Oxford, Ohio.
Copyright © 2005 by Rob Withers. This document may be reproduced in whole or in part for noncommercial, educational, or scientific purposes, provided that the preceding copyright statement and source are clearly acknowledged. All other rights are reserved. For permission to reproduce or adapt this document or any part of it for commercial distribution, address requests to the author at email@example.com
REVIEW OF: Rickford Grant. (2004). Linux for Non-Geeks: a Hands-on, Project-based, Take-it-slow Guidebook. San Francisco, CA: No Starch Press.
by Wilfred E. Drew
Many of us, including those we serve, want to switch to something not owned or produced by Microsoft. Linux is one of the most viable alternatives to the Windows operating system. However, it can be very scary for those without experience in installing operating systems to take the plunge. Grant's book is the manual on how to do that for "non-geeks." The author characterizes "non-geeks" as those that want to use Linux in the same way as they use Mac or Windows, users who are more skilled using computers but are new to Linux, and those like myself that need a little push and encouragement to use Linux. This guidebook comes with two CD-ROMs containing Fedora Core Linux. The reader may want to install the latest Fedora Core version after finishing this book.
Linux for Non-Geeks uses a project-based approach starting in Chapter 1, "Becoming a Penguinista". This chapter begins by giving the future Linux user a general background on what Linux is, why the penguin is its mascot, and why someone might want to use Linux. The author then goes into detail about Fedora Core, hardware compatibility, and hardware requirements. All of this is offered up in easy-to-understand, everyday English.
The second project and Chapter 2, "Making Commitments" explains in very detailed steps how to install Linux on a computer in various different ways. This includes dual-boot installation for those that want both the original operating system and Linux on their machine or a complete replacement of the old operating system with the Fedora Core flavor of Linux.
Perhaps the most useful parts of this book are Chapters 3 through 7. Chapter 3, "A New Place to Call Home" explains how to use the GNOME desktop. GNOME provides a graphic interface to Linux. It is very similar in appearance and functionality to the Mac or Windows desktop. This chapter includes a very useful and fun project on customizing the GNOME panel and another on doing screenshots. "More Than Webbed Feet", Chapter 4 is all about getting connected to the World Wide Web and the Internet. It explains how to connect via broadband, wireless, or dial-up. It also explains how to use and set up e-mail, web browsing using Mozilla, instant messaging using Gaim which can connect you to AOL Instant Messenger users, and several other interesting Internet applications. Chapter 5, "Dressing up the Bird" is all about customizing the desktop to make it your own. This includes everything from changing icons for folders to installing themes for Mozilla. The next chapter, "Gutenbird" involves installing printers and managing print queues. The only part missing from this chapter is how to install networked printers such as those on a local area network. However, it is not difficult to figure out how to do so with a little poking around Chapter 7, "Putting your Data on Ice" explains in great detail and in easy to understand instructions on how to format floppies, use data CD-ROMs, play music CD-ROMs, and how to burn CD-ROMs of all types using CD-ROM burning software.
After using Linux and GNOME, the user soon finds it necessary to update or install software. Chapters 8 through 11 provide the necessary background to do that. Chapter 8, "RPM isn't a 1980s Atlanta-based Band" explains how to use RPM (Red Hat Package Manager) and RPM packages to install and update software on your Linux machine. It also tells you how to roll-back or uninstall packages. "Simple Kitten Ways", Chapter 9 explains how to use Terminal to provide a command line environment. It covers commands needed to help you manage and change settings in programs as well as doing other functions in a text mode. Chapter 10, " Yes, Yet another Way" covers using APT (Advanced Package Tool) and Synaptic to install software. Chapter 11, "Dining on Tarballs" covers compiling and installing programs from source files.
Linux is capable of handling other data storage devices beside floppies and CD-ROMs. Chapter 12 ,"Data on Ice Revisited" looks at installing and using USB (universal serial bus) storage devices and also examines how to mount and unmount Windows partitions. This is useful for people doing dual-boot systems. Chapter 13, "Tux Rocks" will be useful to those of you that make and play MP3 files. This chapter goes into great detail on how to do that in Linux including installing RealOne Player which is used in place of Windows Media Player and Real Player. Linux comes with many graphics capabilities. "Brush-Wielding Penguins", Chapter 14 tells the reader how to download and install free programs for expanding Linux's abilities in manipulating and creating graphics files.
Linux is also capable of doing the same type of functions as the Microsoft Office Suite. Chapter 15, "Penguins Back at Work" discusses OpenOffice and its applications. OpenOffice is an open source office suite including writer (a full featured word processor), calc (a spread sheet), impress (presentation software), draw (vector drawing), and math (a mathematical formula editor). It is installed as part of the Fedora Core version of Linux.
The next four chapters will be of interest to more advanced computer users. Chapter 16, "Font Feathered Frenzy" examines how to manage and install TrueType fonts as well as creating fonts. This chapter also looks into how to use Windows fonts in Linux. Chapter 17, "Tux Speaks Your Language" gives instructions on using and changing language support in Linux including installing a multilingual dictionary, StarDict. Chapter 18, "Tux Untethered" will be of special interest to those installing Fedora Core on a laptop. This chapter focuses in great detail on wireless connectivity. It tells the reader how to do it and how to trouble shoot any problems.
After completing the various projects in this book, it is time to think about "Leaving the Nest". Chapter 19 tells the user how to proceed into more advanced areas such as changing system settings, alternative desktops besides GNOME, keeping the system up-to-date, programming in Linux, and how to run Windows programs under Linux.
Every user needs to know how to handle operating system problems. The last chapter in the book, "What to do if Tux Starts Acting Up" does this for the new Linux user. It reads much like an FAQ (frequently asked questions) file. It lists typical problems and questions with easy to understand answers. To assist the new Linux user, the guide also includes two appendices and an excellent index. Appendix A lists specifications for various projects in the book and Appendix B offers lists of resources for more help including web pages and mailing lists.
This book is highly recommended for all users, from beginner to expert, looking for a way to learn more about Linux using a project-based approach.
Wilfred (Bill) Drew is Systems and Reference Librarian, State University of New York, College of Agriculture and Technology.
Copyright © 2005 by Bill Drew. This document may be reproduced in whole or in part for noncommercial, educational, or scientific purposes, provided that the preceding copyright statement and source are clearly acknowledged. All other rights are reserved. For permission to reproduce or adapt this document or any part of it for commercial distribution, address requests to the author at firstname.lastname@example.org
REVIEW OF: Peter Griffiths. (2004). Managing Your Internet and Intranet Services: The Information Professional’s Guide to Strategy, Second Edition. London: Facet Publishing.
by Michelle Mach
Managing Your Internet & Intranet Services provides a strong introduction to the major issues of library website management. The book is organized into thirteen chapters, plus an introduction.
(A detailed table of contents is available on the publisher’s website at:
Eight chapters concentrate on initial site creation and ongoing maintenance with topics ranging from choosing an ISP (Internet service provider) and domain name to linkchecking an existing site. Two chapters cover website staffing, while the topics "Your Intranet" and "The Internet Revolution" each merit their own chapters. Each chapter, and even most sub-sections, could easily justify a lengthy journal article or an entire book of its own. For example, the two topics of usability testing and database-driven websites each merit only a single paragraph in the book. Readers who want more technical or detailed information will need to seek other sources. Fortunately, the last chapter includes recommended resources for further study and a basic glossary. Scattered throughout the book are a handful of screen shots illustrating various design or content principles. The book also includes an index.
While this is a printed book, it is written in a web-friendly style that can be easily and quickly scanned. Each chapter begins with a bulleted list of contents and ends with a brief summary and references. The book relies heavily on bulleted or numbered lists to convey information. Bold or italicized subheadings appear on nearly every page, making it easy to find and read only the essential sections. The writing itself is extremely non-technical, practical and occasionally humorous. For example, in describing the move from the single webmaster management style, Griffiths writes, " The days of the all-singing, all dancing webmaster are largely gone from all but the smallest sites. . ."(p. 65)
The broad content, compact size and easy-to-read style make this an ideal book for the busy library administrator who needs to manage web personnel or oversee web development. Managers, especially those in special libraries, will appreciate the sections on outsourcing, staffing, budgeting, policies, and the politics of selling the library’s involvement in a website to non-library management. New LIS (Library and Information Studies) graduates hoping for web-related jobs might read this book to better understand the breadth of issues involved in creating and maintaining a website. Freelancers who need to build a site completely from scratch will especially appreciate the early portion of the book that assumes that no prior website exists. Given the novice audience for the book, the experienced webmaster is unlikely to uncover much new information. However, he or she may find it reassuring to learn that many of the challenges they face every day at work are not unique to their particular library, but experienced by almost all webmasters.
A major strength of this book is the honest description of the many personnel and political issues related to website management. Author Peter Griffiths is the Assistant Director, Communication Directorate at the UK Government Home Office, where he is Head of Profession for librarians and information scientists. While his job title suggests an administrative position, his perceptive descriptions suggest substantial hands-on, in-the-trenches website work. Griffiths discusses many issues that are not always mentioned in the technical literature: how to select and keep technical staff, deal with that well-meaning but totally inappropriate website suggestion, motivate lazy web content providers, and recognize the website problems caused by disgruntled, alienated or non-cooperative employees (or ex-employees).
One minor disappointment in this book is the comparatively brief discussion of intranets. Although the book's title Managing Your Internet and Intranet Services implies that both topics would receive comparable treatment, intranets only merit a single chapter. Given that intranets have some unique features and have not been discussed as extensively in the literature, some additional coverage of the topic seems necessary. It is also worth noting that while much of the book’s content is international, this book was published in the United Kingdom. Mentions of pounds sterling, the London Underground and other British items might be mildly disconcerting to an American reader. At the same time, the section on publishing non-English websites is probably more thorough than in a comparable American text.
Since the online world changes so quickly, it is worth briefly comparing the two editions of this book. An initial glance indicates that the content of this second edition, published in 2004 is surprisingly similar to the first edition published in 2000. Of course, the resource lists have been updated in this new edition to include articles, books and websites published in 2001 through to 2003. References to specific web authoring or maintenance software packages have also been updated. Some new topics in the second edition include information architecture, content management software, weblogs, extranets, and reputation management. However, the tables of contents for the two books are nearly identical, as is the relative length of each book. Minor wording changes aside, much of the content appears the same. A comparison of three lists found in both editions (how to annoy people with your website, the top ten features of library websites, and the golden rules of website content) shows only one item in one list has changed from the first edition. Similarly, a comparison of the two glossaries found only a handful of new terms added (blog, CSS, database-driven sites, java, XML) and another handful (mostly general terms like plug-ins, style sheets, webmaster) discarded. Given the book’s broad treatment of the Internet, it is not too surprising that its content has enjoyed such staying power.
Two minor changes found in the new edition reflect society’s changing perceptions regarding the Internet. First, the chapter "The webmaster" in the first edition became "The webmaster and the web team" in the second, recognizing that websites are no longer a one-person development. Second, the question "Why have a website?" appears as a subheading in both editions. Initially, inclusion of this question in the second edition seems surprising. Does any library particularly one attached to a business -- truly believe that a website is unnecessary? (Whether or not they can afford one is an entirely different question.) The different answers to this question show how much our view of the Internet has changed. In the first edition, websites are described as a new and exciting opportunity, one that will reach an international audience, bring in money and reduce communication costs. In the second, Griffiths admits that it will take some money to draw in this extra business and that the main problem Internet users face today is evaluating information, rather than finding it. Rather than viewing a website as something unique that sets your organization apart from others, it now "has become a part of daily business across the world and the question is no longer whether you will be part of it or not, but whether you will be an effective part of it, and whether your site will stand out from the welter of others." (p. 27)
Managing Your Internet and Intranet Services is a highly readable, solid overview of website management. Given its general coverage of website issues, it is likely to remain relevant longer than more technical guides. In fact, some readers of the first edition may decide that it isn’t necessary to update their copies yet. In any case, readers with hands-on, web-related positions will need to supplement this text with more technical resources. In particular, librarians may want to seek out library-specific resources, as coverage of library-related issues like e-journals, library catalogs, databases, and in-house digitization is minimal or nonexistent. Griffiths himself acknowledges the need to go beyond his book to keep current. As he writes, "Do the best job you can on your own behalf and keep on top of the wave. It hasn’t stopped being exciting yet." (p. xi)
Michelle Mach is Digital Projects Librarian at Colorado State University.
Copyright © 2005 by Michelle Mach. This document may be reproduced in whole or in part for noncommercial, educational, or scientific purposes, provided that the preceding copyright statement and source are clearly acknowledged. All other rights are reserved. For permission to reproduce or adapt this document or any part of it for commercial distribution, address requests to the author at email@example.com
by Ray Olszewski
If you have an e-mail address, you know about the flood of unsolicited e-mail commonly called spam. If you have a well-known e-mail address, one that shows up on a lot of mailing lists and websites, or one easily guessed, then you know a lot about spam. My public e-mail address, for example, is "well-known" by both measures, and I would estimate that between 95 and 98 percent of all e-mail I receive is spam. It has become a problem of such magnitude that it raises into question the long-term viability of e-mail in its present form.
The problem will be fixed ... eventually. But an international, weakly-regulated entity like the Internet is slow to respond institutionally to these sorts of problems. In the meantime, what's a suffering user, or sys admin (system administrator) to do?
There are actually many ways that individual users and sys admins can at least make spam manageable. The mail-filtering program SpamAssassin is one of the popular ones. It is a program that runs on a Unix- or Linux-based SMTP (Simple Mail Transfer Protocol, the main service that delivers e-mail over the Internet) server. SpamAssassin runs each message it processes through a screen of filters that calculate the probability that it is an unwanted e-mail message, then sorts mail into "good" and "bad" piles based on the score and a cutoff point.
Alan Schwartz's book SpamAssassin is a detailed guide to installing, maintaining, and using the program SpamAssassin.
About half of the book is oriented to familiarizing the reader with how SpamAssassin itself works. The program uses a mix of approaches, but the key one is a technique called "Bayesian filtering", which calculates conditional probabilities of a particular message being spam, based on whether it matches various filtering criteria. The probabilities are site specific, learned by examining a collection of messages that the site has received, and that the recipient, or the sysadmin, has categorized as spam or "ham" (a jargon term for the non-spam messages a user wants to see). Schwartz does an excellent job of describing the practical, workaday details of Bayesian filtering and of how to run SpamAssassin on a collection of categorized messages to develop starting probabilities.
Much of this part of the book covers in detail the specific tests that the program SpamAssassin applies to each message. They include tests based on the headers and message content themselves; tests based on Internet resources like "black hole" lists of spam sources; and locally-configurable lists of e-mail addresses always to be accepted ("white lists") or always to be rejected ("black lists").
This early part of the book begins with instructions for how to obtain and install the SpamAssassin program. Written in the programming language Perl, it is available from the CPAN archive of Perl applications. Schwartz does a nice job of covering the requirements for running the program and how to install it from this "upstream" source. Although he notes the alternative of getting SpamAssassin as part of a Linux distribution like Debian or Gentoo, his coverage of this option is skimpy -- unfortunately so, because it would be the preferred approach by any experienced Linux sysadmin, since it assures version compatibility with the shared libraries and other resources on the server.
The latter half of the book provides detailed instructions for configuring and using SpamAssassin with the four most common free-software SMTP servers -- sendmail, postfix, qmail, and exim. I use exim here on my own LAN (local area network), so I focused my review on that chapter, and I found the instructions clear, detailed, and accurate. A more superficial reading of the other chapters indicated the same degree of clarity and detail. There is also briefer discussion of using the SpamAssassin program separately from the site's SMTP server, such as implementing it as a POP (Post Office Protocol) proxy server.
In addition to the book itself being both well written and thorough in its treatment, SpamAssassin (the book) benefits from SpamAssassin (the program) being a popular, well-maintained piece of free software. Licensed under the Gnu General Public License (GPL),the software is both free in cost and freely modifiable by the programming community. Together, these characteristics encourage wide adoption and rapid adaptation to new spam strategies as they arise.
SpamAssassin (the program) is probably overkill for small sites with only a few users. In my home office, I still use simpler filtering strategies that work acceptably well because there are three users of the domain for e-mail. Larger sites, including libraries and schools, need a more heavyweight solution, one that supports the needs of e-mail users who are not themselves techies. Anyone looking for a way to limit the disruption of spam would do well to consider SpamAssassin (the program) for that purpose, and for those folks, SpamAssassin (the book) is a first-rate guide to making the program work effectively. Beyond that, anyone just looking for a deeper understanding of the various strategies mail recipients can use to fight spam would benefit from reading this book.
Ray Olszewski is a Senior Software Engineer at Protogene Laboratories, where he focuses on embedded systems development. He has also worked as a freelance computer programmer and statistician. His work includes development of custom Web-based software to support on-line research.
Copyright © 2005 by Ray Olszewski. This document may be reproduced in whole or in part for noncommercial, educational, or scientific purposes, provided that the preceding copyright statement and source are clearly acknowledged. All other rights are reserved. For permission to reproduce or adapt this document or any part of it for commercial distribution, address requests to the author at firstname.lastname@example.org
The TER Editor is Sharon Rankin, McGill University (email@example.com). Editorial Board Members are: Linda Robinson Barr, Texas Lutheran University (firstname.lastname@example.org); Paul J. Bracke, Arizona Health Sciences Library (email@example.com); Kathlene Hanson, California State University, Monterey Bay (firstname.lastname@example.org); Adriene Lim, Wayne State University (email@example.com); Tierney Morse McGill, Colorado State University (firstname.lastname@example.org); Florence Tang, Mercer University, Atlanta (email@example.com); Stacey Voeller, Minnesota State University (firstname.lastname@example.org); Laura Wrubel, University of Maryland (email@example.com); and Michael Yukin, University of Nevada, Las Vegas (firstname.lastname@example.org).