Meta, Federated, Distributed: Search Solutions
By Walt Crawford
American Libraries Columnist
Senior analyst, Research Libraries Group
Column for August 2004
Call it federated searching, if you will—or distributed searching, or cross-database searching, or metasearch, or even search portals. I suspect there are those who can make sensible distinctions between these terms, people who can explain why Searchlight at the California Digital Library and WebFeat offer federated searching, while Vivisimo and iBoogie are examples of metasearch, and Godot (at several Canadian libraries) does distributed searching. In practice, I believe most distinctions have been lost in discussions and papers about the various functions that carry these names. I’ll call all the one-stop search shops metasearch interfaces for now.
There’s nothing new about cross-database search and retrieval: Dialog was doing it long before the Web. In some cases, metasearch is just a fancy word for cross-database search and retrieval; in others, I think it’s something different.
Choose a function
Web metasearch seems straightforward. A web metasearch engine takes your search, sends it to several web search engines (MSN, Yahoo, AllTheWeb), and brings back a relatively small number of results from each search engine. The metasearch engine removes duplicates and shows a combined set of results, which sometimes include the rank from each search engine or topical clusters. You may be able to select the web search engines to be included and have some control over how many records are retrieved from each engine. Every web search engine I’ve tried sorts the records by relevance. Every returned record has a URL. Those two factors make it easy for the metasearch engine to eliminate duplicates and provide a combined relevance-ranked result.
Research library searches are inherently more difficult. The metasearch interface can ask the user to select groups of available databases first, or it can suggest an appropriate one based on a search. It then can present a combined set of results or list the result size for each database that produced results with links to get those records. The interface might also encourage the user to switch to a native interface for preferred databases.
The biggest advantage of metasearch is that users enter a search once, usually in a Google-like single box, and don’t have to puzzle over which of several hundred databases will suit their needs. It may offer other advantages, but that’s the primary benefit.
Setting aside differences in response time and database size (that 10-ton elephant), there are also problems of wildly disparate database types: full-text aggregations with minimal metadata, catalogs and pure indexes lacking full text, databases that aren’t textual in nature, and web pages that aren’t databases. The most common search-and-retrieve language (Z39.50) is rare outside the library field and implemented differently by different database providers. Many library-related databases don’t do relevance ranking or any complex sorting, making combined displays more difficult.
Some people view metasearch as a magic bullet, the answer to all searching needs for libraries. Some librarians view metasearch as a waste of time and money. I believe they’re both wrong. The truth lives somewhere in between.
Well-implemented metasearch has real advantages for many libraries, for many users, for many situations. Individual databases and online catalogs with their native search interfaces also have substantial advantages over metasearch for many libraries, for many users, for many situations.
Many solutions
One solution is rarely right for all circumstances; one user interface or one search technique is unlikely to suit all circumstances. “The user” is even less meaningful as a term than “the library”: Faculty and postdoctoral students aren’t the same as freshmen, and students taking honors courses in their majors have different needs than students rushing out a survey course paper.
I won’t recommend a metasearch solution and don’t believe one solution will suit all libraries. I do recommend paying attention to the thoughtful work being done on the hard problem of making metasearch a worthwhile complement to database searching.
The tricks are to determine which solutions make most sense for which cases, come up with good ways to guide users to the solutions that make most sense for them at the time (and make it easy to move among solutions), and make metasearch work as well as it possibly can. Those aren’t trivial tricks. They are worthwhile goals.
|