Microsoft's Live Search Books

By Tom Peters | After playing around for an hour or so with the recently released public beta version of Microsoft's Live Search Books (LSB), I have to admit—against some vague sense that my better judgment is failing me—that I like it.

Sure, others have reported that LSB does not work well—or at all—when using browser software other than Internet Explorer, but if you stick to the straight-and-narrow Microsoft path, the service works and shows potential.

Spontaneous combustion of Krook in Bleak HouseOn December 6th, when the beta version was released to the public, I conducted a couple of sample searches on "phrenology" and "spontaneous combustion," two of my favorite hot topics from the 19th century. Spontaneous combustion sometimes is qualified as spontaneous human combustion, to differentiate the phenomenon, one imagines, from the spontaneous combustion of grain dust in an elevator, or burning bushes, or some random rodent explosion.

The phrenology search returned an impressive 518 books, with Sylvester Graham's human Lectures on Chastity showing some promise. The preface to that fine work was written by James Coates, Ph.D., who is described as a "medical magnetist."

A phrenological examination The search for books about spontaneous combustion returned 660 items, with Eliphalet Nott's 1857 Lectures on Temperance floating (or blasting) near the top of the heap. Evidently, the spontaneous combustion of habitual drunkards was a particularly vexing spectacle.

The relevance algorithm used by LSB seems to work in two stages. When a set of books is returned as the result of a search, the most relevant half dozen titles are displayed. Of course, the methods of determining relevance are shrouded in mystery. When I couldn't find a way to advance to the next set of six titles, I scrolled down to the bottom of the page, only to notice that the number of returned results kept increasing as I scrolled downward. Then, once you select a title, you get a batch of relevancy-ranked snippets. If you choose to go to one of the pages containing the returned snippets, the search term(s) you used are highlighted in the text.

Speaking of highlighting, several of the old books I examined contained lots of underlining, highlighting, and marginalia. Initially, this put me off, reminding me that these scanned books are not pristine, but taken right off the shelves of research libraries, with bar codes, property stamps, and doodling there for the entire world to see now. Then I became fascinated by the marginalia. Although much of it is banal, this huge mass of petrified marginalia, now scattered by the digital winds to the four corners of the globe, could be a boon for the study of marginalia.

Although my overall initial impression was positive, some aspects of this service confused or rankled me. The name of the service, for example, is confusing. The phrase "Live Search Books" is neither natural nor mellifluous. Who were the marketing geniuses that came up with that name? Why qualify a search as being "live"? What are the alternatives? a dead search? a batch search that runs at night?

And why can't Microsoft list the total number of scanned books in the collection? All the company can say in the Q&A is there are "tens of thousands" of English-language books in the collection. There seems to be a trait of the commercial mind that abhors the simple statement of facts, as if doing so would be, as a general rule learned in school, bad for business.

The full text of every book retrieved is available for viewing online and downloading. When I downloaded and saved a book, the only file format option was PDF. I could not figure out how to read these PDF e-books provided by Microsoft in Microsoft Reader, but they displayed just fine in Adobe Reader.

According to an article that appeared last week in CNN online, within six months Microsoft plans to integrate Live Search Books' results into results from other content categories, such as Web pages. This is a similar strategy to what Google seems to be doing with SearchMash. During this beta phase, only books no longer protected by U.S. copyright laws will be available for search and retrieval. The books have been scanned from the print collections of the University of California, the University of Toronto, and the British Library. According to the CNN article, additional scanned books from the New York Public Library, Cornell University, and the American Museum of Veterinary Medicine will be added soon. Once that last collection comes online, you can bet I'll be searching for "spontaneous rodent combustion."
Technorati tags: , , , , , , ,