« Sidewalk tracesBooks for a Digital Nation »

Book Search as a product


Book Search as a product


A few days ago, I was having dinner with a publishing friend in San Francisco, and we started talking about Google Book Search.  We were both tired, but enjoying our conversation, and my friend made a comment that engaged me; I concurred in its finding.

Neither of us use Google Book Search.  

That's interesting, because both of us read quite a bit.  Neither of us, notably, are actively engaged in academic research, which has a very unique pattern of search and discovery.  However, we both read a lot of serious non-fiction, and we do not find books, or investigate books, through GBS.  Generally, when I am looking for something to read, I am abstract- (aka, book reviews) and metadata- (aka references, citations) bound.  I do not search for books to read, or purchase, by doing open-ended full text searches, even when I have specific topics in mind.  

I think full text search is a wonderful thing, and the world is poorer without it.  My experience, and that of my friend's, suggests that it is not that relevant for purchasing.  Rather, the essential characteristics that Amazon and Powells offer (as does GBS) are sufficient: cited reviews by trusted third parties, and by readers; a browse-inside type functionality to determine whether the author can write with clarity and engagement; and straight-up search on basic metadata such as author and title.  

As I mentioned earlier, I have no desire to draw too many conclusions from this.  For research, navigating through full text search will become a normative expectation.  However, even there, the nature of the "book" as an object frustrates our naïve, idealistic visions of books talking to each other.  A moment's thought draws into critical light the relative worth of deep content interlinking.  

Books - even non-fiction - intrinsically present poorly structured information.  They are about as far as one can get from a RDF tuple: meaning must be interpreted and mined.  As a searcher looking for information on a topic, full text search against books generally serves me poorly; it may be a very useful adjutant for deep inquiry, but for drive-by or casual queries, it is a frustration.
 
For example, if I was interested in a general topic such as Ronald Cartland's role in the anti-appeasement discussions in the House of Commons in 1939, I benefit from finding books that discuss this.  However, if I am particularly interested in the date of Ronald's Cartland entrance into the House, or summary information on his relationship with the famous romance writer Barbara Cartland, then a search against the main Google index is far superior.  

There are a couple of forks from this.  The first is that the amount of effort to intertwine book content with other data, such as temporal, geographic, or biographical thesauri, might have relatively little real-world usage (oh, to see the logs from GBS ...).  Aside from the Curious and a few deep content divers, how useful is mixed-content integration for discovery against books?  Is integration perhaps more useful against other forms of content, such as the inherently more structured and reference-like material found in open web pages?  

Second, what does useful cross-text navigation actually look like?  Going back to my previous example on appeasement, if I was interested in the outcome of the House debate on Germany's invasion of Norway and Great Britain's failure to respond aggressively, how would that actually appear to me, as I navigated my way through a GBS product?  What, in other words, becomes a target for hyperlink expansion, and what are its targets?  If I assume that "appeasement" was a topic that I wanted to navigate, what UI would I expect?  

What is difficult here is intentionality.  It is extraordinarily difficult to determine what a user's intentions are as they navigate and browse through a sea of text.  It is relatively easy to give them intellectual "snack food" - places cited in this book; a timeline; historical figures.  Those might drive clicks, and ultimately ad sales, but they might not actually help the user in their quest.  

We are dumb animals after all, most of the time.  We click on bright shiny objects, and are easily distracted.  Designing a product to best meet the diversity of a user's intentions is very different than designing a product to maximize revenue.  

There's one final question for which I have no answer.  Presumably, as Google in fact previously announced but has not advanced, it would be possible to commercialize GBS as a bookstore, with PPV or POD services; it might also serve as a source for licensed content and services.  Either or both might, from Google's perspective, be a desirable offering.  

But what is the relative value compared to integration with the main search experience?  

This is not a trivial question.  Google has to make very hard choices about what to maintain as a search silo, and what to integrate into the main index, at what level, and through what metrics.  Is the user better served by collapsing book content into the main index?  What are the consequences for GBS functionality?  We remember that index-level integration is the choice that Microsoft made with Live Books when they vacated the book digitization market.  

Google is under no compulsion to continue GBS as an independent entry point in the absence of market incentives that justify the myriad expenses associated with its development.  In the absence of sufficient direct-driven revenue, GBS might actually dilute advertising based income from the main index.  Because book "data" is so poorly structured, and browsing behavior does not always translate into tidy crystallizations of intent, advertising CPMs against GBS might be less than when books are combined with the main index.  (I'm not saying they are, just that certain types of user behavior might not generate valuable advertising targeting profiles).

I'm left with a lot of questions and no answers, but I think we need to consider carefully, as publishers, libraries, and search firms, what we are trying to achieve, and importantly, for whom.

Jun 20, 2008 | Categories: MassBooks, DigLibs, Publishing, Search | pbrantley

3 comments

Comment from: Edward Vielmetti [Visitor] Email · http://vielmetti.typepad.com
Google Book Search is best for pre-1922 materials; your example from the 1930s hits the GBS dead zone, where nothing is available full-text or even limited text, and the snippets are just frustrating. It doesn't get better until you reach popular works in print.

For pre-1922, GBS gives you some sense for what is available, but even with that it's nowhere near complete. Finding something there for me generally leads to a next step of figuring out which library has it, so that I can go to their real collection.

This is the best account I know of how GBS changes scholarship:

http://landscape.blogspot.com/2007/03/how-google-books-is-changing-academic.html

if your uses don't match that your experience will be different!
06/20/08 @ 11:52
Comment from: Eoin Purcell [Visitor] Email · http://eoinpurcell.wordpress.com
Interesting thoughts Peter.

I agree with you about not using GBS as a search and find tool. In fact, I'm far more likely to use amazon, LibraryThing or The Book Depository.

I've downloaded the text of a few books I have wanted to read but mostly, the text versions aren't great and you are better of with Gutenberg if you can find them there.

I'm intrigued by your thoughts on the interface that best suits the search.

Now to get thinking about what you have written!
Eoin

06/20/08 @ 15:14
Comment from: Adam Hodgkin [Visitor] Email
In the end Google Book Search may work best as index and as a search tool because it enables us to obtain access to many different reading and writing styles, and to search books in a variety of digital manifestations.

More on why we should not be expecting Google Book Search to be an ideal reading system.
http://exacteditions.blogspot.com/2008/06/is-google-good-for-writers.html
06/22/08 @ 21:19

This post has 1 feedback awaiting moderation...

Leave a comment


Your email address will not be revealed on this site.

Your URL will be displayed.
(Line breaks become <br />)
(Name, email & website)
(Allow users to contact you through a message form (your email will not be revealed.)
7 + 3= ?
antispam test
This is the personal blog of Peter Brantley, and the opinions expressed here are his own and are not reflective of any of his employers in the continuum of history, or the University of California, which provides support for this blog.

Join EFF today

Recent Posts

Search

Subscribe

  • RSS
  • Bloglines
  • MyYahoo!
  • MyMSN
  • Newsgator
  • Google Feeds
How to subscribe
powered by b2evolution free blog software

Server manager: contact