« Homes for Good (Orphan) BooksOpen Access, Harvard, Google Books, and a New Long View »

Searching for flamenco


Searching for flamenco


A couple of weeks ago, my wife demonstrated to me the speed with which application fabric is spun from the fibers of new services. She is an avid amateur student of flamenco, a dance form usually associated with Spain, but actually possessing deep and extensive Romani (Gypsy) roots. Flamenco has many traditional expressions, and my wife often searches for recordings online, as well as texts describing its history and evolution.

Recently she was searching for an obscure out of print title. After about 20 minutes of keyed up silence across the dining table where we keep our mutual laptops perched like terminals into the nether world, my wife leapt up from the table and rushed to our printer, picking up a printout.

- “Success?” I inquired.
- “Yes,” she said. “I can't find it locally, but both Harvard and Yale have copies, and so does UC Riverside.”
- “Did you look for it in Google Books? -- they might have a copy from one of those libraries.”
- “Of course,” she replied, weakly tolerating my assumption that she wouldn't know how to use online tools, “that's how I found it, through the links with WorldCat Local. I've requested the UC Riverside copy via ILL from the Berkeley Public Library.”

The story is a useful vehicle to ruminate about the potential consequences of innovative commercial services such as Google Book Search, whose power quietly and compellingly transforms both our enjoyment of reading, and the more rarified art of research, at both professional and amateur levels.

They most central aspect is a mature insight -- search is everything. For libraries, the path to utility in a networked age is to make their content available for searching. Books, pretty pictures, video recordings of Mr. Smith and Dr. Who -- all of these must be enhanced to facilitate harvesting by search firms. Libraries should forthrightly engage in an unhesitant embrace of search engine optimization.

Google Books is cool, wonderful, and awesome. I am jealous of those working on the product, repetitively transforming it into an ever more powerful tool, whose charms seduce our attention. Google's ability to aggregate across data pools, reinforcing the collective intelligence of content, is a critical facility that to date is comparatively absent in alternative search engines. Microsoft has produced some incredibly compelling technologies, but it is not likely to successfully compete on mainstream search. Yahoo may well be out of the picture; Ask, with a very small sliver of the search market, is a wonderful technology demonstration.

My issue with Google has never been that they are Google, or that they act on their own behalf. More power to them; I think they follow their interests very well. It is rather that selfishness should be a more intrinsic aspect to libraries, on behalf of the greater community; there are real and profound consequences for its relative absence.

University libraries optimize for their institutions more than the community of libraries. Given the current political and organizational constraints of their positions, this might not be unexpected, but I believe it has unfortunate outcomes. Participating in Google Book Search is a complex cost/benefit calculation, and I can understand the decision of any library, considered on its own, to represent its own academic corporation foremost in these calculations. But if our hands speak often through the signatures on the contracts, which increasingly abound, where is our body? Who and what are we, if we cloak our decisions with the mantle of a presumed but never explicated common agenda?

Recently, Dr. James Cummings of the Oxford Text Archive at the University of Oxford summed up some of the quandaries of Google's development of compelling, yet commercial, services in the academic sphere. Writing in a moderated forum, the Humanist Discussion List, Dr. Cummings discussed the release of Google's Palimpsest, a hosting service for scientific datasets with rich visualization capabilities.

While making more data available online, especially to the information-poor, can only be seen as a good thing, announcements like this always leave me with a slightly uncertain sense of unease. ...

A place where one can dump, and then successively add to and annotate, research data certainly has its place and I will watch this development with interest. But this, and many other developments always seem to fall far short of the co-operative virtual research development platform that I've always wanted. What I want is a sourceforge-for-humanists, where we can undertake truly collaborative research but with the benefits of software development tools (such as a free subversion repository, web presence, bug tracking, forums, etc.), but alongside of that all the tools one might wish for a variety of humanities endeavours. (src: "21.494 Google hosting research data")


Libraries have helped to make Google Books happen, and that is undoubtedly, without any reservation or qualm on my part, something to take a measure of pride in. Where I am unsettled is in the matter of benefits to libraries, scholars, and the public as captured both in the intrinsic terms of the existing contracts (which offer a startling heterogeneity of terms and conditions), and more urgently, in a possible settlement to the lawsuits entangling Google with the AAP and Authors' Guild.

I wonder, in an analogue to Dr. Cummings richer insights, where is the comprehensive open library that best meets our as yet uncertain needs for the generations that shall inherit the work of today? Where shall be our sourceforge for readers?

I highly suspect it will not be made in the cubicles of Google's lawyers.

In a potential settlement, for example, millions more books will be viewable than ever before -- generally those that are out of print, including many orphans, and potentially public domain works. That is terrific news. Most likely, access will be offered through a license of unknown cost metrics. Certainly, that is potentially acceptable -- yet under what compulsions and restrictions? Are those terms the best that could be achieved for not just the narrow interests of elite institutions, but the whole of those interests they claim to represent? Believe it they may, those whose hands may be bound to sign, but please respect the rest of us, our concerns and doubts.

My shimenawa blog offers one simple concern -- what happens to the proceeds derived from the online use of books that are orphans, particularly those that are subsequently found to be public domain -- where does the revenue from those works go? Do libraries, which are contributing the books in the first place -- get any revenue from them? Can they claim or direct any revenue on behalf of the commonweal, e.g., to hold in special funds to facilitate information literacy?

A few of my colleagues suggest that Google could offer (or impose) terms with each publisher, versus as a collective. I disagree. The motivations for a collective agreement that binds publishers and Google are more likely to be decisive. Publishers recognize that a terminus to the lawsuits portends a re-writing of crucial aspects of the book business for a wide swath of currently unavailable content. Seared by their experience with Amazon, which has dictated onerous, sometimes loathed arrangements, publishers will fear not being able to obtain commensurate dealings in a decisive moment of change. Deeply competitive, but with a well honed sense of their collective interests, and a clubbiness borne from an in-bred labor market, I cannot imagine that publishers will pursue any path other than seeking the shelter of common terms.

If there is a settlement, the overriding loss is the enervation of potential competition. For we would have a settlement, almost certainly based on a collective (I suppose, a convened class action) judgment that would need be granted by the court, as opposed to a potentially replicable set of proposals with individual publishers. It would be a special and unique arrangement that benefits Google by not only eliminating significant potential and unknown liability, but also permits them remuneration on the access against books provided by private and public libraries for Google's scanning. It also privileges publishers, as noted over a year ago by Larry Lessig, in Jeffrey Toobin's New Yorker article, Google's Moon Shot: "The publishers will get more than the law entitles them to, because Google needs to get this case behind it. And the settlement will create a huge barrier for any new entrants in this field."

In a generic sense, whatever marketplace good derives from the concept of "competition" -- the compulsion to produce new features, to race into underserved markets wielding the lethal sword of innovation -- will be sorely diminished. In a settlement, only Google amongst all market actors would compile a significant portion of the public domain, combined with the out-of-print, as well as in-print, texts. The benefits of such a significantly concentrated ownership position multiply with the accumulation of content, generating uniquely enhanced advertising revenue. A fiscally-calculated expansion across Google's growing set of content-reliant product offerings (Books, Maps, Earth, Scholar, Gmail, Orkut, Custom Search, Knol, Palimpsest, News, and of course, the open web) proffers significant ramifications for the viability of heterogeneity in the advertising marketplace, and is deeply consequential, socially. Arguably such scale is necessary to render benefits; but it cries for our consideration of the environment in which we partake those benefits.

Considered solely through the mechanism of the Google Books product, securing access to the largest volume of digital books forces a pivot around access to the published literature that will be determining for a very large percentage of potential consumers. In Google's Moon Shot, Tim Wu, a well-known law professor at Columbia University, says, "[M]aybe Google won’t get book search right. But if they settle the case with the publishers and create huge barriers to newcomers in the market there won’t be any competition. That’s the greatest danger here."

Good for GOOG. Good for many users. As good as we deserve? Maybe not good enough, I would argue. Potential gains and losses should not be casually dismissed with accolades to Google for giving us something far better than what we have, when we do not question what we might have been able to achieve, or what might still be within our grasp.

Even within Google, we must be uncertain of shifting priorities and claims. With a settlement in hand, how primal is continued development of GBS as a goal? How rich the set of features, and what will be the priorities that shape their unveiling? Google is not a university; what it is now, and what will motivate its excitement in the future will almost certainly aver a very different focus than what portends on closer horizons.

There are also subtle, and difficult to measure, costs from a settlement. An agreement between Google, the AAP, and the Authors' Guild removes an important litigation from the U.S. court system that would stand to clarify some very important claims relating to the use of in-copyright material. Is it acceptable to scan, index, and provide searching across digital files that represent published book content? Is it fair-use to show snippets to users, contextualizing their search results? The lawsuits, for good or ill, could potentially light a path through these difficult riddles, evaluating the different privileges of intellectual property holders and search firms. These are critical issues of our day.

Their answering might be severely delayed.

In the dim of these uncertainties, Google's own gains from a settlement are an ephemeral compromise, one that -- importantly -- might yet be challenged again, with a new and different suite of actors. Court settlements do not rewrite copyright, unlike legislation. In its turn, a rewriting of copyright could eliminate the bases for a settlement that was crafted, and deemed acceptable, under a very different prior copyright regime.

Some of our colleagues argue that a settlement, in this regard, is fine and good. Almost nihilistically, they argue that it is a distraction from our real aims: for libraries, universities, and not-for-profits to reclaim space in the future determination of the privileges granted by copyright. Perhaps. However, this diverse suite of actors has rarely demonstrated much effectiveness in organizing concerted efforts to establish fundamental new legislation without potent commercial allies to initiate the engagements. In balance it is difficult to determine whether one is better off with an active court case where we would be effectively allied with Google in many critical aspects, or a long slog of intellectual and political work where allies have fallen to the side, content with the arrangements they have privately negotiated.

Let us thus take pause in our proceedings. Let us hold hearings amongst all of our bodies -- search firms, publishers, libraries, and the public interest, considering the range of possible business models and their impacts, across which we might yet enter into whole new ages of intellectual discovery and imagination. A diversity of interests, after all, must argue most vehemently for a competition of offerings.

To help us build the finest sourceforge for readers.

Feb 23, 2008 | Categories: MassBooks, DLF, DigLibs | pbrantley

No feedback yet

Leave a comment


Your email address will not be revealed on this site.

Your URL will be displayed.
(Line breaks become <br />)
(Name, email & website)
(Allow users to contact you through a message form (your email will not be revealed.)
8 + 2 = ?
antispam test
This is the personal blog of Peter Brantley, and the opinions expressed here are his own and are not reflective of any of his employers in the continuum of history, or the University of California, which provides support for this blog.

Join EFF today

Recent Posts

Search

Subscribe

  • RSS
  • Bloglines
  • MyYahoo!
  • MyMSN
  • Newsgator
  • Google Feeds
How to subscribe
powered by b2evolution

Server manager: contact