Abstracting the ILS


At the last DLF Forum in Pasadena, there was great interest in exploring the development of a lightweight API or computational framework that would permit the abstraction of a discovery layer away from the typical ILS software utilized by libraries that typically includes discovery (via the "OPAC"), cataloguing, acquisitions, circulation, and other functions. 

About a month ago, I sent out a call for nominations, and with this post, I can announce the membership of the committee, after my reprise of the call, below.

I want to thank everybody who has already been involved so far, including those who nominated others, the ones nominated, and of course everyone who advised, commented, and so far contributed.  Rest assured there will be plenty of opportunities for additional contribution; we hope the deliberation will be as transparent as possible.  

 -------------------- 

Text of original announcement

The ILS Discovery Interface Task Force:  

* John Mark Ockerbloom, Penn (chair)

* Terry Reese, Oregon State
* Patricia Martin, UCOP
* Emily Lynema, NCSU
* Todd Grappone, USC
* Dave Kennedy, UMD
* David Bucknum, LoC
* Dianne McCutcheon, NLM

 

June 26, 2007  | Categories: DLF

A Glimpse of Neon


In the short span of time that I have been with DLF, I've spent a certain amount of time thinking about the future of libraries; the direction that digital library programs should take; and the kind of organization that DLF should become.

Part of my thinking is motivated by a programmatic concern: where should DLF put its energies? Right now, DLF has a single dominant area of activity, aside from the small standards initiatives and explorations we are funding, called Aquifer. Aquifer is an ambitious Mellon-funded 2-3 year project that hopes to establish a multi-library pool of information assets and metadata that can be utilized both by institutions for collection building, and directly by users. Its lessons and direct benefits will have a long term impact on the accessbility of the riches held digitally by libraries.

It has also informed how best to develop organizational models of how libraries should cooperate and develop initiatives. Aquifer has demonstrated that - although the benefits can be great - getting libraries to work together is really quite hard. Libraries are institutionally-bound, and already face huge hurdles in confronting, accepting, and responding to the changes in their landscape. They inevitably find it difficult to repeatedly release sufficient resources to engage in inter-peer commitments, particularly when they suggest or mandate the establishment of independent organizational frameworks.

Perhaps the longer term solution to organizational engagement lies in a different type of collaboration.

Recently, through a personal/professional connection at UC Berkeley, I was able to make an entrée to a fascinating new cyberinfrastructure project called Neon. As Neon's website states,

The National Ecological Observatory Network (NEON) is a continental-scale research platform for discovering and understanding the impacts of climate change, land-use change, and invasive species on ecology. NEON will gather long-term data on ecological responses of the biosphere to changes in land use and climate, and on feedbacks with the geosphere, hydrosphere, and atmosphere. NEON is a national observatory, not a collection of regional observatories. It will consist of distributed sensor networks and experiments, linked by advanced cyberinfrastructure to record and archive ecological data for at least 30 years. Using standardized protocols and an open data policy, NEON will gather essential data for developing the scientific understanding and theory required to manage the nation’s ecological challenges.

I think this is very interesting, and I've been having some (very) early conversations with the CEO of the project, discussing some of the data organizing, accessing, and control challenges, and the ways in which DLF member institutions might possibly help. Strategically, this takes us far beyond the concerns we so often find ourselves enmeshed in today, so thoroughly infiltrated by Google and other commercial entrants in the information discovery space. Organizationally, this brings us new models for how we might see ourselves, and serve our institutions.

This might be one future for libraries. Libraries cannot transform themselves structurally through magic. But through an open and deep application of their immense expertise, libraries can beneficially impact significant and important projects, and through the act of changing others, they will change themselves.

The future for libraries is not merely in their working together, but also, perhaps primarily, in working with others. The future resides in the creation of inter-institutional partnerships that marry libraries' wisdom in the organization, presentation, and accessibility of information to projects seeking to generate, deliver, and manipulate data in the service of science, learning, and education. Through these hybrid partnerships, libraries will enter a new territory of ideas, enriching their own experience while they bring insight to the work of others. Working beyond ourselves, libraries will chart a future into new lands.

 

June 25, 2007  | Categories: DLF

Another country heard from


Well, more of me, anyway.

Just to let folks know that short blog posts, as opposed to more in-depth essays, are now likely to be placed at the O'Reilly blog, where I have been blessed with posting privileges.

How cool is that? w00t! :)

pb

 

June 22, 2007  | Categories: DLF

F8'ing Search


Like an entire seeming cohort of lemmings, I've become a Facebook convert. All us late-bloomin' lemmings got excited in part because Facebook very recently opened up its platform through APIs so that it is quite straightforward to write an application which can work within the F8 context. This means that I can track my dopplr friends within Facebook, for example, or keep track of what music they like. It's pretty awesome; it's pretty easy; it builds a very sticky platform.

Tonight over an exquisite dinner with a couple of friends in publishing, interspersed through our musings and speculations on content integration, discoverability, and site functionality, I began to wonder: could we make search work like a platform, like Facebook? If, in other words, we take it as a pragmatic, realistic assertion that Google Scholar, or Google generally, will not expand and liberate its APIs as much as external application service creaters would like, making all of the Google apps fully and wholly mashable, then would Google apps be willing to turn themselves into platforms?

Imagine if Google Scholar took the Facebook approach: then maybe I could tell Scholar that I am an Elsevier Scirus search user, or a Scopus licensee, or a devout user of a future iteration of Zotero, and I want to integrate that functionality into Scholar. I could do research within Scholar, or Daddy Google for that matter, and then pipe the results into an applet of my choice for further analysis. Or alternatively, take some preliminary investigations from a specialized high end database aggregator and pipe them into Scholar for broader query. This functions to some extent like some of the "Web 3.0" search enrichment services that Michael Jensen has so beautifully captured in his article, The New Metrics of Scholarly Authority. There are many other kinds of applications that one could deduce for this metaphor of search.

Like Facebook for example, within Google Book Search, I could subscribe to my friends within my own academic niche, and maybe see what they have been browsing, if they give me permission. Or maybe Harper-Collins has a widget that would permit me to embed page previews within the GBS experience with knobs and levers that I particularly value.

The virtue of the Facebook compromise is that as an individual user, I select my pain quotient, my desserts, my own integration travails. If something doesn't work, I know it is not the fault of Facebook; if something works beautifully, Facebook gets part of the credit for helping make it happen. That's a pretty attractive bargain.

It makes Google (or Microsoft Live!) more sticky, and enriches search for users. In this type of integration, Facebook has shown an interesting path forward.

 

June 18, 2007  | Categories: MassBooks

The Bookstores


What used to be the primary Barnes & Nobles in downtown Berkeley.

 

Shuttered Berkeley B&N bookstore

 

June 15, 2007  | Categories: BookRights

Monetizing libraries


"Proverbs for Paranoids, 1: You may never get to touch the Master, but you can tickle his creatures."

- Thomas Pynchon, Gravity’s Rainbow, 1973


A colleague in Europe recently forwarded to me the Google agreement with the CIC libraries. Even though I had been told this new agreement had some very different language from that in prior contracts, it was still eye-opening reading.

Simply put, the CIC libraries are contributing in-copyright material to Google for scanning, but for the first time (known to me), they will not get a copy back. (Michigan and Wisconsin have previously negotiated arrangements with Google, and the terms of the CIC agreement hold only to the extent that it permits the two universities to participate in a CIC-wide project; the contract terms of Michigan and Wisconsin are not superseded.)

Let's go to the contract. Clause 4.7(b) states:

Escrow Deposit. As Google "successfully processes" the works contained in the Selected Content, Google will place the University Digital Copy of such Selected Content in escrow on a secure server maintained by Google at Google’s cost and expense.


(The University Digital Copy being a copy of the material digitized by Google.)

Clause 4.9(a) then stipulates that "Works in dispute" may be withheld from Escrow deposit, and 4.9(b) permits a CIC institution to assert that a work thought by Google to not be public domain, is actually in such a condition, and as long as the CIC University partner indemnifies Google and provides assurances that any claims can be addressed, the institution can receive a copy of it. (This is a high bar, so don't anyone wait in line for it to be invoked unless there is fair certainty; that said, I know the University of Michigan has encountered situations [podcast] where they feel Google treads extremely conservatively, and where a reading of the law is clear enough to grant the University assurance to act.)

It is in "4.10 University Digital Copy of In-Copyright Works" where the story starts getting interesting:

In General. As noted in Section 4.1 above, Google may, in ways consistent with applicable Copyright Law, select and Digitize In-Copyright Works contained in the Selected Content. Such works will be part of the University Digital Copy and, as such, the Digitized files will be maintained in escrow as set forth in Section 4.7 above and released to the Source CIC University as set forth in Section 4.11. Until such time as these In-Copyright Works are released, Google agrees to provide CIC Universities with searchable access to such In-Copyright Works as described in Section 4.6 above.


(4.6 provides for a hosted solution by Google for the CIC libraries.)

Then come the constraints:

4.11 Release of In-Copyright Works Held in Escrow. Subject to the terms of this Section 4 Google agrees to enable download capability from the escrow to the CIC Administrative Offices or the applicable Source CIC University for one copy of the digital file for any In-Copyright Work(s) held in escrow in the event that any of the following release conditions (each, a "Release Condition") occurs:

(a) the In-Copyright Work becomes in the public domain;

(b) a Party has obtained permission through contractual agreements with copyright holders that includes the right to make a copy of the In-Copyright Work and to provide it to the CIC or Source CIC University;

(c) well established case law exists that In-Copyright Works can be copied and held by the CIC Administrative Offices and/or the Source CIC University without infringing on the rights of a copyright holder;

(d) if at any time Google is in material breach of its obligations under Section 4.3(b) or 4.6(a) and Google does not remedy any such failure within ninety (90) days after its occurrence (or, in the event such failure is caused by technical problems or causes similar to those described in Section 12.5, within such longer period as Google, working diligently, reasonably requires to remedy such problems); or

(e) the CIC Administrative Offices or the Source CIC University and Google agree in writing that the release of a particular In-Copyright Work or Works is legally supported and appropriate under the terms of this Agreement.


In other words - pretty much, unless Google ceases business operations, or there is a legal ruling or agreement with publishers that expressly permits these institutions (excepting Michigan and Wisconsin which have contracts of precedence) to receive digitized copies of In-Copyright material, it will be held in escrow until such time as it becomes public domain.

That could be a long wait.

Why was this done this way? What, after all, are the chances that residents of these U.S. Midwestern States might actually receive a digitized copy of the works that their tax dollars made a contribution towards purchasing? (N.B.: Not all CIC members are public institutions, but the overwhelming majority of them are.)

The answer to this question may well depend on the outcome of the larger contest.

In an article early this year in The New Yorker, "Google's Moon Shot," Jeffrey Toobin discusses possible outcomes of the antagonism this project has generated between Google and publishers.

Paramount among them, in his mind, is a settlement:

Google's endeavor is encountering opposition. A federal court in New York is considering two challenges to the project, one brought by several writers and the Authors Guild, the other by a group of publishers, who are also, curiously, partners in Google Book Search. Both sets of plaintiffs claim that the library component of the project violates copyright law. Like most federal lawsuits, these cases appear likely to be settled before they go to trial, and the terms of any such deal will shape the future of digital books.


Toobin then goes on to delineate what such a deal might actually look like:

The terms of such a deal aren't hard to imagine. The Authors Guild is concerned that pirated copies of the books on Google's site could leak to the public, and so the organization would insist on security measures. [...] As for distribution of the proceeds from the site, Google might agree to share revenue with publishers, in the way that radio stations pay for the music they play; publishers could receive a fee based on a statistical analysis of how often their books are viewed. Google could pay in cash or in kind, with advertising.


That's an eye-opening observation, and from my perspective as the Director of the Digital Library Federation, one that has not received enough speculation.

Obviously, any settlement would not cover the In-Copyright materials that are already part of Google's Partners' Program, in which publishers submit their material directly to Google for digitization (if necessary) and inclusion in Google Book Search. And, I think by any likely definition, a settlement would not include public domain material, which is differentially encumbered in these agreements. Therefore, a settlement must quite manifestly concern itself with works that are believed to be In-Copyright but where no publisher has stepped forward with an explicit opt-out, or where there is suspicion that they might be In-Copyright, but it is not known for certain (i.e., they are "orphan" works).

If this is what a settlement might cover, publishers will race to establish their historical rights portfolios with a zeal that will be astounding to watch.

More importantly, as Toobin intimates, remuneration must be involved, and it is at least open to suspicion that libraries will have to license access to the material covered in a settlement. (Perhaps paranoid on my part, but I have been long concerned about the new liabilities associated with moving into a realm of digital monograph licensing). The publishers and authors are not suing over tiddlywinks; there is real money at stake for them, far into the future, and the legal bills alone are already breathtaking. I find it hard to wrap my head around how a voluntary collective licensing arrangement such as the one Toobin describes might work in practice, but if the opportunity arose, I suspect that Google is clever enough to come up with something.

(I should note that a settlement would have several beneficial secondary consequences: someone, for example, will have to create a registry to record the known rights status of orphan works, a goal that everybody who is sentient, reasonably sane, and not drugged out of their minds desires. A derivative concern is whether or not such a registry would be public, or more specifically, what parties would have access, through what means, and under what conditions.)

Let us briefly return to the CIC deal. If a settlement along the lines that Toobin suggests occurs, then it is seriously in question whether the CIC institutions (again, excepting Wisconsin and Michigan, which have prior digitization agreements) would be able to obtain access to their In-Copyright works until such time as they fall into the public domain; I find it hard (not impossible, but hard) to imagine why publishers, as a community, would permit the CIC to obtain such copies; the "library copy" is something that has deeply irritated them since the Google Book Search program started.

I think the CIC agreement is a significant enough departure from the prior public contracts that we must take notice of its suggestions that the relationship between Google and publishers is maturing, and that Google is more cautious of the distribution of In-Copyright material than they ever have been before.

That may well be the harbinger of something broader; something that libraries and the larger public might not find much cause to celebrate. Observing, "[A] settlement that serves the parties' interests does not necessarily benefit the public," Toobin invites Larry Lessig to comment:

"It's clearly in both sides' interest to settle," Lawrence Lessig, a professor at Stanford Law School, said. "Businesses in Internet time can't wait around for years for lawsuits to be resolved. [...] For the publishers, if Google gives them anything at all, it creates a practical precedent, if not a legal precedent, that no one has the right to scan this material without their consent. That's a win for them. The problem is that even though a settlement would be good for Google and good for the publishers, it would be bad for everyone else."


Being neither Google nor a publisher, I'm in the part of the global community encompassed within Lessig's "everyone else."

Lessig continues:

"If Google says to the publishers, 'We'll pay,' that means that everyone else who wants to get into this business will have to say, 'We'll pay,' " Lessig said. "The publishers will get more than the law entitles them to, because Google needs to get this case behind it. And the settlement will create a huge barrier for any new entrants in this field."


A settlement between Google and publishers would create a barrier to entry in part because the current litigation would not be resolved through court decision; any new entrant would be faced with the unresolved legal issues and required to re-enter the settlement process on their own terms. That, beyond the costs of mass digitization itself, is likely to deter almost any other actor in the market.

And that to me is potentially the saddest loss, should such an arrangement come to be realized. Because in real terms, across this vitally important collection of humanity’s literature and thought, of all the ways of thinking about books and working with ideas on the Web, we might be left with only one way.

 

- Peter Brantley, Digital Library Federation, 2007:06:13

June 13, 2007  | Categories: DigLibs, Libraries, BookRights

On scholarly communication and university presses


When I was at the SSP meeting in San Francisco this week, I ran into an old friend who is the director of a noted university press. We talked about several things in the business, but I got around to asking, "What's your rights picture? Do you have a pretty good handle on what you have?" His response was, "Yes, we really do. We have a great picture. From 1998, anyway."

I should note that when I shared this vignette with a friend at a very large trade publisher, the recognition was almost palpable through email, but yet with a fortitude-conveying, "We'll get there," response. (Indeed, various scenarios could drive the recapture of this historical rights data faster than others. But I digress.) I shall note this is not an uncommon scenario, and it leads into a bigger riff.

There's a lot of talk right now within universities about remaking scholarly communication, and with that, remaking the relationship between libraries and university presses (I've contributed to some of this, myself). But what's becoming obvious to me is that there is often a pretty serious disconnect between these two worlds; in fact, a potentially crippling one.

Certainly, if university presses could gain recognition within university administrations that they are handicapped in an open market and need economic subsidization as a core academic enterprise, they would be significantly healthier, and could afford more assertiveness in product development and differentiation. But this is the least of the misunderstandings.

As another press friend said, "Librarians think they can prepare content in xml, push a button, place it on a website, and declare themselves publishers." And while that is a simplication of what libraries are thinking, in that rests so much of our common problem.

Because, first, truly, that can be a lot of what there is to publishing. But it is a very different kind of publishing than publishers have been involved with. Not necessarily better or worse, but very, very different. As we enter a world of mixed and interactive media, push-button publishing may be more like what some publishing will be. Not all, and I think not ever, all. And even in the most panglossian visions of e-publishing, you are still on the hook for the most important things that publishing has always been about: nurturing a work, building community, creating buzz, starting conversations. And the more I think about these central aspects of publishing - the social aspects of publishing - I am not at all convinced that librarians, despite their technical whiz with text/xml, have got even a basic clue about what these things mean. Building great, nifty mixed media products doesn't build viral marketing; doesn't find readers; doesn't build an audience; and doesn't find revenue.

In short, I am coming to the conclusion that librarians are likely to be lousy publishers.

But the most important thing to me about this recent spate of conversations is that libraries so utterly and completely miss what it means to publish what publishers have been publishing for the past several hundred years - longer form articles and manuscripts. The publishing work flow is intense: it requires significant hand- and thought-work. Editors don't sit around at their desks waiting for pretty, tightly-formed, well-argued drafts to come floating by. There is a lot of work in finding, attracting, grooming talent; encouraging the actual writing; producing coherent drafts; editing; presentation; administration; rights; marketing; and distribution. Some of these things are made easier by Web 2.0 and social computing, but in most cases, the workload has only increased, at least in the short term. For example, as a publisher, I have to now build buzz on-line as well as through my sales reps. Not only am I required to send the author traveling around 50 cities and 220 bookstores, but I will likely need to provide him or her a blog as well. Convince them to write more than "Hello, I have a blog," and to engage their fans in MySpace. And a core piece of publishing -- making sure what goes out to the world is literate and persuasive -- is always going to be hard work.

We're just not that smart, humans; it is really difficult to communicate with each other. Not everything is going to be improved by being processed through a collaborative, social mill. The best things are always going to take somebody's care, and love.

What librarians are missing is that all the talk of reinventing scholarly communication does almost nothing to actually help publishers publish. It doesn't alleviate the suffering; it creates an added distraction. It doesn't suddenly reinvent the economics of publishing or equip publishers with brand new staffs perfectly attuned to networked information after years of practice with their facebook profiles. Libraries are somewhat akin to Medieval physicians administering tonics to eliminate fever while instructing their patients to avoid approaching bodies of water at the time of full moons and teaching them how to make curative herbal teas.

I think anyone who has heard me talk, or reads anything I have written, knows how excited I am about some of the possibilities for building new forms of communication - for building new community - that our maturing understanding of network technologies is bringing us. But we have to appreciate the work that we are doing now, as publishers, as a society, and to bring these values together in a way that respects the best of what has been, and what could be. If either of these sets of institutions are to participate in a solution - libraries and presses - it will require serious, long-term, fundamental re-invention of their essence. There's pain there; it won't be avoided. And we're not there yet.

But we'll get there.

June 8, 2007  | Categories: Libraries, Publishing

SSP talk on inline integration


This afternoon I gave a talk at the Society for Scholarly Publishing conference, "Scholarly Communication 2.0", in San Francisco, on the future of search. Well, theoretically, that was what the talk was supposed to be about.

I had the pleasure of co-presenting with Anurag Acharya from Google (Scholar), and a Microsoft Live program manager, Heather Dystrup-Chiang. (They had much more on-target presentations than the one that I gave).

This talk was a little bit of a condensation and rehash of one that I had given at NLM; better than that talk for having been given once before, and for me to have acquired more information and had time for a little bit more thinking.

The talk is not available in audio or video, so one thing that I drew out in the talk relates to the active thread in our communities (publishing, library, and so forth) on how content will be mashable, remixable, etc.; that this, while important, should perhaps not be so much a focus as the act of reading itself, and how reading can be transformed into a more iterative, or social, or collaborative fabric. I am not suggesting that is the sole future of reading - much reading will always be solitary - but rather we are seeing the enablement of a new form of interaction with material given our current set of technologies which is more social, and in a way, very familiar to any of us who have sat in front of a grandparent or uncle and heard them tell stories.

June 7, 2007  | Categories: Publishing

This is the personal blog of Peter Brantley, and the opinions expressed here are his own and are not reflective of any of his employers in the continuum of history, or the University of California, which provides support for this blog.

Recent Posts

Search

Categories

Subscribe

  • RSS
  • Bloglines
  • MyYahoo!
  • MyMSN
  • Newsgator
  • Google Feeds
How to subscribe
powered by
b2evolution
Join EFF Today