| « BCLT conference on Google Book Search Settlement | Amazon and Google » |
This last trip to New York for the Tools of Change conference, I spoke with a small publisher who had been thinking off and on about the Google Book Search settlement proposal. She made an interesting observation about the strategy for monetization of the content: it was through advertising, individual purchase, and institutional licensing. In other words, it was not solely through advertising, which is how Google obtains the overwhelming majority of its revenue.
That made me pause, because I had never really thought about that option: I had always assumed that digital books would move toward a licensing arrangement, in part because I had presumed that individual publishers would be the ones doing the licensing. (Fail!)
My friend, who is being squeezed betwixt Google and Amazon, was right to observe instead that Google could easily become the dominant distributor for online literature through licensing. This is a novel burden Google is assuming, with which it has no prior experience, and one that inserts itself between rightsholders and consumers (never a pretty place).
Although I meekly admit to not having thought this through deeply, it seems to me that the reliance on licensing (and individual sales) in addition to advertising might mean one of the following things:
1) The rightsholders involved in the crafting of the settlement language in concert with Google mandated a licensing scheme because it was what they were most familiar with, and because it seemed to guarantee them a revenue floor.
Or:
2) Google has analyzed the traffic through its Google Book Search site; it is rather phenomenal, as Dan Clancy of Google revealed at the last DLF Forum, and as Jon Orwant of Google discussed in more detail at ToC. Basically, every book in GBS gets viewed, and viewed a lot. Even with that, maybe Google determined that GBS traffic does not generate adequate advertising revenue via click-thrus to recoup the $200 million plus Google will expend through the settlement, plus its scanning and ingest costs, and to provide adequate income on top of that for the rightsholders to make it worth anyone’s while (particularly in the near term, when people want to see returns relatively quickly).
And finally, of oourse, there is:
3) Google makes more money by using as many revenue streams as possible. This is appealing in its simiplicity; however, the problem with this is that the organizational and technical infrastructure to support institutional sales, even through a third party (permitted in the settlement) is a significant new cost for Google. I have heard that even individual sales are difficult to enable for Partner Program (frontlist) books, and have required an on-going re-write of the Google Checkout payments system. In other words, I suspect Google would not take up new forms of revenue-generating distribution unless it absolutely had to do so.
I have no idea which of these postulates is correct, if elements of all three are true, or if all are wrong. But take (2) for a second, because it is more fascinating: it means that contextual advertising against book content is really difficult because the content comes in such bigger chunks, without significant link networks to provide external relevance valuation, and with often notably less internally provided context than web pages. In other words, the diversity of web pages and their heterogeneity in citation graphs produces more robust evaluations for ranking. It is hard to rank large items that demonstrate significant internal consistency.
In logical succession, this suggests the difficulty of integrating book content into the main index for discovery. This challenge was discussed briefly by Jon Orwant at ToC 2009 – the desire of the GBS team to “earn” inclusion into main index search results by demonstrating relevance to main search engine queries. As he pointed out, some searches merit GBS inclusion relatively easily, such as “Tennyson poetry”; others such as “irrigation acequias” might be far more difficult for GBS to deliver results scoring high enough to justify inclusion (these are my examples; Jon used others that were better).
One of the concerns that some librarians have had with Google and GBS is the opacity of ranking; there are very pointed critiques on how Books ranking takes place within GBS. When one attempts to essentially combine different algorithmic regimes (e.g., those for web, books, and Google Scholar) to produce integrated results, the number of potentially fatal permutations, and the risk of local (vs global) optimizations must be extremely significant. Google might argue this is why they do not expose their calculations; certainly they must be changing them constantly. But one might argue that this is precisely where Google should expose their work.
It would be interesting to think about a Netflix-like challenge for integrating Google Book Search content with main web results; Netflix did not expose its own algorithms in its solicitation of that work, but in that case the competition clearly exercised our public knowledge of what approaches might work best in the computational challenge of recommending against a diverse but homologous media repository.
The other sign of maturity that I would like to see emerge from Google is a willingness to expose a few knobs and levers for an Advanced Search option in GBS that would allow users control over boosts or weight factor-inclusions when there is a desire to conduct more sophisticated searching against GBS content. Currently, GBS "Advanced Search" only permits users to specify core metadata fields such as author, title, etc.
Both of these would be new signs of openness for GBS, and would be warmly welcomed by many communities.
"Google has analyzed the traffic through its Google Book Search site; it is rather phenomenal, as Dan Clancy of Google revealed at the last DLF Forum, and as Jon Orwant of Google discussed in more detail at ToC. Basically, every book in GBS gets viewed, and viewed a lot. ..."
This post has 2 feedbacks awaiting moderation...
Server manager: contact