The Pragmatic View: Digitial Libraries: Future Trend

There is still nothing like a paper book especially a "hardback" book.

///
http://www.siliconvalley.com/mld/siliconvalley/12806011.htm

Posted on Mon, Oct. 03, 2005 

Google's libraries project facing writers' block

By Mike Langberg
Mercury News

You can't build the biggest electronic reference desk
on planet Earth without suffering a few virtual paper
cuts, as Google is learning with its ambitious program
to digitize the contents of several major university
libraries.

The benefit to society in having a free, fully
searchable collection of millions of books and
journals is huge -- so huge that we should all root
for Google to prevail.

But the Mountain View search giant also needs to work
harder at finding common ground with authors and book
publishers who aren't happy about being Googled. It's
the right thing to do. What's more, it might not be
possible to complete the program with authors and
publishers standing in the way.

The dispute came to a head on Sept. 20 when the
Author's Guild, a group representing writers, filed a
class-action lawsuit to stop or modify the Google
Print Library Project.

Google is already the world's search engine of choice
for finding anything and everything on the Web. But
the company has always aimed higher, declaring:
``Google's mission is to organize the world's
information and make it universally accessible and
useful.''

Much of that information is locked away inside books,
most of which can't be searched electronically.

In October, the company started Google Print
(http://print.google.com) to fill the gap.

The first part of the initiative, the Google Print
Publisher Program, isn't causing a fuss. The Publisher
Program allows anyone who owns the copyright to a book
-- an author or a publisher -- to let Google digitize
the entire text of the book. Publishers decide how
much of the book you'll see when a Google search
results in a hit; it could be just a sentence or two,
or several pages.

Almost all the major New York publishers, as well as
many small presses and independent authors, are
participating in the Publisher Program. The benefit
for them is obvious: Google users will discover
relevant books, and may decide to buy the books to
learn more.

The second part of the program, unveiled in December,
immediately antagonized authors and publishers.

The Library Project is an agreement with the
universities of Stanford, Harvard, Michigan and
Oxford, as well as the New York Public Library, to
scan their entire collections and convert the scanned
pages into searchable electronic text.

Stanford alone has 8.5 million books and journals on
its shelves, explaining why the Library Project will
run for six years or more. Google is already loading
stacks of Stanford books into trucks, taking them to a
scanning facility in Mountain View, returning them,
and picking up the next load.

Some of what's in these libraries isn't protected by
copyright, including government documents and any
books published before 1923. But most of the
collections are under copyright.

Google is relying on the legal doctrine of ``fair
use'' to scan books under copyright. Fair use allows
the public to comment on copyrighted works, as well as
quote from them in a limited way. It's how a reviewer
can cite several sentences or even several paragraphs
from a new book, without permission from the publisher
or author.

For books under copyright, Google says it will display
no more than two or three sentences surrounding the
highlighted search term.

``Fair use is not a one-size-fits-all thing,''
counters Paul Aiken, executive director of the New
York-based Author's Guild.

Google is getting value from the Library Project
because the search pages would ultimately display ads
along the side. Authors and publishers therefore
deserve compensation, Aiken argues.

I disagree, as do many legal experts. Copyright law
supports so-called ``transformative'' efforts, where
someone seeks to profit from fair use. A book review
in this newspaper, after all, contributes to our goal
of attracting readers who also look at our
advertising.

Google talked at length this summer with the
Association of American Publishers, which represents
about 300 book publishers in the United States, and
couldn't come to an agreement. But Google did
unilaterally offer to remove individual books from the
Library Project at the request of copyright holders.

This could give effective veto power to big
publishers. If they ship Google a list of every
copyrighted book in their files, the Library Project
would be forced to leave out hundreds of thousands or
even millions of titles -- enough to keep the project
from becoming a comprehensive tool for searching the
world's most important books.

Andrew Herkovic, a top administrator in Stanford's
library system, warns of an impending ``digital Dark
Ages'' in which information not available
electronically is lost to all but a handful of future
archaeologists willing to unearth dusty,
long-neglected book-shelves.

Talks between Google and the Association of American
Publishers are set to resume this month. Let's hope
both sides find a way for the Library Project to move
forward, unfettered, so Google really can make all the
world's information universally accessible.
Contact Mike Langberg at mike@langberg.com or (408)
920-5084. Past columns may be read at
www.langberg.com.
###
http://www.siliconvalley.com/mld/siliconvalley/12805965.htm
Posted on Mon, Oct. 03, 2005

Consortium to digitize classic books for Web

WITH PERMISSION, THEY'LL BE FREELY AVAILABLE

By Michael Bazeley

Mercury News

A consortium backed by Yahoo has launched an ambitious
effort to digitize classic books and technical papers
and make them freely available on the Web.

One of the Open Content Alliance's first projects will
be to digitize the approximately 18,000-title
collection of classic fiction and non-fiction American
books owned by the University of California, the group
said. That could be completed by the end of next year.

The consortium includes Adobe Systems, Hewlett-Packard
Labs, the National Archives of the U.K., O'Reilly
Media, the Prelinger Archives, the University of
California and the University of Toronto.

The announcement of the consortium comes amid furious
debate about a similar project called Google Library,
in which the Mountain View tech giant is scanning and
digitizing millions of books at select libraries.

Google's effort differs, though, because it intends to
digitize material regardless of its copyright status.
The members of the Open Content Alliance say they will
scan copyrighted material only if they have the
permission of the rights-holders.

Also, while Google allows people to view only excerpts
of copyrighted material, the aim of the alliance is to
offer complete texts for viewing and downloading.

``Our goal is to help with the expansion of human
knowledge,'' said Dave Mandelbrot, Yahoo's vice
president of search content. ``What we'd like to see
in two or three years is a major collaborative effort
where libraries are contributing material and
publishers are providing permission'' to digitize
their content.

Each of the consortium's partners is providing
different areas of expertise. The Internet Archive, a
San Francisco non-profit that collects copies of Web
pages and other material, is helping with the scanning
of materials. Yahoo will index the content, make it
searchable and pay for the scanning, about 10 cents a
page. And Adobe will help convert some of it into its
PDF format so that it can be downloaded from the Web.

The group is launching a Web site at http://open
contentalliance.org, where people will be able to gain
access to the content. But the digitized materials
also will be available through a special page on
Yahoo's Web site and through the Internet Archive. In
fact, Mandelbrot said, the group's goal is to make the
content available so that any search engine can index
it and make it available.

Burning ambition

Digitizing the world's cultural archives, from
television shows to classic books, has long been a
burning ambition of Internet Archive founder Brewster
Kahle.

In fact, Kahle's Internet Archive has already launched
an effort to digitize books called the Million Book
Project, a collaboration with Indian and Chinese
agencies and Carnegie Mellon University.

``The real crime is that we have all these people
using the Internet for research, but we don't have
some of the best content on it,' Kahle said.

Amazon.com, the Internet retailer, also operates a
massive book digitizing project. But its goal is to
make the material available to customers to help spur
sales.

``At some point, we want to meet in the middle so end
users win,'' Kahle said. ``So they can have access to
great works either for free or pay.''

In addition to working with libraries to scan older
content whose copyright is expired, the consortium
will collaborate with publishers and authors who want
to make their works available on the Web.

In some instances, the Open Content Alliance will give
copyright-holders the option of releasing their
material under a Creative Commons license, an
alternative licensing scheme that encourages re-use
and distribution of content.

The announcement of the consortium comes just days
after the 8,000-member Authors Guild sued Google in
federal court seeking to stop its Google Library
project. The group claims the project violates
copyright law because authors have not first given
permission for Google to digitize their works.

Not full texts

Google has defended its project by noting that it will
only offer short excerpts -- not full texts -- of
copyrighted material on its Web site. That type of use
falls under ``fair use'' laws, the company contends.
Also, Google is allowing authors to choose not to have
their work digitized. The Electronic Frontier
Foundation and others have defended Google in the
debate.

A less-controversial companion project called Google
Publisher allows publishers to tell Google which books
they want to be in the search engine's index.

Google did not respond to a request for comment about
the Open Content Alliance project.

The Association of American Publishers has been one of
the critics of the Google Library program. Although an
association spokeswoman said it had little information
about the alliance project, she said it sounded
``encouraging.''

``At the very least, they are approaching it from the
standpoint that it's the author who says what can be
done with his work,'' said Judith Platt, the
association's director of communications. ``There are
ways the use of copyrighted material can be approached
without violating the rights of the copyright
holder.''
Contact Michael Bazeley at
mbazeley@mercurynews.com or (408) 920-5642.
///
The Pragmatic View

Monday, October 03, 2005

Digitial Libraries: Future Trend

No comments:

About Me

Strategic Prjct Mgt Central

Strategy Sites

CIS Toolbox: Techie Resources

Favorite Bloggers and News Sources

News Sources and Others

Miscellaneous

Blog Archive