Google provides partner institutions with digitized copies of works from various partner libraries – these digitized copies are referred to as surrogates. When Google receives a book from a library partner and recognizes it was already scanned from a different library partner, Google rejects the book. For example, Google may reject a volume from Wisconsin because it previously scanned the same volume from Indiana. Google will then make available to Wisconsin the digitized copy, or surrogate, from Indiana. HathiTrust has not yet begun ingesting surrogates into the repository because of the many complex issues associated with these materials. This issue will get more complicated as we get into settlement works that (a) include illustrative content and (b) where the publisher is or may not be the rights holder for that illustrative content. In those cases, Google will make available an un-redacted copy to the source institution and a copy with images removed to other institutions. Currently (prior to the settlement) Google is making public domain surrogates available to partner libraries now.
Paul Soderdahl, Director, Library Information Technology, University of Iowa, and HathiTrust Strategic Advisory Board member, prepared an overview outlining the various issues surrounding surrogates; Surrogate Problem Overview. The overview concisely describes the storage, display and repository management issues that result from Google designated duplicates.
Teams from University of Michigan (UM) and University of California (UC) have investigated implications for HathiTrust when receiving surrogates and have prepared two white papers that provide additional information on issues associated with surrogates.
Google Designated Duplicates: Implications for HathiTrust End User Display , Heather Christenson, California Digital Library, February 11, 2010.
The Google surrogates is a complicated issue and many issues that need further investigation including legal issues, ownership, storage, branding, display, preservation and repository management issues. For example, do we store the surrogates in the source institution’s namespace? How do we inform the end user that the original comes from Stanford and is provided as a substitute to California, Wisconsin and Michigan? What metadata do we need to store and preserve?
The HathiTrust Strategic Advisory Board charges a Google Surrogates Working Group to:
Heather Christenson, California Digital Library
Bernie Hurley (liaison to the SAB), University of California-Berkeley
Jon Rothman, University of Michigan
Paul Soderdahl (liaison to the SAB and consultant to the working group)
Members are charged through the end of June 2011.