Available Indexes

Update on December 2014 Activities

January 16, 2015 Syndicate content

[Download PDF]

Top News

Authors Guild Drops Lawsuit against HathiTrust Institutions

In June 2014 the Second Circuit Court of Appeals upheld lower court rulings in favor of HathiTrust in the Authors Guild case. At that time the Second Circuit found that the Authors Guild did not have standing to bring the suit, but remanded to the lower court the question of whether the foreign rights organizations in the suit had standing to challenge HathiTrust’s practice of making replacement copies available for works that were lost or damaged. The remaining plaintiffs have now resolved their dispute with the HathiTrust members named in the case in a court filing, and the case has been dismissed by the court. A full statement from HathiTrust can be found on our website.

Board of Governors News

Executive Committee Membership

At its October 2014 meeting, the Board of Governors approved new officers and members of the  HathiTrust Executive Committee, effective January 1, 2015:

  • Chair, Board of Governors:  Richard Clement, University of New Mexico
  • Chair-elect/treasurer: Lizabeth (Betsy) Wilson, University of Washington
  • Past Chair: Sarah Michalak, Univerisity of North Carolina, Chapel Hill
  • Chair, Program Steering Committee:  Bob Wolven, Columbia University
  • Ex-officio: Mike Furlough, Executive Director, HathiTrust

Board Membership

The HathiTrust Board of Governors announces two changes in its membership.

Brenda Johnson left the board in December when she stepped down as Ruth Lilly Dean of Libraries at Indiana University to join the University of Chicago as Library Director and University Librarian on January 1. 

The Board now warmly welcomes Carolyn Walters, who has been appointed by Indiana as its representative on the Board of Governors.  Walters was named the  Ruth Lilly Dean of University Libraries on January 8, after serving in many roles over a long career at Indiana.  Since 2005, she has served as executive associate dean of the IU Libraries. In 2012, Walters also became the founding executive director of the Office of Scholarly Publishing.  

Patricia Steele, Dean of the University of Maryland Libraries, has resigned from the Board of Governors. Steele helped to found HathiTrust in 2008 when she led Indiana University Libraries. In 2012 she was elected by the membership to a four year term on the newly configured Board of Governors, and led the process of drafting the current version of the HathiTrust bylaws. Steele writes that she has “so appreciated the opportunity to serve the HathiTrust since its inception. It has made tremendous strides and that is the result of wonderful leadership and talented volunteers. The experience has been a highlight of my career.”  The Board of Governors will appoint a replacement for Steele as specified in the bylaws (Article V, Section 5: Board of Governors, Vacancies).

Approval of HathiTrust 2015 Budget and Fees

HathiTrust members voted to accept the proposed 2015 total budget and fees. Plans will proceed accordingly, and member invoices will be sent in mid-January.

13 Million Volumes

HathiTrust achieved a significant milestone, surpassing 13 million volumes in the digital repository. Nearly 5 million of the volumes are in the public domain and available either in the United States or worldwide.


Locally-digitized content

HathiTrust ingested first batches of locally-digitized content from Yale University and Emory University, and additional batches of content from the University of Illinois and Texas A&M University. Virginia Tech, Cornell, Tufts University, Washington University, and the University of Missouri proceeded at various stages in preparing to submit locally-digitized materials.

Internet Archive-digitized

HathiTrust ingested new materials from Duke University.

Bibliographic Data Management

The California Digital Library (CDL) loaded 184,466 new or updated bibliographic records into Zephir, including more than 90,000 updated records from Keio University.


Copyright Review

A summary of the determinations from HathiTrust copyright review activities in December is given below. See CRMS-US and CRMS-World for further information.




Public Domain Determinations

All Determinations

Public Domain Determinations

All Determinations


40 68 167,617 317,852


2,642 4,640 88,759 168,371


2,682 4,708 256,376 486,223

Government Documents Registry

Project staff have identified several stages involved in identifying relationships (including duplication) between government document titles: 1) normalization of information in the bibliographic record, 2) identification of exact and “unclear” matches, and 3) scoring and categorization of “unclear” matches. Staff have created initial methods for stages 1 and 2, and expect to be working on stage 3 throughout the spring. Work in December also focused on analysis of government documents publication information as part of efforts to identify more US governments documents in HathiTrust that are not cataloged as such.

HathiTrust Research Center

HTRC has a new version, 3.0, which is in a Beta test phase thru Fri Jan 30. New features include the HTRC Data Capsule, improved user experience and single sign-on (except for the Data Capsule).  Access 3.0 at https://htrc2.pti.indiana.edu/ and send your feedback to htrc-tech-help-l@list.indiana.edu.

The HTRC UnCamp 2015 is set for March 30 and 31 in Ann Arbor, Michigan, home of the HathiTrust. The UnCamp is a day and a half of hands-on coding and demonstration as well as informational presentations on use-cases and new developments. The UnCamp will feature programs for both beginner and advanced users and will feature boot-camp activities with hands-on sessions using HTRC infrastructure and tools. Registration opens the first week of February, at which time the keynote presenter will be announced. We look forward to seeing you in Ann Arbor!

J. Stephen Downie delivered two talks at the University of Waterloo and the University of Western Ontario, in Ontario, Canada. The talks covered the HTRC’s ongoing work to unlock the HathiTrust collection, and were sponsored by UW’s School of Computer Science.

Researcher Ted Underwood, along with co-authors Hoyt Long and Richard So, published an article in online magazine Slate detailing their use of the HathiTrust corpus to respond to economist Thomas Piketty’s claims that references to money decreased in  20th century fiction as compared to that of 19th. The article is available in full on Slate. Underwood also released his own collection of page-level metadata for 854,476 English language volumes from between 1700 to 1922. Part of the release is a Python utility that will align his page-level genre predictions with HTRC extracted-feature files, allowing for new research applications. The information on this release can be found at Underwood’s blog.

Development Updates

Development updates and activities by HathiTrust institutions included the following:

Access, Authorization, and Authentication

  • Implemented modifications to the registration process for staff access to in-copyright materials to smooth the registration process for staff involved in copyright review.

Full-text Search

  • Continued efforts to configure and test Solr 4. Staff worked with Solr committers to contribute documentation and example configuration files to the Solr community.
  • Completed initial work to take advantage of planned changes in the indexing of volume publication dates.
  • Staff continued waiting for production-quality software fixes for the high-performance storage to address performance and stability problems, and are in regular communication with the storage vendor.

Storage Replacement Cycle

  • Received equipment for the annual storage purchase at both sites. The equipment was installed at the Michigan site and will be put into service in early January; the equipment will be installed at the Indiana site in late January and put into service shortly thereafter.

HathiTrust on the Road

HathiTrust administrative staff will be attending the following upcoming meetings. Please get in touch if you would like to meet with us there.

  • Mike Furlough, Valerie Glenn: ALA Midwinter 2015, Chicago, IL.  January 29-February 2, 2015.

January Forecast

  • Reassess accessibility features of PageTurner with particular attention to supporting new content types.
  • Implement the use of coordinate OCR in the PDF generation process.
  • Conduct experiments to determine if Solr 4 reduces the memory footprint required for indexing. Re-index using Solr 4.

New Growth

As of January 1:

  December Overall
Boston College 0 3,263
Columbia University 2 73,395
Cornell University 4,418 510,065
Duke University 405 8,206
Emory University 52 52
Getty Research Institute 716 18,979
Harvard University 10 838,110
Indiana University 167 528,811
Keio University 0 90,094
Knowledge Unlatched 0 28
Library of Congress 0 108,892
McGill University 0 893
New York Public Library 10 294,835
North Carolina State University 0 3,196
Northwestern University 14 56,677
Ohio State University 6,830 61,129
Penn State University 1,139 387,717
Princeton University 1 252,808
Purdue University 0 47,488
Sterling & Francine Clark Art Institute 0 358
Texas A&M University 1,233 2,446
Universidad Complutense 6 117,235
University of Alberta 0 76,106
University of California 9,747 3,612,596
The University of Chicago 10 51,976
University of Connecticut 0 4,637
University of Delaware 10 48
University of Florida 0 9,866
University of Illinois 1,498 318,131
University of Massachusetts, Amherst 486 11,614
University of Michigan 3,871 4,712,752
University of Minnesota 6,110 144,717
University of North Carolina, Chapel Hill 0 17,025
University of Virginia 0 51,207
University of Wisconsin 103 560,775
Utah State 0 117
Yale University 154 23,832
Total 36,992 13,000,076

Public Domain (~37%)

Total*                                                                25,289 4,869,281

* Includes volumes opened through copyright review and rights holder permissions

Summary of Issues Received by User Support

Issue Type December 2014 November 2014
Content 121 129


109 118


11 11
Cataloging 115 151
Access and Use 109 120


43 55


16 6


1 1

Print on Demand

0 0

Inter-library loan

0 6

Full-PDF or e-copy requests

14 14


2 1

Data Availability and APIs

0 0

Reuse of content

0 1
Web applications 20 24

Functionality problems

6 13

Problems with login specifically

1 0

General Questions about Login

2 2

Partners setting up login

0 0

Usability issues

0 0

Feature requests

1 1
Partner Ingest 13 23
General 109 92


8 7


101 85
Total 487 539

Most Accessed Volumes

The Human Figure, by John H. Vanderpoel
Ami and Amile: A Medieval Tale of Friendship, translated from the Old French by Samuel N. Rosenberg and Samuel Danon.
Quicksand, by Nella Larsen.
Roster of the Confederate soldiers of Georgia, 1861-1865, v.1.
Roster of the Confederate soldiers of Georgia, 1861-1865, v.2.
Godey's Magazine, v.40-41, 1850.
2835 Mayfair: A Novel, by Frank Richardson.
Pennsylvania German pioneers: A Publication of the Original Lists of Arrivals in the Port of Philadelphia from 1727 to 1808, by Ralph Beaver Strassburger.
A Hand-book of the City of Rock Hill, by William John Cherry.
The Five Laws of Library Science, by S. R. Ranganathan.



Cumulative 12-month availability of repository accesss*: 99.964% (+0.015%)

No outages were reported in December.


Due to the winter holidays, Zephir suspended bibliographic record loading from noon December 23 until Friday, January 2.

* Repository access refers to page viewing and full-text search functionality, i.e., user-facing applications. It does not refer to preservation or storage infrastructure, which is under continual operation.