Navigation

HathiTrust Celebrates 10 Years: 2008-2018

 

Celebrating 10 years 2008-2018 HathiTrust

2008-2018: The Founding of HathiTrust

On October 13, 2018, HathiTrust celebrates its 10th anniversary. The organization’s pre-history began with Google in the early era of mass digitization and, since its founding in 2008, HathiTrust has grown into a robust and influential organization. What began as a risky venture into digital territory —  especially for academic and research libraries — is now a collaborative model for the future of supporting libraries and advancing scholarship. 

Today, HathiTrust comprises 140+ member institutions, preserves 16+ million items, and continues to advance its founding mission: to contribute to research, scholarship, and the common good by collaboratively collecting, organizing, preserving, communicating, and sharing the record of human knowledge.

The articles and press releases below summarize the early years leading up to the founding and the people and institutions who launched HathiTrust ten years ago. Take a deeper dive and go back to the beginning to walk through each year and milestone since Hathitrust's founding. 

In the Beginning

HathiTrust. Major Library Partners Launch HathiTrust Shared Digital Repository: There’s an Elephant in the Library; Organizers Promise It Will Never Forget [Ann Arbor] 13 October 2008. Web.

Markoff, John and Wyatt, Edward. “Google is Adding Major Libraries to its Database.” New York Times [New York] 12 December 2004. Web.

NPR: StoryCorps Interview with University of Michigan President Emerita Mary Sue Coleman and former provost Paul Courant  [Ann Arbor] April 2018. Radio.In this episode, U-M President Emerita Mary Sue Coleman and former U-M Provost Paul Courant discuss the role the University of Michigan played in the birth of the Google Books project, and how it led to the creation 

HathiTrust News and Publications : the complete, 10-year archive of newsletters, press releases, and other announcements.

The HathiTrust Collection: 10 Years Later

From 2 million to 16.7 million titles! View this extraordinary snapshot of the HathiTrust digitized collection  as it has evolved and grown over 10 years.

 

Official Timeline

January 2008

  • First formal multi-institutional commitments made to building HathiTrust

March 2008

  • First instance of HathiTrust repository infrastructure in place in Ann Arbor, Michigan
  • Storage purchased for second instance of repository in Indianapolis
  • University of Michigan coordinates site visit by a team from DRAMBORA
    • Results of the DRAMBORA review were published as

Seamus Ross, Andrew McHugh, Perla Innocenti, Raivo Ruusalepp: Investigation of the potential application of the DRAMBORA toolkit in the context of digital libraries to support the assessment of the repository aspects of digital libraries, Glasgow: DELOS NoE, August 2008, ISBN: 2-912335-41-8

April 2008

  • Loading and testing of Google-digitized content from the University of Wisconsin begins
  • Preparations begin to establish second instance of repository in Indianapolis

May 2008

  • Testing of Lucene/Solr begins to provide full-text search across the repository
  • PageTurner application released with specialized accessible interface, allowing reading and full-text searching of individual volumes in the repository

June 2008

  • Lucene/Solr installed on development and production servers
  • Collection Builder application released

July 2008

August 2008

  • HathiTrust “about” website is released, including information about HathiTrust compliance with criteria for Trustworthy Digital Repositories (TRAC) and other documentation
  • Benchmarking for full-text search indexing begins

September 2008

  • Plans initiated to enable distributed development of applications and services by partner institutions
    • 3-prong strategy: to enable access to the PageTurner via an API, to create a development ‘sandbox’ for shared development, and to develop a public discovery interface for the repository

October 2008

  • HathiTrust formally launched, including the institutions of the CIC, the University of California system, and the University of Virginia
  • Storage installed at Indiana site and an additional 90 TB of storage is installed at both instances, bringing capacity at each site to 190TB
  • Public beta full-text search application released, allowing full-text search of 500,000 volumes

November 2008

  • Data synchronization between Michigan and Indiana sites is completed and routinized

December 2008

  • Agreement concluded with OCLC to create discovery interface for HathiTrust
  • Indiana site becomes fully operational mirror of storage at Michigan site

January 2009

  • Load testing for full-text search begins

February 2009

  • Work begins on temporary beta catalog interface for HathiTrust

March 2009

  • Redundancy (in Indiana) for Web hosting infrastructure and full-text search indexing is established
  • Sample datasets containing full-text OCR of repository volumes are made available to researchers
  • New storage purchased, bringing total capacity at each site to 320TB

April 2009

  • Temporary beta catalog released
  • Ingest of Google-digitized content from Indiana University and the University of California begins

May 2009

  • HathiTrust Research Center and Collaborative Development Environment working groups launched
    • The groups are charged to develop specifications for a HathiTrust Research Center and establish collaborative development environment for HathiTrust repository, respectively
  • Alpha version of Data API released
  • Michigan ingests legacy digital collections into the repository to pilot non-Google ingest

June 2009

  • California Digital Library begins work on improvements to PageTurner application
  • A record 379,000 volumes are ingested in June

July 2009

  • Working group formed to investigate need for 3rd instance of storage

August 2009

September 2009

  • University of Michigan Press opens access to backfile publications in HathiTrust
  • UM and CDL staff begin collaboration for ingest of Internet Archive-digitized materials
  • Michigan staff contribute common-grams code to Solr code base

October 2009

  • Ingest of content begins from Penn State
  • Ingest of content begins from UC Santa Cruz and UC San Diego
  • A record 553,963 volumes are ingested in October

November 2009

December 2009

  • Columbia University joins HathiTrust
  • Center for Research Libraries begins audit of HathiTrust for compliance with TRAC
  • HathiTrust Bibliographic API released
  • HathiTrust begins work to implement Shibboleth
  • Redundancy of search index established at Indiana site

January 2010

  • Executive Committee approves new pricing model for HathiTrust
  • Storage Working Group submits final report to Executive Committee

February 2010

  • Sample of IA-digitized volumes from UC ingested for testing
  • Ingest of Google-digitized volumes begins from the University of Minnesota
  • Full-text search index exceeds Solr/Lucene's limit of 2.1 billion unique terms

March 2010

  • UM staff receive samples of locally-digitized materials from several CIC institutions (Iowa, Illinois, Northwestern) to begin working on scalable mechanisms and processes for ingesting locally-digitized content
  • OCLC begins loading records for HathiTrust volumes into WorldCat

April 2010

  • Ingest begins of an initial set of nearly 100,000 IA-digitized volumes from the University of California

May 2010

  • New York Public Library joins HathiTrust
  • HathiTrust passes 6 million total volumes and 1 million volumes in the public domain
  • Executive Committee launches Communications Working Group

June 2010

  • HathiTrust enables authentication via Shibboleth
    • In the short-run this allows partners to download full-PDFs of all public domain materials in the repository and use the Collections application through a local sign-on. Implementation of Shibboleth paves the way for future partner services, such as expanded access to in-copyright materials.
  • Full-text search index is mirrored at Indiana site

July 2010

August 2010

  • Princeton University Library joins HathiTrust
  • Ingest of Google- and Internet Archive-digitized volumes from Columbia University begins
  • HathiTrust adds 160 new TB of storage bringing total capacity at each site to 475 TB
  • October 31 deadline announced for joining HathiTrust to participate in "constitutional convention" of partners in 2011

September 2010

  • The Triangle Research Libraries Network and Dartmouth College join HathiTrust
  • Ingest of content begins from New York Public Library and the University of Illinois

October 2010

  • HathiTrust announces the 52 partners that will take part in 2011 Constitutional Convention
    • Newly announced partners include:
      • Baylor University
      • Emory University
      • Harvard University Library
      • Johns Hopkins University
      • Library of Congress
      • Massachusetts Institute of Technology
      • New York University
      • Stanford University Library
      • Texas A&M University
      • Universidad Complutense de Madrid
      • University of Maryland
      • University of Pennsylvania
      • University of Pittsburgh
      • University of Utah
      • University of Washington
      • Utah State University
  • Image ingest pilot begins
    • The University of Minnesota, Minnesota Historical Society, and Minnesota Digital Library begin working with staff at Michigan to develop a prototype workflow for depositing images and associated metadata into the HathiTrust system for access, storage, and preservation purposes. Read more about the project.
  • California Digital Library begins work on a new bibliographic data management system for HathiTrust
  • Discovery Interface Working Group charges Full-text Search sub-group
  • Ingest begins of content from Princeton University and the University of Chicago
  • Collaborative Development Environment is released, used actively for development, testing, and release of code for HathiTrust systems

November 2010

  • Ingest from Cornell University begins

December 2010

  • Policy and specifications framework for ingest of locally-digitized materials is finalized
  • HathiTrust begins working with CIC institutions on ingest of locally-digitized content

January 2011

  • OCLC releases WorldCat Local prototype catalog for HathiTrust
  • HathiTrust ingests nearly 60,000 images and associated metadata from the University of Minnesota and partners
  • HathiTrust adds support for rights holders to open access to works with Creative Commons licenses

February 2011

  • HathiTrust makes datasets of public domain materials available on a large scale

March 2011

  • HathiTrust certified by the Center for Research Libraries as a Trustworthy Digital Repository
  • Ingest from the Library of Congress begins
  • HathiTrust signs agreement with ProQuest to make the HathiTrust full-text index available via Serials Solutions' Summon service
  • Executive Committee launches User Support Working Group

April 2011

  • HathiTrust releases new viewing functionality in PageTurner application
  • Ingest from Harvard University begins
  • HathiTrust concludes first storage replacement cycle, replacing storage purchased in 2007
  • Planning begins for the HathiTrust Constitutional Convention

May 2011

  • HathiTrust begins investigation to identify orphan works in HathiTrust
  • Ingest of content from University of Virginia begins

June 2011

  • Boston University and Lafayette College join HathiTrust
  • UM announces plans to provide access to orphan works to partner institutions
  • The HathiTrust Research Center is launched, led by Indiana University and the University of Illinois
  • HathiTrust begins ingest of materials digitized by Yale University Library
  • "Perspectives on HathiTrust" blog is launched, with inaugural post on HathiTrust and Discovery by John Wilkin

July 2011

  • The University of Notre Dame and University of Florida join HathiTrust
  • 3-year review of HathiTrust is posted on the HathiTrust website and distributed to partners
    • The 3-year review was prepared by Ithaka S+R with oversight by the Strategic Advisory Board in advance of the Constitutional Convention to lay the groundwork for discussions about HathiTrust’s future. View the 3-year review and the Constitutional Convention information page.
  • HathiTrust posts the first set of orphan candidate works
  • HathiTrust releases improvements to the Collections application interface and full-text search
    • Improvements to full-text search include the 2 highest priorities from a full-text search features analysis prepared by the Full-text Search Working Group: the incorporation of bibliographic metadata into the full-text index to allow faceting of results by bibliographic data and improved search results ranking.
  • First version of partner print holdings database released
  • The HathiTrust Research Center receives a $600,000 grant from the Sloan Foundation to investigate “non-consumptive” research
    • The term “non-consumptive” was first used in the proposed Google Settlement to refer to computational research performed on in-copyright works In relation to in-copyright works, "non-consumptive" research in such a way that significant reading or "consumption" of the works does not occur.

August 2011

  • University of Connecticut joins HathiTrust
  • Cornell, Duke, Johns Hopkins, Emory University, and the University of California system announce participation in the Orphan Works Project
    • View information about the terms of access proposed to orphan works. See also the Orphans Works Project page on the University of Michigan Library website. Note: No orphan works are currently available in HathiTrust (as of January 6, 2012).
  • Proposal to establish print monographs archive distributed to partners
  • HathiTrust releases mobile interfaces for catalog and PageTurner applications
  • HathiTrust begins ingest of rare books and incunabula digitized by Universidad Complutense de Madrid
  • HathiTrust begins working with the University of Pittsburgh and University of Utah on ingest of locally-digitized materials
  • HathiTrust begins ingest of Utah State University Press backfile publications, to be made available in HathiTrust on an open access basis
  • HathiTrust begins ingest of Google-digitized volumes from Northwestern University and Purdue University, and Internet Archive-digitized volumes from North Carolina State University
  • HathiTrust concludes agreements with OCLC and EBSCO to make the HathiTrust full-text index available via their discovery services

September 2011

  • The University of Connecticut and University of Missouri join HathiTrust
  • HathiTrust, Google, and Duke University Press sign agreement to open access to DUP backfile volumes in HathiTrust under Creative Commons licenses
  • The Authors Guild and others file a lawsuit against HathiTrust alleging copyright infringement
  • HathiTrust begins working with the University of Florida and the University of North Carolina-Chapel Hill on ingest of locally-digitized materials
  • Partners submit final ballot proposals for the Constitutional Convention. 7 are submitted in all.

October 2011

  • The University of Miami and University of Arizona join HathiTrust
  • The Constitutional Convention takes place; 5 out of 7 ballot initiatives are passed
  • Ingest of Internet Archive-digitized content begins from Duke University and University of North Carolina-Chapel Hill
  • Authors Guild files suit against HathiTrust (University of Michigan, Indiana University, the University of California, the University of Wisconsin, and Cornell University)

November 2011

  • Boston College joins HathiTrust
  • The University of California begins offering reprints of UC-digitized public domain materials via HathiTrust
  • The User Experience Advisory Group releases HathiTrust User Personas

January 2011

  • HathiTrust reaches 10 million volumes

October 2012

  • Authors Guild suit dismissed in favor of HathiTrust in federal district court.  Judge Harold Baer writes in his opinion “I cannot imagine a definition of fair use that would not encompass the transformative uses made by” HathiTrust.

 

February 2014

  • HathiTrust reaches 11 million volumes

     

May 2014

  • Michael Furlough begins at HathiTrust as the Executive Director

June 2014

  •        On appeal by the Authors Guild, a federal court largely upholds the district court rule but remands a portion of the case back to the circuit court for rehearing.

October 2014

 

2015

January 2015

March 2015

  • Revisions to Bylaws are approved

November 2015

 

2016

February 2016

  • Copyright Review Management System (CRMS) transitions from grant funding to a HathiTrust project. Copyright review project manager, Kristina Eden joins team

May 2016

  • New Director of Services and Operations, Sandra McIntyre, joins team

  • New Program Officer for Shared Print Services, Lizanne Payne, joins team

  • Copyright Review Management System wins ALA L. Ray Patterson award

June 2016

  • New Program Officer for Federal Documents Initiative, Heather Christenson, joins team

  • June 29, 2016: HathiTrust announces its collaboration with the National Federation of the Blind https://www.hathitrust.org/hathitrust_NFB_announcement

  • “Finding the Public Domain”, CRMS Toolkit published by Michigan Publishing

September 2016

  • September 15: first “all-sites” mtg. 20+ staff members from Michigan, HTRC and Zephir teams meet in Ann Arbor to start discussing how communication and collaboration can be improved

November 2016

  • November 10: Member Meeting at Big10 Center in Chicago



We continue to add to this timeline! A lot has happened in 2017 and 2018, too!