Available Indexes

Getting Content Into HathiTrust

Purpose

Members may deposit digitized materials with HathiTrust for long-term preservation and access. These materials are stored in our repository and made available for search, display, and computational research, in addition to other uses as permitted by U.S. Copyright Law.  We encourage all members to deposit material.

HathiTrust supports ingest of digitized books and book-like materials.  These include manuscripts, pamphlets, and both bound and unbound serials.  We cannot accept:

  • unbound maps or other large-format items
  • audio
  • video
  • items digitized from microform, film, or fiche
  • born-digital materials, including PDFs and eBooks

Members wishing to deposit materials with HathiTrust may:

  • work with one of our mass-digitization partners (currently Google and the Internet Archive), or
  • work with us directly to establish a deposit workflow for locally-digitized materials

Note that HathiTrust does not offer digitization services. We can, however, provide guidelines and other support to members undertaking digitization projects.

Ingest Overview

HathiTrust requires the following artifacts for each deposited item:

  • Content: a zip archive (Submission Information Package, or SIP) containing page scans, OCR text, object metadata, and fixity information for each digitized item
  • Bibliographic metadata: a MARC-compliant record describing the print version of the digitized item

There are separate processes for content and bibliographic metadata submission.  For content submission, please email feedback@issues.hathitrust.org for information.  For bibliographic metadata submission, see https://www.hathitrust.org/bib_data_submission.

In addition, we require the following two forms prior to deposit:

  • Digital Asset Submission Inventory - (DASI) - This form describes the digital objects being submitted for deposit, and is used to establish the correct content stream (ingest processing workflow) for each submission. Note that this form requires a signature from a University Librarian or Dean of Libraries.  Please contact feedback@issues.hathitrust.org with any questions; we will gladly review pre-signed drafts of this form.
  • Administrative Coversheet - The information in this form is used to establish the correct configuration for bibliographic metadata loading and processing.

Getting Started

It is usually simpler to create content that meets our specifications from the beginning than it is to remediate existing scans.  We are here to help, and will gladly answer questions and review sample files. Contact us at feedback@issues.hathitrust.org to get started.

New Digitization

If your institution is preparing to undertake a new digitization project and wishes to comply with our requirements, please consult the Technical Requirements for Digitized Page Images Submitted to HathiTrust.  These guidelines provide specific technical and quality requirements for creating digital objects, including:

  • image format, resolution, and color
  • image metadata
  • image filenaming

For information on checksums, OCR, object metadata and creating Submission Information Packages (including package filenaming conventions), see our Submission Package Requirements for Digitized Content Submitted to HathiTrust.

Existing Digitized Content

HathiTrust has developed workflows to accommodate the ingest of content from Google and the Internet Archive efficiently and at scale.

HathiTrust also supports ingest of content digitized by institutions either in-house or as part of vended projects. In such cases, institutions must undertake any necessary content or metadata transformations prior to submission.

To determine whether existing content meets HathiTrust requirements for deposit, please refer to our Technical requirements for digitized page images submitted to HathiTrust. In addition, the Ingest Tools page includes a set of validation resources, including documentation, that should be used prior to submission.

Bibliographic Metadata

For our bibliographic metadata specification, see https://www.hathitrust.org/bib_specifications.

How to Proceed

If you have any questions, or are ready to proceed with deposit, please contact feedback@issues.hathitrust.org.

Updated February 20, 2020