Available Indexes

Ingest Tools

HathiTrust makes the same long-term preservation commitment for every item in our digital collection. Our technical standards ensure consistency among materials digitized under different conditions and submitted from different sources, and support a common user experience in our online environment. To this end, we have defined baseline requirements for the digital content we accept for deposit in a number of areas, including:

  • Item identifiers (i.e. how each individual submitted item is identified and named)

  • Package layout (file names, directory structure, etc.)

  • Image technical characteristics (file format, resolution, color depth, etc.)

  • Image metadata (scanning time, scanning artist, etc.)

  • Optionally, page number and page tag metadata

We provide two tools to aid depositors in preparing content to these specifications:

  1. HathiTrust Single-Image Validator

This web-based service validates a single uploaded image and provides a report on compliance with HathiTrust image specifications.

  1. HathiTrust Submission Information Package (SIP) Validator

This command-line tool for Windows, MacOS or Linux enables content preparers to validate locally-digitized content prior to submission to HathiTrust. The SIP validator provides a report on compliance with HathiTrust package specifications, including OCR text, object metadata, checksums, and package structure.  It does not validate image files.

Steps to Ingest

Depositors should review the information at Getting Content into HathiTrust for an introduction to our ingest requirements, including how much remediation may be needed prior to submission.

Please contact feedback@issues.hathitrust.org with any questions about using the tools or to submit sample packages or images for review by HathiTrust staff.

Updated February 20, 2020