How to Use HathiTrust Data Resources

The possible benefits of using the HathiTrust collection go far beyond the digitized items in the library. Data from the collection can help you make collection decisions, integrate items locally, or support researchers in more robust research inquiries.

Notes on Working with HathiTrust Data

While working with HathiTrust bibliographic metadata or digital content, it may helpful to keep the following in mind.

The HathiTrust collection is not static. Works get added to the collection every day, and sometimes a digital item may be updated with a new version. Bibliographic records can be updated when contributors send us corrections. Copyright and access statuses may change as items undergo copyright review or we receive permissions agreements from copyright holders.

The HathiTrust collection comprises works from more than 60 different libraries located in the United States and around the world. Bibliographic records represent many different cataloging practices and may even be in different languages.

We work closely with our contributing libraries to try to correct errors in bibliographic records and digital content (including poor OCR). Users can notify us about errors using the “feedback” link in the header or footer of most pages. Because the originating library or vendor may need to make the change on their end, it may take a while for corrections to be made in HathiTrust.

This API can provide you with a limited number of brief or full bibliographic records for real-time queries.

Bibliographic API

Use this option to retrieve metadata from full-view items in MARC21 or unqualified Dublin Core formats.

OAI Feed

Tab-limited files of metadata describing all works in the HathiTrust collection may be used for collection management, record retrieval, or building links to HathiTrust works.

hathifiles

A tab-delimited file containing U.S. copyright renewal registration numbers in connection with a HathiTrust volume identifier is available for download.

Renewal ID File

For in-depth research and exploration of the collection, you can request a custom dataset or use a pre-built dataset from the HathiTrust Research Center.

Research Datasets

HathiTrust Data in Discovery Products

HathiTrust publicly-available data is consumed into a number of different vendor discovery products, including the following: Summon, EBSCO Discovery Services, Ex Libris Primo, OCLC WorldCat Discovery, Innovative Interfaces Inc Encore. In addition, a number of knowledge bases include HathiTrust data, including EBSCO’s Knowledge Base and OCLC’s WorldShare Collection Manager. Other vendor products may also include HathiTrust data.

Quality and accuracy of HathiTrust data in vendor products may vary. We are happy to work with vendors to determine how best to characterize collections or sets and manage data over time. Contact our User Support team.

HathiTrust Metadata Sharing

Under HathiTrust Digital Library’s (HTDL) Metadata Sharing Policy, independent users, member institutions, and other third parties are free to harvest (for example, through our OAI feed or the HathiFiles), modify and/or otherwise make use of any metadata contained in HTDL unless restricted by contractual obligations residing with the parties that have contributed the metadata (“Depositing Institutions”) to HTDL. Furthermore, HTDL provides no warranties on the data made available through any sharing mechanisms. Use of the data is undertaken at the user’s own risk. Any contributions made by HTDL to the metadata in the repository have been placed into the public domain by HTDL via a CC0 Public Domain Dedication.

HathiTrust and OCLC records

OCLC and HathiTrust work together to synchronize WorldCat with the HathiTrust catalog nightly. HathiTrust records are added to WorldCat as e-resource records. The vast majority of records representing the HathiTrust collection are in WorldCat today.

Questions?

Contact our member-led user support team!

Top