HathiTrust distributes information about items in the repository (and items themselves where possible) through a variety of mechanisms.
It is possible to obtain datasets (full-text OCR) of public domain works in HathiTrust. More information can be found at http://www.hathitrust.org/datasets [1].
Bibliographic API
The Bib API [2] returns bibliographic, copyright, and volume information (including permanent URLs) when queried with a variety of standard identifiers (e.g., ISBN, LCCN, OCLC, etc.). The API has controls to return brief or full bibliographic metadata, and is implemented as a replacement for the (now deprecated) Rights API [3].
Data (page images, OCR text, and associated metadata)
HathiTrust has developed a Data API [4] that makes it possible to retrieve page images, OCR text, rights information, and a variety of other data about objects in the repository. A draft specification [4] for the API has been made available for comment from the HathiTrust partners. Please read the most recent Monthly Update [5] for current status information.
The University of Michigan provides an OAI feed of MARC21 and unqualified Dublin Core records for public domain materials in HathiTrust (see http://www.lib.umich.edu/michigan-digitization-project-oai-harvesting [6] for information about the Open Archives Initiative at the University of Michigan and the UM OAI toolkit for harvesting records).
In place of "set=hathitrust" at the end of the URLs above, use "set=hathitrust:pdus" to access materials that are public domain in the United States only and "set=hathitrust:pd" to access materials that are in the public domain worldwide.
Metadata identifying the contents of HathiTrust repository [9] are available for download as tab-delimited files. These files include a small number of bibliographic elements to aid an institution in making decisions as to records they want to retrieve. That is, the metadata made available here are a tool that can be used to help obtain records and add links to existing records in local systems. Full documentation on these metadata is available under HathiTrust Metadata [10]. Using the metadata described above, an institution may acquire records through one of the following methods:
The Bib API [2] can be used in conjunction with these metadata for purposes such as including records in a local catalog, though it is meant for use on a small scale (see the Bib API [2] page for details). It provides rights and permanent URL information about each volume, in addition to bibliographic information.
OCLC and the HathiTrust work together to synchronize WorldCat with the HathiTrust catalog nightly. The vast majority of records representing the HathiTrust collection are in WorldCat today, with links to the HathiTrust content.
Links:
[1] http://www.hathitrust.org/datasets
[2] http://www.hathitrust.org/bib_api
[3] http://www.hathitrust.org/rights_api
[4] http://www.hathitrust.org/data_api
[5] http://www.hathitrust.org/updates
[6] http://www.lib.umich.edu/michigan-digitization-project-oai-harvesting
[7] http://quod.lib.umich.edu/cgi/o/oai/oai?verb=ListRecords&metadataPrefix=marc21&set=hathitrust
[8] http://quod.lib.umich.edu/cgi/o/oai/oai?verb=ListRecords&metadataPrefix=oai_dc&set=hathitrust
[9] http://www.hathitrust.org/hathifiles
[10] http://www.hathitrust.org/hathitrust_metadata
[11] http://www.oclc.org/connexion/