The "Hathifiles" are tab-delimited text files that describe every item in the HathiTrust collection. They include information derived from the bibliographic record (e.g., title, publisher, language, commonly used identifiers, etc.), rights and access codes, and information about the source of the item.
A description of the fields included in the hathifiles as well as potential use cases is provided in the “Hathifiles Description” page.
Files provided below
A monthly file is uploaded on the first of every month with a row for every item that is in the HathiTrust collection at the moment the file is created. The filename begins with “hathi_full_”. These files tend to be large and may be difficult to open with standard spreadsheet software or text editors. You may need to work with the files programmatically (e.g., using Python to extract desired data).
An update file is uploaded every day and contains a row for every item that has changed in the previous 24 hours. The filename begins with “hathi_upd_”. Items are included in the update files if any of the following has occurred: the item was newly deposited into the collection, a new copy of the digital item overrode the previous copy, the rights and access status has changed, or a new bibliographic record was provided by the contributor.
A “header” file is also included below. This file contains one row of labels for the data elements included in the hathifiles. It can be combined with the regular hathifiles for ease of working with the data. This header file is only updated when a new data element is added to the hathifiles.
Display name![]() | created | size | modified | Mime type | |
---|---|---|---|---|---|
![]() | hathi_field_list.txt | April 1, 2022 | 307 bytes | April 1, 2022 | text/plain |
![]() | hathi_file_list.json | May 21, 2022 | 17.75 KB | May 21, 2022 | application/octet-stream |
![]() | hathi_full_20220401.txt.gz | May 10, 2022 | 1016.98 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_full_20220501.txt.gz | May 10, 2022 | 1.03 GB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220312.txt.gz | May 10, 2022 | 1.43 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220313.txt.gz | May 10, 2022 | 1.74 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220314.txt.gz | May 10, 2022 | 1.56 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220315.txt.gz | May 10, 2022 | 2.27 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220316.txt.gz | May 10, 2022 | 1.86 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220317.txt.gz | May 10, 2022 | 2.29 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220318.txt.gz | May 10, 2022 | 2.43 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220319.txt.gz | May 10, 2022 | 2.68 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220320.txt.gz | May 10, 2022 | 2.01 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220321.txt.gz | May 10, 2022 | 2 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220322.txt.gz | May 10, 2022 | 1.77 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220323.txt.gz | May 10, 2022 | 1.97 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220324.txt.gz | May 10, 2022 | 1.49 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220325.txt.gz | May 10, 2022 | 2.33 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220326.txt.gz | May 10, 2022 | 1.87 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220327.txt.gz | May 10, 2022 | 1.67 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220328.txt.gz | May 10, 2022 | 1.55 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220329.txt.gz | May 10, 2022 | 2.08 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220330.txt.gz | May 10, 2022 | 2.77 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220331.txt.gz | May 10, 2022 | 1.37 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220401.txt.gz | May 10, 2022 | 1.57 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220402.txt.gz | May 10, 2022 | 1.31 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220403.txt.gz | May 10, 2022 | 1.98 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220404.txt.gz | May 10, 2022 | 1.41 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220405.txt.gz | May 10, 2022 | 1.18 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220406.txt.gz | May 13, 2022 | 1.89 MB | May 13, 2022 | application/octet-stream |
![]() | hathi_upd_20220407.txt.gz | May 10, 2022 | 1.34 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220408.txt.gz | May 10, 2022 | 1.79 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220409.txt.gz | May 10, 2022 | 1.46 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220410.txt.gz | May 10, 2022 | 800.4 KB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220411.txt.gz | May 10, 2022 | 655 KB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220412.txt.gz | May 10, 2022 | 788.03 KB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220413.txt.gz | May 10, 2022 | 1.19 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220414.txt.gz | May 10, 2022 | 1.51 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220415.txt.gz | May 10, 2022 | 1.72 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220416.txt.gz | May 10, 2022 | 2.05 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220417.txt.gz | May 10, 2022 | 2.83 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220418.txt.gz | May 10, 2022 | 2.16 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220419.txt.gz | May 10, 2022 | 931.72 KB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220420.txt.gz | May 10, 2022 | 997.41 KB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220421.txt.gz | May 10, 2022 | 2.25 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220422.txt.gz | May 10, 2022 | 2.22 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220423.txt.gz | May 10, 2022 | 1.57 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220424.txt.gz | May 10, 2022 | 2.09 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220425.txt.gz | May 10, 2022 | 3.54 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220426.txt.gz | May 10, 2022 | 2.16 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220427.txt.gz | May 10, 2022 | 1.69 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220428.txt.gz | May 10, 2022 | 1.61 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220429.txt.gz | May 10, 2022 | 485.79 KB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220430.txt.gz | May 10, 2022 | 2.3 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220501.txt.gz | May 10, 2022 | 2.92 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220502.txt.gz | May 10, 2022 | 3.39 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220503.txt.gz | May 10, 2022 | 2.58 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220504.txt.gz | May 10, 2022 | 2.06 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220505.txt.gz | May 10, 2022 | 116.28 KB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220506.txt.gz | May 10, 2022 | 152.04 KB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220507.txt.gz | May 10, 2022 | 482.59 KB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220508.txt.gz | May 10, 2022 | 1.94 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220509.txt.gz | May 10, 2022 | 2.08 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220510.txt.gz | May 10, 2022 | 2.63 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220511.txt.gz | May 13, 2022 | 2.07 MB | May 11, 2022 | application/octet-stream |
![]() | hathi_upd_20220512.txt.gz | May 13, 2022 | 1.6 MB | May 12, 2022 | application/octet-stream |
![]() | hathi_upd_20220513.txt.gz | May 13, 2022 | 2.26 MB | May 13, 2022 | application/octet-stream |
![]() | hathi_upd_20220514.txt.gz | May 14, 2022 | 934.23 KB | May 14, 2022 | application/octet-stream |
![]() | hathi_upd_20220515.txt.gz | May 15, 2022 | 2.6 MB | May 15, 2022 | application/octet-stream |
![]() | hathi_upd_20220516.txt.gz | May 16, 2022 | 2.74 MB | May 16, 2022 | application/octet-stream |
![]() | hathi_upd_20220517.txt.gz | May 17, 2022 | 1.6 MB | May 17, 2022 | application/octet-stream |
![]() | hathi_upd_20220518.txt.gz | May 18, 2022 | 2.71 MB | May 18, 2022 | application/octet-stream |
![]() | hathi_upd_20220519.txt.gz | May 19, 2022 | 3.01 MB | May 19, 2022 | application/octet-stream |
![]() | hathi_upd_20220520.txt.gz | May 20, 2022 | 2.3 MB | May 20, 2022 | application/octet-stream |
![]() | hathi_upd_20220521.txt.gz | May 21, 2022 | 2.84 MB | May 21, 2022 | application/octet-stream |
![]() | ucal_barcodes_dollarified_201403.txt.gz | March 13, 2014 | 686.82 KB | March 13, 2014 | application/octet-stream |