The "Hathifiles" are tab-delimited text files that describe every item in the HathiTrust collection. They include information derived from the bibliographic record (e.g., title, publisher, language, commonly used identifiers, etc.), rights and access codes, and information about the source of the item.
A description of the fields included in the hathifiles as well as potential use cases is provided in the “Hathifiles Description” page.
Files provided below
A monthly file is uploaded on the first of every month with a row for every item that is in the HathiTrust collection at the moment the file is created. The filename begins with “hathi_full_”. These files tend to be large and may be difficult to open with standard spreadsheet software or text editors. You may need to work with the files programmatically (e.g., using Python to extract desired data).
An update file is uploaded every day and contains a row for every item that has changed in the previous 24 hours. The filename begins with “hathi_upd_”. Items are included in the update files if any of the following has occurred: the item was newly deposited into the collection, a new copy of the digital item overrode the previous copy, the rights and access status has changed, or a new bibliographic record was provided by the contributor.
A “header” file is also included below. This file contains one row of labels for the data elements included in the hathifiles. It can be combined with the regular hathifiles for ease of working with the data. This header file is only updated when a new data element is added to the hathifiles.
Display name | created![]() | size | modified | Mime type | |
---|---|---|---|---|---|
![]() | hathi_upd_20220606.txt.gz | June 6, 2022 | 4.19 MB | June 6, 2022 | application/octet-stream |
![]() | hathi_full_20220601.txt.gz | June 2, 2022 | 1.03 GB | June 1, 2022 | application/octet-stream |
![]() | hathi_upd_20220605.txt.gz | June 5, 2022 | 3.5 MB | June 5, 2022 | application/octet-stream |
![]() | hathi_upd_20220625.txt.gz | June 25, 2022 | 1.89 MB | June 25, 2022 | application/octet-stream |
![]() | hathi_upd_20220423.txt.gz | May 10, 2022 | 1.57 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220527.txt.gz | May 27, 2022 | 2.34 MB | May 27, 2022 | application/octet-stream |
![]() | hathi_upd_20220428.txt.gz | May 10, 2022 | 1.61 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220425.txt.gz | May 10, 2022 | 3.54 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220524.txt.gz | May 24, 2022 | 2.18 MB | May 24, 2022 | application/octet-stream |
![]() | hathi_upd_20220523.txt.gz | May 23, 2022 | 5.13 MB | May 23, 2022 | application/octet-stream |
![]() | hathi_upd_20220603.txt.gz | June 3, 2022 | 2.84 MB | June 3, 2022 | application/octet-stream |
![]() | hathi_upd_20220511.txt.gz | May 13, 2022 | 2.07 MB | May 11, 2022 | application/octet-stream |
![]() | hathi_upd_20220624.txt.gz | June 24, 2022 | 4.81 MB | June 24, 2022 | application/octet-stream |
![]() | hathi_upd_20220519.txt.gz | May 19, 2022 | 3.01 MB | May 19, 2022 | application/octet-stream |
![]() | hathi_upd_20220626.txt.gz | June 26, 2022 | 17.89 KB | June 26, 2022 | application/octet-stream |
![]() | hathi_upd_20220617.txt.gz | June 17, 2022 | 901.28 KB | June 17, 2022 | application/octet-stream |
![]() | hathi_upd_20220614.txt.gz | June 14, 2022 | 24.14 KB | June 14, 2022 | application/octet-stream |
![]() | hathi_upd_20220430.txt.gz | May 10, 2022 | 2.3 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220507.txt.gz | May 10, 2022 | 482.59 KB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220608.txt.gz | June 8, 2022 | 3.09 MB | June 8, 2022 | application/octet-stream |
![]() | hathi_upd_20220512.txt.gz | May 13, 2022 | 1.6 MB | May 12, 2022 | application/octet-stream |
![]() | hathi_upd_20220419.txt.gz | May 10, 2022 | 931.72 KB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220502.txt.gz | May 10, 2022 | 3.39 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220621.txt.gz | June 21, 2022 | 2.33 MB | June 21, 2022 | application/octet-stream |
![]() | hathi_upd_20220529.txt.gz | May 29, 2022 | 2.01 MB | May 29, 2022 | application/octet-stream |
![]() | hathi_upd_20220420.txt.gz | May 10, 2022 | 997.41 KB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220530.txt.gz | May 30, 2022 | 3.49 MB | May 30, 2022 | application/octet-stream |
![]() | hathi_upd_20220521.txt.gz | May 21, 2022 | 2.84 MB | May 21, 2022 | application/octet-stream |
![]() | hathi_upd_20220613.txt.gz | June 13, 2022 | 132.39 KB | June 13, 2022 | application/octet-stream |
![]() | hathi_upd_20220528.txt.gz | May 28, 2022 | 2.84 MB | May 28, 2022 | application/octet-stream |
![]() | hathi_upd_20220504.txt.gz | May 10, 2022 | 2.06 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220526.txt.gz | May 26, 2022 | 2.72 MB | May 26, 2022 | application/octet-stream |
![]() | hathi_field_list.txt | April 1, 2022 | 307 bytes | April 1, 2022 | text/plain |
![]() | hathi_upd_20220508.txt.gz | May 10, 2022 | 1.94 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220610.txt.gz | June 10, 2022 | 2.62 MB | June 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220520.txt.gz | May 20, 2022 | 2.3 MB | May 20, 2022 | application/octet-stream |
![]() | hathi_upd_20220427.txt.gz | May 10, 2022 | 1.69 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220601.txt.gz | June 1, 2022 | 21.52 MB | June 1, 2022 | application/octet-stream |
![]() | hathi_upd_20220522.txt.gz | May 22, 2022 | 2.27 MB | May 22, 2022 | application/octet-stream |
![]() | ucal_barcodes_dollarified_201403.txt.gz | March 13, 2014 | 686.82 KB | March 13, 2014 | application/octet-stream |
![]() | hathi_upd_20220619.txt.gz | June 19, 2022 | 10.41 KB | June 19, 2022 | application/octet-stream |
![]() | hathi_file_list.json | June 28, 2022 | 17.74 KB | June 28, 2022 | application/octet-stream |
![]() | hathi_upd_20220620.txt.gz | June 20, 2022 | 4.01 MB | June 20, 2022 | application/octet-stream |
![]() | hathi_upd_20220506.txt.gz | May 10, 2022 | 152.04 KB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220513.txt.gz | May 13, 2022 | 2.26 MB | May 13, 2022 | application/octet-stream |
![]() | hathi_upd_20220516.txt.gz | May 16, 2022 | 2.74 MB | May 16, 2022 | application/octet-stream |
![]() | hathi_upd_20220609.txt.gz | June 9, 2022 | 4.06 MB | June 9, 2022 | application/octet-stream |
![]() | hathi_upd_20220424.txt.gz | May 10, 2022 | 2.09 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220509.txt.gz | May 10, 2022 | 2.08 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220514.txt.gz | May 14, 2022 | 934.23 KB | May 14, 2022 | application/octet-stream |
![]() | hathi_upd_20220517.txt.gz | May 17, 2022 | 1.6 MB | May 17, 2022 | application/octet-stream |
![]() | hathi_upd_20220505.txt.gz | May 10, 2022 | 116.28 KB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220615.txt.gz | June 16, 2022 | 71.16 KB | June 16, 2022 | application/octet-stream |
![]() | hathi_upd_20220607.txt.gz | June 7, 2022 | 2.57 MB | June 7, 2022 | application/octet-stream |
![]() | hathi_upd_20220510.txt.gz | May 10, 2022 | 2.63 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220627.txt.gz | June 27, 2022 | 1.51 KB | June 27, 2022 | application/octet-stream |
![]() | hathi_upd_20220618.txt.gz | June 18, 2022 | 233.83 KB | June 18, 2022 | application/octet-stream |
![]() | hathi_upd_20220515.txt.gz | May 15, 2022 | 2.6 MB | May 15, 2022 | application/octet-stream |
![]() | hathi_upd_20220426.txt.gz | May 10, 2022 | 2.16 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220429.txt.gz | May 10, 2022 | 485.79 KB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220604.txt.gz | June 4, 2022 | 3.52 MB | June 4, 2022 | application/octet-stream |
![]() | hathi_upd_20220503.txt.gz | May 10, 2022 | 2.58 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220501.txt.gz | May 10, 2022 | 2.92 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220612.txt.gz | June 12, 2022 | 154.24 KB | June 12, 2022 | application/octet-stream |
![]() | hathi_upd_20220611.txt.gz | June 11, 2022 | 3.08 MB | June 11, 2022 | application/octet-stream |
![]() | hathi_upd_20220602.txt.gz | June 2, 2022 | 3.04 MB | June 2, 2022 | application/octet-stream |
![]() | hathi_upd_20220421.txt.gz | May 10, 2022 | 2.25 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_full_20220501.txt.gz | May 10, 2022 | 1.03 GB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220623.txt.gz | June 23, 2022 | 2.12 MB | June 23, 2022 | application/octet-stream |
![]() | hathi_upd_20220518.txt.gz | May 18, 2022 | 2.71 MB | May 18, 2022 | application/octet-stream |
![]() | hathi_upd_20220622.txt.gz | June 22, 2022 | 43.03 KB | June 22, 2022 | application/octet-stream |
![]() | hathi_upd_20220525.txt.gz | May 25, 2022 | 3.05 MB | May 25, 2022 | application/octet-stream |
![]() | hathi_upd_20220628.txt.gz | June 28, 2022 | 342 bytes | June 28, 2022 | application/octet-stream |
![]() | hathi_upd_20220616.txt.gz | June 16, 2022 | 148.9 KB | June 16, 2022 | application/octet-stream |
![]() | hathi_upd_20220422.txt.gz | May 10, 2022 | 2.22 MB | May 10, 2022 | application/octet-stream |
![]() | hathi_upd_20220531.txt.gz | May 31, 2022 | 30.24 MB | May 31, 2022 | application/octet-stream |