The "Hathifiles" are tab-delimited text files that describe every item in the HathiTrust collection. They include information derived from the bibliographic record (e.g., title, publisher, language, commonly used identifiers, etc.), rights and access codes, and information about the source of the item.
A description of the fields included in the hathifiles as well as potential use cases is provided in the “Hathifiles Description” page.
Files provided below
A monthly file is uploaded on the first of every month with a row for every item that is in the HathiTrust collection at the moment the file is created. The filename begins with “hathi_full_”. These files tend to be large and may be difficult to open with standard spreadsheet software or text editors. You may need to work with the files programmatically (e.g., using Python to extract desired data).
An update file is uploaded every day and contains a row for every item that has changed in the previous 24 hours. The filename begins with “hathi_upd_”. Items are included in the update files if any of the following has occurred: the item was newly deposited into the collection, a new copy of the digital item overrode the previous copy, the rights and access status has changed, or a new bibliographic record was provided by the contributor.
A “header” file is also included below. This file contains one row of labels for the data elements included in the hathifiles. It can be combined with the regular hathifiles for ease of working with the data. This header file is only updated when a new data element is added to the hathifiles.
Display name![]() | created | size | modified | Mime type | |
---|---|---|---|---|---|
![]() | hathi_field_list.txt | March 1, 2021 | 307 bytes | March 1, 2021 | text/plain |
![]() | hathi_file_list.json | March 5, 2021 | 21.16 KB | March 5, 2021 | application/octet-stream |
![]() | hathi_full_20201001.txt.gz | October 1, 2020 | 1008.56 MB | October 1, 2020 | application/octet-stream |
![]() | hathi_full_20201101.txt.gz | November 1, 2020 | 1008.74 MB | November 1, 2020 | application/octet-stream |
![]() | hathi_full_20201201.txt.gz | December 1, 2020 | 1008.74 MB | December 1, 2020 | application/octet-stream |
![]() | hathi_full_20210101.txt.gz | January 1, 2021 | 1008.93 MB | January 1, 2021 | application/octet-stream |
![]() | hathi_full_20210201.txt.gz | February 1, 2021 | 1008.6 MB | February 1, 2021 | application/octet-stream |
![]() | hathi_full_20210301.txt.gz | March 1, 2021 | 1009.44 MB | March 1, 2021 | application/octet-stream |
![]() | hathi_upd_20210101.txt.gz | January 2, 2021 | 2.29 MB | January 2, 2021 | application/octet-stream |
![]() | hathi_upd_20210102.txt.gz | January 3, 2021 | 1.35 MB | January 3, 2021 | application/octet-stream |
![]() | hathi_upd_20210103.txt.gz | January 4, 2021 | 1.45 MB | January 4, 2021 | application/octet-stream |
![]() | hathi_upd_20210104.txt.gz | January 5, 2021 | 1.35 MB | January 5, 2021 | application/octet-stream |
![]() | hathi_upd_20210105.txt.gz | January 6, 2021 | 2.36 MB | January 6, 2021 | application/octet-stream |
![]() | hathi_upd_20210106.txt.gz | January 7, 2021 | 2.05 MB | January 7, 2021 | application/octet-stream |
![]() | hathi_upd_20210107.txt.gz | January 8, 2021 | 2.42 MB | January 8, 2021 | application/octet-stream |
![]() | hathi_upd_20210108.txt.gz | January 9, 2021 | 24.99 KB | January 9, 2021 | application/octet-stream |
![]() | hathi_upd_20210109.txt.gz | January 10, 2021 | 1.39 MB | January 10, 2021 | application/octet-stream |
![]() | hathi_upd_20210110.txt.gz | January 11, 2021 | 1.71 MB | January 11, 2021 | application/octet-stream |
![]() | hathi_upd_20210111.txt.gz | January 12, 2021 | 1.62 MB | January 12, 2021 | application/octet-stream |
![]() | hathi_upd_20210112.txt.gz | January 13, 2021 | 1.77 MB | January 13, 2021 | application/octet-stream |
![]() | hathi_upd_20210113.txt.gz | January 14, 2021 | 1.36 MB | January 14, 2021 | application/octet-stream |
![]() | hathi_upd_20210114.txt.gz | January 15, 2021 | 1.21 MB | January 15, 2021 | application/octet-stream |
![]() | hathi_upd_20210115.txt.gz | January 16, 2021 | 1.39 MB | January 16, 2021 | application/octet-stream |
![]() | hathi_upd_20210116.txt.gz | January 17, 2021 | 1.41 MB | January 17, 2021 | application/octet-stream |
![]() | hathi_upd_20210117.txt.gz | January 18, 2021 | 1.5 MB | January 18, 2021 | application/octet-stream |
![]() | hathi_upd_20210118.txt.gz | January 19, 2021 | 358.94 KB | January 19, 2021 | application/octet-stream |
![]() | hathi_upd_20210119.txt.gz | January 20, 2021 | 1.3 MB | January 20, 2021 | application/octet-stream |
![]() | hathi_upd_20210120.txt.gz | January 21, 2021 | 1.74 MB | January 21, 2021 | application/octet-stream |
![]() | hathi_upd_20210121.txt.gz | January 22, 2021 | 1.9 MB | January 22, 2021 | application/octet-stream |
![]() | hathi_upd_20210122.txt.gz | January 23, 2021 | 1.42 MB | January 23, 2021 | application/octet-stream |
![]() | hathi_upd_20210123.txt.gz | January 24, 2021 | 1.76 MB | January 24, 2021 | application/octet-stream |
![]() | hathi_upd_20210124.txt.gz | January 25, 2021 | 1.14 MB | January 25, 2021 | application/octet-stream |
![]() | hathi_upd_20210125.txt.gz | January 26, 2021 | 1.14 KB | January 26, 2021 | application/octet-stream |
![]() | hathi_upd_20210126.txt.gz | January 27, 2021 | 2.26 MB | January 27, 2021 | application/octet-stream |
![]() | hathi_upd_20210127.txt.gz | January 28, 2021 | 1.46 MB | January 28, 2021 | application/octet-stream |
![]() | hathi_upd_20210128.txt.gz | January 29, 2021 | 1.55 MB | January 29, 2021 | application/octet-stream |
![]() | hathi_upd_20210129.txt.gz | January 30, 2021 | 1.52 MB | January 30, 2021 | application/octet-stream |
![]() | hathi_upd_20210130.txt.gz | January 31, 2021 | 1.45 MB | January 31, 2021 | application/octet-stream |
![]() | hathi_upd_20210131.txt.gz | February 1, 2021 | 1.44 MB | February 1, 2021 | application/octet-stream |
![]() | hathi_upd_20210201.txt.gz | February 2, 2021 | 1.36 MB | February 2, 2021 | application/octet-stream |
![]() | hathi_upd_20210202.txt.gz | February 3, 2021 | 1.53 MB | February 3, 2021 | application/octet-stream |
![]() | hathi_upd_20210203.txt.gz | February 4, 2021 | 1.42 MB | February 4, 2021 | application/octet-stream |
![]() | hathi_upd_20210204.txt.gz | February 5, 2021 | 1.94 MB | February 5, 2021 | application/octet-stream |
![]() | hathi_upd_20210205.txt.gz | February 6, 2021 | 1.8 MB | February 6, 2021 | application/octet-stream |
![]() | hathi_upd_20210206.txt.gz | February 7, 2021 | 1.44 MB | February 7, 2021 | application/octet-stream |
![]() | hathi_upd_20210207.txt.gz | February 8, 2021 | 1.49 MB | February 8, 2021 | application/octet-stream |
![]() | hathi_upd_20210208.txt.gz | February 9, 2021 | 1.41 KB | February 9, 2021 | application/octet-stream |
![]() | hathi_upd_20210209.txt.gz | February 10, 2021 | 3.01 MB | February 10, 2021 | application/octet-stream |
![]() | hathi_upd_20210210.txt.gz | February 11, 2021 | 1.27 MB | February 11, 2021 | application/octet-stream |
![]() | hathi_upd_20210211.txt.gz | February 12, 2021 | 1.28 MB | February 12, 2021 | application/octet-stream |
![]() | hathi_upd_20210212.txt.gz | February 13, 2021 | 3.49 MB | February 13, 2021 | application/octet-stream |
![]() | hathi_upd_20210213.txt.gz | February 15, 2021 | 3.59 MB | February 15, 2021 | application/octet-stream |
![]() | hathi_upd_20210214.txt.gz | February 15, 2021 | 1.7 KB | February 15, 2021 | application/octet-stream |
![]() | hathi_upd_20210215.txt.gz | February 16, 2021 | 1.21 KB | February 16, 2021 | application/octet-stream |
![]() | hathi_upd_20210216.txt.gz | February 17, 2021 | 4.15 MB | February 17, 2021 | application/octet-stream |
![]() | hathi_upd_20210217.txt.gz | February 18, 2021 | 1.5 MB | February 18, 2021 | application/octet-stream |
![]() | hathi_upd_20210218.txt.gz | February 19, 2021 | 4.47 MB | February 19, 2021 | application/octet-stream |
![]() | hathi_upd_20210219.txt.gz | February 20, 2021 | 1.28 MB | February 20, 2021 | application/octet-stream |
![]() | hathi_upd_20210220.txt.gz | February 21, 2021 | 1.72 MB | February 21, 2021 | application/octet-stream |
![]() | hathi_upd_20210221.txt.gz | February 22, 2021 | 1.06 MB | February 22, 2021 | application/octet-stream |
![]() | hathi_upd_20210222.txt.gz | February 23, 2021 | 779.27 KB | February 23, 2021 | application/octet-stream |
![]() | hathi_upd_20210223.txt.gz | February 24, 2021 | 1.86 MB | February 24, 2021 | application/octet-stream |
![]() | hathi_upd_20210224.txt.gz | February 25, 2021 | 1.73 MB | February 25, 2021 | application/octet-stream |
![]() | hathi_upd_20210225.txt.gz | February 26, 2021 | 797.44 KB | February 26, 2021 | application/octet-stream |
![]() | hathi_upd_20210226.txt.gz | February 27, 2021 | 1.85 MB | February 27, 2021 | application/octet-stream |
![]() | hathi_upd_20210227.txt.gz | February 28, 2021 | 970.8 KB | February 28, 2021 | application/octet-stream |
![]() | hathi_upd_20210228.txt.gz | March 1, 2021 | 1.28 MB | March 1, 2021 | application/octet-stream |
![]() | hathi_upd_20210301.txt.gz | March 2, 2021 | 798.99 KB | March 2, 2021 | application/octet-stream |
![]() | hathi_upd_20210302.txt.gz | March 3, 2021 | 887.14 KB | March 3, 2021 | application/octet-stream |
![]() | hathi_upd_20210303.txt.gz | March 4, 2021 | 1.04 MB | March 4, 2021 | application/octet-stream |
![]() | hathi_upd_20210304.txt.gz | March 5, 2021 | 2.8 MB | March 5, 2021 | application/octet-stream |
![]() | ucal_barcodes_dollarified_201403.txt.gz | March 13, 2014 | 686.82 KB | March 13, 2014 | application/octet-stream |