Size of the HathiTrust Federal Documents Collection

HathiTrust contains 412,205 bibliographic records for US Federal Documents including 970,315 separate digital objects for unique items. 387,766 of these bibliographic records represent monographs, and 23,985 represent serial titles.

HathiTrust's corpus of Federal documents contains different copyrights determinations. Some of these rights determinations may be indicative of a weakness in HathiTrust's rights determination algorithm or a weakness in the Registry's identification of US Federal Documents.

Number of Publications Produced Per Year

The chart below was extracted from the publication dates reported in HathiTrust's holdings fields (974$y). of the digital objects do not include a $y. An alternative method for identifying publication date is to pull from the MARC 008 or 260$c fields. Among HathiTrust's bibliographic records, are missing publication dates in their 008 and 260$c.

The Superintendent of Documents (SuDoc) call number is assigned to publications based primarily on the agency that published them. Of HathiTrust's bibliographic records, are missing an identifiable SuDoc number.


This collection includes materials in more than 100 different languages. Below is a table indicating the language, as pulled from the MARC 008 and 041 fields, along with the number of bibliographic records coded for that language. records are missing language information.

Comprehensiveness Tracking

The HathiTrust collection includes digital objects for % ( unique items) of the currently known corpus of Federal Documents Registry Records. The HathiTrust numbers are current as of and the Registry numbers are current as of .

In addition to tracking the comprehensiveness of HathiTrust's overall federal documents holdings, we have begun tracking the comprehensiveness of several key series and one government agency, based on data in the US Federal Documents Registry.

The number of items in the Registry is overstated for most of these series. Enumeration/chronology processing and deduplication has been performed on all except Foreign Relations and the Civil Rights Commission, but their numbers remain inflated due to the inability to reconcile some enumeration/chronology data. The comprehensiveness percentages will improve as our duplicate identification processes progress.