Navigation

U.S. Federal Documents Program Update, January 2018

Federal Documents within the HathiTrust Digital Library, as of January 1, 2018

  • 424,498 bibliographic records
  • 1,054,845 digital objects
  • 398,738 monographs
  • 25,415 serial titles

Year In Review

As 2018 begins, it is a good moment to reflect on the HathiTrust US Federal Documents Program’s progress in the past year. In 2017 we made progress in defining our collection boundaries and priorities, adding to the collection, improving discovery, and in providing value by promoting awareness of our collection and making it available for new uses.

Additions to our collection

Approximately 33,000 monographic US federal documents were added to the HathiTrust Digital Library in 2017 from 35 libraries, including major contributions from the following member libraries:

  • University of California, Riverside (8,146)
  • University of Virginia (5,675)
  • Northwestern University (4,917)
  • University of Illinois at Urbana-Champaign (2,019)
  • The Ohio State University (1,998)
  • University of California, Los Angeles (including the UC SRLF) (1,730)
  • University of Michigan (1,523)
  • University of California, Berkeley (including the UC NRLF) (1,260)
  • Texas A&M University (1,205)
  • Also of note, 1,916 federal documents from the TRAIL project were ingested into HathiTrust.

Highlights of the new additions include:

  • More than 1,200 Flood Insurance Studies, the majority contributed by Texas A&M
  • 1,061 public and private laws; 1,336 treaties; Census of Agriculture volumes for 1978, 1982, and 1987 contributed by UC Riverside (in addition to many other publications!)
  • More than 1,500 publications from the Army, including more than 400 Army Technical Manuals, contributed by University of Virginia
  • 369 WPA publications contributed by Northwestern University
  • More than 500 publications from the National Recovery Administration, contributed by the University of Florida.

We also added hundreds of publications from agencies we’ve prioritized: EPA, NOAA, NASA, CIA and BIA, and close to 3,200 volumes authored by Congress (including House and Senate Committees but not Congressional agencies, i.e., Library of Congress), including at least 1,563 Hearings.  More statistics on the overall federal documents collection, including a SuDoc breakdown, may be found in our regularly-updated HathiTrust Collection Profile. Many thanks to all who have contributed!

Collection development

In the first full year of the Program’s existence, emphasis was on delineating our collection out of the mass digitized whole, digging in to understand it more comprehensively, and working with our member advisors to set collection development priorities.

We also began tackling gap filling in priority areas of the collection, beginning with a pilot project with a handful of member libraries. The project is intended to test our ability to provide actionable data on items needed to fill gaps, and to test libraries’ ability to provide those volumes for digitization via existing workflows. So far we have had mixed success, and have learned that the data HathiTrust provides is useful, but the digitization workflow integration piece has proved challenging for some libraries. We are continuing to work through the complexities and hope that the project will increase our knowledge and inform future strategies for identifying and filling in gaps in the whole HathiTrust collection.

A key tool in this process has been our US Federal Documents Registry. In 2017 we moved from a focus on building and refining the Registry to using it for collection building and gap identification. In summer 2017, we incorporated the Library of Congress Name Authority File (LCNAF) into the Registry, enabling us to better identify agency publications. Use of the Registry has enabled us to create a number of collections that provide additional access points for federal documents in HathiTrust. The collections are intended to highlight those collection priorities listed in the Federal Documents Collection Framework, and include:

The collections will be updated regularly, until they are complete. Additionally, we have provided a number of the collections as worksets in the HathiTrust Research Center (HTRC), enabling new kinds computational research using these federal documents.  Worksets are defined slices of the overall corpus hosted by HTRC, and now include:

Improving discovery

The Federal Documents Program has driven improvements in HathiTrust infrastructure, serving as an initial case for upgrades in multiple HathiTrust service areas. The HathiTrust “Collection Builder” (tool for creating and maintaining collections, available to end users) mechanisms were extended to enable building our huge collection of all federal documents held in HathiTrust which provides a new focused discovery point for end users. The ability to download a set of metadata for items in the collection has been extended to very large collections like ours, and includes a choice of downloading all metadata, or just for full view items. The HathiTrust Research Center extended capability to provide extremely large worksets in order to construct a comprehensive workset of items identified as federal documents. Driven by inquiries to open up federal documents, we’ve looked at workflows and improved interactions with the Copyright Review Program.  Seeking to improve the bibliographic records HathiTrust receives that correspond with deposits of digitized federal documents, we worked with our metadata management team to specify a “fed docs suggester” report that will go to our contributors when they add metadata, planned for implementation this year. The metadata management team will also be exploring possibilities for improving the “best record” for a given federal document that is displayed to end users.

Shared Print

Also noteworthy, during 2017 our HathiTrust Shared Print Program secured close to 382,000 shared print commitments for federal documents in Phase 1 of program implementation.

2017 Publications & Presentations

For more information on our program and activities, the following publications and presentations we produced in 2017 may be of interest:

Onward into 2018

Program aims for 2018 include:

  • Continuation of our collaborative “gap-filling” project to increase the number of digitized documents in priority collection areas and gain insight into processes and data that libraries’ need in order to contribute needed items. We plan to conclude the initial phase of the project in June 2018, and expect that it will inform targeted collection-building for the greater HathiTrust collection.

  • A federal documents-focused user needs investigation that includes an environmental scan, survey, and user research.

  • Envisioning and planning a project focused on collaborative metadata improvement. We have a significant need to improve bibliographic records for federal documents in order to 1) improve the identification of federal documents within the mass digitized corpus, 2) reliably match to library collections for shared print and other uses, 3) detect duplicates, and especially 4) provide better end user discovery and 5) open documents that are currently in limited view.

  • Revisiting the Federal Documents Registry as a service for members and/or mechanism for collaborative HathiTrust collection development.

  • Activities that foster connection with our community -- we are eager to be able to bring more users to federal documents, and to engage, you, our community, in collection building and improvement. We are aiming for the ALA Annual Meeting as an opportunity for an in-person event, so please stay tuned!