The HathiTrust Research Center at the Code4Lib Conference

April 21, 2023

The HathiTrust Research Center (HTRC) presented a pre-conference workshop at Code4Lib 2023 on March 14th. The workshop highlighted a new API being developed as part of the NEH-funded project, Tools for Open Research and Computation with HathiTrust: Leveraging Intelligent Text Extraction (TORCHLITE). The API will provide quick and easy access to the Extracted Features dataset, a derived dataset consisting of volume metadata and word-level statistical data for the HathiTrust corpus. The HTRC team is building the new API and an interactive dashboard to allow our user community to develop its own tools for interacting with data from the 17.5 million-volume HathiTrust Digital Library. This workshop provided an introduction to the Extracted Features dataset and the new TORCHLITE API, and set the stage for an NEH-funded hackathon in Fall 2023.

The following teaching materials from the workshop are available for reuse and remixing: