HathiTrust Research Center Extends Non-Consumptive Research Tools to Copyrighted Materials: Expanding Research through Fair Use

September 20, 2018

HathiTrust has reached a tremendous milestone in the history of HathiTrust and the HathiTrust Research Center’s services.

Since 2011, HTRC has been developing services and tools to allow researchers to employ text and data mining methodologies using the HathiTrust collection. To date, this service has been available only on the portion of the collection that is out of copyright. With the development of a landmark HathiTrust policy and an updated release of HTRC Analytics, HTRC now provides access to the text of the complete 16.7-million-item HathiTrust corpus for non-consumptive research, such as data mining and computational analysis, including items protected by copyright.

This extraordinary opportunity to use copyrighted materials for non-consumptive research purposes expands research access to the entire HathiTrust digital collection, which is sustained by HathiTrust’s 140+ member libraries. Researchers may access HTRC’s easy-to-use computational tools ideal for beginners, as well as more complex tools to meet advanced data analysis needs.

HTRC Algorithms A set of tools for assembling collections of digitized text from the HathiTrust corpus and performing text analysis on them. Including copyrighted items for ALL USERS.
Extracted Features Dataset Dataset allowing non-consumptive analysis on specific features extracted from the full text of the HathiTrust corpus. Including copyrighted items for ALL USERS.
HathiTrust+Bookworm A tool for visualizing and analyzing word usage trends in the HathiTrust corpus. Including copyrighted items for ALL USERS.
HTRC Data Capsule A secure computing environment for researcher-driven text analysis on the HathiTrust corpus. Public domain for all users. Exclusive member benefit: full corpus access for the Data Capsule service, including copyrighted items.

How is This Possible?

This work has been several years in the making. A primary goal of HathiTrust is to enable the widest possible lawful research and educational uses of the HathiTrust collection. In recent years, US courts have recognized the solid legal basis for non-consumptive research on copyrighted materials. In 2016, HathiTrust established a working group to develop the Non-Consumptive Use Research Policy to ensure the responsible research use of copyrighted items.

The policy is now enacted in an updated release of HTRC Analytics, which allows researchers to conduct computational text analysis on copyrighted items as permitted under US copyright law. Non-consumptive research use DOES NOT change the legal status of items protected under copyright.

Thanks to all who have helped HathiTrust reach this milestone in our 10th anniversary year. HathiTrust looks forward to supporting researchers in using these resources.

Additional Resources

HTRC Analytics
HTRC Help & Documentation
Chart on HTRC Analytics Tool Access
Getting Started with HTRC Guide
HathiTrust Non-Consumptive Use Research Policy

If you have specific questions or need help getting started, please contact htrc-help@hathitrust.org. Media inquiries, please contact Jessica Rohr, Member Engagement & Communications Specialist, at jbelle@hathitrust.org.