Available Indexes

2022 HathiTrust Community Week

2022 Community Week July 11-14

Thank you for attending this year's Community Week! 
You can see all the presentation recordings on our 2022 Community Week YouTube playlist. Individual slide presentations and recordings are linked below.

This July, join colleagues from around the world for HathiTrust Community Week, four days of member-led sessions on local projects, research, and workshops on topics from text and data mining to science fiction. All sessions are open to any interested party affiliated with a member library. You may register for as many sessions as you wish.

HathiTrust Community Week is scheduled Monday, July 11 through Thursday, July 14. All sessions will be recorded.

Monday, July 11 Sessions
Tuesday, July 12 Sessions
Wednesday, July 13 Sessions
Thursday, July 14 Sessions

 Monday, July 11 Sessions

11:00 AM EDT - 11:45 PM EDT
How to Read an Entire HathiTrust Collection (Slides)
Presentation Recording on YouTube

Speaker: Eric Lease Morgan, Librarian, Hesburgh Libraries, University of Notre Dame
Theme: Text & Data Mining
Audience: General audience
HathiTrust Staff Facilitator: Jessica Rohr, Member Engagement & Communications Specialist

This how-to presentation demonstrates how to analyze ("read") an entire HathiTrust collection. The process is divided into the following steps: 1) using OpenRefine to remove duplicates from a 'Trust collection file, 2) using a system called htid2books to exploit the 'Trust Data API and download both plain text and PDF versions of each item in the collection, 3) using a system called the Distant Reader Toolbox to create a data set from the plain text, and finally 4) using any number of different applications to model the data set. Through this process the student, researcher, or scholar affords themselves the ability to use and understand their collection by exploiting both close and distant reading techniques.

2:00 PM EDT - 3:00 PM EDT
Digitizing and Exhibiting Copyrighted Science Fiction (Slides)
Presentation Recording on YouTube

Speaker: Alex Wermer-Colan, Digital Scholarship Coordinator, Temple University Libraries
Theme: Collections, Text & Data Mining, Teaching and Learning
Audience: General, Technical, Research
HathiTrust Staff Facilitator: Melissa Stewart, Assistant to the Executive Director

In this talk, I'll overview an ongoing project at Temple University Libraries to digitize and curate the Special Collection Research Center's Paskow Science Fiction Collection. This talk will introduce our grassroots digitization effort at Temple University Libraries, involving ingestion of underrepresented materials into HathiTrust's Digital Library, with a focus on making these materials available as data through the HathiTrust Research Center. Besides discussing the ways I have sought to promote and teach HTRC's resources through workshops and related programming at the Scholars Studio, Temple's digital scholarship center, I will also discuss our collaboration with Prof. Inna Kouper at Indiana University to develop a portable data capsule appliance, modeled on HathiTrust's data capsule. Our ongoing exploration of data capsule software, as well as of other methods for curating and visualizing restricted data, will hopefully illustrate some emerging avenues for academic libraries and practitioners to adapt HTRC tools and methods for localized, decentralized purposes.

3:30 PM EDT - 4:30 PM EDT

Mining the Native American Authored Works in HathiTrust for Insights (Slides
Presentation Recording on YouTube
Speaker: Kun Lu, Associate Professor, Raina Heaton, Assistant Professor of Native American Studies, and Raymond Orr, Associate Professor and Department Chair of Native American Studies (University of Oklahoma)

Theme: Collections, Text & Data Mining, Teaching and Learning
Audience: General, Technical, Research
HathiTrust Staff Facilitator: Jessica Rohr, Member Engagement and Communications Specialist

 In 2020, the Andrew W. Mellon Foundation funded a special round of the HathiTrust Research Center’s Advanced Collaborative Support program, the Scholar-Curated Worksets for Analysis, Reuse & Dissemination/ (SCWAReD). The goal of these projects is to explore new methods for creating, analyzing, and reusing curated digital library collections, along with research data derived from these collections, with a focus on historically under-resourced and marginalized textual communities. This specific project seeks to compile a collection of Native American authored works in HathiTrust and apply various text mining methods to the collection to reveal the coverage, subjects, perspectives, and writing styles of Native authors. The project is expected to develop a database of Native American authors and the bibliographic information of their works, create a reusable workset of Native American authored works in HathiTrust, identify potential gaps in the HathiTrust corpus on this textual community, and provide insights into the characteristics of the community by text mining their works. Join the project team for an overview of their work and findings to date.


Tuesday, July 12 Sessions

11:00 AM EDT - 12:00 PM EDT
The power of page-level linking to draw visitors (Slides)
Presentation Recording on YouTube
(Re-recorded to accomodate for poor audio quality in original.)
Speaker: Marie Concannon, Librarian IV, Head, Government Information and Data Archives, University of Missouri
Theme: Access and Discovery
Audience: General. "Open to all" session.
HathiTrust Staff Facilitator: Jessica Rohr, Member Engagement & Communications Specialist

The University of Missouri’s Prices and Wages by Decade guide is based largely on page-level links to HathiTrust and other digital libraries. It has brought a phenomenal 4.5 million hits to MU’s website, and is the 3rd largest referrer to HathiTrust. Now that Google-digitized books in HathiTrust have persistent links, it is easier than ever to maintain indexes like this one. In this session, you will learn the elements of success so you can create a similar guide on a topic of interest to you.

1:00 PM EDT - 2:00 PM EDT
Metadata Improvement Projects and HathiTrust: Harvard and the University of Michigan (Slides)
Presentation Recording on YouTube

Speakers: Leigh Billings, Metadata Management Librarian, University of Michigan and Chew Chiat Naun, Head of Metadata Creation, Harvard University
Theme: Metadata & Cataloging 
Audience: General
HathiTrust Staff Facilitator: Graham Dethmers, Metadata Analyst

Retrospective DEI enhancement of catalogue records (Harvard)As librarians we are increasingly aware of the need to represent more fully the diversity of our users and collections. For cataloguers this work includes repairing some of the harms or omissions of the past, and attending to issues such as offensive language, problematical attributions, disclosure of sensitive information, and misidentification or conflation of demographic groups or languages. Access to full text greatly expands the scope of what cataloguers can do in this area. In this presentation I will discuss some of our plans. 

SAZTEC cleanup project using HathiTrust ETAS access (UM)I will be discussing a metadata quality improvement project that began shortly after the retrospective conversion of catalog cards to an online environment and is only now almost complete, more than 20 years later. I'll discuss the background of why the metadata needed to be improved and how the HathiTrust Emergency Temporary Access Service allowed this languishing cleanup project to kick into high gear during our campus closure as a remote work project. I'll also touch on workflows, training, complications, and (briefly) the continuation of the project after ETAS ended in August 2021.

3:00 PM EDT - 4:00 PM EDT
A Window into Your Needs: Analyzing and Moving Through HathiTrust Digital Collection Survey Results (Slides)
Presentation Recording on YouTube
 Wade Wyckoff, Associate University Librarian, Distinctive Collections. McMaster University and Chair of the HathiTrust Digital Collection Strategy Working Group
Theme: Collections, Member input and feedback
Audience: All HathiTrust members
HathiTrust Staff Facilitator: Melissa Stewart, Assistant to the Executive Director

The HathiTrust Digital Collection Strategy Working Group is charged in part with the goal of diversifying the HathiTrust corpus. In Spring 2022, the HT DCSWG distributed to the community a 10-question survey that focused on collections, digitization, and contributions to digital repositories. The survey results identify areas where community members felt their libraries would benefit from greater knowledge of the HathiTrust Digital Library’s content, its policies, and its processes. The survey also solicited input on what services and programs HathiTrust member libraries believed would advance the work of bringing greater diversity (e.g., language, region, underrepresented groups) to the HathiTrust Digital Library.

What did we learn from you? Join members of the Working Group at this session to find out! The Working Group members will also share thoughts on next steps that utilize and build upon the input received in the survey.


Wednesday, July 13 Sessions

2:00 PM EDT - 3:30 PM EDT
Opening access to post-1926 serials in HathiTrust: A presentation and workshop (Slides)
Presentation Recording on YouTube

Speakers: John Mark Ockerbloom, Digital library strategist and metadata architect, University of Pennsylvania with Kristina Hall, University of Michigan; Jonah McAllister-Erickson, West Virginia University; Nicolle Lynne Nicastro, Pennsylvania State University; Scott Pope, Texas State University.
Theme: Collections, Access and Discovery, Metadata/Cataloging, Things to do with Data, Copyright
Audience: Primarily librarians in HathiTrust member institutions, though scholars and administrators might find it of interest.
HathiTrust Staff Facilitator: Jessica Rohr, Member Engagement & Communications Specialist and Grahm Dethmers, Metadata Analyst

HathiTrust has opened access to hundreds of thousands of 20th century books over the last several years through their copyright review program. HathiTrust also has hundreds of thousands of 20th century serials, many of which also have volumes in the public domain more recent than the default 1926 cutoff. A small number of these volumes are now starting to open up to public access in HathiTrust. The process for opening them relies on data and methodology compiled and developed at the University of Pennsylvania, and on experiences from a pilot program in which librarians at HathiTrust member institutions reviewed serial volume copyrights with the help of this data and methodology. This program will begin with a presentation describing the development of actionable serials copyright data and serials copyright review workflows, and reporting on the experiences of librarians who have engaged in serial copyright reviews. Following the presentation will be a working session, in which participants can contribute further data and suggest areas of focus for opening further serial volumes of interest. 


Thursday, July 14

1:00 PM EDT - 2:00 PM EDT
Birds of a Feather Discussion for Members Using (or Wanting) to Use Available User Data to Understand HathiTrust Use by Their Institution (Slides) 
Presentation Slides on YouTube

Speaker: Renata Ewing, Access and Support Coordinator, Digitization & Digital Content, California Digital Library

Theme: Things to do with Data, user data/metrics
Audience: HathiTrust members who are currently gathering (or interested in gathering) user data to help understand engagement with HathiTrust materials by their institution’s community
HathiTrust Staff Facilitator: Natalie Fulkerson, Collection Services Librarian

A 60-minute birds of a feather discussion for HathiTrust members who are currently using (or interested in using) available user data/metrics to understand engagement with HathiTrust materials by their institution’s community. The discussion will be facilitated by a small panel (not yet determined but including someone from CDL) of HathiTrust members who are analyzing HathiTrust user data for their institution. Members engaged in these efforts will be able to share their methods, ideas, and concerns with each other - and those with members interested in beginning such investigations will learn about techniques for gathering useful metrics for their institution.
This discussion will focus on the following questions:
- How are HathiTrust members using existing data streams (Google Analytics, their ILS, or any other source) to gauge HathiTrust use by their institution’s stakeholder communities?
- What metrics are useful now that ETAS usage data is no longer available?
- What tips/tricks can you share for massaging Google Analytics data given that it is #1 highly sampled, and #2 does not associate a user with an institution unless a patron has authenticated (which is not common)?
- Do you have any other usage data gathering methods that you find useful that you are willing to share?
- How does your institution use available usage data to gauge value or make decisions? What kinds of regular data visualizations and other reporting have been effective?
Participants should be prepared to share their usage data gathering techniques and/or visualizations with other members.

3:00 PM EDT - 4:00 PM EDT
Rolling and Reaching Out: Introducing Your Users to HathiTrust (Slides)
Presentation Recording on YouTube

Speakers: Lora Lennertz, Data Services Librarian, University of Arkansas and L. Angie Ohler, AUL for Collections and Content Strategy, University of Minnesota 
Theme: Member input and feedback; Teaching & Learning
Audience: Outreach & Liaision/Subject Librarians; Electronic Resources
HathiTrust Staff Facilitator: Jessica Rohr, Member Engagement & Communications Specialist

This presentation will provide an overview of the steps taken by the University of Arkansas to promote our new HathiTrust membership. Join us for tips on promotions, staff orientation, and searching and research instruction.