Projects

HathiTrust is involved in a number of projects that are both internal and external to the repository and partnership.These include grant projects, working groups HathiTrust has assembled on various issues, and repository development opportunties that the partners will engage.

Grant Projects

  • National Science Foundation EAGER grant, in conjunction with Johns Hopkins University and CLIR (the Council on Library and Information Resources) to explore the feasibility of an open access repository for NSF-funded research.

Summary from the Update on September Activities:

Sayeed Choudhury of Johns Hopkins University, John Wilkin of the University of Michigan, and Amy Friedlander of the Council on Library and Information Resources (CLIR) are co-PIs in an NSF EAGER grant to determine the needs and requirements for developing an open-access repository for publications arising from NSF-funded research. The PIs will leverage Johns Hopkins’ experience in evaluating digital repositories, HathiTrust’s experience with large-scale infrastructure and ingest of digital objects, and CLIR’s experience and facility in bringing together groups of experts to determine next steps and directions on targeted issues. CLIR will host a series of workshops focusing on technical requirements, business and policy concerns, and organization and operations issues relating to the open-access repository. Johns Hopkins and HathiTrust will evaluate various technical systems based on the recommendations from the workshops. The creation of a sustainable, efficient, and scalable model to deliver the products of NSF-funded research to users at no cost will have a transformative impact on the dissemination and use of this valuable work.

  • Andrew W. Mellon Foundation grant led by University of Michigan professor Paul Conway.

Summary from the Update on October Activities:

With support from the Andrew W. Mellon Foundation, Associate Professor Paul Conway of the University of Michigan is leading a one-year research and planning project to find and test new procedures for validating the quality and usefulness of digital objects in HathiTrust. The short-term goal of the project is to prepare and submit a funding proposal to a federal granting agency to explore possibilities for validating these characteristics through manual and automated methods. The long-term goal is to develop criteria and methods to brand the trustworthiness of volumes in HathiTrust and other digital repositories for fulfilling specific purposes (such as reading, printing volumes on demand, performing computational research, and others). Such a branding or certification process would give assurance that content within a repository is worthy of preservation, and increase the value of that content in broader discussions about storage and management solutions for both digital and print collections. More information is available at http://blog.si.umich.edu/2009/09/28/mellon-grant-aids-researching-criter....

Development Opportunities

In the Update on October Activities, the first in a series of 'columns' was published outlining development opportunities or needs in HathiTrust that partner institutions have identified. Although reported in the updates, these will also be published below.

Ingest reporting

Description: The deposit of digital volumes and associated metadata into HathiTrust, referred to as “ingest,” involves a significant number of updates to administrative systems—bibliographic records added, digital volumes ingested, and access rights established. Many data elements will be of interest to the contributing institution, and each institution may drive local processes based on the current status of content in the repository (e.g., the percentage of in-copyright works may highlight the value of performing copyright determination work, or a low number of items available in the Google Return Interface may stimulate exploratory discussions with Google). A system that combines all of the available streams of administrative data into a simple web-based reporting system may have considerable value not only for transparency but also for local decision-making.

Resources available: Staff at the University of Michigan and the University of California have assembled a table of relevant data feeds with a brief description of each in the following document: http://bit.ly/2Jk5mm

Priority: moderate

Additional details: An institution that undertakes this work must:

  • outline a process for design and specifications with a group of interested HathiTrust partner libraries;
  • in consultation with partner libraries, give consideration to authentication and authorization needs for this system.

Usage reporting

Description: A clearer sense of the level of use of library materials in HathiTrust will help shape extended activities such as collection management and further digitization. Volumes in HathiTrust may, in some cases, be read in their entirety, while in other cases they may only be searched. To what extent are search-only materials viewed?  Which works that are fully viewable are displayed? Where does that access originate? As HathiTrust introduces authentication, to what extent do users authenticate to get access to a fuller array of services? How frequently is the HathiTrust catalog searched, and how does that use compare to the use of full text indexes? These are some of the questions that an improved service for usage reporting will  help to answer.

Resources available: HathiTrust retains raw log data and registers some uses through Google analytics.

Priority: moderate

Additional details: An institution that undertakes this work must:

  • clearly outline a commitment to undertake appropriate measures with regard to user privacy (e.g., with regard to IP addresses and, at such time that HathiTrust implements Shibboleth, user authentication information). Such efforts should include secure storage of sensitive data, appropriate aggregation of data so as to anonymize use by specific individuals, and a commitment to not transfer private user data to a third  party;
  • outline a process for design and specifications with a group of interested partner libraries;
  • give consideration to producing reports consistent with appropriate library community standards (e.g., COUNTER and SUSHI).

Upcoming Opportunities

  • Print holdings database
  • Ingest transformation

Working Groups

Please see http://www.hathitrust.org/working_groups for information about HathiTrust working groups. Present and past working groups include:

  • Discovery Interface
  • Research Center
  • Collaborative Development Environment
  • Storage
  • Quality, Ingest and Error Rate
  • Communications