The HathiTrust Collections Committee is a standing committee of the HathiTrust partnership, reporting to the HathiTrust Strategic Advisory Board, and charged with making key recommendations about the content included in HathiTrust, including the activities, policies, tools and services needed by HathiTrust partners to manage collections individually and collectively and the process by which collection development and management decisions affecting HathiTrust content should be made.
The Committee is asked to make recommendations in the following areas:
1. Key issues: The HathiTrust Collections Committee will identify key collection development and management issues that the partnership should address and recommend appropriate strategies or groups to address them. Examples of such issues that have surfaced in recent discussions include the role of duplicates in HathiTrust and strategies for rights determination.
2. Collection development: To what extent should the partnership explore specific collection development opportunities, and how should such activities be prioritized and carried out? Examples of collection development activities that might be pursued include:
- developing particular types of collections within the HT corpus, such as comprehensive or distinctive collections in particular areas that build on participant strengths
- exploring opportunities for digitization and collaboration with other initiatives
- developing a shared approach to government documents that capitalizes on the work undertaken by CIC
- attempting to attract and aggregate additional public domain content
- leveraging the HathiTrust corpus to manage print collections both amongst and beyond the HT partner libraries, including extramural partnerships with third party organizations.
3. Decision-making and prioritization for new content types: Each new content type ingested into HathiTrust will have unique access, management, and preservation requirements and associated costs. What process can best assist the partnership in making strategic decisions about the addition of new content types – how should such recommendations be proposed, vetted and prioritized? Is there room for individual decision-making, or must these be collective decisions? How should associated costs be allocated across the partnership?
4. Tools and Services:
- What tools and or services, if any, are needed to characterize the HathiTrust collection as it evolves? (e.g. analytical tools by subject, language, date, format, or other scope, especially mechanisms that would be useful to describe the corpus to a potential user)
- Collection builder functionality; beyond its use for personal collection-building, how should this be developed and/or utilized to support the development of specific collections within HathiTrust?
- Citation and versioning; how does the inevitable mutability of content in HathiTrust (e.g. continuously-improved Google versions, the potential for de-duplication, etc.), affect the need for stable citations to support scholarly work
5. Structure and Process: The HathiTrust Collections Committee is asked to recommend a persistent process for making recommendations affecting HathiTrust collection development and management. How should new collection-oriented services and activities be surfaced, vetted, prioritized, and communicated?
The HathiTrust Collections Committee shall have approximately seven members including a minimum of two members each from the CIC and UC and 2-3 members from other member institutions. Members should ideally represent a diversity of HathiTrust partners and a mix of expertise including collection development and management, rights management, digital collections, and connections to other areas of HathiTrust work (e.g. quality, tools and services, etc.)
Members are charged for an initial term through the end of 2012.
Bryan Skib (CIC – also on quality working group)
Tom Teper (University of Illinois)
Claire Stewart (Northwestern)
Ivy Anderson, Chair (CDL)
Sharon Farb (UCLA)
Bob Wolven (Columbia) (SAB liaison)
Ann Thornton (NYPL)
The HathiTrust Collections Committee shall report to the Strategic Advisory Board. The chair of the Strategic Advisory Board shall submit recommendations to the Executive Committee for review and/or approval.
Appendix: Key Issues for the HathiTrust Collections Committee to Consider
- Duplicates in HathiTrust: Duplicate content in HathiTrust may take two specific forms:
1) Duplicate copies of the same bibliographic item deposited by more than one partner (for example the same item digitized by Google at multiple partner libraries, or the same item digitized by multiple libraries using different capture methodologies); at present, no attempt is being made to detect or limit the storing of these duplicates;
2) Surrogate copies matching a bibliographic item owned by a Google partner that are made available to that partner by Google from a copy digitized at another library (“Google Surrogates”). The library from whose copy the original was digitized may or may not be a HathiTrust partner.
For both types of duplicate content, the Collections Committee should make recommendations addressing the following issues from a collection development and management perspective:
- Under what conditions should HathiTrust store multiple copies of the same bibliographic item, and under what conditions should such items be de-duplicated? Are there principles that can be articulated to guide such decisions?
- If copies are de-duplicated, what policies and procedures are needed to address description and ownership of de-duplicated content, access and collection management decision-making, the potential future withdrawal of content, and related collection management and ownership issues?
- How should the costs for storing de-duplicated content be allocated among the partners?
Update: In March 2012, the Collection Committee release a discussion document presenting their findings.
Rights determination. How should we prioritize rights determination activities (and other rights clearing processes) for materials in the repository that are currently restricted? Given that there are more items in the repository than can likely be addressed in the next few years, are there specific areas that the partnership should focus on? (e.g. later rather than earlier, or easier rather than harder determinations)