Strategic Advisory Board Meeting Minutes - April 15, 2010

HathiTrust Strategic Advisory Board Meeting Minutes
Conference Call
Thursday, 15 April, 2010
10:30 AM - 12:00 PM (Central)
Participating: John Butler, Bruce Miller, Sarah Pritchard, Paul Soderdahl, Ed van Gemert (chair), John Wilkin, Bernie Hurley (recorder)
Absent: Patricia Cruse

1.  Approved SAB minutes for 18 March, 2010 conference call

2.  Operations/Development update, John Wilkin. 

  • No update from the Executive Committee, as they do not meet until April 19th.
  • Loading UC materials digitized by the Internet Archive UC has begun (»100K volumes)
  • Michigan staff are working with a few CIC libraries on local digitization issues
  • The large scale search infrastructure will be extended to Indiana this month
  • HathiTrust ingest is slowing, but more large loads are on the horizon
  • Minnesota will be sending first big shipment to Google soon (approx 10,000 volumes per month)
  • CDL has implemented links to HathiTrust materials in their UC eLinks service (SFX based) so they are visible to end users (Bruce will send example URLs to the group)

3.   Review and discuss the handling of surrogates

The draft charge and membership of the proposed Google Surrogates Working Group was discussed. 

Action:  Ed and Bernie will edit the draft charge to reflect suggestion from the group, which include:

- The need to keep both the Source of the Print Volume (SOPG ) and the Contributor of the Digital Item to HathiTrust (CDIH) in the metadata

- Make recommendation on where the SOPG and CDIH should display, including related branding considerations and the cost of making these changes

- Consider the cost and preservation issues related to ingesting surrogates, as well as the costs and consequences of not ingesting them

Action: Bernie will be the SAB member on the committee and Ed will contact the other members nominated to serve – John Rothman and Heather Christenson. (Paul has agreed to be a consultant to this Working Group). 

Action:  Once the charge is finalized, Ed will share it, the “Surrogate Problem Overview” paper and membership with the Executive Committee with an invitation to suggest other members and to share all this information with others, as they see fit.

4.  Discussion of faculty awareness of the repository, S. Pritchard.

Sarah described the need for a communication strategy that explained HathiTrust content & services to faculty and other researchers.  For example, a new section on the website called “Research uses of HathiTrust” that addresses their questions and concerns (e.g., image quality). 

John Wilkin shared that he wanted to establish a Communications Working Group focused on large scale outreach and addressing needs of the different HathiTrust audiences.  The SAB agreed this would be a good place to address these issues raised by Sarah.

Action: Sarah has agreed to be the SAB liaison to this working group and she will work with John W. to draft a charge. The SAB was asked to nominate membership for this group.

In a separate but related action, John W. has decided to rework the HathiTrust website to move the “brochure-ware” to the background and focus on search and services.  This change is planned for late summer/early fall.  John will involve the new Communication WG in this project.

5.  Updates:

Discovery Interface Working Group, John Butler.

Record loads are proceeding at a good pace into WorldCat.  There will be 3+ million loaded by mid-May.  Usability testing is also going well.  Recommendations are being developed based on the findings of this testing.  These recommendations will then be discussed with OCLC and worked into their development queue.  The Discovery Interface Group is writing a charge for a subgroup to investigate the integration of full text search into catalogs interfaces

Error Rate Working Group, Paul Soderdahl.

No report

Collections Working Group

Ivy has prepared a draft of their charge and it is being reviewed by the rest of the group, Sarah, Trisha, John W. and Ed

6.  The new HathiTrust Cost Model

SAB members expressed a desire to better understand the new cost model recently approved by the Executive Committee. 

John W. began by summarizing the model.  Work on the new model began 2008 with a goal of trying to address how to charge new institutions who are not large content contributors but had significant content to share, as well as institutions that were interested in sharing the curation  responsibilities of HathiTrust (i.e., maybe no content to contribute at all).  The original cost model was based on distributing the cost of storage and related technology infrastructure.  The new model moves us to distributing the cost based on the opportunity to share the benefits derived from HT curation. It is based on holdings counts at the volume level with a “multiplier” to fund additional development.   John noted that modeling exercises have shown that the cost to institutions would not change much under the new model, unless there was a significant increase in membership, which would decrease the cost to existing institutions.  He also believes that a significant influx of new content from new or exiting members will not significantly change institutional costs for others. Any new unique content would be borne by the contributing institution.  The additional cost increase for existing institutions due to non-unique materials would be blunted by the cost-sharing feature of the model.

John also spoke to the difficulty of building the holdings inventory database on which the model would dependent.  He suggested by starting simply with single volume monographs and then working to the more problematic formats like multi-volume monographs. John also said that we would need to be flexible and adapt the model as we learn how it works.

Bruce suggested that we want to make the cost model document more clearly state that the cost of public domain materials to each institution is the total public domain costs for the HathiTrust divided by the number of partner libraries. Perhaps this could be done in the form of an equation.

For in-copyright works Bruce confirmed that the cost is the (storage cost * the multiplier) divided by number of libraries that hold or have held that volume.  He suggested this may be too complicated because the cost of any volume varies by the number of libraries that hold or have held that volume. We would need all holdings for all members recorded by HathiTrust to implement this model in 2013.  A potential new member would have to give all their new holdings to the HathiTrust just to get a quote.  Bruce suggested that a standard cost per in-copyright volume be easier to implement.  John agreed that we would need to explore fallback position if gathering holdings at the volume level becomes too complicated. But, he said that we need this holdings level data anyway (legal needs, shared print storage, etc.), so it is worth the try to build the holdings database.

7.  New business

Time ran out before any new business could be addressed.