Strategic Advisory Board Meeting Minutes - February 17, 2011

Present:  John Butler, Sarah Pritchard, Paul Soderdahl, Ed Van Gemert (recorder), Bob Wolven, Jeremy York.  Tricia Cruse, Bernie Hurley and Bruce Miller had indicated previous commitments.

1. The minutes from the January 20, 2011 SAB teleconference were approved.

2. Development updates: (Prepared and presented by Jeremy York) 


  • Passed 8 million volumes in early February
  • Will be beginning ingest from Library of Congress very soon (about 80,000 volumes)
  • Working with Yale to address issues discovered in ingested sample
  • Have begun working with Harvard on ingest of 50,000 volumes
  • Received locally-digitized volumes from Northwestern; are doing preliminary investigation
  • Will be receiving locally-digitized files from Illinois in near future.
  • Have received first files from University of Utah Press. MPublishing will be working on processing them to prepare for ingest.


  • We are planning a release of the new interface later this month or early March
  • Existing PageTurner will be default, with links to try new interface
  • Main features - scrolling view, flippy view, re-designed layout designed to have more features above page-scrolling threshold; viewing controls and links to information easier to find and use.

Collection Builder

  • In the middle of enhancements (moving Collection Builder from using its own index to use the full-text search index) to enable creation of collections with as many items as users would like
  • Recently discovered someone had scripted creation of several large collections
  • Will likely have limits programmed to require staff mediation for creation of collections above a certain limit. 

Large-scale Search 

  • Full-text index rebuilt in just 10 days as opposed to projected 40; smaller overall index size
  • Increase in speed due to upgraded Solr software and modifications to how index was constructed
  • Index includes bibliographic data, improved handling of non-latin scripts 

Storage Replacement 

  • Continues in Indiana.
  • Expect to complete in March
  • Non-disruptive to replace storage, but have paused ingest and full-text indexing at certain moments to be prepared to respond to unexpected problems.

Bib Mgmt

  • Monthly calls with staff at Michigan to check in.
  • So far have talked about requirements for bibliographic records, how records are received, processes for transforming partner records prior to ingest, transfer of records from UM to UC. 
  • UC working simultaneously on core system and processes around ingest. UC development to point where these two systems could begin to interface.
  • Project tracking page posted on HathiTrust website with key milestones ( Will link to monthly updates and provide additional information as appropriate.
  • Will be some adjustments to web page to add more granularity.

Creative Commons

  • Adjustments are being made to access interfaces and processes to accommodate CC licenses. Plan to go live March 1. 
  • ~350 volumes will be opened under variety of CC options (most cc-by-nc and cc-by-nc-sa)
  • Adjustments for CC licenses include RDFa in PageTurner (machine readable, provides proper way to attribute the work when you view the license on CC website). 
  • Made updates to COinS data, which makes it easy to add bib data from the PageTurner to systems like Zotero.
  • As part of CC development, over last couple of months have standardized language and delivery of access and use policies so have more consistency in statements across different services.

Print Holdings Database

  • Have data currently from 19 institutions.
  • Have been contacting institutions on individual basis and getting responses. Deadline is end of February.
  • Intend to use the data gathered to start offering services such as access for users with print disabilities [and section 108 services] in near term [this year]

Calls for Working Groups

  • Call went out for one new member of Communications group to replace someone who had to leave. Have had about a half-dozen nominations.
  • Call for participation in User Support group. Have started receiving nominations here as well. 
  • Hope to have membership of both finalized soon.


  • Several new institutions joined last month, a few working with currently, including Madrid
  • Working with LoC and NYPL to find strategies that will work for each. Developed matrix of access scenarios and requirements. Have come to place where possible strategies for each have diverged in the short-term. Will work with each individually. 

Development Environment

  • Continue to make enhancements - most recently to facilitate testing of new code prior to release. All development is happening there.
  • Plan to development public documentation next month.
  • It's at a level where not ready to have everyone come in at once, but have been encouraging partners when we speak to them that if there is interest, to let us know and we will add folks to the environment, to begin developing or just to explore.
  • Two UM staff have been taken on 1/2 time each supported by HathiTrust funds as code managers for 2011 and 2012. Recognizing need for ongoing support and management of the environment.
  • [Will likely be seeing more of what we are calling "programmatic" activities - targeted development and support such as the Bibliographic Management system, code repository, print holdings database; special initiatives in reserve e.g., if we needed to migrate to new platform, etc.]


  • Received word at beginning of last week that report would be out in two weeks. Expect something next week.

On the Horizon

  • Mobile development, UM UX department - reading and bibliographic search
  • Over next 7 months

3. Review and discuss the consultant’s proposal to review HathiTrust

SAB reviewed the proposed revision, adding comments for further discussion with the consultant; e.g. Include a confidentiality statement

Moving to finalizing a MOU and working with the CIC to administer the MOU and make payments to the consultant.


4. Working group and committee updates


  • Working on a report on duplicates in the repository
  • Adding Tom Teper to the committee
  • Reviewing proposal to have HT serve as a print management service expanding to a network of dark repositories


  • Reviewed and discussed the HT Communications and Marketing Plan 2011.  Agreed to the need for a larger assessment component.  There is a gap in how we receive, capture and refer feedback to appropriate groups.
  • Also reviewed and discussed the Strategic Objectives and Activities for 2011. 
  • SAB acknowledges the good work being done by the Communications WG.

Discovery Interface

  • The record load into WorldCat Local is now over 4.4 million records.  Used to create the HT WCL view
  • Weekly traffic using the WCL view is declining each week.  Under 1K last week.
  • Continuing its work with usability testing

Full text interface subgroup making progress on identifying and prioritizing near-term enhancements to HathiTrust full-text search.  Developing 13 new features to be added by June including:  snippets, facets, multiple-word searches, greater use of MARC in relevancy ranking.