Sarah Pritchard, recorder.
1. Review and approve SAB minutes from 4/15/2010. Minor spelling corrections were noted.
2. HT Operations/Development update. John Wilkin reported. Major news items include:
3. Committee charges: Charges and reporting were clarified for new committees on Communication, and on User Research. John sent updated drafts. The charge and membership for the Communications Working Group were subsequently approved by the Executive Committee and can be found online at http://www.hathitrust.org/wg_communications_charge
4. Update on the Mellon workshop at Northwestern about digital humanities text initiatives: Sarah Pritchard reported on this one-day discussion meeting led by Martin Mueller, NU English Dept. The focus is on strategies for curating digital corpora of humanities texts (especially Early Modern literature), and the possible use of a crowdsourcing approach to correcting and annotating things like the works in the ESTC. The workshop looked at the content and approaches of ESTC, the NINES project, HathiTrust, TCP, and relevant computer science research projects. Sarah participated, and briefly updated the group on HathiTrust's ongoing work in areas of 1) quality control, and 2) subsetting. A key need identified by the workshop is for collaborative editing tools; and for better metadata and text linkages among corpora and relevant bibliographical listings. The ESTC may contact HathiTrust in this regard. A report will be forthcoming shortly.
5. Working Group Updates:
1) HathiTrust WorldCat Local (HT-WCL) Record Loads -- WorldCat now contains over 2.5 million records for HathiTrust titles
2) The HT-WCL software installation scheduled for May 16 was delayed for additional testing. OCLC is working to identify the earliest date possible to move forward with the install.
3) The HathiTrust Full-Text Search Services Assessment and Development effort is being launched as a subgroup of the working group.
1) Error Rate WG began drafting a "gating scenarios" document, identifying pros and cons of each scenario and projecting related development requirements and magnitude of costs for each. Current scenarios being explored: (a) gating at ingest, (b) gating at access, (c) no gating but disclose QC info to users, (d) no gating at all.
2) Now working on refining the various gating scenarios and developing a principles framework document. Will need to discuss applicability, if any, to non-Google objects. At some point, will need help getting a sense of efficiencies and costs for gating at ingest vs. access.
3) Question to SAB: Is HathiTrust intending to allow localized (partner-specific) decisions and practices or will all partners need to abide by the same rules?