Preservation
HathiTrust is committed to preserving the intellectual content and in many cases the exact appearance and layout of materials digitized for deposit. HathiTrust stores and preserves metadata detailing the sequence of files for the digital object. Currently, HathiTrust relies on the extensive specifications on file formats, preservation metadata, and quality control methods that are detailed in the University of Michigan digitization specifications, dated May 1, 2007 (http://www.lib.umich.edu/files/UMichDigitizationSpecifications20070501.pdf). HathiTrust is committed to bit-level preservation and format migration of materials created according to these specifications as technology, standards, and best practices in the digital library community change.
Preservation Formats
HathiTrust currently ingests only documented acceptable preservation formats, including TIFF ITU G4 files stored at 600dpi, JPEG or JPEG2000 files stored at several resolutions ranging from 200dpi to 400dpi, and XML files with an accompanying DTD (typically METS). HathiTrust supports these formats because of their broad acceptance as preservation formats and because the formats are documented, open and standards-based, giving HathiTrust an effective means to migrate its contents to successive preservation formats over time, as necessary. The Repository Administrators have undertaken such transformations in the past; moreover, HathiTrust offers end-user services that routinely transform digital objects stored in HathiTrust to “presentation” formats using many of the widely available software tools associated with HathiTrust’s preservation formats. HathiTrust gives attention to data integrity (e.g., through checksum validation) as part of format choice and migration.

