Navigation

Help - Using the Digital Library

Logging in

What are the benefits of logging in as a member?

Members of partner institutions get access to the largest number of volumes and features by logging in with their institution. Logging in enables members of HathiTrust partner institutions to:

Members can not view or download works that are “limited (search-only)”. See “Is it possible to view a volume that is Limited (search-only)?” for more information.

See “What can I access without logging in?” for information about what all users can do without logging in.

I'm from a partner institution that uses Google Apps. How should I log in?

You should login by selecting your institution’s name from the menu. Do not log into HathiTrust with a Google account. Logging in with a Google account will give you limited guest access, not full member access.

Why isn't my institution listed on the login page?

If your university or college isn’t listed, they have not joined HathiTrust. Contact a librarian at your institution to request that they become a member. See more information about “Eligibility and Agreements”.

Can I login as a guest? What are the benefits?

Users that do not belong to partner institutions can create a guest account. Log in with an existing account (Facebook, Google, Twitter, etc.). Users can also create a University of Michigan "friend" account. Logged in guests can:

See “What can I access without logging in?” for information about what all users can do without logging in.

What can I access without logging in?

All users can do the following without logging in:

  • Search across the entire collection;
  • Read and view works that are “full-view”;
  • Search within works that are “limited (search-only)”;
  • Download a single page at a time from works that have download restrictions (e.g., works that are in the public domain but were digitized by Google or other vendors with contractual limitations);
  • Download an entire work that doesn’t have download restrictions (e.g., works digitized by Internet Archive and other organizations, works that have been opened with a Creative Commons license);

Do I need to use a proxy server to access HathiTrust if my institution is a partner?

No, you should not use a web proxy server to access HathiTrust. Many libraries require their users to use a proxy server (e.g., EZProxy) or VPN client to access licensed databases from off-campus locations. Using a proxy service like EZProxy will interrupt your connection to HathiTrust and result in slow response times or pages that do not load. VPN clients will not interrupt your connection but are not required for member access. For the best experience, ensure that you have not connected to HathiTrust via a proxy server. See "Note on Proxy Servers" for more information.

Searching and Viewing Books

Is it possible to view a volume that is "Limited (search-only)"?

In most cases, no. Volumes are "Limited (search-only)" because 1) we have determined they are, or they are suspected to be, copyrighted, or 2) we do not have enough information to make a copyright determination.

Note: Access to volumes in HathiTrust is dependent on the country a user is accessing from. All users worldwide are able to view:

  1. open access or Creative Commons-licensed works
  2. U.S. works published prior to 1923
  3. Canadian or Australian works published prior to 1897
  4. works published in other countries prior to 1877

In addition, users accessing HathiTrust from the United States are able to view works that were published in other countries between the years (inclusive) 1877 and 1922. See Copyright for further details.

We do not provide selective access for persons or organizations to copyrighted works for which we have not received permission, except as permitted by law. Lawful uses include several broad and general rubrics of use, including services for users with print disabilities and U.S. Copyright Law Section 108 uses of materials. Where we have secured permissions, we may offer privileged access for all authenticated users at all partners institutions.

We can and do routinely open access to materials with the permission of the rights holder, which includes the possibility of applying Creative Commons licenses (see our Permissions Agreement). Users are free to contact publishers or authors of materials to request that the materials be opened in HathiTrust. If a volume was published in the United States from 1923 to 1963 and is "Limited (search-only)", it is possible that it could be in the public domain (see our Copyright page). Older works published outside the United States may also be out of copyright.

The Find in a Library link, available in the catalog record and when viewing the works themselves, can be used to located the nearest print copy. Contact your local library about interlibrary loan options.

How can I search HathiTrust?

There are several ways to search works in HathiTrust, which are all accessible from http://www.hathitrust.org:

  1. Catalog Search:
    1. Catalog HathiTrust catalog at http://www.hathitrust.org: Search bibliographic fields such as Title, Author, Publication Date.
  2. Full-text Search: Use keywords to search the full-text of all works in HathiTrust.
  3. Collection Builder: Search inside a collection of materials that you or others have created
  4. Single-volume Search: When viewing a work in the page-viewing application (see example), use keywords to search inside that volume alone.

How can I view the full MARC Cataloging Record for a title?

It is possible to view the full record by adding a .xml, .marc or .json extension to the URL when viewing the record in the catalog (e.g., http://catalog.hathitrust.org/Record/002128836.xml or http://catalog.hathitrust.org/Record/002128836.marc or http://catalog.hathitrust.org/Record/002128836.json. Add a .mrc extenstion to the URL to download the full MARC record.

Search Tips

Primary Catalog Search:

  • Phrase Searching: Use quotes to search an exact phrase: e.g., "occult fiction"
  • Wildcards: Use * or ? to search for alternate forms of a word. Use * to stand for several characters, and ? for a single character: e.g., optim* will find optimal, optimize or optimum; wom?n will find woman and women. If you would simply like to browse without entering a search term you can enter * by itself.
  • Boolean Searching: Use AND and OR (capitalized) between words to combine them with Boolean logic: e.g., (heart OR cardiac) AND surgery will find works about heart surgery or cardiac surgery

Full-text Search:

  • Phrase Searching: Use quotes to search an exact phrase: e.g., "occult fiction"
  • Multiple Term Searching: When your search terms are not quoted phrases, avoid common words (such as: 'a', 'and', 'of', 'the', etc.) to speed up your search.
  • Boolean Searching: Use AND and OR (capitalized) between words to combine them with Boolean logic: e.g., heart OR cardiac will find works containing the word heart or the word cardiac; heart AND cardiac will find works containing both words. Use a minus (-) to remove words from the result e.g., heart  -cardiac will find works containing the word heart that do not include the word cardiac.
  • As in the catalog, if you would like to browse without entering a search term you can enter * by itself.

Full-text Advanced Search

  • In advanced full-text search, you can also use the "all of these words" dropdown instead of AND or OR. For instance, a search for

    [all of these words] heart cardiac

    is equivalent to a search for heart AND cardiac and will find works with both terms, while

    [any of these words] heart cardiac

    is equivalent to heart OR cardiac and will find works with either one or both words.

    Examples:

    Advanced AND Search Example  Advanced Search OR Example

Can I search the full-text of a subset of works?

Yes. In Collection Builder, you can search all the works in a private or public collection. See What can I do with my collection for more information on Collection Builder searching.

How do I conduct an advanced search?

Advanced search is available in the catalog search and full-text search. Please see the "Search Tips" above for further information.

Is non-linguistic content searchable (graphs, charts, illustrations, etc.)?

Non-linguistic content is not searchable at the present time.

Are "wildcard" searches possible?

Yes. In the primary catalog (http://www.hathitrust.org), a "wildcard" search is formed by appending the asterisk character (*) to the end of the term. For example, "fond*" would find "fond," "fondest," "fondly," etc.

Do I need to use diacritics and accents in my search?

In the primary catalog (http://www.hathitrust.org) and full-text search, if the work is in a Latin alphabet with diacritical marks, it is not necessary to enter the character with diacritical marks. For example, you may enter "Émile," "émile," "Emile," or "emile" and a page containing any of these forms will be found.

Can I search works printed in a non-Latin alphabet?

Yes. We are receiving searchable text for volumes in a variety of non-Western scripts including Russian, Greek, Hebrew, Chinese, Japanese, Korean. Our goal is full international coverage. If you are having trouble searching a volume, please let us know by using the feedback link at the top of the page.

What does "Page not available" mean?

This message is displayed in 3 cases:

  1. Pages were missing from the library's print copy of the work. If a work does have a missing page, this generally means that two pages are missing, since publishers generally print on both sides of the page. So, if pages 81-82 of a work are missing, there should be two pages with the message "Page not available" between pages 80 and 83.
  2. One or more pages were not scanned.
  3. In some cases, Google will misidentify a page, leading them to believe that a page is missing when it is not. For example, if they misidentify p. 206 as "205" they will think p. 206 is missing. They will insert a page to display "Page not available," although there is no missing page. Please notify us using the feedback form if you believe that a page has been misidentified in this way.

What is the difference between page numbers and 'sequence numbers'?

Sequence numbers indicate the order of the physical pages of the work, and they rarely match up with pagination, which is inserted by printers. The sequence numbers start with the very first image in the work, which is usually the front cover. Thus, the sequence number and the page number will rarely be the same. We have sequence numbers for all works, but some works may not have page numbers (works digitized early in Google's process, in particular, are lacking page number tagging).

Can I view the page with the search term highlighted on it?

Search terms are not highlighted on the page images, but for works we can show you, they are highlighted in the text view. Do your search in the box in the upper right-hand corner of the page display. You'll see your search terms highlighted on the search results page. Follow the link to a page or sequence number, change the format to text, and you will see the search term highlighted on the text view in the page display.

Can I "bookmark" certain pages or portions of works or include them in a collection?

These are not features that we have available now, but we will be collecting requests as part of our development process. Please submit suggestions via our feedback form.

Do these works comply with accessibility standards for users with disabilities?

Our system conforms to Web Content Accessibility Guidelines, priority 1 <http://www.w3.org/TR/WCAG10-CORE-TECHS/>. The system can display each page either as an image or as text (although due to the limitations of Optical Character Recognition software the text may contain errors). Text is available one page at a time and is readable by screen readers. If you are aware of ways in which we can improve accessibility, please tell us how we can make it better using the feedback link provided at the top of each page. Please see our webpage on accessibility for more information.

Building Collections in Collection Builder

How do I build a collection?

Works can be added to collections either from the page-viewing application or from the search results page in Full-text Search. In the page-viewing application, the collection building feature is in the left sidebar. In the Full-text Search, items in the results list can be selected using the check boxes next to each item and added to a collection using the drop-down menu at the top of the list. In both cases, works can be added to a new or existing permanent collection, or a temporary collection (a temporary collection is automatically created if the user is not logged in). Permanent collections may be private to the user or shared publicly with others. Works that are saved to collections can be searched independently of the rest of the repository, allowing users to perform focused searches on subsets of HathiTrust materials.

What can I do with my collection?

The primary purposes of building a colleciton include: being able to search within it; saving works for future reference; sharing a collection of works with other users by making your collection public. In addition, you can download information about your collection.

Can I download information about my collection?

Yes, you can download select data from your collection. Under the description of your collection is a button that allows you to “download metadata.” When you click this button, you can select one of two files to download:

  • a TSV file that contains the data elelents below about the items in the collection, which can be used for analysis of the collection, and
  • a JSON file that contains the data elements below as well as some data about the collection itself, which can be used to create a workset in the HathiTrust Research Center portal.

Each type of file contains the following data elements:

  • htitem_id - the HathiTrust item identifier which is used to uniquely identify every HathiTrust digital item or work
  • title
  • author
  • date - the publication date for the work in question. This date is derived from the catalog record.
  • rights - the copyright status for this work as determined by HathiTrust (for the full list of rights codes used in our system, please see “Attributes” on this page)
  • a series of identifiers commonly used by libraries: OCLC, LCCN, ISBN
  • catalog_url - the url for the catalog record with which the item in question is associated
  • handle_url - the permanent url for the HathiTrust digital item

In order to sort or otherwise work with the data, you may choose to copy the contents of this file into a spreadsheet.

You can also perform a search within the collection and download the results of that search.

Can I search all works in the HathiTrust Digital Library?

Yes. Searching and faceted browsing of all works is available through the Catalog and the Full-text Search, available at www.hathitrust.org. Full-text search allow you to add search results directly to your collections.

Can I include works not in the HathiTrust Digital Library in my collection?

No, currently you can only include works that are held in the HathiTrust Digital Library.

Printing/Downloading

Can I download a whole work?

Members of HathiTrust partner institutions are able to download public domain works in their entirety as well as works made available in under Creative Commons licenses. Guest users can download one page at a time of all public domain works or an entire work that doesn’t have download restrictions (e.g., works digitized by Internet Archive and other organizations, works that have been opened with a Creative Commons license). See "Why isn't full-PDF download publicly available for all viewable items" below. There is significant overlap of volumes in HathiTrust and Google Book Search, and if a work is "full view" in HathiTrust, it is possible that the work can be downloaded from Google Book Search.

Can I download only a portion of a work such as an article or chapter?

Yes, you can manually select what pages you would like to download. For works with download restrictions (see "Can I download a whole work?" for more information), only members have the option to download portions of the work.

Click in the upper corner of the pages you would like to download in order to select those pages, and those pages will be added to your download selection. An orange border will appear, and the upper corner of the page will appear “folded” down. To deselect the page, click again in the corner.


Image of mouse clicking on corner of page


You can select pages and ranges that are not adjacent to each other.

To select a range of pages, select the first page of the range, navigate to the last page of your desired range, and shift+click to select the range. All pages in the range should now have an orange border to indicate they are selected. You can navigate to different ranges and pages that you have selected by clicking the selection dropdown menu that appears in the toolbar above the pages.

Image of toolbar above page, focusing on paperclip icon


When you have finished selecting all of the pages that you would like to include in your PDF, click the link in the left sidebar to “Download N pages (PDF)” where N indicates the number of pages you have selected. After you have downloaded the selected pages as a PDF, your selection will disappear.

Why isn't full-PDF download publicly available for all viewable volumes?

The uses of materials in HathiTrust may be defined by third-party agreements. For instance, libraries' agreements with Google require us to take steps to prevent bulk download of materials they have digitized. We offer full-PDF download of Google-digitized materials to partner institutions because we are able to work with partners to ensure that use is within acceptable parameters. Public domain works deposited in HathiTrust without restrictions (such as works scanned by Internet Archive or other organizations) and works made available by rights holders under Creative Commons licenses are available for full download.

When I attempt to print a PDF that I have downloaded, why do printing errors occur?

HathiTrust PDFs are not optimized for printing. Most content in HathiTrust was originally scanned as image files, and those image files are compressed and combined into PDF files for download. The conversion and compression process may result in PDF files that are not printer-friendly.

Can you send me a work (or more than one work) on a CD-ROM?

We cannot provide digital copies of these volumes.

Can I see statistics on how often a particular work is accessed?

Librarians from member insitutions may request access to the HathiTrust Google Analytics profile in order to view statistics.

Where is the persistent URL for a work I'm viewing?

The persistent URL for each volume is displayed in the left sidebar when viewing an work. We use the "handle" form of persistent URLs.

How can I get a physical copy of something I'm viewing?

Click on the "Find in a library" link in the catalog or when viewing a work to find a copy at a library near you. If you do not have access to a nearby library or the work is not available locally, you may request it through Inter-library Loan. Check with your host or local library for details.

Scanning/OCR Quality

How do I report page images that are hard to read, missing, or otherwise problematic?

Please use the feedback form that is available at the top of each page to report any problems you encounter. For problems that are found in images digitized by Google: Google is continually improving the quality of the images and OCR it delivers to HathiTrust partners. Over time we will be replacing low-quality scanned images with better ones.

Why is the text view for some books full of misspelled words, gibberish or words in the wrong language?

The searchable text of HathiTrust volumes is produced through a process called Optical Character Recognition (OCR). OCR software "reads" the page images and makes a best guess at identifying the letters and words on a page. This is an automated process, subject to error. In order to interpret the images of letters on a page, the OCR software must first determine the script and language of the page. When the OCR software misidentifies the script or the language of the volume or the page, it will tend to produce gibberish.

Sometimes the OCR software will misidentify an image (such as a picture of a dog) as text and produce gibberish. The OCR software also has trouble with tables, musical notation, math and scientific formulas and may try to interpret these as regular text.

Other factors causing OCR errors can result in misspelled words, or can even confuse the OCR software about the language of the page. These include: poor quality printing or poor quality scans, unusual fonts, light printing, or faded or discolored text, bleedthrough, stains, underlining/highlighting of text and unusual page layout.

Even the best quality OCR usually contains an error or two per page.

What type of volumes are most likely to contain OCR errors?

HathiTrust has volumes in over 400 languages. Works in the more common languages and works using the Latin script are more likely to be recognized correctly.

The OCR software does better with modern texts in English and European languages and has difficulty with non-Latin scripts such as Chinese, Japanese, Devenagari (used for Hindi and other Indic languages), and Arabic. Volumes where the text is written vertically instead of horizontally (such as many older Japanese and Chinese volumes) tend to have high error rates as well.

Older books tend to have more OCR errors due to various issues such as unusual fonts, faded or discolored pages, stains or bleedthrough, and unusual page layouts. Blackletter fonts such as Fraktur often produce high OCR error rates. Other physical quality problems can also cause OCR errors such as tight book gutters and damaged pages.

Does the quality of scanned images affect the way they can be searched?

Yes. Poor quality scanned images contribute to OCR errors. In some cases the poor quality images cause the OCR engine to guess the wrong language. In other cases only some occurrences of keywords may be affected. Depending on the severity of the OCR errors, the text may not be searchable at all, or searches for most words in the text may succeed. Text quality can be improved by improvements in OCR software and by human correction. We will incorporate better text whenever possible.

Do OCR errors affect the ability to search volumes?

Yes. The impact of OCR errors on searching depends on the type and severity of the errors.

If the OCR software was unable to recognize text on a page, then that page is not searchable. If the OCR engine incorrectly identified the script of a page, that page will not be searchable.

If only some sections of a volume are affected, such as words too close to the binding of the page or words obscured by stains or highlighting, most of the text will still be searchable. However, if the OCR consistently misrecognizes a word, that volume will not show up in search results for that word. OCR errors also affect term statistics which are used in relevance ranking. This can cause volumes that contain OCR errors for the query words being moved to the bottom of the search results even if those volumes are the most relevant.

Can I submit corrected text for a page/title?

We do not yet have a system in place to accept corrected text. We hope to have one in place in the future since feedback and assistance from the user community will help us to improve quality.

Miscellaneous

How do I contact HathiTrust with suggestions?

HathiTrust is very interested in user feedback and in collecting suggestions for improving the repository. Please submit your comments through the feedback form at the top of every page. If you have questions about copyright, partnership, or general questions about HathiTrust, you may also contact us at feedback@issues.hathitrust.org.

I don't see my question here. Where can I go for more help?

You can click on the "Feedback" link at the top of every page, or send email to feedback@issues.hathitrust.org.

Save