How to Add ETAS Records to Your Catalog

oclc

How to Add ETAS Records to Your Catalog

April 8, 2020

Updated October 14, 2021 to correct the order of the columns in the Report Structure table.  The correct order is: oclc, local_id, item_type, rights, access

****

Update: Recording of HathiTrust ETAS Discovery Office Hours
Learn more on how to add other types of HathiTrust records to your catalog.

How to identify your ETAS records in your local catalog to users

This is an evolving page and will be updated as more examples and external documentation becomes available.

Once your institution has been approved for the HathiTrust Emergency Temporary Access Service (ETAS) there are three possible paths to take to identify and make discoverable those records from HathiTrust that are newly open for Full View to your users. All methods can be used in tandem with HathiTrust’s Automatic Login, which makes it easier for your users to encounter temporary access titles through your own catalog. Read more on setting up Automatic Login.

Overlap Report

Upon approval for ETAS, a new overlap report will be generated for your institution identifying which of your print holdings have a digital equivalent in HathiTrust. You can use this report to identify which items to mark in your catalog. You will find it in your library’s Dropbox folder: overlap_[your institution]_[date]  This report shows the overlap of your print collection with the current HathiTrust collection. Find more information, including descriptions of the fields and a report example, see HathiTrust Overlap Reports.

Linking Syntax

There are two preferred URI syntaxes that will resolve for the patron to either the catalog record page or the item page for the PageTurner application:

Title Level:

https://catalog.hathitrust.org/Record/{ht_bib_key/clusterID}

Item Level:

https://hdl.handle.net/2027/{htid}

See details on where to find htid and ht_bib_key values under Method B: Using the HathiFiles below.

Method A: The BibAPI

One approach to integrating these newly available records into your catalog is to use the HathiTrust Bibliographic API .

The BibAPI is intended for small batch calls, up to 20 items at a time. Calls should be made using the OCLC Number, though that is not the only acceptable identifier for the call, and can retrieve either ‘Full’ or ‘Brief’ records.

Syntax:

Brief:

https://catalog.hathitrust.org/api/volumes/ brief /<id type>/<id value>.json

Full :

https://catalog.hathitrust.org/api/volumes/ full /<id type>/<id value>.json

The difference is in the amount of information about each record that is retrieved.

The basic difference is that the ‘full’ call returns the full marc-xml in the json file.

Example API call and results from the Bib API:

Below is an example (from the BibAPI linked above) of what a call for the brief record for OCLC number 00424023, using this URI: https://catalog.hathitrust.org/api/volumes/brief/oclc/00424023.json

(The call for the full record would be: https://catalog.hathitrust.org/api/volumes/full/oclc/00424023.json , in the interest of space, the full record results are not included here)

{

    “records”:{

        “000578050”:{

            “recordURL”:”https:\/\/catalog.hathitrust.org\/Record\/000578050″,

            “titles”:[“Infinite series”],

            “isbns”:[“9780030110405″,”9780030110405”],

            “issns”:[],

            “oclcs”:[“424023”],

            “lccns”:[“62009520”],

            “publishDates”:[“1962”]

        }

    },

    “items”:[

        {

            “orig”:”University of Michigan”,

            “fromRecord”:”000578050″,

            “htid”:”mdp.39015025315527″,

“itemURL”:”https:\/\/hdl.handle.net\/2027\/mdp.39015025315527″,

            “rightsCode”:”ic”,

            “lastUpdate”:”20200225″,

            “enumcron”:false,

            “usRightsString”:”Limited (search-only)”

        },

        {

         “orig”:”University of California”,

         “fromRecord”:”000578050″,

         “htid”:”uc1.b4405602″,

         “itemURL”:”https:\/\/hdl.handle.net\/2027\/uc1.b4405602″,

         “rightsCode”:”ic”,

         “lastUpdate”:”20190118″,

         “enumcron”:false,

         “usRightsString”:”Limited (search-only)”

        }

    ]

}

Important fields from API results:

  • Volume Identifier – ‘htid’
    • This is the permanent HathiTrust item identifier. Each item identifier is unique. Used to make the ‘itemURL’.
  • HathiTrust Record Number – ‘fromRecord’
    •  HathiTrust’s record number for the associated bibliographic record. HathiTrust record numbers are not permanent and can change over time.
    • Used to make the ‘recordURL’.
  • OCLC Number – ‘oclcs’
    • OCLC number(s) for the bibliographic record. Multiple values are separated by a comma. At least one OCLC number will match that used for the API call.
  • Rights Code – ‘rightsCode’
    • A code (also referred to as “rights attribute”) that describes the copyright status, license or access.
  • Access String – ‘usRightsString’
    • Identifies whether the item is temporarily available through ETAS (e.g. ‘Limited (Search Only)’ or normally available (e.g. ‘Full View’).
  • Catalog Record URL: ‘recordURL’
    • The URL for the catalog page.
  • Item Record URL: ‘itemURL’
    • The URL for a specific digital item

Below is a generic idea of the logic that can be used to implement the BibAPI into local discovery layers:

  • Pass an array of up to 20 OCLC numbers — just the digits — to the API
  • A call is made for each number, and then the API fetches data from HathiTrust based on that OCLC number passed in.
  • The brief or full record will be returned as a json response.
    • If no data is retrieved, or the number is not present in HathiTrust, return a null value.
  • Parse through the returned rights values and distinguish between ‘Full View’ and ‘Limited View’
  • Update the linking text from ‘Limited View’ to text identifying that the item is available through ETAS. Example: ‘Temporary Access’

Note : For the ETA service, we have created a new ‘Temporary Access’ label on the HathiTrust catalog site. But the ‘usRightsString’ values returned by the API have not been changed, so any link text should be modified locally to reflect this new access.

Method B. Using the HathiFiles

Another method for integrating these newly available item records into your catalog is to download the most recent monthly HathiFiles (Links: HathiFiles and the HathiFiles Description ) and filter the results against the overlap report to extract only items that are now accessible to your library.

The HathiFiles is a tab-delimited file containing an entry for every item in HathiTrust.

ID values from the HathiFiles that are important for the ETA Service:

  • Volume Identifier – ‘htid’
    • This is the permanent HathiTrust item identifier. Each item identifier is unique.
  • HathiTrust Record Number – ‘ht_bib_key’
    •  HathiTrust’s record number for the associated bibliographic record. HathiTrust record numbers are not permanent and can change over time.
  • OCLC Number – ‘oclc_num’
    • OCLC number(s) for the bibliographic record. Multiple values are separated by a comma.
  • Rights Code – ‘rights’
    • A code (also referred to as “rights attribute”) that describes the copyright status, license or access.

For a more detailed breakdown of all of the fields present please see the HathiFiles Description .

Example Applications of the HathiFiles:

Cornell Library regularly makes use of the HathiFiles. For ETAS, they have modified the process slightly to take the overlap report into account and used that information to create links that force login for authentication. More details on what they did can be found here: http://blogs.cornell.edu/discoveryandaccess/2020/04/01/adding-hathitrust-emergency-access-links/ 

Temple University Library has also used the HathiFiles, though in a slightly different fashion. For members that have had issues with the size of the HathiFiles, Temple’s approach may be instructive. Chad Nelson, a developer for Temple University Libraries, has written a blog post that goes into detail about his process using Temple’s overlap report and the HathiFiles together to create a smaller file (12MB) that is then used to identify associated ht_bib_key values for items newly available to them via ETAS. You can find that post here:https://chads.space/words/libraries/2020/04/27/temple-libraries-hathi-trust.html

Below is modified version of the summary of the post:

  1. Download the latest HathiTrust monthly file (i.e. hathi_full_20200401.txt.gz ) from theHathiFiles page.
  2. Pare the monthly file down to just the needed data (OCLC number and Hathi Trust bib key) with:
  • gunzip the monthly full .gz file
  • csvcut to limit to just the needed columns [OCLC Number (column 4) and ht_bib_key (column 8)]
  • csvgrep to eliminate rows without required fields [removes any row with an empty value in either of the two columns remaining after the previous step]
  • sort and uniq to eliminate duplicates [de-dupes the remaining values so only unique entries remain]

gunzip – c hathi_full_20200401.txt.gz|  \

  csvcut -t-c 4,8 -z 1310720 | \

  csvgrep -c 1,2 -r”.+” | \

  sort | uniq> hathi_full_dedupe.csv

  1. Take your overlap report and extract the unique set of OCLC numbers:

csvgrep -t-c-4-r”.+” [overlap report].tsv | \

  csvcut -c 1 > | csvsort | uniq  \

  > overlap_all_unique.csv

  1. Then filter the pared down HathiTrust data from step 2 using the overlap OCLC numbers from step 3 as the filter input:

csvgrep -c 2 -f overlap_all_unique.csv \

  hathi_full_dedupe.csv > hathi_filtered_by_overlap.csv

  1. The output file —hathi_filtered_by_overlap.csv— is a two column csv of related OCLC numbers and Hathi Bib Keys that represent the items in your library available through ETAS, which you can use to construct links to HathiTrust items based on OCLC numbers in catalog records.

Method C. Third-Party Discovery Services

In the event that your library makes use of third-party discovery services like OCLC WorldCat Discovery, Primo, Summon or EBSCO EDS and would rather activate ETAS via one of those discovery layers, we are in conversation with those partners to make this a viable alternative for temporary access. This space will be updated in the event of any developments.

OCLC:

If you are using WorldCat Discovery to set holdings on HathiTrust titles you will need to create a custom KBART collection in your knowledge base with the HathiTrust titles, URLs and OCNs. If you need assistance with creating this collection and/or file please contact OCLC Support here

When reaching out to OCLC support please include your HathiTrust overlap report and with a subject line of HathiTrust access during COVID shutdown.

PRIMO:

For Primo, The University of Minnesota has created a package (available on GitHub or NPM ) that supplements the results for locally held items with links to associated HathiTrust records. The “Primo explore HathiTrust availability” package can, when search results are displayed, pass each record’s OCLC numbers to the HathiTrust Bib API. If at least one match is found, a link to the HathiTrust record is appended to the availability section. A recent update to this package allows the copyright status of the records to be ignored, so that matches will include all locally held items, not just the public domain ones. In order for this update to be available, please follow the steps below:

1) Upgrade to version 2.4.0 of the primo-explore-hathitrust-availability package.

2) Set an “ignore-copyright” attribute on the component in their template. For example:

< hathi-trust-availability ignore-copyright =” true” ></ hathi-trust-availability >

Examples of Each Method:

Method A: The BibAPI — Columbia University

ETAS cornell BibAPI example

Method B: The HathiFiles — Cornell University

ETAS Cornell hathifiles example

Method C: Third Party Vendors

Primo: Oxford University

ETAS Oxford Primo example

OCLC WMS: University of Delaware

ETAS UofD WMS example

EBSCO Discovery: Indiana University

ETAS IU EBSCO example

Top