AM
Trials Pricing

HTR Transcription: a project-based approach to automated transcription

How McGill University Library built upon its pilot digital collection in Quartex by exploring the potential of HTR Transcription to remove barriers to accessing and understanding primary sources in its Fur Trade Collection.

As told in a Quartex/Library Journal webinar by Jacquelyn Sundberg (Outreach and Special Projects at McGill University Library and McGill ROAAr (Rare Books, Osler Library of the History of Medicine, Visual Arts Collection and McGill University Archives)) and Carolyn Pecoskie (Metadata and Electronic Resources Librarian at McGill Library).

Breaking down barriers to understanding manuscript materials

“At McGill Library and ROAAr, our archival collections, which include treasures such as historical atlases, travel narratives and natural history books, number around 250,000 rare materials. This includes a significant amount of manuscript or handwritten material, which presents unique challenges for access.”

“As much as possible, digitisation makes these items more accessible but it is only the first step; they also require cataloguing, metadata and a framework to be discoverable, as well as time-consuming manual or crowdsourced transcription.”

“In addition, even history students who receive the most training in primary source literacy can be put off by the time and effort required to use handwritten sources and interpret them for more than just the literal meaning of the text. This creates two significant barriers: primary source literacy skills and handwriting skills.”

Quartex Pilot Project

“As a first stage of piloting the use of Handwritten Text Recognition, or HTR, technology available in Quartex, we chose one collection, our entirely manuscript Doncaster Recipes Collection of papers, recipe books and medical receipts. The collection was ingested and published, alongside an accompanying digital exhibit, in April 2020. It made an interesting test case for HTR, which at that point couldn’t generate transcripts, but provided enhanced search functionality in the form of full-text search.”

“We then began working on a second collection, the Fur Trade Collection, with the enhancement of HTR Transcription. Thanks to a grant from the National Heritage Digitization Strategy, we were able to make digitally available a new swathe of materials to complement our existing fur trade collection, this time documenting the Colonial-era Fur Trade through the lens of the North West Company through which the university founder, James McGill, made his fortune.”

“Primarily authored by the Company’s bourgeois and predominantly European senior management, the holdings in this collection nevertheless reveal, albeit indirectly, the presence of Indigenous peoples and the extent to which their knowledge was critical to the success of the Company and of the Fur Trade itself.”

“One of the reasons that HTR was actually quite an exciting prospect for the Fur Trade Collection is that it opens up new pathways to uncover the hidden and indirect content; the legacy of Indigenous knowledge.”

To continue the story, register to watch the full webinar.

Discover:

  • The team’s approach to metadata configuration as a way of making the published Fur Trade Collection as accessible and discoverable as possible.
  • The methodology and mechanics behind the collection’s pathways to discovery.
  • The published collection through a guided tour of search functionality and display features.
  • The effectiveness of automated HTR Transcription on a variety of materials in the collection.
  • The conclusions drawn on the effectiveness of HTR as a tool for breaking down barriers to using handwritten materials.
  • The Library’s plans to build upon this pilot project and extend its use of HTR into further collections.

Recent posts

Going digital with AM Quartex: the importance of image quality in a digital platform

In preparation for migration to AM Quartex, the University of Delaware Library, Museums and Press is taking the opportunity to reassess its practices related to the quality of images that are created and displayed digitally. Learn more in part two of this guest blog series.

Forging “new links between the America and the Africa of today and tomorrow.”

Seventy years on from publication of the first issue, Emily Stafford, AM Editor, explores how the American Committee on Africa’s newsletter, Africa Today, served the committee’s aim of informing the American public about African affairs and built on the collective power of small individual actions to effect change.