HTR: Introducing AI as an Aid to Manuscript Research in Adam Matthew Collections
The third module of Adam Matthew’s award-winning resource Colonial America: The American Revolution marks the release of revolutionary Handwritten Text Recognition (HTR) technology. Whilst the vast selection of material covering the American Revolutionary period in this collection was sure to pique many a budding American history enthusiast’s interest, this new HTR software has catapulted our latest collection to the forefront of academic and digital curation discussion.
This fantastic new technology makes manuscript material searchable for the first time and allows users of the resource to delve even deeper into documents of interest. What better resource to showcase this new technology within than Colonial America as it consists of thousands of blood-spattered and time worn pages from the National Archives’ Colonial Office files.
Having worked across the first three modules of the Colonial America I know all too well how difficult some of these documents can be to read. Over the past years I have given myself many a migraine straining my eyes to decipher script that can sometimes be almost illegible. On one particularly memorable occasion I spent almost half an hour attempting to read ‘Mr Maitland's letter to Lord Germain from aboard a transport in the Savannah River, informing him of the rebel arson attack upon the 'Inverness'. Although sent by Maitland, he tells us in his final line that the letter is predominantly written “by the hand of another my having only the left remaining”. Ironically, the only part of the letter I could read with relative ease was Maitland’s own sentence written with the only hand he had remaining after the attack!
Mr Maitland's letter to Lord Germain from aboard a transport in the Savannah River, informing him of the rebel arson attack upon the 'Inverness', 1776 © The National Archives. Further reproduction prohibited without permission. Click the image to see this document in the collection.
In the simplest sense, HTR acts as an advanced search feature, allowing users to run text level searches at document level. Users of Colonial America need only find a document of interest and then type in a term they’d like to search and they will be returned with a series of results. Results will be presented in both a list of full text hits beneath the document metadata, and with relevant thumbnails bordered in red. By either clicking on a snippet or thumbnail, users will then be taken to the image viewer where the hits will be highlighted in yellow. Alternatively, users can select a term from our Search Directories, select a document of interest, and once opened a HTR search for that term will automatically be run for the chosen term.
For instance, let’s take the ‘Letters from George Germain responding to news of Lord Cornwallis' capitulation and other matters in Henry Clintons' recent letters’ as an example. Once a user selects this document they may begin running HTR searches for any term they are interested in, say ‘Henry Clinton’. To do so a user would simply have to type ‘Henry Clinton’ into the search box and press enter. The document details information will then collapse, displaying a list of full-text hits beneath in snippet views. The image thumbnails which contain these hits will also be displayed with a red border.
Letters from George Germain responding to news of Lord Cornwallis' capitulation and other matters in Henry Clintons' recent letters, 1781© The National Archives. Further reproduction prohibited without permission.
By clicking either on the snippets or bordered thumbnails the user will be taken to the image viewer where they can view the image in isolation. In this view all hits will be highlighted in yellow. Users are also able to click the ‘next hit’ button in the image viewer to move between hits.
Letters from George Germain responding to news of Lord Cornwallis' capitulation and other matters in Henry Clintons' recent letters, 1781© The National Archives. Further reproduction prohibited without permission. Click the image to see this document in the collection.
The type of hits HTR can identify are wide ranging. Possessive name endings can still be presented as hits even if the user enters a term in a singular form. In the model document for instance, a search for ‘Lord Cornwallis’ can not only return hits for that exact term, but also any instances of ‘Cornwallis’s’. Likewise, even archaic or incorrect spellings may be identified by HTR which is particularly useful if searching, for instance in this document, for ‘Chesapeake’ as HTR can still return hits for ‘Chesapeak’.
And how is this wizardry achieved you may rightly ask? HTR uses complex algorithms to determine the possible combinations of characters in manuscript to generate full-text hits. The artificial intelligence then assigns a confidence rating to each result to return relevant hits. Hopefully this isn’t translating into techy gobbledegook on first read, but just in case you need any further information the team at Adam Matthew have taken pains to create a thorough introduction to this new feature on the Colonial America site, providing both a guidance video and Q&A pages. Users will also be able to quickly identify documents that are HTR searchable as these are marked with a small pencil icon in the resource.
The entire team at Adam Matthew are incredibly excited to be able to offer our users not only a fantastic new resource in Colonial America: The American Revolution, but also a truly innovative way of searching manuscript material. This state of the art technology is also set to be released within both new and selected existing collections published by Adam Matthew. If that doesn’t call for a celebratory ‘Huzzah’, I don’t know what does!
Colonial America is available now. Full access to this resource is restricted to authenticated institutions who have purchased a licence. For more information, including trial access and price enquiries, please contact info@amdigital.co.uk.
Recent posts
In the first of a guest blog series from the University of Delaware, discover the challenges and legacy systems limiting usage of the library's digitised special collections, and how the library team arrived at the decision to migrate its many-faceted, multimedia collections to AM Quartex.
As Los Angeles prepares for the 2028 Olympic Games, Matt Brand, Editor, delves into the city’s 1956 bid, revealing surprising twists and turns through promotional material and behind-the-scenes correspondence featured in The Olympic Movement: Sport, Global Politics, and Identity.