Making Sense of Illustrated Handwritten Archives

Supporters of "Making Sense of Illustrated Handwritten Archives"

The project Making Sense of Illustrated Handwritten Archives was submitted jointly by the Leiden Centre of Data Science (LCDS), Naturalis Biodiversity Center, the universities of Groningen (ALICE), Leiden (LIACS) and Twente (STePS), and publisher Brill as creative industry partner. It was awarded € 626.000 by the NWO (the Netherlands Organisation for Scientific Research) Creative Industry programme, matched by Brill’s € 268.000 investment (in cash and in kind). Read the project press release.  

The researchers will use an advanced system for handwriting and image recognition (Monk), complemented with contextual information on species, locations and habitats. Naturalis’ taxonomic expertise, in combination with history of science methods, will be used to refine the system further. The outcome of the project will allow Brill to offer the system as an online service for the heritage sector, as a strengthening of its digital humanities profile. This will serve both curators of illustrated handwritten archives and researchers who wish to further the understanding of these collections.

Monk

The Monk system logo. Image credits: Lambert Schomaker

The 4-year project includes the appointment of two computer science PhD students (Leiden, Groningen), a post-doctoral researcher in the history of science (Twente) and a specialist on 19th century taxonomy and natural history (Naturalis). ‘The unique archive of the Natuurkundige Commissie serves as a perfect challenge to combine expertise from different universities and disciplines’, says Brill’s Senior Acquisitions Editor Michiel Thijssen. ‘The resulting technologies will advance the ways in which scholars can study the archived human cultural heritage.’

 

Netherlands Organisation for Scientific Research

Page from a bundle of field notes, describing and depicting a mouse species Drawing of Burro multicolor created in Buitenzorg, Java in 1827 by Pieter van Oort
Page from a bundle of field notes, describing and depicting a mouse species. Source: Naturalis Biodiversity Center, Archief van de Natuurkundige Commissie voor Nederlands-Indië. Copyright: Public Domain Mark 1.0. Download in high-resolution Drawing of Burro multicolor created in Buitenzorg, Java in 1827 by Pieter van Oort. Source: Naturalis Biodiversity Center, Archief van de Natuurkundige Commissie voor Nederlands-Indië. Copyright: Public Domain Mark 1.0. Download in high-resolution

For more information, please contact Michiel Thijssen, thijssen@brill.com, +31-71-5353594.

About Making Sense of Illustrated Handwritten Archives: Providing access to the hidden treasures of the Natuurkundige Commissie
Large and important parts of our cultural heritage are stored in archives that are difficult to access. Documents and notes are written in hard-to-read historical handwriting and are weakly structured, precluding access to a wider public, or to scientists and other experts. Computer-based recognition of connected-cursive script is, in general, distinctly beyond the scope of current technology.

Our project will investigate this challenging problem by attempting to interpret the notes and illustrations of the Natuurkundige Commissie. It is one of the top-collections of Naturalis Biodiversity Center, containing a rich account of 17,000 pages of scientific exploration of the Indonesian Archipelago (1820-1850). Correctly interpreting illustrated handwritten historical archives is hard.

For handwriting recognition we use the MONK system, a state-of-the-art machine learning handwriting system. Yet, we may rely on (1) the circumstances of the committee’s voyages, and (2) contextual information of the species, locations and habitats. This information will be used to support the handwriting recognition of the historic collection. MONK will be extended with layout formatting and ontology elements. Furthermore, the Naturalis taxonomic expertise, in combination with history of science methods, is used to bootstrap, train and refine the system.

The project aims to develop a technologically advanced and user-centered digital environment that provides access to archives containing handwritten notes and illustrations. This technological tool, that combines both image and textual recognition, allows, for the first time, an integrated study of underexplored scientific heritage collections and archives in general.