Description

The rise of transformer language models such as BERT has opened up possibilities to use contextualized word embeddings for downstream text processing tasks. This includes applications in humanities research. However, the methods to properly use these models in a humanities context -- and, particularly, for historical research -- are still very much under development. The aim of this working paper is to present guidelines for using transformer language models to study change over time. This paper is based on a workshop held at the NL eScience Center Amsterdam on 9 December 2022, which brought together experts in computational analysis of historical text.

Sophie Arnoult - VU Amsterdam
Sara Budts - Antwerp University
Andreas van Cranenburgh - Groningen University
Mirjam Cuper - National Library The Hague
Ronald Dekker - KNAW Humanities Cluster
Pieter Delobelle - KU Leuven
Lauren Fonteyn- Leiden University
Anastasia Giachanou - Utrecht University
Julian Gonggrijp - Utrecht University
Flavio Hafner - NL eScience Center Amsterdam
Pim Huijnen - Utrecht University
Ali Hürriyetoğlu - KNAW Humanities Cluster
Marijn Koolen - KNAW Humanities Cluster
Ken Krige - Utrecht University
Malte Luken - NL eScience Center Amsterdam
Enrique Manjavacas - Leiden University
Dong Nguyen - Utrecht University
Laura Ootes - NL eScience Center Amsterdam
Luka van der Plas - Utrecht University
Carsten Schnober - NL eScience Center Amsterdam
Erik Tjong Kim Sang - NL eScience Center Amsterdam
Stella Verkijk - VU Amsterdam
Arjen Versloot - University of Amsterdam
Leon van Wissen - University of Amsterdam
Parisa Zahedi - Utrecht University

Published

2023-01-31

Website

https://github.com/Semantics-of-Sustainability/2022-12-09-workshop/blob/main/paper-draft.md