Computational Paleography of Medieval Hebrew Scripts

Berat Kurar-Barakat*, Daria Vasyutinsky-Shapira, Sharva Gogawale, Mohammad Suliman, Nachum Dershowitz

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

Abstract

We present ongoing work as part of an international multidisciplinary project, called MiDRASH, on the computational analysis of medieval manuscripts. We focus here on clustering manuscripts written in Ashkenazi square script using a dataset of 206 pages from 59 manuscripts. Collaborating with expert paleographers, we identified ten critical features and trained a multi-label CNN, achieving high accuracy in feature prediction. This should make it possible to computationally predict the subclusters already known to paleographers and those yet to be discovered. We identified visible clusters using PCA and χ2 feature selection. In future work, we aim to enhance feature extraction using deep learning algorithms and provide computational tools to ease paleographers’ work. We plan to develop new methodologies for analyzing Hebrew scripts and refining our understanding of medieval Hebrew manuscripts.

Original languageEnglish
Pages (from-to)707-717
Number of pages11
JournalCEUR Workshop Proceedings
Volume3834
StatePublished - 2024
Event2024 Computational Humanities Research Conference, CHR 2024 - Aarhus, Denmark
Duration: 4 Dec 20246 Dec 2024

Funding

FundersFunder number
European Commission
European Research Council Executive Agency
European Research Council101071829

    Keywords

    • Medieval Hebrew manuscripts
    • computational paleography
    • convolutional neural networks
    • image clustering
    • recurrent neural networks

    Fingerprint

    Dive into the research topics of 'Computational Paleography of Medieval Hebrew Scripts'. Together they form a unique fingerprint.

    Cite this