Using Text Mining to Search for Neolithic Vlaardingen Culture Sites in the Rhine-Meuse-Scheldt Delta

This paper presents a study on Vlaardingen Culture (3400–2500 BCE) sites in the Rhine-Meuse-Scheldt delta using AGNES, an intelligent search engine for Dutch and Flemish archaeological grey literature. The aims of this paper are twofold: 1) to provide an up-to-date overview of Vlaardingen Culture si...

Full description

Saved in:
Bibliographic Details
Main Authors: Lasse Van den Dikkenberg, Alex Brandsen
Format: Article
Language:English
Published: Ubiquity Press 2025-03-01
Series:Journal of Computer Applications in Archaeology
Subjects:
Online Access:https://account.journal.caa-international.org/index.php/up-j-jcaa/article/view/205
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper presents a study on Vlaardingen Culture (3400–2500 BCE) sites in the Rhine-Meuse-Scheldt delta using AGNES, an intelligent search engine for Dutch and Flemish archaeological grey literature. The aims of this paper are twofold: 1) to provide an up-to-date overview of Vlaardingen Culture sites; 2) to evaluate the performance of AGNES in searching for period specific sites. Vlaardingen Culture (VLC) sites usually consist of artefact scatters without clearly discernible house plans. These scatters are often found amongst abundant remains from later periods. This type of ‘by-catch’ is usually not found in the metadata of archaeological reports, and can only be recovered in full text searches. AGNES uses text mining and large language models to allow searches on archaeological concepts (in this case an archaeological culture) in full texts extracted from three major repositories for Dutch (DANS and ARCHIS) and Flemish (Onroerend Erfgoed) archaeology. This paper presents a search for VLC sites, and a comparison of the retrieved information with a recently compiled overview of VLC sites in the area. Using eight queries we retrieved 4532 hits, which were subdivided into: relevant hits (n = 430), semi-relevant hits (n = 2133), and irrelevant hits (n = 1960). We recovered 30 previously unknown Vlaardingen Culture sites, amounting to 19% of the total number of VLC sites (n = 158). Not all sites could be found in AGNES; older archaeological sites are often published in scientific and semi-scientific journals, theses, or books. These publications are absent in the repositories that can be accessed through AGNES, and by extent, they cannot be retrieved. As such, AGNES does not provide an alternative to traditional search methods. Nevertheless, most of the newly found sites consist of sites that cannot be found by searching the metadata of reports in DANS and ARCHIS. Therefore, AGNES proved to be an essential and effective addition to traditional search methods. Finally, our study highlighted the fact that clear terminology to describe Vlaardingen Culture sites is presently lacking. As such, the study provided interesting insights into the terminologies employed in development-led archaeology.
ISSN:2514-8362