Word-based largest chunks for Agreement Groups processing: Cross-linguistic observations

The present study reports results from a series of computer experiments seeking to combine word-based Largest Chunk (LCh) segmentation and Agreement Groups (AG) sequence processing. The AG model is based on groups of similar utterances that enable combinatorial mapping of novel utterances. LCh segm...

Full description

Saved in:
Bibliographic Details
Main Author: László Drienkó
Format: Article
Language:English
Published: The John Paul II Catholic University of Lublin 2020-12-01
Series:LingBaW
Subjects:
Online Access:https://czasopisma.kul.pl/index.php/LingBaW/article/view/11831
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The present study reports results from a series of computer experiments seeking to combine word-based Largest Chunk (LCh) segmentation and Agreement Groups (AG) sequence processing. The AG model is based on groups of similar utterances that enable combinatorial mapping of novel utterances. LCh segmentation is concerned with cognitive text segmentation, i.e. with detecting word boundaries in a sequence of linguistic symbols. Our observations are based on the text of Le petit prince (The little prince) by Antoine de Saint-Exupéry in three languages: French, English, and Hungarian. The data suggest that word-based LCh segmentation is not very efficient with respect to utterance boundaries, however, it can provide useful word combinations for AG processing. Typological differences between the languages are also reflected in the results.
ISSN:2450-5188