Constant-delay enumeration for SLP-compressed documents

We study the problem of enumerating results from a query over a compressed document. The model we use for compression are straight-line programs (SLPs), which are defined by a context-free grammar that produces a single string. For our queries, we use a model called Annotated Automata, an extension...

Full description

Saved in:
Bibliographic Details
Main Authors: Martín Muñoz, Cristian Riveros
Format: Article
Language:English
Published: Logical Methods in Computer Science e.V. 2025-02-01
Series:Logical Methods in Computer Science
Subjects:
Online Access:http://lmcs.episciences.org/12495/pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849344756511932416
author Martín Muñoz
Cristian Riveros
author_facet Martín Muñoz
Cristian Riveros
author_sort Martín Muñoz
collection DOAJ
description We study the problem of enumerating results from a query over a compressed document. The model we use for compression are straight-line programs (SLPs), which are defined by a context-free grammar that produces a single string. For our queries, we use a model called Annotated Automata, an extension of regular automata that allows annotations on letters. This model extends the notion of Regular Spanners as it allows arbitrarily long outputs. Our main result is an algorithm that evaluates such a query by enumerating all results with output-linear delay after a preprocessing phase which takes linear time on the size of the SLP, and cubic time over the size of the automaton. This is an improvement over Schmid and Schweikardt's result, which, with the same preprocessing time, enumerates with a delay that is logarithmic on the size of the uncompressed document. We achieve this through a persistent data structure named Enumerable Compact Sets with Shifts which guarantees output-linear delay under certain restrictions. These results imply constant-delay enumeration algorithms in the context of regular spanners. Further, we use an extension of annotated automata which utilizes succinctly encoded annotations to save an exponential factor from previous results that dealt with constant-delay enumeration over vset automata. Lastly, we extend our results in the same fashion Schmid and Schweikardt did to allow complex document editing while maintaining the constant delay guarantee.
format Article
id doaj-art-bf4e158b28b74231ac55e48987a250ac
institution Kabale University
issn 1860-5974
language English
publishDate 2025-02-01
publisher Logical Methods in Computer Science e.V.
record_format Article
series Logical Methods in Computer Science
spelling doaj-art-bf4e158b28b74231ac55e48987a250ac2025-08-20T03:42:35ZengLogical Methods in Computer Science e.V.Logical Methods in Computer Science1860-59742025-02-01Volume 21, Issue 110.46298/lmcs-21(1:17)202512495Constant-delay enumeration for SLP-compressed documentsMartín MuñozCristian RiverosWe study the problem of enumerating results from a query over a compressed document. The model we use for compression are straight-line programs (SLPs), which are defined by a context-free grammar that produces a single string. For our queries, we use a model called Annotated Automata, an extension of regular automata that allows annotations on letters. This model extends the notion of Regular Spanners as it allows arbitrarily long outputs. Our main result is an algorithm that evaluates such a query by enumerating all results with output-linear delay after a preprocessing phase which takes linear time on the size of the SLP, and cubic time over the size of the automaton. This is an improvement over Schmid and Schweikardt's result, which, with the same preprocessing time, enumerates with a delay that is logarithmic on the size of the uncompressed document. We achieve this through a persistent data structure named Enumerable Compact Sets with Shifts which guarantees output-linear delay under certain restrictions. These results imply constant-delay enumeration algorithms in the context of regular spanners. Further, we use an extension of annotated automata which utilizes succinctly encoded annotations to save an exponential factor from previous results that dealt with constant-delay enumeration over vset automata. Lastly, we extend our results in the same fashion Schmid and Schweikardt did to allow complex document editing while maintaining the constant delay guarantee.http://lmcs.episciences.org/12495/pdfcomputer science - data structures and algorithmscomputer science - formal languages and automata theorycomputer science - logic in computer science
spellingShingle Martín Muñoz
Cristian Riveros
Constant-delay enumeration for SLP-compressed documents
Logical Methods in Computer Science
computer science - data structures and algorithms
computer science - formal languages and automata theory
computer science - logic in computer science
title Constant-delay enumeration for SLP-compressed documents
title_full Constant-delay enumeration for SLP-compressed documents
title_fullStr Constant-delay enumeration for SLP-compressed documents
title_full_unstemmed Constant-delay enumeration for SLP-compressed documents
title_short Constant-delay enumeration for SLP-compressed documents
title_sort constant delay enumeration for slp compressed documents
topic computer science - data structures and algorithms
computer science - formal languages and automata theory
computer science - logic in computer science
url http://lmcs.episciences.org/12495/pdf
work_keys_str_mv AT martinmunoz constantdelayenumerationforslpcompresseddocuments
AT cristianriveros constantdelayenumerationforslpcompresseddocuments