Exploring Legislative Textual Data in Brazilian Portuguese: Readability Analysis and Knowledge Graph Generation

Legislative documents are crucial to democratic societies, defining the legal framework for social life. In Brazil, legislative texts are particularly complex due to extensive technical jargon, intricate sentence structures, and frequent references to prior legislation. The country’s civil law tradi...

Full description

Saved in:
Bibliographic Details
Main Authors: Gisliany Lillian Alves de Oliveira, Breno Santana Santos, Marianne Silva, Ivanovitch Silva
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Data
Subjects:
Online Access:https://www.mdpi.com/2306-5729/10/7/106
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849733081969197056
author Gisliany Lillian Alves de Oliveira
Breno Santana Santos
Marianne Silva
Ivanovitch Silva
author_facet Gisliany Lillian Alves de Oliveira
Breno Santana Santos
Marianne Silva
Ivanovitch Silva
author_sort Gisliany Lillian Alves de Oliveira
collection DOAJ
description Legislative documents are crucial to democratic societies, defining the legal framework for social life. In Brazil, legislative texts are particularly complex due to extensive technical jargon, intricate sentence structures, and frequent references to prior legislation. The country’s civil law tradition and multicultural context introduce further interpretative and linguistic challenges. Moreover, the study of Brazilian Portuguese legislative texts remains underexplored, lacking legal-specific models and datasets. To address these gaps, this work proposes a data-driven approach utilizing large language models (LLMs) to analyze these documents and extract knowledge graphs (KGs). A case study was conducted using 1869proposals from the Legislative Assembly of Rio Grande do Norte (ALRN), spanning January 2019 to April 2024. The Llama 3.2 3B Instruct model was employed to extract KGs representing entities and their relationships. The findings support the method’s effectiveness in producing coherent graphs faithful to the original content. Nevertheless, challenges remain in resolving entity ambiguity and achieving full relationship coverage. Additionally, readability analyses using metrics for Brazilian Portuguese revealed that ALRN proposals require superior reading skills due to their technical style. Ultimately, this study advances legal artificial intelligence by providing insights into Brazilian legislative texts and promoting transparency and accessibility through natural language processing techniques.
format Article
id doaj-art-831b553512f8488eb413fe3e3bd39daa
institution DOAJ
issn 2306-5729
language English
publishDate 2025-07-01
publisher MDPI AG
record_format Article
series Data
spelling doaj-art-831b553512f8488eb413fe3e3bd39daa2025-08-20T03:08:09ZengMDPI AGData2306-57292025-07-0110710610.3390/data10070106Exploring Legislative Textual Data in Brazilian Portuguese: Readability Analysis and Knowledge Graph GenerationGisliany Lillian Alves de Oliveira0Breno Santana Santos1Marianne Silva2Ivanovitch Silva3UFRN-PPgEEC, Postgraduate Program in Electrical and Computer Engineering, Federal University of Rio Grande do Norte, Natal 59078-970, BrazilInformation System Department, Federal University of Sergipe, Itabaiana 49400-000, BrazilCampus Arapiraca, Federal University of Alagoas, Penedo 57200-000, BrazilUFRN-PPgEEC, Postgraduate Program in Electrical and Computer Engineering, Federal University of Rio Grande do Norte, Natal 59078-970, BrazilLegislative documents are crucial to democratic societies, defining the legal framework for social life. In Brazil, legislative texts are particularly complex due to extensive technical jargon, intricate sentence structures, and frequent references to prior legislation. The country’s civil law tradition and multicultural context introduce further interpretative and linguistic challenges. Moreover, the study of Brazilian Portuguese legislative texts remains underexplored, lacking legal-specific models and datasets. To address these gaps, this work proposes a data-driven approach utilizing large language models (LLMs) to analyze these documents and extract knowledge graphs (KGs). A case study was conducted using 1869proposals from the Legislative Assembly of Rio Grande do Norte (ALRN), spanning January 2019 to April 2024. The Llama 3.2 3B Instruct model was employed to extract KGs representing entities and their relationships. The findings support the method’s effectiveness in producing coherent graphs faithful to the original content. Nevertheless, challenges remain in resolving entity ambiguity and achieving full relationship coverage. Additionally, readability analyses using metrics for Brazilian Portuguese revealed that ALRN proposals require superior reading skills due to their technical style. Ultimately, this study advances legal artificial intelligence by providing insights into Brazilian legislative texts and promoting transparency and accessibility through natural language processing techniques.https://www.mdpi.com/2306-5729/10/7/106legislativedocumentsknowledge graphslarge language modelslawsreadability analysisexploratory data analysis
spellingShingle Gisliany Lillian Alves de Oliveira
Breno Santana Santos
Marianne Silva
Ivanovitch Silva
Exploring Legislative Textual Data in Brazilian Portuguese: Readability Analysis and Knowledge Graph Generation
Data
legislativedocuments
knowledge graphs
large language models
laws
readability analysis
exploratory data analysis
title Exploring Legislative Textual Data in Brazilian Portuguese: Readability Analysis and Knowledge Graph Generation
title_full Exploring Legislative Textual Data in Brazilian Portuguese: Readability Analysis and Knowledge Graph Generation
title_fullStr Exploring Legislative Textual Data in Brazilian Portuguese: Readability Analysis and Knowledge Graph Generation
title_full_unstemmed Exploring Legislative Textual Data in Brazilian Portuguese: Readability Analysis and Knowledge Graph Generation
title_short Exploring Legislative Textual Data in Brazilian Portuguese: Readability Analysis and Knowledge Graph Generation
title_sort exploring legislative textual data in brazilian portuguese readability analysis and knowledge graph generation
topic legislativedocuments
knowledge graphs
large language models
laws
readability analysis
exploratory data analysis
url https://www.mdpi.com/2306-5729/10/7/106
work_keys_str_mv AT gislianylillianalvesdeoliveira exploringlegislativetextualdatainbrazilianportuguesereadabilityanalysisandknowledgegraphgeneration
AT brenosantanasantos exploringlegislativetextualdatainbrazilianportuguesereadabilityanalysisandknowledgegraphgeneration
AT mariannesilva exploringlegislativetextualdatainbrazilianportuguesereadabilityanalysisandknowledgegraphgeneration
AT ivanovitchsilva exploringlegislativetextualdatainbrazilianportuguesereadabilityanalysisandknowledgegraphgeneration