Incremental learning algorithm for dynamic evolution of domain specific vocabulary with its stability and plasticity analysis

Abstract Domain-specific vocabulary, which is crucial in fields such as Information Retrieval and Natural Language Processing, requires continuous updates to remain effective. Incremental Learning, unlike conventional methods, updates existing knowledge without retraining from scratch. This paper pr...

Full description

Saved in:
Bibliographic Details
Main Authors: Mansi Jain, Harmeet Kaur, Bhavna Gupta, Jaya Gera, Vandana Kalra
Format: Article
Language:English
Published: Nature Portfolio 2025-01-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-024-78785-6
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841559621298290688
author Mansi Jain
Harmeet Kaur
Bhavna Gupta
Jaya Gera
Vandana Kalra
author_facet Mansi Jain
Harmeet Kaur
Bhavna Gupta
Jaya Gera
Vandana Kalra
author_sort Mansi Jain
collection DOAJ
description Abstract Domain-specific vocabulary, which is crucial in fields such as Information Retrieval and Natural Language Processing, requires continuous updates to remain effective. Incremental Learning, unlike conventional methods, updates existing knowledge without retraining from scratch. This paper presents an incremental learning algorithm for updating domain-specific vocabularies. It introduces DocLib, an archive used to capture a compact footprint of previously seen data and vocabulary terms. Task-based evaluation measures the effectiveness of the updated vocabulary by using vocabulary terms to perform a downstream task of text classification. The classification accuracy gauges the effectiveness of the vocabulary in discerning unseen documents related to the domain. Experiments illustrate that multiple incremental updates maintain vocabulary relevance without compromising its effectiveness. The proposed algorithm ensures bounded memory and processing requirements, distinguishing it from conventional approaches. Novel algorithms are introduced to assess the stability and plasticity of the proposed approach, demonstrating its ability to assimilate new knowledge while retaining old insights. The generalizability of the vocabulary is tested across datasets, achieving 97.89% accuracy in identifying domain-related data. A comparison with state-of-the-art techniques using a benchmark dataset confirms the effectiveness of the proposed approach. Importantly, this approach extends beyond classification tasks, potentially benefiting other research fields.
format Article
id doaj-art-be6ad87bd5844f7b8dbf24d11687dc27
institution Kabale University
issn 2045-2322
language English
publishDate 2025-01-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-be6ad87bd5844f7b8dbf24d11687dc272025-01-05T12:22:14ZengNature PortfolioScientific Reports2045-23222025-01-0115111610.1038/s41598-024-78785-6Incremental learning algorithm for dynamic evolution of domain specific vocabulary with its stability and plasticity analysisMansi Jain0Harmeet Kaur1Bhavna Gupta2Jaya Gera3Vandana Kalra4Department of Computer Science, Shyama Prasad Mukherji College for Women, University of DelhiDepartment of Computer Science, Hansraj College, University of DelhiDepartment of Computer Science, Keshav Mahavidyalaya, University of DelhiDepartment of Computer Science, Shyama Prasad Mukherji College for Women, University of DelhiDepartment of Computer Science, Sri Guru Gobind Singh College of Commerce, University of DelhiAbstract Domain-specific vocabulary, which is crucial in fields such as Information Retrieval and Natural Language Processing, requires continuous updates to remain effective. Incremental Learning, unlike conventional methods, updates existing knowledge without retraining from scratch. This paper presents an incremental learning algorithm for updating domain-specific vocabularies. It introduces DocLib, an archive used to capture a compact footprint of previously seen data and vocabulary terms. Task-based evaluation measures the effectiveness of the updated vocabulary by using vocabulary terms to perform a downstream task of text classification. The classification accuracy gauges the effectiveness of the vocabulary in discerning unseen documents related to the domain. Experiments illustrate that multiple incremental updates maintain vocabulary relevance without compromising its effectiveness. The proposed algorithm ensures bounded memory and processing requirements, distinguishing it from conventional approaches. Novel algorithms are introduced to assess the stability and plasticity of the proposed approach, demonstrating its ability to assimilate new knowledge while retaining old insights. The generalizability of the vocabulary is tested across datasets, achieving 97.89% accuracy in identifying domain-related data. A comparison with state-of-the-art techniques using a benchmark dataset confirms the effectiveness of the proposed approach. Importantly, this approach extends beyond classification tasks, potentially benefiting other research fields.https://doi.org/10.1038/s41598-024-78785-6UnigramsBigramsIncremental learningn-gramsText analyticsNatural language processing
spellingShingle Mansi Jain
Harmeet Kaur
Bhavna Gupta
Jaya Gera
Vandana Kalra
Incremental learning algorithm for dynamic evolution of domain specific vocabulary with its stability and plasticity analysis
Scientific Reports
Unigrams
Bigrams
Incremental learning
n-grams
Text analytics
Natural language processing
title Incremental learning algorithm for dynamic evolution of domain specific vocabulary with its stability and plasticity analysis
title_full Incremental learning algorithm for dynamic evolution of domain specific vocabulary with its stability and plasticity analysis
title_fullStr Incremental learning algorithm for dynamic evolution of domain specific vocabulary with its stability and plasticity analysis
title_full_unstemmed Incremental learning algorithm for dynamic evolution of domain specific vocabulary with its stability and plasticity analysis
title_short Incremental learning algorithm for dynamic evolution of domain specific vocabulary with its stability and plasticity analysis
title_sort incremental learning algorithm for dynamic evolution of domain specific vocabulary with its stability and plasticity analysis
topic Unigrams
Bigrams
Incremental learning
n-grams
Text analytics
Natural language processing
url https://doi.org/10.1038/s41598-024-78785-6
work_keys_str_mv AT mansijain incrementallearningalgorithmfordynamicevolutionofdomainspecificvocabularywithitsstabilityandplasticityanalysis
AT harmeetkaur incrementallearningalgorithmfordynamicevolutionofdomainspecificvocabularywithitsstabilityandplasticityanalysis
AT bhavnagupta incrementallearningalgorithmfordynamicevolutionofdomainspecificvocabularywithitsstabilityandplasticityanalysis
AT jayagera incrementallearningalgorithmfordynamicevolutionofdomainspecificvocabularywithitsstabilityandplasticityanalysis
AT vandanakalra incrementallearningalgorithmfordynamicevolutionofdomainspecificvocabularywithitsstabilityandplasticityanalysis