Incremental learning algorithm for dynamic evolution of domain specific vocabulary with its stability and plasticity analysis
Abstract Domain-specific vocabulary, which is crucial in fields such as Information Retrieval and Natural Language Processing, requires continuous updates to remain effective. Incremental Learning, unlike conventional methods, updates existing knowledge without retraining from scratch. This paper pr...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2025-01-01
|
Series: | Scientific Reports |
Subjects: | |
Online Access: | https://doi.org/10.1038/s41598-024-78785-6 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841559621298290688 |
---|---|
author | Mansi Jain Harmeet Kaur Bhavna Gupta Jaya Gera Vandana Kalra |
author_facet | Mansi Jain Harmeet Kaur Bhavna Gupta Jaya Gera Vandana Kalra |
author_sort | Mansi Jain |
collection | DOAJ |
description | Abstract Domain-specific vocabulary, which is crucial in fields such as Information Retrieval and Natural Language Processing, requires continuous updates to remain effective. Incremental Learning, unlike conventional methods, updates existing knowledge without retraining from scratch. This paper presents an incremental learning algorithm for updating domain-specific vocabularies. It introduces DocLib, an archive used to capture a compact footprint of previously seen data and vocabulary terms. Task-based evaluation measures the effectiveness of the updated vocabulary by using vocabulary terms to perform a downstream task of text classification. The classification accuracy gauges the effectiveness of the vocabulary in discerning unseen documents related to the domain. Experiments illustrate that multiple incremental updates maintain vocabulary relevance without compromising its effectiveness. The proposed algorithm ensures bounded memory and processing requirements, distinguishing it from conventional approaches. Novel algorithms are introduced to assess the stability and plasticity of the proposed approach, demonstrating its ability to assimilate new knowledge while retaining old insights. The generalizability of the vocabulary is tested across datasets, achieving 97.89% accuracy in identifying domain-related data. A comparison with state-of-the-art techniques using a benchmark dataset confirms the effectiveness of the proposed approach. Importantly, this approach extends beyond classification tasks, potentially benefiting other research fields. |
format | Article |
id | doaj-art-be6ad87bd5844f7b8dbf24d11687dc27 |
institution | Kabale University |
issn | 2045-2322 |
language | English |
publishDate | 2025-01-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj-art-be6ad87bd5844f7b8dbf24d11687dc272025-01-05T12:22:14ZengNature PortfolioScientific Reports2045-23222025-01-0115111610.1038/s41598-024-78785-6Incremental learning algorithm for dynamic evolution of domain specific vocabulary with its stability and plasticity analysisMansi Jain0Harmeet Kaur1Bhavna Gupta2Jaya Gera3Vandana Kalra4Department of Computer Science, Shyama Prasad Mukherji College for Women, University of DelhiDepartment of Computer Science, Hansraj College, University of DelhiDepartment of Computer Science, Keshav Mahavidyalaya, University of DelhiDepartment of Computer Science, Shyama Prasad Mukherji College for Women, University of DelhiDepartment of Computer Science, Sri Guru Gobind Singh College of Commerce, University of DelhiAbstract Domain-specific vocabulary, which is crucial in fields such as Information Retrieval and Natural Language Processing, requires continuous updates to remain effective. Incremental Learning, unlike conventional methods, updates existing knowledge without retraining from scratch. This paper presents an incremental learning algorithm for updating domain-specific vocabularies. It introduces DocLib, an archive used to capture a compact footprint of previously seen data and vocabulary terms. Task-based evaluation measures the effectiveness of the updated vocabulary by using vocabulary terms to perform a downstream task of text classification. The classification accuracy gauges the effectiveness of the vocabulary in discerning unseen documents related to the domain. Experiments illustrate that multiple incremental updates maintain vocabulary relevance without compromising its effectiveness. The proposed algorithm ensures bounded memory and processing requirements, distinguishing it from conventional approaches. Novel algorithms are introduced to assess the stability and plasticity of the proposed approach, demonstrating its ability to assimilate new knowledge while retaining old insights. The generalizability of the vocabulary is tested across datasets, achieving 97.89% accuracy in identifying domain-related data. A comparison with state-of-the-art techniques using a benchmark dataset confirms the effectiveness of the proposed approach. Importantly, this approach extends beyond classification tasks, potentially benefiting other research fields.https://doi.org/10.1038/s41598-024-78785-6UnigramsBigramsIncremental learningn-gramsText analyticsNatural language processing |
spellingShingle | Mansi Jain Harmeet Kaur Bhavna Gupta Jaya Gera Vandana Kalra Incremental learning algorithm for dynamic evolution of domain specific vocabulary with its stability and plasticity analysis Scientific Reports Unigrams Bigrams Incremental learning n-grams Text analytics Natural language processing |
title | Incremental learning algorithm for dynamic evolution of domain specific vocabulary with its stability and plasticity analysis |
title_full | Incremental learning algorithm for dynamic evolution of domain specific vocabulary with its stability and plasticity analysis |
title_fullStr | Incremental learning algorithm for dynamic evolution of domain specific vocabulary with its stability and plasticity analysis |
title_full_unstemmed | Incremental learning algorithm for dynamic evolution of domain specific vocabulary with its stability and plasticity analysis |
title_short | Incremental learning algorithm for dynamic evolution of domain specific vocabulary with its stability and plasticity analysis |
title_sort | incremental learning algorithm for dynamic evolution of domain specific vocabulary with its stability and plasticity analysis |
topic | Unigrams Bigrams Incremental learning n-grams Text analytics Natural language processing |
url | https://doi.org/10.1038/s41598-024-78785-6 |
work_keys_str_mv | AT mansijain incrementallearningalgorithmfordynamicevolutionofdomainspecificvocabularywithitsstabilityandplasticityanalysis AT harmeetkaur incrementallearningalgorithmfordynamicevolutionofdomainspecificvocabularywithitsstabilityandplasticityanalysis AT bhavnagupta incrementallearningalgorithmfordynamicevolutionofdomainspecificvocabularywithitsstabilityandplasticityanalysis AT jayagera incrementallearningalgorithmfordynamicevolutionofdomainspecificvocabularywithitsstabilityandplasticityanalysis AT vandanakalra incrementallearningalgorithmfordynamicevolutionofdomainspecificvocabularywithitsstabilityandplasticityanalysis |