Multilingual Automatic Term Extraction in Low-Resource Domains

With the emergence of the neural networks-based approaches, research on information extraction has benefited from large-scale raw texts by leveraging them using pre-trained embeddings and other data augmentation techniques to deal with challenges and issues in Natural Language Processing tasks. In t...

Full description

Saved in:
Bibliographic Details
Main Authors: NGOC TAN LE, Fatiha Sadat
Format: Article
Language:English
Published: LibraryPress@UF 2021-04-01
Series:Proceedings of the International Florida Artificial Intelligence Research Society Conference
Online Access:https://journals.flvc.org/FLAIRS/article/view/128502
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849736487122239488
author NGOC TAN LE
Fatiha Sadat
author_facet NGOC TAN LE
Fatiha Sadat
author_sort NGOC TAN LE
collection DOAJ
description With the emergence of the neural networks-based approaches, research on information extraction has benefited from large-scale raw texts by leveraging them using pre-trained embeddings and other data augmentation techniques to deal with challenges and issues in Natural Language Processing tasks. In this paper, we propose an approach using sequence-to-sequence neural networks-based models to deal with term extraction for low-resource domain. Our empirical experiments, evaluating on the multilingual ACTER dataset provided in the LREC-TermEval 2020 shared task on automatic term extraction, proved the efficiency of deep learning approach, in the case of low-data settings, for the automatic term extraction task.
format Article
id doaj-art-9552b2908c5d43a7b725ae9a62e54ee2
institution DOAJ
issn 2334-0754
2334-0762
language English
publishDate 2021-04-01
publisher LibraryPress@UF
record_format Article
series Proceedings of the International Florida Artificial Intelligence Research Society Conference
spelling doaj-art-9552b2908c5d43a7b725ae9a62e54ee22025-08-20T03:07:16ZengLibraryPress@UFProceedings of the International Florida Artificial Intelligence Research Society Conference2334-07542334-07622021-04-013410.32473/flairs.v34i1.12850262895Multilingual Automatic Term Extraction in Low-Resource DomainsNGOC TAN LE0Fatiha Sadat1Universite du Quebec a MontrealUniversite du Quebec a MontrealWith the emergence of the neural networks-based approaches, research on information extraction has benefited from large-scale raw texts by leveraging them using pre-trained embeddings and other data augmentation techniques to deal with challenges and issues in Natural Language Processing tasks. In this paper, we propose an approach using sequence-to-sequence neural networks-based models to deal with term extraction for low-resource domain. Our empirical experiments, evaluating on the multilingual ACTER dataset provided in the LREC-TermEval 2020 shared task on automatic term extraction, proved the efficiency of deep learning approach, in the case of low-data settings, for the automatic term extraction task.https://journals.flvc.org/FLAIRS/article/view/128502
spellingShingle NGOC TAN LE
Fatiha Sadat
Multilingual Automatic Term Extraction in Low-Resource Domains
Proceedings of the International Florida Artificial Intelligence Research Society Conference
title Multilingual Automatic Term Extraction in Low-Resource Domains
title_full Multilingual Automatic Term Extraction in Low-Resource Domains
title_fullStr Multilingual Automatic Term Extraction in Low-Resource Domains
title_full_unstemmed Multilingual Automatic Term Extraction in Low-Resource Domains
title_short Multilingual Automatic Term Extraction in Low-Resource Domains
title_sort multilingual automatic term extraction in low resource domains
url https://journals.flvc.org/FLAIRS/article/view/128502
work_keys_str_mv AT ngoctanle multilingualautomatictermextractioninlowresourcedomains
AT fatihasadat multilingualautomatictermextractioninlowresourcedomains