Combining the Pre-Trained Model Roberta with a Two-Layer Bidirectional Long- and Short-Term Memory Network and a Multi-Head Attention Mechanism for a Rice Phenomics Entity Classification Study

At a time when global food security is challenged, the importance of phenomics research on rice, as a major food crop, has become more and more prominent. In-depth analysis of rice phenotypic characteristics is of key importance to promote the genetic improvement of rice and sustainable agricultural...

Full description

Saved in:
Bibliographic Details
Main Authors: Dayu Xu, Xinyu Zhu, Xuyao Zhang, Fang Xia
Format: Article
Language:English
Published: MDPI AG 2025-04-01
Series:AgriEngineering
Subjects:
Online Access:https://www.mdpi.com/2624-7402/7/4/94
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:At a time when global food security is challenged, the importance of phenomics research on rice, as a major food crop, has become more and more prominent. In-depth analysis of rice phenotypic characteristics is of key importance to promote the genetic improvement of rice and sustainable agricultural development. However, it is a challenging task to accurately identify and classify entities from the huge amount of rice phenotypic data. In this study, a deep learning model based on Roberta-two-layer BiLSTM-MHA was innovatively constructed for rice phenomics entity classification. Firstly, with the powerful language comprehension capability of the pre-trained Roberta model, deep feature extraction was performed on the rice phenotype text data to capture the underlying semantic information in the text. Next, the contextual information is comprehensively modelled using a two-layer bidirectional long- and short-term memory network (BiLSTM) to fully explore the long-term dependencies in the text sequences. Finally, a multi-head attention mechanism is introduced to enable the model to adaptively focus on key features at different levels, which significantly improves the classification accuracy of complex phenotypic information. The experimental results show that the model performs excellently in several evaluation metrics, with accuracy, recall, and F1-scores of 89.56%, 86.40%, and 87.90%, respectively. This research result not only provides an efficient and precise entity classification tool for rice phenomics research but also provides a comparable method for other crop phenomics analyses, which is expected to promote the technological innovation in the field of crop genetic breeding and agricultural production.
ISSN:2624-7402