Toward Effective Comparative Opinion Mining: A Novel Vietnamese Product Review Corpus and Benchmark Approach

Comparative opinion mining is an important sub-task of aspect-based sentiment analysis that focuses on identifying and interpreting comparisons between entities. It plays a vital role in understanding how users evaluate competing products or services. However, most existing studies are limited to ex...

Full description

Saved in:
Bibliographic Details
Main Authors: Duy-Cat Can, Khanh-Vinh Nguyen, Hung-Manh Hoang, Duc-Loc Vu, Mai-Vu Tran, Hoang-Quynh Le
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10990144/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Comparative opinion mining is an important sub-task of aspect-based sentiment analysis that focuses on identifying and interpreting comparisons between entities. It plays a vital role in understanding how users evaluate competing products or services. However, most existing studies are limited to extracting comparative elements from a single sentence. This narrow scope overlooks the complexity of real-world reviews, where multiple comparisons often appear within the same sentence or extend across sentences. Furthermore, comparative opinion mining remains underdeveloped in low-resource languages such as Vietnamese, due to the scarcity of annotated datasets. To address these challenges, we introduce VCOM, a Vietnamese corpus for comparative opinion mining, constructed from real-world product reviews. VCOM contains 2,468 annotated comparative tuples across 9,174 sentences, capturing both intra- and inter-sentence comparisons. Each tuple is labeled with a structured quintuple: subject, object, aspect, predicate, and comparison type. We also propose a deep learning model that leverages a sliding window mechanism to capture contextual dependencies across multiple sentences. This model improves the extraction of complex comparative structures that are often overlooked by sentence-level approaches. To validate our contributions, we conducted extensive experiments that demonstrate the effectiveness of our model on the VCOM corpus. We also organized a shared task based on the intra-sentence subset of VCOM to encourage further research in this domain. Our work serves as a foundation for future advancements in comparative opinion mining, particularly in low-resource settings.
ISSN:2169-3536