MultiSHTM: Multi-Level Attention Enabled Bi-Directional Model for the Summarization of Chart Images

Chart-to-text conversion is an emerging research area focused on extracting useful information from chart images to improve understanding and analysis. Deep learning methods help in identifying important details and patterns from charts. However, existing models struggle to analyze charts because of...

Full description

Saved in:
Bibliographic Details
Main Authors: Indra Kumari, Hansung Lee
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11000319/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Chart-to-text conversion is an emerging research area focused on extracting useful information from chart images to improve understanding and analysis. Deep learning methods help in identifying important details and patterns from charts. However, existing models struggle to analyze charts because of the mix of text and graphical elements. To solve this problem, we propose a new method called MultiSHTM (Multi-level Stacked Houghless Network-based Bi-LSTM). This method improves accuracy, reduces complexity, and works well across different chart types. MultiSHTM integrates two key innovations: 1) a multilevel attention mechanism in a stacked houghless network, which accurately identifies key points in charts without relying on traditional Hough Transform-based methods and 2) a Bi-LSTM model enhanced with a Hierarchical and Channel Attention module, which effectively captures contextual relationships to generate precise summaries of chart images. Compared to existing methods, MultiSHTM performs better, achieving scores of Rouge: 0.55, Bleu: 0.45, Cider: 0.8, Meteor: 0.25, and Spice: 25.60.
ISSN:2169-3536