Unified Visual-Aware Representations for Data Analytics
One of the characteristics of big data is its internal complexity and variety manifested in many types of datasets that are to be managed, searched, or analyzed. In their natural forms, some data entities are unstructured, such as texts or multimedia objects, while some are structured but too comple...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10854212/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | One of the characteristics of big data is its internal complexity and variety manifested in many types of datasets that are to be managed, searched, or analyzed. In their natural forms, some data entities are unstructured, such as texts or multimedia objects, while some are structured but too complex (e.g., high-dimensional tabular data). Due to the many different forms of data managed in many domain-specific problems, there are many different data representations used – tailored to a specific data form, domain and task. In this paper, we propose a framework for universal visual representations of complex data. The desired property of the visualizations is the ability to visually encode the semantic features of the original data. Hence, processing of visualizations (images) by generic deep learning models results in deep feature vectors that could be uniformly used in standard data retrieval/analytics tasks. Specifically, we develop a semi-automated transfer learning pipeline for transformation of input arbitrary tabular data into visual representations. The visual representations serve for data analytics tasks performed by human users as well as serve for universal data representations used in machine learning models for automated tasks. We show in large study that visual representations of complex data are effective in a number of domains while we also propose a recommender to help with the parameterization of the entire pipeline for certain domains and use cases. In summary, the proposed framework enables rapid prototyping of data representations (in an arbitrary domain) using a shared concept – visual representations applicable in data analytics using generic deep learning models. |
---|---|
ISSN: | 2169-3536 |