A Machine-Learning-Based Data Science Framework for Effectively and Efficiently Processing, Managing, and Visualizing Big Sequential Data

In recent years, the open data initiative has led to the willingness of many governments, researchers, and organizations to share their data and make it publicly available. Healthcare, disease, and epidemiological data, such as privacy statistics on patients who have suffered from epidemic diseases...

Full description

Saved in:
Bibliographic Details
Main Authors: Alfredo Cuzzocrea, Islam Belmerabet, Abderraouf Hafsaoui, Carson K. Leung
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Computers
Subjects:
Online Access:https://www.mdpi.com/2073-431X/14/7/276
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849406585724469248
author Alfredo Cuzzocrea
Islam Belmerabet
Abderraouf Hafsaoui
Carson K. Leung
author_facet Alfredo Cuzzocrea
Islam Belmerabet
Abderraouf Hafsaoui
Carson K. Leung
author_sort Alfredo Cuzzocrea
collection DOAJ
description In recent years, the open data initiative has led to the willingness of many governments, researchers, and organizations to share their data and make it publicly available. Healthcare, disease, and epidemiological data, such as privacy statistics on patients who have suffered from epidemic diseases such as the Coronavirus disease 2019 (COVID-19), are examples of <i>open big data</i>. Therefore, huge volumes of valuable data have been generated and collected at high speed from a wide variety of rich data sources. <i>Analyzing these open big data</i> can be of social benefit. For example, people gain a better understanding of disease by analyzing and mining disease statistics, which can inspire them to participate in disease prevention, detection, control, and combat. Visual representation further improves data understanding and corresponding results for analysis and mining, as a picture is worth a thousand words. In this paper, we present <i>a visual data science solution for the visualization and visual analysis of large sequence data</i>. These ideas are illustrated by the visualization and visual analysis of sequences of real epidemiological data of COVID-19. Through our solution, we enable users to visualize the epidemiological data of COVID-19 over time. It also allows people to visually analyze data and discover relationships between popular features associated with COVID-19 cases. The effectiveness of our visual data science solution in improving the user experience of visualization and visual analysis of large sequence data is demonstrated by the real-life evaluation of these sequenced epidemiological data of COVID-19.
format Article
id doaj-art-bf9d8b1c1d16413eb896ccbf742cd194
institution Kabale University
issn 2073-431X
language English
publishDate 2025-07-01
publisher MDPI AG
record_format Article
series Computers
spelling doaj-art-bf9d8b1c1d16413eb896ccbf742cd1942025-08-20T03:36:19ZengMDPI AGComputers2073-431X2025-07-0114727610.3390/computers14070276A Machine-Learning-Based Data Science Framework for Effectively and Efficiently Processing, Managing, and Visualizing Big Sequential DataAlfredo Cuzzocrea0Islam Belmerabet1Abderraouf Hafsaoui2Carson K. Leung3iDEA Lab, University of Calabria, 87036 Rende, ItalyiDEA Lab, University of Calabria, 87036 Rende, ItalyiDEA Lab, University of Calabria, 87036 Rende, ItalyDepartment of Computer Science, University of Manitoba, Winnipeg, MB R3T 2N2, CanadaIn recent years, the open data initiative has led to the willingness of many governments, researchers, and organizations to share their data and make it publicly available. Healthcare, disease, and epidemiological data, such as privacy statistics on patients who have suffered from epidemic diseases such as the Coronavirus disease 2019 (COVID-19), are examples of <i>open big data</i>. Therefore, huge volumes of valuable data have been generated and collected at high speed from a wide variety of rich data sources. <i>Analyzing these open big data</i> can be of social benefit. For example, people gain a better understanding of disease by analyzing and mining disease statistics, which can inspire them to participate in disease prevention, detection, control, and combat. Visual representation further improves data understanding and corresponding results for analysis and mining, as a picture is worth a thousand words. In this paper, we present <i>a visual data science solution for the visualization and visual analysis of large sequence data</i>. These ideas are illustrated by the visualization and visual analysis of sequences of real epidemiological data of COVID-19. Through our solution, we enable users to visualize the epidemiological data of COVID-19 over time. It also allows people to visually analyze data and discover relationships between popular features associated with COVID-19 cases. The effectiveness of our visual data science solution in improving the user experience of visualization and visual analysis of large sequence data is demonstrated by the real-life evaluation of these sequenced epidemiological data of COVID-19.https://www.mdpi.com/2073-431X/14/7/276information visualizationbig datasequencesdata sciencevisual data sciencedata mining
spellingShingle Alfredo Cuzzocrea
Islam Belmerabet
Abderraouf Hafsaoui
Carson K. Leung
A Machine-Learning-Based Data Science Framework for Effectively and Efficiently Processing, Managing, and Visualizing Big Sequential Data
Computers
information visualization
big data
sequences
data science
visual data science
data mining
title A Machine-Learning-Based Data Science Framework for Effectively and Efficiently Processing, Managing, and Visualizing Big Sequential Data
title_full A Machine-Learning-Based Data Science Framework for Effectively and Efficiently Processing, Managing, and Visualizing Big Sequential Data
title_fullStr A Machine-Learning-Based Data Science Framework for Effectively and Efficiently Processing, Managing, and Visualizing Big Sequential Data
title_full_unstemmed A Machine-Learning-Based Data Science Framework for Effectively and Efficiently Processing, Managing, and Visualizing Big Sequential Data
title_short A Machine-Learning-Based Data Science Framework for Effectively and Efficiently Processing, Managing, and Visualizing Big Sequential Data
title_sort machine learning based data science framework for effectively and efficiently processing managing and visualizing big sequential data
topic information visualization
big data
sequences
data science
visual data science
data mining
url https://www.mdpi.com/2073-431X/14/7/276
work_keys_str_mv AT alfredocuzzocrea amachinelearningbaseddatascienceframeworkforeffectivelyandefficientlyprocessingmanagingandvisualizingbigsequentialdata
AT islambelmerabet amachinelearningbaseddatascienceframeworkforeffectivelyandefficientlyprocessingmanagingandvisualizingbigsequentialdata
AT abderraoufhafsaoui amachinelearningbaseddatascienceframeworkforeffectivelyandefficientlyprocessingmanagingandvisualizingbigsequentialdata
AT carsonkleung amachinelearningbaseddatascienceframeworkforeffectivelyandefficientlyprocessingmanagingandvisualizingbigsequentialdata
AT alfredocuzzocrea machinelearningbaseddatascienceframeworkforeffectivelyandefficientlyprocessingmanagingandvisualizingbigsequentialdata
AT islambelmerabet machinelearningbaseddatascienceframeworkforeffectivelyandefficientlyprocessingmanagingandvisualizingbigsequentialdata
AT abderraoufhafsaoui machinelearningbaseddatascienceframeworkforeffectivelyandefficientlyprocessingmanagingandvisualizingbigsequentialdata
AT carsonkleung machinelearningbaseddatascienceframeworkforeffectivelyandefficientlyprocessingmanagingandvisualizingbigsequentialdata