Synthetic Data-Based Algorithm Selection for Medical Image Classification Under Limited Data Availability

The Algorithm selection approach improves performance by dynamically choosing the optimal Algorithm for each input instance. While this selection strategy has been extensively studied, the amount of data and their nature have not yet been investigated with respect to meta-learning, particularly in s...

Full description

Saved in:
Bibliographic Details
Main Authors: Maxim Zhabinets, Benjamin Tyler, Martin Lukac, Shinobu Nagayama, Ferdinand Molnár, Michitaka Kameyama
Format: Article
Language:English
Published: MDPI AG 2025-05-01
Series:Algorithms
Subjects:
Online Access:https://www.mdpi.com/1999-4893/18/6/310
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849435152058417152
author Maxim Zhabinets
Benjamin Tyler
Martin Lukac
Shinobu Nagayama
Ferdinand Molnár
Michitaka Kameyama
author_facet Maxim Zhabinets
Benjamin Tyler
Martin Lukac
Shinobu Nagayama
Ferdinand Molnár
Michitaka Kameyama
author_sort Maxim Zhabinets
collection DOAJ
description The Algorithm selection approach improves performance by dynamically choosing the optimal Algorithm for each input instance. While this selection strategy has been extensively studied, the amount of data and their nature have not yet been investigated with respect to meta-learning, particularly in scenarios with limited data availability. This paper addresses a critical challenge: where additional data might not be available for training an Algorithm selector, and to implement a selection mechanism, data must be generated. Focusing on medical image classification, we investigate whether synthetic data can effectively train an Algorithm selector when real training data are scarce. Our methodology involves data generation using Generative Adversarial Network. To determine if Algorithm selection trained on synthetically generated data can achieve the same accuracy as if trained on real-world natural data, we systematically evaluate the data generative model using the smallest amount of data needed to choose the right Algorithm and to achieve the expected level of accuracy. Our experimental results demonstrate that using a small amount of real samples can provide enough information to a Generative Adversarial Network to synthesize a new dataset that, when used for training the Algorithm selection, improves image classification in some cases.
format Article
id doaj-art-391e3f61bc9d472ca3b39b22cc8379fa
institution Kabale University
issn 1999-4893
language English
publishDate 2025-05-01
publisher MDPI AG
record_format Article
series Algorithms
spelling doaj-art-391e3f61bc9d472ca3b39b22cc8379fa2025-08-20T03:26:24ZengMDPI AGAlgorithms1999-48932025-05-0118631010.3390/a18060310Synthetic Data-Based Algorithm Selection for Medical Image Classification Under Limited Data AvailabilityMaxim Zhabinets0Benjamin Tyler1Martin Lukac2Shinobu Nagayama3Ferdinand Molnár4Michitaka Kameyama5School of Engineering and Digital Sciences, Nazarbayev University, Astana 010000, KazakhstanSchool of Engineering and Digital Sciences, Nazarbayev University, Astana 010000, KazakhstanGraduate School of Information Sciences, Hiroshima City University, Hiroshima 731-3166, JapanGraduate School of Information Sciences, Hiroshima City University, Hiroshima 731-3166, JapanSchool of Sciences and Humanities, Nazarbayev University, Astana 010000, KazakhstanEmeritus of Graduate School of Information Sciences, Tohoku University, Sendai 980-8577, JapanThe Algorithm selection approach improves performance by dynamically choosing the optimal Algorithm for each input instance. While this selection strategy has been extensively studied, the amount of data and their nature have not yet been investigated with respect to meta-learning, particularly in scenarios with limited data availability. This paper addresses a critical challenge: where additional data might not be available for training an Algorithm selector, and to implement a selection mechanism, data must be generated. Focusing on medical image classification, we investigate whether synthetic data can effectively train an Algorithm selector when real training data are scarce. Our methodology involves data generation using Generative Adversarial Network. To determine if Algorithm selection trained on synthetically generated data can achieve the same accuracy as if trained on real-world natural data, we systematically evaluate the data generative model using the smallest amount of data needed to choose the right Algorithm and to achieve the expected level of accuracy. Our experimental results demonstrate that using a small amount of real samples can provide enough information to a Generative Adversarial Network to synthesize a new dataset that, when used for training the Algorithm selection, improves image classification in some cases.https://www.mdpi.com/1999-4893/18/6/310algorithm selectionmedical image classificationsynthetic dataGAN
spellingShingle Maxim Zhabinets
Benjamin Tyler
Martin Lukac
Shinobu Nagayama
Ferdinand Molnár
Michitaka Kameyama
Synthetic Data-Based Algorithm Selection for Medical Image Classification Under Limited Data Availability
Algorithms
algorithm selection
medical image classification
synthetic data
GAN
title Synthetic Data-Based Algorithm Selection for Medical Image Classification Under Limited Data Availability
title_full Synthetic Data-Based Algorithm Selection for Medical Image Classification Under Limited Data Availability
title_fullStr Synthetic Data-Based Algorithm Selection for Medical Image Classification Under Limited Data Availability
title_full_unstemmed Synthetic Data-Based Algorithm Selection for Medical Image Classification Under Limited Data Availability
title_short Synthetic Data-Based Algorithm Selection for Medical Image Classification Under Limited Data Availability
title_sort synthetic data based algorithm selection for medical image classification under limited data availability
topic algorithm selection
medical image classification
synthetic data
GAN
url https://www.mdpi.com/1999-4893/18/6/310
work_keys_str_mv AT maximzhabinets syntheticdatabasedalgorithmselectionformedicalimageclassificationunderlimiteddataavailability
AT benjamintyler syntheticdatabasedalgorithmselectionformedicalimageclassificationunderlimiteddataavailability
AT martinlukac syntheticdatabasedalgorithmselectionformedicalimageclassificationunderlimiteddataavailability
AT shinobunagayama syntheticdatabasedalgorithmselectionformedicalimageclassificationunderlimiteddataavailability
AT ferdinandmolnar syntheticdatabasedalgorithmselectionformedicalimageclassificationunderlimiteddataavailability
AT michitakakameyama syntheticdatabasedalgorithmselectionformedicalimageclassificationunderlimiteddataavailability