Comparative Analysis of Machine Learning Models for Predicting Innovation Outcomes: An Applied AI Approach

Predicting innovation outcomes at the firm level continues to be an important but challenging goal for researchers and practitioners alike. In this study, multiple machine learning models, encompassing both ensemble-based and single-model approaches, were applied to data from the Community Innovatio...

Full description

Saved in:
Bibliographic Details
Main Authors: Marko Martinović, Kristian Dokic, Dalibor Pudić
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/7/3636
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850188246718349312
author Marko Martinović
Kristian Dokic
Dalibor Pudić
author_facet Marko Martinović
Kristian Dokic
Dalibor Pudić
author_sort Marko Martinović
collection DOAJ
description Predicting innovation outcomes at the firm level continues to be an important but challenging goal for researchers and practitioners alike. In this study, multiple machine learning models, encompassing both ensemble-based and single-model approaches, were applied to data from the Community Innovation Survey. Methods included random forests, gradient boosting frameworks, support vector machines, neural networks, and logistic regression, each with hyperparameters optimized through Bayesian search routines and evaluated using corrected cross-validation techniques. The results showed that tree-based boosting algorithms consistently outperformed other models in accuracy, precision, F1-score, and ROC-AUC, while the kernel-based approach excelled in recall. Logistic regression proved to be the most computationally efficient model despite its weaker predictive power. The statistical analyses made it clear that the choice of an appropriate cross-validation protocol and accounting for overlapping data splits are crucial to reduce bias and ensure reliable comparisons. Overall, the results indicate that ensemble methods generally provide robust classification performance for innovation prediction tasks. However, individual models may still prove advantageous under certain metric-specific conditions or computational constraints. These observations emphasize the need to match model selection with data structure, performance objectives, and practical resource constraints when predicting and improving innovation outcomes at the firm level.
format Article
id doaj-art-6684046e6f8d44bb9904d58ec9e59754
institution OA Journals
issn 2076-3417
language English
publishDate 2025-03-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj-art-6684046e6f8d44bb9904d58ec9e597542025-08-20T02:15:55ZengMDPI AGApplied Sciences2076-34172025-03-01157363610.3390/app15073636Comparative Analysis of Machine Learning Models for Predicting Innovation Outcomes: An Applied AI ApproachMarko Martinović0Kristian Dokic1Dalibor Pudić2Technical Department, University of Slavonski Brod, Trg Ivane Brlić Mažuranić 2, 35000 Slavonski Brod, CroatiaDepartment of Information and Communication Sciences, Faculty of Tourism and Rural Development, University of Osijek, Vukovarska 17, 34000 Požega, CroatiaDepartment of Business Economics, University North, Ulica Jurja Križanića 31b, 42000 Varaždin, CroatiaPredicting innovation outcomes at the firm level continues to be an important but challenging goal for researchers and practitioners alike. In this study, multiple machine learning models, encompassing both ensemble-based and single-model approaches, were applied to data from the Community Innovation Survey. Methods included random forests, gradient boosting frameworks, support vector machines, neural networks, and logistic regression, each with hyperparameters optimized through Bayesian search routines and evaluated using corrected cross-validation techniques. The results showed that tree-based boosting algorithms consistently outperformed other models in accuracy, precision, F1-score, and ROC-AUC, while the kernel-based approach excelled in recall. Logistic regression proved to be the most computationally efficient model despite its weaker predictive power. The statistical analyses made it clear that the choice of an appropriate cross-validation protocol and accounting for overlapping data splits are crucial to reduce bias and ensure reliable comparisons. Overall, the results indicate that ensemble methods generally provide robust classification performance for innovation prediction tasks. However, individual models may still prove advantageous under certain metric-specific conditions or computational constraints. These observations emphasize the need to match model selection with data structure, performance objectives, and practical resource constraints when predicting and improving innovation outcomes at the firm level.https://www.mdpi.com/2076-3417/15/7/3636innovation predictionmachine learningensemble methodscross-validationclassification performancecomputational efficiency
spellingShingle Marko Martinović
Kristian Dokic
Dalibor Pudić
Comparative Analysis of Machine Learning Models for Predicting Innovation Outcomes: An Applied AI Approach
Applied Sciences
innovation prediction
machine learning
ensemble methods
cross-validation
classification performance
computational efficiency
title Comparative Analysis of Machine Learning Models for Predicting Innovation Outcomes: An Applied AI Approach
title_full Comparative Analysis of Machine Learning Models for Predicting Innovation Outcomes: An Applied AI Approach
title_fullStr Comparative Analysis of Machine Learning Models for Predicting Innovation Outcomes: An Applied AI Approach
title_full_unstemmed Comparative Analysis of Machine Learning Models for Predicting Innovation Outcomes: An Applied AI Approach
title_short Comparative Analysis of Machine Learning Models for Predicting Innovation Outcomes: An Applied AI Approach
title_sort comparative analysis of machine learning models for predicting innovation outcomes an applied ai approach
topic innovation prediction
machine learning
ensemble methods
cross-validation
classification performance
computational efficiency
url https://www.mdpi.com/2076-3417/15/7/3636
work_keys_str_mv AT markomartinovic comparativeanalysisofmachinelearningmodelsforpredictinginnovationoutcomesanappliedaiapproach
AT kristiandokic comparativeanalysisofmachinelearningmodelsforpredictinginnovationoutcomesanappliedaiapproach
AT daliborpudic comparativeanalysisofmachinelearningmodelsforpredictinginnovationoutcomesanappliedaiapproach