Comparative Analysis of Machine Learning Models for Predicting Innovation Outcomes: An Applied AI Approach
Predicting innovation outcomes at the firm level continues to be an important but challenging goal for researchers and practitioners alike. In this study, multiple machine learning models, encompassing both ensemble-based and single-model approaches, were applied to data from the Community Innovatio...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-03-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/7/3636 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850188246718349312 |
|---|---|
| author | Marko Martinović Kristian Dokic Dalibor Pudić |
| author_facet | Marko Martinović Kristian Dokic Dalibor Pudić |
| author_sort | Marko Martinović |
| collection | DOAJ |
| description | Predicting innovation outcomes at the firm level continues to be an important but challenging goal for researchers and practitioners alike. In this study, multiple machine learning models, encompassing both ensemble-based and single-model approaches, were applied to data from the Community Innovation Survey. Methods included random forests, gradient boosting frameworks, support vector machines, neural networks, and logistic regression, each with hyperparameters optimized through Bayesian search routines and evaluated using corrected cross-validation techniques. The results showed that tree-based boosting algorithms consistently outperformed other models in accuracy, precision, F1-score, and ROC-AUC, while the kernel-based approach excelled in recall. Logistic regression proved to be the most computationally efficient model despite its weaker predictive power. The statistical analyses made it clear that the choice of an appropriate cross-validation protocol and accounting for overlapping data splits are crucial to reduce bias and ensure reliable comparisons. Overall, the results indicate that ensemble methods generally provide robust classification performance for innovation prediction tasks. However, individual models may still prove advantageous under certain metric-specific conditions or computational constraints. These observations emphasize the need to match model selection with data structure, performance objectives, and practical resource constraints when predicting and improving innovation outcomes at the firm level. |
| format | Article |
| id | doaj-art-6684046e6f8d44bb9904d58ec9e59754 |
| institution | OA Journals |
| issn | 2076-3417 |
| language | English |
| publishDate | 2025-03-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Applied Sciences |
| spelling | doaj-art-6684046e6f8d44bb9904d58ec9e597542025-08-20T02:15:55ZengMDPI AGApplied Sciences2076-34172025-03-01157363610.3390/app15073636Comparative Analysis of Machine Learning Models for Predicting Innovation Outcomes: An Applied AI ApproachMarko Martinović0Kristian Dokic1Dalibor Pudić2Technical Department, University of Slavonski Brod, Trg Ivane Brlić Mažuranić 2, 35000 Slavonski Brod, CroatiaDepartment of Information and Communication Sciences, Faculty of Tourism and Rural Development, University of Osijek, Vukovarska 17, 34000 Požega, CroatiaDepartment of Business Economics, University North, Ulica Jurja Križanića 31b, 42000 Varaždin, CroatiaPredicting innovation outcomes at the firm level continues to be an important but challenging goal for researchers and practitioners alike. In this study, multiple machine learning models, encompassing both ensemble-based and single-model approaches, were applied to data from the Community Innovation Survey. Methods included random forests, gradient boosting frameworks, support vector machines, neural networks, and logistic regression, each with hyperparameters optimized through Bayesian search routines and evaluated using corrected cross-validation techniques. The results showed that tree-based boosting algorithms consistently outperformed other models in accuracy, precision, F1-score, and ROC-AUC, while the kernel-based approach excelled in recall. Logistic regression proved to be the most computationally efficient model despite its weaker predictive power. The statistical analyses made it clear that the choice of an appropriate cross-validation protocol and accounting for overlapping data splits are crucial to reduce bias and ensure reliable comparisons. Overall, the results indicate that ensemble methods generally provide robust classification performance for innovation prediction tasks. However, individual models may still prove advantageous under certain metric-specific conditions or computational constraints. These observations emphasize the need to match model selection with data structure, performance objectives, and practical resource constraints when predicting and improving innovation outcomes at the firm level.https://www.mdpi.com/2076-3417/15/7/3636innovation predictionmachine learningensemble methodscross-validationclassification performancecomputational efficiency |
| spellingShingle | Marko Martinović Kristian Dokic Dalibor Pudić Comparative Analysis of Machine Learning Models for Predicting Innovation Outcomes: An Applied AI Approach Applied Sciences innovation prediction machine learning ensemble methods cross-validation classification performance computational efficiency |
| title | Comparative Analysis of Machine Learning Models for Predicting Innovation Outcomes: An Applied AI Approach |
| title_full | Comparative Analysis of Machine Learning Models for Predicting Innovation Outcomes: An Applied AI Approach |
| title_fullStr | Comparative Analysis of Machine Learning Models for Predicting Innovation Outcomes: An Applied AI Approach |
| title_full_unstemmed | Comparative Analysis of Machine Learning Models for Predicting Innovation Outcomes: An Applied AI Approach |
| title_short | Comparative Analysis of Machine Learning Models for Predicting Innovation Outcomes: An Applied AI Approach |
| title_sort | comparative analysis of machine learning models for predicting innovation outcomes an applied ai approach |
| topic | innovation prediction machine learning ensemble methods cross-validation classification performance computational efficiency |
| url | https://www.mdpi.com/2076-3417/15/7/3636 |
| work_keys_str_mv | AT markomartinovic comparativeanalysisofmachinelearningmodelsforpredictinginnovationoutcomesanappliedaiapproach AT kristiandokic comparativeanalysisofmachinelearningmodelsforpredictinginnovationoutcomesanappliedaiapproach AT daliborpudic comparativeanalysisofmachinelearningmodelsforpredictinginnovationoutcomesanappliedaiapproach |