Increasing comprehensiveness and reducing workload in a systematic review of complex interventions using automated machine learning
Background As part of our ongoing systematic review of complex interventions for the primary prevention of cardiovascular diseases, we have developed and evaluated automated machine-learning classifiers for title and abstract screening. The aim was to develop a high-performing algorithm comparable t...
Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
NIHR Journals Library
2022-11-01
|
| Series: | Health Technology Assessment |
| Subjects: | |
| Online Access: | https://doi.org/10.3310/UDIR6682 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849340397788069888 |
|---|---|
| author | Olalekan A Uthman Rachel Court Jodie Enderby Lena Al-Khudairy Chidozie Nduka Hema Mistry GJ Melendez-Torres Sian Taylor-Phillips Aileen Clarke |
| author_facet | Olalekan A Uthman Rachel Court Jodie Enderby Lena Al-Khudairy Chidozie Nduka Hema Mistry GJ Melendez-Torres Sian Taylor-Phillips Aileen Clarke |
| author_sort | Olalekan A Uthman |
| collection | DOAJ |
| description | Background As part of our ongoing systematic review of complex interventions for the primary prevention of cardiovascular diseases, we have developed and evaluated automated machine-learning classifiers for title and abstract screening. The aim was to develop a high-performing algorithm comparable to human screening. Methods We followed a three-phase process to develop and test an automated machine learning-based classifier for screening potential studies on interventions for primary prevention of cardiovascular disease. We labelled a total of 16,611 articles during the first phase of the project. In the second phase, we used the labelled articles to develop a machine learning-based classifier. After that, we examined the performance of the classifiers in correctly labelling the papers. We evaluated the performance of the five deep-learning models [i.e. parallel convolutional neural network (CNN), stacked CNN, parallel-stacked CNN, recurrent neural network (RNN) and CNN–RNN]. The models were evaluated using recall, precision and work saved over sampling at no less than 95% recall. Results We labelled a total of 16,611 articles, of which 676 (4.0%) were tagged as ‘relevant’ and 15,935 (96%) were tagged as ‘irrelevant’. The recall ranged from 51.9% to 96.6%. The precision ranged from 64.6% to 99.1%. The work saved over sampling ranged from 8.9% to as high as 92.1%. The best-performing model was parallel CNN, yielding a 96.4% recall, as well as 99.1% precision, and a potential workload reduction of 89.9%. Future work and limitations We used words from the title and the abstract only. More work needs to be done to look into possible changes in performance, such as adding features such as full document text. The approach might also not be able to be used for other complex systematic reviews on different topics. Conclusion Our study shows that machine learning has the potential to significantly aid the labour-intensive screening of abstracts in systematic reviews of complex interventions. Future research should concentrate on enhancing the classifier system and determining how it can be integrated into the systematic review workflow. Funding This project was funded by the National Institute for Health and Care Research (NIHR) Health Technology Assessment programme and will be published in Health Technology Assessment. See the NIHR Journals Library website for further project information. |
| format | Article |
| id | doaj-art-a0b261f9584e40ed81201d50b18be74f |
| institution | Kabale University |
| issn | 2046-4924 |
| language | English |
| publishDate | 2022-11-01 |
| publisher | NIHR Journals Library |
| record_format | Article |
| series | Health Technology Assessment |
| spelling | doaj-art-a0b261f9584e40ed81201d50b18be74f2025-08-20T03:43:55ZengNIHR Journals LibraryHealth Technology Assessment2046-49242022-11-01293710.3310/UDIR6682NIHR135482Increasing comprehensiveness and reducing workload in a systematic review of complex interventions using automated machine learningOlalekan A Uthman0Rachel Court1Jodie Enderby2Lena Al-Khudairy3Chidozie Nduka4Hema Mistry5GJ Melendez-Torres6Sian Taylor-Phillips7Aileen Clarke8Warwick Medical School, University of Warwick, Coventry, UKWarwick Medical School, University of Warwick, Coventry, UKWarwick Medical School, University of Warwick, Coventry, UKWarwick Medical School, University of Warwick, Coventry, UKWarwick Medical School, University of Warwick, Coventry, UKWarwick Medical School, University of Warwick, Coventry, UKPeninsula Technology Assessment Group (PenTAG), College of Medicine and Health, University of Exeter, Exeter, UKWarwick Medical School, University of Warwick, Coventry, UKWarwick Medical School, University of Warwick, Coventry, UKBackground As part of our ongoing systematic review of complex interventions for the primary prevention of cardiovascular diseases, we have developed and evaluated automated machine-learning classifiers for title and abstract screening. The aim was to develop a high-performing algorithm comparable to human screening. Methods We followed a three-phase process to develop and test an automated machine learning-based classifier for screening potential studies on interventions for primary prevention of cardiovascular disease. We labelled a total of 16,611 articles during the first phase of the project. In the second phase, we used the labelled articles to develop a machine learning-based classifier. After that, we examined the performance of the classifiers in correctly labelling the papers. We evaluated the performance of the five deep-learning models [i.e. parallel convolutional neural network (CNN), stacked CNN, parallel-stacked CNN, recurrent neural network (RNN) and CNN–RNN]. The models were evaluated using recall, precision and work saved over sampling at no less than 95% recall. Results We labelled a total of 16,611 articles, of which 676 (4.0%) were tagged as ‘relevant’ and 15,935 (96%) were tagged as ‘irrelevant’. The recall ranged from 51.9% to 96.6%. The precision ranged from 64.6% to 99.1%. The work saved over sampling ranged from 8.9% to as high as 92.1%. The best-performing model was parallel CNN, yielding a 96.4% recall, as well as 99.1% precision, and a potential workload reduction of 89.9%. Future work and limitations We used words from the title and the abstract only. More work needs to be done to look into possible changes in performance, such as adding features such as full document text. The approach might also not be able to be used for other complex systematic reviews on different topics. Conclusion Our study shows that machine learning has the potential to significantly aid the labour-intensive screening of abstracts in systematic reviews of complex interventions. Future research should concentrate on enhancing the classifier system and determining how it can be integrated into the systematic review workflow. Funding This project was funded by the National Institute for Health and Care Research (NIHR) Health Technology Assessment programme and will be published in Health Technology Assessment. See the NIHR Journals Library website for further project information.https://doi.org/10.3310/UDIR6682text classificationreducing workloadmachine learning |
| spellingShingle | Olalekan A Uthman Rachel Court Jodie Enderby Lena Al-Khudairy Chidozie Nduka Hema Mistry GJ Melendez-Torres Sian Taylor-Phillips Aileen Clarke Increasing comprehensiveness and reducing workload in a systematic review of complex interventions using automated machine learning Health Technology Assessment text classification reducing workload machine learning |
| title | Increasing comprehensiveness and reducing workload in a systematic review of complex interventions using automated machine learning |
| title_full | Increasing comprehensiveness and reducing workload in a systematic review of complex interventions using automated machine learning |
| title_fullStr | Increasing comprehensiveness and reducing workload in a systematic review of complex interventions using automated machine learning |
| title_full_unstemmed | Increasing comprehensiveness and reducing workload in a systematic review of complex interventions using automated machine learning |
| title_short | Increasing comprehensiveness and reducing workload in a systematic review of complex interventions using automated machine learning |
| title_sort | increasing comprehensiveness and reducing workload in a systematic review of complex interventions using automated machine learning |
| topic | text classification reducing workload machine learning |
| url | https://doi.org/10.3310/UDIR6682 |
| work_keys_str_mv | AT olalekanauthman increasingcomprehensivenessandreducingworkloadinasystematicreviewofcomplexinterventionsusingautomatedmachinelearning AT rachelcourt increasingcomprehensivenessandreducingworkloadinasystematicreviewofcomplexinterventionsusingautomatedmachinelearning AT jodieenderby increasingcomprehensivenessandreducingworkloadinasystematicreviewofcomplexinterventionsusingautomatedmachinelearning AT lenaalkhudairy increasingcomprehensivenessandreducingworkloadinasystematicreviewofcomplexinterventionsusingautomatedmachinelearning AT chidozienduka increasingcomprehensivenessandreducingworkloadinasystematicreviewofcomplexinterventionsusingautomatedmachinelearning AT hemamistry increasingcomprehensivenessandreducingworkloadinasystematicreviewofcomplexinterventionsusingautomatedmachinelearning AT gjmelendeztorres increasingcomprehensivenessandreducingworkloadinasystematicreviewofcomplexinterventionsusingautomatedmachinelearning AT siantaylorphillips increasingcomprehensivenessandreducingworkloadinasystematicreviewofcomplexinterventionsusingautomatedmachinelearning AT aileenclarke increasingcomprehensivenessandreducingworkloadinasystematicreviewofcomplexinterventionsusingautomatedmachinelearning |