Semi-Automated Training of AI Vision Models

The adoption of AI vision models in specialized industries is often hindered by the substantial requirement for extensive, manually annotated image datasets. Even when employing transfer learning, robust model development typically necessitates tens of thousands of such images, a process that is tim...

Full description

Saved in:

Bibliographic Details
Main Authors:	Mathew G. Pelletier, John D. Wanjura, Greg A. Holt
Format:	Article
Language:	English
Published:	MDPI AG 2025-07-01
Series:	AgriEngineering
Subjects:	machine vision plastic contamination cotton automated inspection
Online Access:	https://www.mdpi.com/2624-7402/7/7/225
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850067586267480064
author	Mathew G. Pelletier John D. Wanjura Greg A. Holt
author_facet	Mathew G. Pelletier John D. Wanjura Greg A. Holt
author_sort	Mathew G. Pelletier
collection	DOAJ
description	The adoption of AI vision models in specialized industries is often hindered by the substantial requirement for extensive, manually annotated image datasets. Even when employing transfer learning, robust model development typically necessitates tens of thousands of such images, a process that is time-consuming, costly, and demands consistent expert annotation. This technical note introduces a semi-automated method to significantly reduce this annotation burden. The proposed approach utilizes two general-purpose vision-transformer-to-caption (GP-ViTC) models to generate descriptive text from images. These captions are then processed by a custom-developed semantic classifier (SC), which requires only minimal training to predict the correct image class. This GP-ViTC + SC system demonstrated exemplary classification rates in test cases and can subsequently be used to automatically annotate large image datasets. While the inference speed of the GP-ViTC models is not suited for real-time applications (approximately 10 s per image), this method substantially lessens the labor and expertise required for dataset creation, thereby facilitating the development of new, high-speed, custom AI vision models for niche applications. This work details the approach and its successful application, offering a cost-effective pathway for generating tailored image training sets.
format	Article
id	doaj-art-bfa4e81413804730b2bf03cdb0ee5297
institution	DOAJ
issn	2624-7402
language	English
publishDate	2025-07-01
publisher	MDPI AG
record_format	Article
series	AgriEngineering
spelling	doaj-art-bfa4e81413804730b2bf03cdb0ee52972025-08-20T02:48:16ZengMDPI AGAgriEngineering2624-74022025-07-017722510.3390/agriengineering7070225Semi-Automated Training of AI Vision ModelsMathew G. Pelletier0John D. Wanjura1Greg A. Holt2Lubbock Gin-Laboratory, Cotton Production and Processing Research Unit, United States Department of Agriculture, Agricultural Research Services, Lubbock, TX 79403, USALubbock Gin-Laboratory, Cotton Production and Processing Research Unit, United States Department of Agriculture, Agricultural Research Services, Lubbock, TX 79403, USALubbock Gin-Laboratory, Cotton Production and Processing Research Unit, United States Department of Agriculture, Agricultural Research Services, Lubbock, TX 79403, USAThe adoption of AI vision models in specialized industries is often hindered by the substantial requirement for extensive, manually annotated image datasets. Even when employing transfer learning, robust model development typically necessitates tens of thousands of such images, a process that is time-consuming, costly, and demands consistent expert annotation. This technical note introduces a semi-automated method to significantly reduce this annotation burden. The proposed approach utilizes two general-purpose vision-transformer-to-caption (GP-ViTC) models to generate descriptive text from images. These captions are then processed by a custom-developed semantic classifier (SC), which requires only minimal training to predict the correct image class. This GP-ViTC + SC system demonstrated exemplary classification rates in test cases and can subsequently be used to automatically annotate large image datasets. While the inference speed of the GP-ViTC models is not suited for real-time applications (approximately 10 s per image), this method substantially lessens the labor and expertise required for dataset creation, thereby facilitating the development of new, high-speed, custom AI vision models for niche applications. This work details the approach and its successful application, offering a cost-effective pathway for generating tailored image training sets.https://www.mdpi.com/2624-7402/7/7/225machine visionplastic contaminationcottonautomated inspection
spellingShingle	Mathew G. Pelletier John D. Wanjura Greg A. Holt Semi-Automated Training of AI Vision Models AgriEngineering machine vision plastic contamination cotton automated inspection
title	Semi-Automated Training of AI Vision Models
title_full	Semi-Automated Training of AI Vision Models
title_fullStr	Semi-Automated Training of AI Vision Models
title_full_unstemmed	Semi-Automated Training of AI Vision Models
title_short	Semi-Automated Training of AI Vision Models
title_sort	semi automated training of ai vision models
topic	machine vision plastic contamination cotton automated inspection
url	https://www.mdpi.com/2624-7402/7/7/225
work_keys_str_mv	AT mathewgpelletier semiautomatedtrainingofaivisionmodels AT johndwanjura semiautomatedtrainingofaivisionmodels AT gregaholt semiautomatedtrainingofaivisionmodels

Semi-Automated Training of AI Vision Models

Similar Items