Location Research and Picking Experiment of an Apple-Picking Robot Based on Improved Mask R-CNN and Binocular Vision

With the advancement of agricultural automation technologies, apple-harvesting robots have gradually become a focus of research. As their “perceptual core,” machine vision systems directly determine picking success rates and operational efficiency. However, existing vision systems still exhibit sign...

Full description

Saved in:
Bibliographic Details
Main Authors: Tianzhong Fang, Wei Chen, Lu Han
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Horticulturae
Subjects:
Online Access:https://www.mdpi.com/2311-7524/11/7/801
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:With the advancement of agricultural automation technologies, apple-harvesting robots have gradually become a focus of research. As their “perceptual core,” machine vision systems directly determine picking success rates and operational efficiency. However, existing vision systems still exhibit significant shortcomings in target detection and positioning accuracy in complex orchard environments (e.g., uneven illumination, foliage occlusion, and fruit overlap), which hinders practical applications. This study proposes a visual system for apple-harvesting robots based on improved Mask R-CNN and binocular vision to achieve more precise fruit positioning. The binocular camera (ZED2i) carried by the robot acquires dual-channel apple images. An improved Mask R-CNN is employed to implement instance segmentation of apple targets in binocular images, followed by a template-matching algorithm with parallel epipolar constraints for stereo matching. Four pairs of feature points from corresponding apples in binocular images are selected to calculate disparity and depth. Experimental results demonstrate average coefficients of variation and positioning accuracy of 5.09% and 99.61%, respectively, in binocular positioning. During harvesting operations with a self-designed apple-picking robot, the single-image processing time was 0.36 s, the average single harvesting cycle duration reached 7.7 s, and the comprehensive harvesting success rate achieved 94.3%. This work presents a novel high-precision visual positioning method for apple-harvesting robots.
ISSN:2311-7524