Semantic Segmentation Method for High-Resolution Tomato Seedling Point Clouds Based on Sparse Convolution

Semantic segmentation of three-dimensional (3D) plant point clouds at the stem-leaf level is foundational and indispensable for high-throughput tomato phenotyping systems. However, existing semantic segmentation methods often suffer from issues such as low precision and slow inference speed. To addr...

Full description

Saved in:
Bibliographic Details
Main Authors: Shizhao Li, Zhichao Yan, Boxiang Ma, Shaoru Guo, Hongxia Song
Format: Article
Language:English
Published: MDPI AG 2024-12-01
Series:Agriculture
Subjects:
Online Access:https://www.mdpi.com/2077-0472/15/1/74
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850116990473076736
author Shizhao Li
Zhichao Yan
Boxiang Ma
Shaoru Guo
Hongxia Song
author_facet Shizhao Li
Zhichao Yan
Boxiang Ma
Shaoru Guo
Hongxia Song
author_sort Shizhao Li
collection DOAJ
description Semantic segmentation of three-dimensional (3D) plant point clouds at the stem-leaf level is foundational and indispensable for high-throughput tomato phenotyping systems. However, existing semantic segmentation methods often suffer from issues such as low precision and slow inference speed. To address these challenges, we propose an innovative encoding-decoding structure, incorporating voxel sparse convolution (SpConv) and attention-based feature fusion (VSCAFF) to enhance semantic segmentation of the point clouds of high-resolution tomato seedling images. Tomato seedling point clouds from the Pheno4D dataset labeled into semantic classes of ‘leaf’, ‘stem’, and ‘soil’ are applied for the semantic segmentation. In order to reduce the number of parameters so as to further improve the inference speed, the SpConv module is designed to function through the residual concatenation of the skeleton convolution kernel and the regular convolution kernel. The feature fusion module based on the attention mechanism is designed by giving the corresponding attention weights to the voxel diffusion features and the point features in order to avoid the ambiguity of points with different semantics having the same characteristics caused by the diffusion module, in addition to suppressing noise. Finally, to solve model training class bias caused by the uneven distribution of point cloud classes, the composite loss function of Lovász-Softmax and weighted cross-entropy is introduced to supervise the model training and improve its performance. The results show that mIoU of VSCAFF is 86.96%, which outperformed the performance of PointNet, PointNet++, and DGCNN, respectively. IoU of VSCAFF achieves 99.63% in the soil class, 64.47% in the stem class, and 96.72% in the leaf class. The time delay of 35ms in inference speed is better than PointNet++ and DGCNN. The results demonstrate that VSCAFF has high performance and inference speed for semantic segmentation of high-resolution tomato point clouds, and can provide technical support for the high-throughput automatic phenotypic analysis of tomato plants.
format Article
id doaj-art-011bc01d4d9f4936bd45aa44739ace09
institution OA Journals
issn 2077-0472
language English
publishDate 2024-12-01
publisher MDPI AG
record_format Article
series Agriculture
spelling doaj-art-011bc01d4d9f4936bd45aa44739ace092025-08-20T02:36:12ZengMDPI AGAgriculture2077-04722024-12-011517410.3390/agriculture15010074Semantic Segmentation Method for High-Resolution Tomato Seedling Point Clouds Based on Sparse ConvolutionShizhao Li0Zhichao Yan1Boxiang Ma2Shaoru Guo3Hongxia Song4School of Computer and Information Technology, Shanxi University, Taiyuan 030006, ChinaSchool of Computer and Information Technology, Shanxi University, Taiyuan 030006, ChinaSchool of Computer and Information Technology, Shanxi University, Taiyuan 030006, ChinaSchool of Computer and Information Technology, Shanxi University, Taiyuan 030006, ChinaCollege of Horticulture, Shanxi Agricultural University, Jinzhong 030801, ChinaSemantic segmentation of three-dimensional (3D) plant point clouds at the stem-leaf level is foundational and indispensable for high-throughput tomato phenotyping systems. However, existing semantic segmentation methods often suffer from issues such as low precision and slow inference speed. To address these challenges, we propose an innovative encoding-decoding structure, incorporating voxel sparse convolution (SpConv) and attention-based feature fusion (VSCAFF) to enhance semantic segmentation of the point clouds of high-resolution tomato seedling images. Tomato seedling point clouds from the Pheno4D dataset labeled into semantic classes of ‘leaf’, ‘stem’, and ‘soil’ are applied for the semantic segmentation. In order to reduce the number of parameters so as to further improve the inference speed, the SpConv module is designed to function through the residual concatenation of the skeleton convolution kernel and the regular convolution kernel. The feature fusion module based on the attention mechanism is designed by giving the corresponding attention weights to the voxel diffusion features and the point features in order to avoid the ambiguity of points with different semantics having the same characteristics caused by the diffusion module, in addition to suppressing noise. Finally, to solve model training class bias caused by the uneven distribution of point cloud classes, the composite loss function of Lovász-Softmax and weighted cross-entropy is introduced to supervise the model training and improve its performance. The results show that mIoU of VSCAFF is 86.96%, which outperformed the performance of PointNet, PointNet++, and DGCNN, respectively. IoU of VSCAFF achieves 99.63% in the soil class, 64.47% in the stem class, and 96.72% in the leaf class. The time delay of 35ms in inference speed is better than PointNet++ and DGCNN. The results demonstrate that VSCAFF has high performance and inference speed for semantic segmentation of high-resolution tomato point clouds, and can provide technical support for the high-throughput automatic phenotypic analysis of tomato plants.https://www.mdpi.com/2077-0472/15/1/743D point cloudssemantic segmentationtomatosparse convolution
spellingShingle Shizhao Li
Zhichao Yan
Boxiang Ma
Shaoru Guo
Hongxia Song
Semantic Segmentation Method for High-Resolution Tomato Seedling Point Clouds Based on Sparse Convolution
Agriculture
3D point clouds
semantic segmentation
tomato
sparse convolution
title Semantic Segmentation Method for High-Resolution Tomato Seedling Point Clouds Based on Sparse Convolution
title_full Semantic Segmentation Method for High-Resolution Tomato Seedling Point Clouds Based on Sparse Convolution
title_fullStr Semantic Segmentation Method for High-Resolution Tomato Seedling Point Clouds Based on Sparse Convolution
title_full_unstemmed Semantic Segmentation Method for High-Resolution Tomato Seedling Point Clouds Based on Sparse Convolution
title_short Semantic Segmentation Method for High-Resolution Tomato Seedling Point Clouds Based on Sparse Convolution
title_sort semantic segmentation method for high resolution tomato seedling point clouds based on sparse convolution
topic 3D point clouds
semantic segmentation
tomato
sparse convolution
url https://www.mdpi.com/2077-0472/15/1/74
work_keys_str_mv AT shizhaoli semanticsegmentationmethodforhighresolutiontomatoseedlingpointcloudsbasedonsparseconvolution
AT zhichaoyan semanticsegmentationmethodforhighresolutiontomatoseedlingpointcloudsbasedonsparseconvolution
AT boxiangma semanticsegmentationmethodforhighresolutiontomatoseedlingpointcloudsbasedonsparseconvolution
AT shaoruguo semanticsegmentationmethodforhighresolutiontomatoseedlingpointcloudsbasedonsparseconvolution
AT hongxiasong semanticsegmentationmethodforhighresolutiontomatoseedlingpointcloudsbasedonsparseconvolution