Estimating strawberry weight for grading by picking robot with point cloud completion and multimodal fusion network
Abstract Strawberry grading by picking robots can eliminate the manual classification, reducing labor costs and minimizing the damage to the fruit. Strawberry size or weight is a key factor in grading, with accurate weight estimation being crucial for proper classification. In this paper, we collect...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-04-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-92641-1 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850265230536343552 |
|---|---|
| author | Yiming Chen Wei Wang Junchao Chen Jizhou Deng Yuanping Xiang Bo Qiao Xinghui Zhu Changyun Li |
| author_facet | Yiming Chen Wei Wang Junchao Chen Jizhou Deng Yuanping Xiang Bo Qiao Xinghui Zhu Changyun Li |
| author_sort | Yiming Chen |
| collection | DOAJ |
| description | Abstract Strawberry grading by picking robots can eliminate the manual classification, reducing labor costs and minimizing the damage to the fruit. Strawberry size or weight is a key factor in grading, with accurate weight estimation being crucial for proper classification. In this paper, we collected 1521 sets of strawberry RGB-D images using a depth camera and manually measured the weight and size of the strawberries to construct a training dataset for the strawberry weight regression model. To address the issue of incomplete depth images caused by environmental interference with depth cameras, this study proposes a multimodal point cloud completion method specifically designed for symmetrical objects, leveraging RGB images to guide the completion of depth images in the same scene. The method follows a process of locating strawberry pixel regions, calculating centroid coordinates, determining the symmetry axis via PCA, and completing the depth image. Based on this approach, a multimodal fusion regression model for strawberry weight estimation, named MMF-Net, is developed. The model uses the completed point cloud and RGB image as inputs, and extracts features from the RGB image and point cloud by EfficientNet and PointNet, respectively. These features are then integrated at the feature level through gradient blending, realizing the combination of the strengths of both modalities. Using the Percent Correct Weight (PCW) metric as the evaluation standard, this study compares the performance of four traditional machine learning methods, Support Vector Regression (SVR), Multilayer Perceptron (MLP), Linear Regression, and Random Forest Regression, with four point cloud-based deep learning models, PointNet, PointNet++, PointMLP, and Point Cloud Transformer, as well as an image-based deep learning model, EfficientNet and ResNet, on single-modal datasets. The results indicate that among traditional machine learning methods, the SVR model achieved the best performance with an accuracy of 77.7% (PCW@0.2). Among deep learning methods, the image-based EfficientNet model obtained the highest accuracy, reaching 85% (PCW@0.2), while the PointNet + + model demonstrated the best performance among point cloud-based models, with an accuracy of 54.3% (PCW@0.2). The proposed multimodal fusion model, MMF-Net, achieved an accuracy of 87.66% (PCW@0.2), significantly outperforming both traditional machine learning methods and single-modal deep learning models in terms of precision. |
| format | Article |
| id | doaj-art-80c27e42124241bda62dd78060260742 |
| institution | OA Journals |
| issn | 2045-2322 |
| language | English |
| publishDate | 2025-04-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Scientific Reports |
| spelling | doaj-art-80c27e42124241bda62dd780602607422025-08-20T01:54:30ZengNature PortfolioScientific Reports2045-23222025-04-0115111910.1038/s41598-025-92641-1Estimating strawberry weight for grading by picking robot with point cloud completion and multimodal fusion networkYiming Chen0Wei Wang1Junchao Chen2Jizhou Deng3Yuanping Xiang4Bo Qiao5Xinghui Zhu6Changyun Li7Hunan Agricultural UniversityHunan Agricultural UniversityTencent Music EntertainmentHunan Agricultural UniversityHunan Agricultural UniversityHunan Agricultural UniversityHunan Agricultural UniversityHunan Agricultural UniversityAbstract Strawberry grading by picking robots can eliminate the manual classification, reducing labor costs and minimizing the damage to the fruit. Strawberry size or weight is a key factor in grading, with accurate weight estimation being crucial for proper classification. In this paper, we collected 1521 sets of strawberry RGB-D images using a depth camera and manually measured the weight and size of the strawberries to construct a training dataset for the strawberry weight regression model. To address the issue of incomplete depth images caused by environmental interference with depth cameras, this study proposes a multimodal point cloud completion method specifically designed for symmetrical objects, leveraging RGB images to guide the completion of depth images in the same scene. The method follows a process of locating strawberry pixel regions, calculating centroid coordinates, determining the symmetry axis via PCA, and completing the depth image. Based on this approach, a multimodal fusion regression model for strawberry weight estimation, named MMF-Net, is developed. The model uses the completed point cloud and RGB image as inputs, and extracts features from the RGB image and point cloud by EfficientNet and PointNet, respectively. These features are then integrated at the feature level through gradient blending, realizing the combination of the strengths of both modalities. Using the Percent Correct Weight (PCW) metric as the evaluation standard, this study compares the performance of four traditional machine learning methods, Support Vector Regression (SVR), Multilayer Perceptron (MLP), Linear Regression, and Random Forest Regression, with four point cloud-based deep learning models, PointNet, PointNet++, PointMLP, and Point Cloud Transformer, as well as an image-based deep learning model, EfficientNet and ResNet, on single-modal datasets. The results indicate that among traditional machine learning methods, the SVR model achieved the best performance with an accuracy of 77.7% (PCW@0.2). Among deep learning methods, the image-based EfficientNet model obtained the highest accuracy, reaching 85% (PCW@0.2), while the PointNet + + model demonstrated the best performance among point cloud-based models, with an accuracy of 54.3% (PCW@0.2). The proposed multimodal fusion model, MMF-Net, achieved an accuracy of 87.66% (PCW@0.2), significantly outperforming both traditional machine learning methods and single-modal deep learning models in terms of precision.https://doi.org/10.1038/s41598-025-92641-1Deep learningPoint cloud completionWeight EstimationFeature fusion |
| spellingShingle | Yiming Chen Wei Wang Junchao Chen Jizhou Deng Yuanping Xiang Bo Qiao Xinghui Zhu Changyun Li Estimating strawberry weight for grading by picking robot with point cloud completion and multimodal fusion network Scientific Reports Deep learning Point cloud completion Weight Estimation Feature fusion |
| title | Estimating strawberry weight for grading by picking robot with point cloud completion and multimodal fusion network |
| title_full | Estimating strawberry weight for grading by picking robot with point cloud completion and multimodal fusion network |
| title_fullStr | Estimating strawberry weight for grading by picking robot with point cloud completion and multimodal fusion network |
| title_full_unstemmed | Estimating strawberry weight for grading by picking robot with point cloud completion and multimodal fusion network |
| title_short | Estimating strawberry weight for grading by picking robot with point cloud completion and multimodal fusion network |
| title_sort | estimating strawberry weight for grading by picking robot with point cloud completion and multimodal fusion network |
| topic | Deep learning Point cloud completion Weight Estimation Feature fusion |
| url | https://doi.org/10.1038/s41598-025-92641-1 |
| work_keys_str_mv | AT yimingchen estimatingstrawberryweightforgradingbypickingrobotwithpointcloudcompletionandmultimodalfusionnetwork AT weiwang estimatingstrawberryweightforgradingbypickingrobotwithpointcloudcompletionandmultimodalfusionnetwork AT junchaochen estimatingstrawberryweightforgradingbypickingrobotwithpointcloudcompletionandmultimodalfusionnetwork AT jizhoudeng estimatingstrawberryweightforgradingbypickingrobotwithpointcloudcompletionandmultimodalfusionnetwork AT yuanpingxiang estimatingstrawberryweightforgradingbypickingrobotwithpointcloudcompletionandmultimodalfusionnetwork AT boqiao estimatingstrawberryweightforgradingbypickingrobotwithpointcloudcompletionandmultimodalfusionnetwork AT xinghuizhu estimatingstrawberryweightforgradingbypickingrobotwithpointcloudcompletionandmultimodalfusionnetwork AT changyunli estimatingstrawberryweightforgradingbypickingrobotwithpointcloudcompletionandmultimodalfusionnetwork |