A Patch-Level Data Synthesis Pipeline Enhances Species-Level Crop and Weed Segmentation in Natural Agricultural Scenes
Species-level crop and weed semantic segmentation in agricultural field images enables plant identification and enhanced precision weed management. However, the scarcity of labeled data poses significant challenges for model development. Here, we report a patch-level synthetic data generation pipeli...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2025-01-01
|
Series: | Agriculture |
Subjects: | |
Online Access: | https://www.mdpi.com/2077-0472/15/2/138 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832589493513748480 |
---|---|
author | Tang Li James Burridge Pieter M. Blok Wei Guo |
author_facet | Tang Li James Burridge Pieter M. Blok Wei Guo |
author_sort | Tang Li |
collection | DOAJ |
description | Species-level crop and weed semantic segmentation in agricultural field images enables plant identification and enhanced precision weed management. However, the scarcity of labeled data poses significant challenges for model development. Here, we report a patch-level synthetic data generation pipeline that improves semantic segmentation performance in natural agriculture scenes by creating realistic training samples, achieved by pasting patches of segmented plants onto soil backgrounds. This pipeline effectively preserves foreground context and ensures diverse and accurate samples, thereby enhancing model generalization. The semantic segmentation performance of the baseline model was higher when trained solely on data synthesized by our proposed method compared to training solely on real data, with an approximate increase in the mean intersection over union (mIoU) by approximately 1.1% (from 0.626 to 0.633). Building on this, we created hybrid datasets by combining synthetic and real data and investigated the impact of synthetic data volume. By increasing the number of synthetic images in these hybrid datasets from 1× to 20×, we observed a substantially performance improvement, with mIoU increasing by 15% at 15×. However, the gains diminish beyond this point, with the optimal balance between accuracy and efficiency achieved at 10×. These findings highlight synthetic data as a scalable and effective augmentation strategy for addressing the challenges of limited labeled data in agriculture. |
format | Article |
id | doaj-art-ae896e633ad641f6a84a11027af2d15b |
institution | Kabale University |
issn | 2077-0472 |
language | English |
publishDate | 2025-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Agriculture |
spelling | doaj-art-ae896e633ad641f6a84a11027af2d15b2025-01-24T13:15:51ZengMDPI AGAgriculture2077-04722025-01-0115213810.3390/agriculture15020138A Patch-Level Data Synthesis Pipeline Enhances Species-Level Crop and Weed Segmentation in Natural Agricultural ScenesTang Li0James Burridge1Pieter M. Blok2Wei Guo3Laboratory of Field Phenomics, Institute for Sustainable Agro-Ecosystem Services, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo 188-0002, JapanLaboratory of Field Phenomics, Institute for Sustainable Agro-Ecosystem Services, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo 188-0002, JapanLaboratory of Field Phenomics, Institute for Sustainable Agro-Ecosystem Services, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo 188-0002, JapanLaboratory of Field Phenomics, Institute for Sustainable Agro-Ecosystem Services, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo 188-0002, JapanSpecies-level crop and weed semantic segmentation in agricultural field images enables plant identification and enhanced precision weed management. However, the scarcity of labeled data poses significant challenges for model development. Here, we report a patch-level synthetic data generation pipeline that improves semantic segmentation performance in natural agriculture scenes by creating realistic training samples, achieved by pasting patches of segmented plants onto soil backgrounds. This pipeline effectively preserves foreground context and ensures diverse and accurate samples, thereby enhancing model generalization. The semantic segmentation performance of the baseline model was higher when trained solely on data synthesized by our proposed method compared to training solely on real data, with an approximate increase in the mean intersection over union (mIoU) by approximately 1.1% (from 0.626 to 0.633). Building on this, we created hybrid datasets by combining synthetic and real data and investigated the impact of synthetic data volume. By increasing the number of synthetic images in these hybrid datasets from 1× to 20×, we observed a substantially performance improvement, with mIoU increasing by 15% at 15×. However, the gains diminish beyond this point, with the optimal balance between accuracy and efficiency achieved at 10×. These findings highlight synthetic data as a scalable and effective augmentation strategy for addressing the challenges of limited labeled data in agriculture.https://www.mdpi.com/2077-0472/15/2/138data synthesisdata augmentationgenerative modelsemantic segmentationprecision weed management |
spellingShingle | Tang Li James Burridge Pieter M. Blok Wei Guo A Patch-Level Data Synthesis Pipeline Enhances Species-Level Crop and Weed Segmentation in Natural Agricultural Scenes Agriculture data synthesis data augmentation generative model semantic segmentation precision weed management |
title | A Patch-Level Data Synthesis Pipeline Enhances Species-Level Crop and Weed Segmentation in Natural Agricultural Scenes |
title_full | A Patch-Level Data Synthesis Pipeline Enhances Species-Level Crop and Weed Segmentation in Natural Agricultural Scenes |
title_fullStr | A Patch-Level Data Synthesis Pipeline Enhances Species-Level Crop and Weed Segmentation in Natural Agricultural Scenes |
title_full_unstemmed | A Patch-Level Data Synthesis Pipeline Enhances Species-Level Crop and Weed Segmentation in Natural Agricultural Scenes |
title_short | A Patch-Level Data Synthesis Pipeline Enhances Species-Level Crop and Weed Segmentation in Natural Agricultural Scenes |
title_sort | patch level data synthesis pipeline enhances species level crop and weed segmentation in natural agricultural scenes |
topic | data synthesis data augmentation generative model semantic segmentation precision weed management |
url | https://www.mdpi.com/2077-0472/15/2/138 |
work_keys_str_mv | AT tangli apatchleveldatasynthesispipelineenhancesspecieslevelcropandweedsegmentationinnaturalagriculturalscenes AT jamesburridge apatchleveldatasynthesispipelineenhancesspecieslevelcropandweedsegmentationinnaturalagriculturalscenes AT pietermblok apatchleveldatasynthesispipelineenhancesspecieslevelcropandweedsegmentationinnaturalagriculturalscenes AT weiguo apatchleveldatasynthesispipelineenhancesspecieslevelcropandweedsegmentationinnaturalagriculturalscenes AT tangli patchleveldatasynthesispipelineenhancesspecieslevelcropandweedsegmentationinnaturalagriculturalscenes AT jamesburridge patchleveldatasynthesispipelineenhancesspecieslevelcropandweedsegmentationinnaturalagriculturalscenes AT pietermblok patchleveldatasynthesispipelineenhancesspecieslevelcropandweedsegmentationinnaturalagriculturalscenes AT weiguo patchleveldatasynthesispipelineenhancesspecieslevelcropandweedsegmentationinnaturalagriculturalscenes |