SD-YOLOv8: SAM-Assisted Dual-Branch YOLOv8 Model for Tea Bud Detection on Optical Images

It is challenging to achieve accurate tea bud detection in optical images with complex backgrounds since distinguishing between the foregrounds and backgrounds of these images remains difficult. Although several studies have been proposed to implicitly distinguish foregrounds and backgrounds via var...

Full description

Saved in:
Bibliographic Details
Main Authors: Xintong Zhang, Dasheng Wu, Fengya Xu
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Agriculture
Subjects:
Online Access:https://www.mdpi.com/2077-0472/15/7/712
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:It is challenging to achieve accurate tea bud detection in optical images with complex backgrounds since distinguishing between the foregrounds and backgrounds of these images remains difficult. Although several studies have been proposed to implicitly distinguish foregrounds and backgrounds via various attention mechanisms, explicit distinction between foregrounds and backgrounds has been seldom explored. Inspired by recent successful applications of the Segment Anything Model (SAM) in computer vision, this study proposes a SAM-assisted dual-branch YOLOv8 model named SD-YOLOv8 for tea bud detection to address the challenges of explicit distinction between foregrounds and backgrounds. The SD-YOLOv8 model mainly consists of two key components: (1) the SAM-based foreground segmenter (SFS) to generate foreground masks of tea bud images without any training, and (2) the heterogeneous feature extractor to parallelly capture both color features in optical images and edge features in foreground masks. The experimental results show that the proposed SD-YOLOv8 significantly improves the performance of tea bud detection based on the explicit distinction between foregrounds and backgrounds. The mean Average Precision of the SD-YOLOv8 model reaches 86.0%, surpassing the YOLOv8 (mAP 81.6%) by 5 percentage points and outperforming recent object detection models, including Faster R-CNN (mAP 60.7%), DETR (mAP 64.6%), YOLOv5 (mAP 72.4%), and YOLOv7 (mAP 80.6%). This demonstrates its superior capability in efficiently detecting tea buds against complex backgrounds. Additionally, this study proposes a self-built tea bud dataset with three seasons to address the data shortages in tea bud detection.
ISSN:2077-0472