Positive Anchor Area Merge Algorithm: A Knowledge Distillation Algorithm for Fruit Detection Tasks Based on Yolov8

In the agricultural sector, employing machine vision technology for fruit target detection holds significant research importance and broad application prospects, such as enabling fruit growth monitoring, yield prediction, and fruit sorting. The Yolov8 model, as the latest model in the field of objec...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiangqun Shi, Xian Zhang, Yifan Su, Xun Zhang
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10897963/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In the agricultural sector, employing machine vision technology for fruit target detection holds significant research importance and broad application prospects, such as enabling fruit growth monitoring, yield prediction, and fruit sorting. The Yolov8 model, as the latest model in the field of object detection, boasts advantages including high execution efficiency and detection accuracy. However, when it comes to fruit object detection, which means counting and locating target fruits in an image, the performance of the Yolov8 model shows a noticeable decline compared to its performance on the standard COCO dataset. To address this issue, knowledge distillation is a highly versatile method that uses a large teacher model to guide the training of a smaller student model, thereby improving the detection accuracy of the student model. This thesis proposes a Yolov8 knowledge distillation method tailored for fruit recognition tasks, which improves the network through knowledge distillation and implements a knowledge distillation method based on positive anchor area merging to enhance detection accuracy for fruit recognition tasks. On our self-constructed fruit dataset, which contains over 3,000 images for each category, we compared our model with other similar state-of-the-art models in terms of resource consumption and detection accuracy. While maintaining a low resource overhead, our model achieved an mAP(50) of 99.47%, which is higher than other models that range from 99.1% to 99.3%. In the ablation experiments, we also demonstrated the practical significance of dividing the positive sample area. Finally, we deployed the model on an embedded system for real-time detection of on-site images. These experiments illustrate the practicality of our method for recognizing fruits in real-world scenarios.
ISSN:2169-3536