Lightweight-Improved YOLOv5s Model for Grape Fruit and Stem Recognition

文献类型: 外文期刊

第一作者: Zhao, Junhong

作者: Zhao, Junhong;Yao, Xingzhi;Wang, Yu;Yi, Zhenfeng;Zhao, Junhong;Xie, Yuming;Zhou, Xingxing

作者机构:

关键词: YOLOv5s; lightweight; target detection; mechanized picking; grape fruits and stems

期刊名称:AGRICULTURE-BASEL ( 影响因子:3.6; 五年影响因子:3.6 )

ISSN:

年卷期: 2024 年 14 卷 5 期

页码:

收录情况: SCI

摘要: Mechanized harvesting is the key technology to solving the high cost and low efficiency of manual harvesting, and the key to realizing mechanized harvesting lies in the accurate and fast identification and localization of targets. In this paper, a lightweight YOLOv5s model is improved for efficiently identifying grape fruits and stems. On the one hand, it improves the CSP module in YOLOv5s using the Ghost module, reducing model parameters through ghost feature maps and cost-effective linear operations. On the other hand, it replaces traditional convolutions with deep convolutions to further reduce the model's computational load. The model is trained on datasets under different environments (normal light, low light, strong light, noise) to enhance the model's generalization and robustness. The model is applied to the recognition of grape fruits and stems, and the experimental results show that the overall accuracy, recall rate, mAP, and F1 score of the model are 96.8%, 97.7%, 98.6%, and 97.2% respectively. The average detection time on a GPU is 4.5 ms, with a frame rate of 221 FPS, and the weight size generated during training is 5.8 MB. Compared to the original YOLOv5s, YOLOv5m, YOLOv5l, and YOLOv5x models under the specific orchard environment of a grape greenhouse, the proposed model improves accuracy by 1%, decreases the recall rate by 0.2%, increases the F1 score by 0.4%, and maintains the same mAP. In terms of weight size, it is reduced by 61.1% compared to the original model, and is only 1.8% and 5.5% of the Faster-RCNN and SSD models, respectively. The FPS is increased by 43.5% compared to the original model, and is 11.05 times and 8.84 times that of the Faster-RCNN and SSD models, respectively. On a CPU, the average detection time is 23.9 ms, with a frame rate of 41.9 FPS, representing a 31% improvement over the original model. The test results demonstrate that the lightweight-improved YOLOv5s model proposed in the study, while maintaining accuracy, significantly reduces the model size, enhances recognition speed, and can provide fast and accurate identification and localization for robotic harvesting.

分类号:

  • 相关文献
作者其他论文 更多>>