Rep-ViG-Apple: A CNN-GCN Hybrid Model for Apple Detection in Complex Orchard Environments

文献类型: 外文期刊

第一作者: Han, Bo

作者: Han, Bo;Lu, Ziao;Zhang, Jingjing;Dong, Luan;Han, Bo;Lu, Ziao;Zhang, Jingjing;Dong, Luan;Han, Bo;Lu, Ziao;Zhang, Jingjing;Dong, Luan;Almodfer, Rolla;Wang, Zhengting;Sun, Wei

作者机构:

关键词: apple detection; orchard complex environments; structural reparameterization; SVGA; data augmentation; model pruning; intelligent detection; machine vision

期刊名称:AGRONOMY-BASEL ( 影响因子:3.4; 五年影响因子:3.8 )

ISSN:

年卷期: 2024 年 14 卷 8 期

页码:

收录情况: SCI

摘要: Accurately recognizing apples in complex environments is essential for automating apple picking operations, particularly under challenging natural conditions such as cloudy, snowy, foggy, and rainy weather, as well as low-light situations. To overcome the challenges of reduced apple target detection accuracy due to branch occlusion, apple overlap, and variations between near and far field scales, we propose the Rep-ViG-Apple algorithm, an advanced version of the YOLO model. The Rep-ViG-Apple algorithm features a sophisticated architecture designed to enhance apple detection performance in difficult conditions. To improve feature extraction for occluded and overlapped apple targets, we developed the inverted residual multi-scale structural reparameterized feature extraction block (RepIRD Block) within the backbone network. We also integrated the sparse graph attention mechanism (SVGA) to capture global feature information, concentrate attention on apples, and reduce interference from complex environmental features. Moreover, we designed a feature extraction network with a CNN-GCN architecture, termed Rep-Vision-GCN. This network combines the local multi-scale feature extraction capabilities of a convolutional neural network (CNN) with the global modeling strengths of a graph convolutional network (GCN), enhancing the extraction of apple features. The RepConvsBlock module, embedded in the neck network, forms the Rep-FPN-PAN feature fusion network, which improves the recognition of apple targets across various scales, both near and far. Furthermore, we implemented a channel pruning algorithm based on LAMP scores to balance computational efficiency with model accuracy. Experimental results demonstrate that the Rep-ViG-Apple algorithm achieves precision, recall, and average accuracy of 92.5%, 85.0%, and 93.3%, respectively, marking improvements of 1.5%, 1.5%, and 2.0% over YOLOv8n. Additionally, the Rep-ViG-Apple model benefits from a 22% reduction in size, enhancing its efficiency and suitability for deployment in resource-constrained environments while maintaining high accuracy.

分类号:

  • 相关文献
作者其他论文 更多>>