您好,欢迎访问北京市农林科学院 机构知识库!

A Low-Altitude Remote Sensing Inspection Method on Rural Living Environments Based on a Modified YOLOv5s-ViT

文献类型: 外文期刊

作者: Wang, Chunshan 1 ; Sun, Wei 1 ; Wu, Huarui 2 ; Zhao, Chunjiang 2 ; Teng, Guifa 1 ; Yang, Yingru 4 ; Du, Pengfei 4 ;

作者机构: 1.Hebei Agr Univ, Sch Informat Sci & Technol, Baoding 071001, Peoples R China

2.Natl Engn Res Ctr Informat Technol Agr, Beijing 100097, Peoples R China

3.Hebei Key Lab Agr Big Data, Baoding 071001, Peoples R China

4.Shijiazhuang Acad Agr & Forestry Sci, Shijiazhuang 050041, Hebei, Peoples R China

关键词: Vision Transformer; attention mechanism; target detection; unmanned aerial vehicle (UAV); YOLOv5

期刊名称:REMOTE SENSING ( 影响因子:5.349; 五年影响因子:5.786 )

ISSN:

年卷期: 2022 年 14 卷 19 期

页码:

收录情况: SCI

摘要: The governance of rural living environments is one of the important tasks in the implementation of a rural revitalization strategy. At present, the illegal behaviors of random construction and random storage in public spaces have seriously affected the effectiveness of the governance of rural living environments. The current supervision on such problems mainly relies on manual inspection. Due to the large number and wide distribution of rural areas to be inspected, this method is limited by obvious disadvantages, such as low detection efficiency, long-time spending, and huge consumption of human resources, so it is difficult to meet the requirements of efficient and accurate inspection. In response to the difficulties encountered, a low-altitude remote sensing inspection method on rural living environments was proposed based on a modified YOLOv5s-ViT (YOLOv5s-Vision Transformer) in this paper. First, the BottleNeck structure was modified to enhance the multi-scale feature capture capability of the model. Then, the SimAM attention mechanism module was embedded to intensify the model's attention to key features without increasing the number of parameters. Finally, the Vision Transformer component was incorporated to improve the model's ability to perceive global features in the image. The testing results of the established model showed that, compared with the original YOLOv5 network, the Precision, Recall, and mAP of the modified YOLOv5s-ViT model improved by 2.2%, 11.5%, and 6.5%, respectively; the total number of parameters was reduced by 68.4%; and the computation volume was reduced by 83.3%. Relative to other mainstream detection models, YOLOv5s-ViT achieved a good balance between detection performance and model complexity. This study provides new ideas for improving the digital capability of the governance of rural living environments.

  • 相关文献
作者其他论文 更多>>