Intelligent survey method of rice diseases and pests using AR glasses and image-text multimodal fusion model

文献类型: 外文期刊

第一作者: Chen, Xiangfu

作者: Chen, Xiangfu;Liu, Yongjian;Wu, Jian;Yao, Qing;Yang, Baojun;Luo, Ju;Liu, Shuhua;Feng, Zelin;Lyu, Jun

作者机构:

关键词: Rice pest and disease; Field survey; Multimodal; RDP-Detector; AR glasses

期刊名称:COMPUTERS AND ELECTRONICS IN AGRICULTURE ( 影响因子:8.9; 五年影响因子:9.3 )

ISSN: 0168-1699

年卷期: 2025 年 237 卷

页码:

收录情况: SCI

摘要: The timely and accurate detection of the occurrence types and dynamics of rice diseases and insect pests in the field is a fundamental requirement for effective scientific prevention and control. Currently, survey methods rely heavily on the expertise and experience of surveyors, leading to challenges such as limited data traceability, high labor demands, and low efficiency. The complex environmental conditions in rice fields, coupled with the diversity of pests and diseases-many of which coexist and exhibit significant intraspecies variation and inter-species similarities-further complicate detection efforts. When identification models are trained using only a limited set of image samples, they often suffer from poor generalization, undermining the accuracy of pest and disease forecasts. To overcome these challenges, a rapid, efficient, and precise intelligent survey method using AR glasses and image-text multimodal fusion model to detect rice pests and diseases was proposed. The AR glasses have advantages of wearable, hands-free, voice-control functions, which is very convenient to collect rice images in paddy fields. An image-text multimodal fusion model with two stages, RDP-Detector, was developed to improve the detection accuracy rates of rice pest and disease lesions in the images. In the first stage, the improved YOLOv5X model with AF-FPN, Decoupled Head and Soft-NMS post-processing achieved improvements in detection ability. In the second stage, text modalities are introduced, and Prompt tuning is used to perform transfer learning for downstream tasks on the basis of the CLIP model. To improve the accuracy of pest detection, the detection boxes with low confidence in the first stage are subjected to reidentification in the second stage. Compared with the state of the art models, the RDP-Detector achieved an precision, recall, and mAP of 82.3 %, 86.5 %, and 87.4 %, respectively, on the detection of seven rice pests and diseases. Compared with the object detection models that do not incorporate text modalities, the proposed approach demonstrated a 14.6 percentage point improvement in precision. The intelligent survey method for rice pests and diseases established in our study, which using AR glasses and a multimodal model, represents a highly effective innovation. The method not only enhances survey efficiency but also reduces reliance on the professional expertise of surveyors, while achieving high accuracy in pest and disease identification.

分类号:

  • 相关文献
作者其他论文 更多>>