A large language model for multimodal identification of crop diseases and pests

文献类型: 外文期刊

第一作者: Wang, Yiqun

作者: Wang, Yiqun;Wang, Fahai;Chen, Wenbai;Lv, Bowen;Liu, Mengchen;Kong, Xiangyuan;Pan, Zhaocen;Zhao, Chunjiang;Wang, Fahai;Lv, Bowen;Liu, Mengchen

作者机构:

关键词: Large language model; Crop disease identification; Agricultural questions and answers; Multimodal

期刊名称:SCIENTIFIC REPORTS ( 影响因子:3.9; 五年影响因子:4.3 )

ISSN: 2045-2322

年卷期: 2025 年 15 卷 1 期

页码:

收录情况: SCI

摘要: Pests and diseases significantly impact the growth and development of crops. When attempting to precisely identify disease characteristics in crop images through dialogue, existing multimodal models face numerous challenges, often leading to misinterpretation and incorrect feedback regarding disease information. This paper proposed a large language model for multimodal identification of crop diseases and pests, which can be called LLMI-CDP. It builds up on the VisualGLM model and introduces improvements to achieve precise identification of agricultural crop disease and pest images, along with providing professional recommendations for relevant preventive measures. The use of Low-Rank Adaptation (LoRA) technology, which adjusts the weights of pre-trained models, achieves significant performance improvements with a minimal increase in parameters. This ensures the precise capture and efficient identification of crop pest and disease characteristics, greatly enhancing the model's application flexibility and accuracy in the field of pest and disease recognition. Simultaneously, the model incorporates the Q-Former framework for effective modal alignment between language models and image features. Through this approach, the LLMI-CDP model is able to more deeply understand and process the complex relationships between language and visual information, further enhancing its performance in multimodal recognition tasks. Experiments are carried out in the homemade datasets, The results demonstrate that the LLMI-CDP model surpasses five leading multimodal large language models in relevant evaluation metrics, confirming its outstanding performance in Chinese multimodal dialogues related to agriculture.

分类号:

  • 相关文献
作者其他论文 更多>>