CDIP-ChatGLM3: A dual-model approach integrating computer vision and language modeling for crop disease identification and prescription

文献类型: 外文期刊

第一作者: Yan, Changqing

作者: Yan, Changqing;Liang, Zeyun;Cheng, Han;Li, Shuyang;Yang, Guangpeng;Li, Zhiwei;Yin, Ling;Qu, Junjie;Wang, Jing;Wu, Genghong;Tian, Qi;Yu, Qiang;Zhao, Gang

作者机构:

关键词: Crop disease; Large language model; Fine-tuning; ChatGLM3; Deep learning; Disease identification; Crop protection

期刊名称:COMPUTERS AND ELECTRONICS IN AGRICULTURE ( 影响因子:8.9; 五年影响因子:9.3 )

ISSN: 0168-1699

年卷期: 2025 年 236 卷

页码:

收录情况: SCI

摘要: Deep learning (DL) models have shown exceptional accuracy in plant disease identification, yet their practical utility for farmers remains limited due to a lack of professional and actionable guidance. To bridge this gap, we developed CDIP-ChatGLM3, an innovative framework that synergizes a state-of-the-art DL-based computer vision model with a fine-tuned large language model (LLM), designed specifically for Crop Disease Identification and Prescription (CDIP). EfficientNet-B2, evaluated among 10 DL models across 48 diseases and 13 crops, achieved top performance with 97.97 % +/- 0.16 % accuracy at a 95 % confidence level. Building on this, we fine-tuned the widely used ChatGLM3-6B LLM using Low-Rank Adaptation (LoRA) and Freeze-tuning, optimizing its ability to deliver precise disease management prescriptions. We compared two training strategies-multi-task learning (MTL) and Dual-stage Mixed Fine-Tuning (DMT)-using a different combination of domain-specific and general datasets. Freeze-tuning with DMT led to substantial performance gains, achieving a 33.16 % improvement in BLEU-4 and a 27.04 % increase in the Average ROUGE F-score, surpassing the original model and state-of-the-art competitors such as Qwen-max, Llama-3.1-405B-Instruct, and GPT-4o. The dual-model architecture of CDIPChatGLM3 leverages the complementary strengths of computer vision for image-based disease detection and LLMs for contextualized, domain-specific text generation, offering unmatched specialization, interpretability, and scalability. Unlike resource-intensive multimodal models that blend modalities, our dual-model approach maintains efficiency while achieving superior performance in both disease identification and actionable prescription generation.

分类号:

  • 相关文献
作者其他论文 更多>>