您好,欢迎访问北京市农林科学院 机构知识库!

DFYOLOv5m-M2transformer: Interpretation of vegetable disease recognition results using image dense captioning techniques

文献类型: 外文期刊

作者: Sun, Wei 1 ; Wang, Chunshan 1 ; Wu, Huarui 1 ; Miao, Yisheng 1 ; Zhu, Huaji 1 ; Guo, Wang 1 ; Li, Jiuxi 2 ;

作者机构: 1.Natl Engn Res Ctr Informat Technol Agr, Beijing 100097, Peoples R China

2.Hebei Agr Univ, Baoding 071000, Peoples R China

3.Beijing Acad Agr & Forestry Sci, Informat Technol Res Ctr, Beijing 100097, Peoples R China

4.Minist Agr & Rural Affairs, Key Lab Digital Rural Technol, Beijing 100097, Peoples R China

5.Hebei Key Lab Agr Big Data, Baoding 071000, Peoples R China

关键词: Image captioning; Disease recognition; YOLOv5m; M2-Transformer; Two-stage; NWD

期刊名称:COMPUTERS AND ELECTRONICS IN AGRICULTURE ( 影响因子:8.3; 五年影响因子:8.3 )

ISSN: 0168-1699

年卷期: 2023 年 215 卷

页码:

收录情况: SCI

摘要: The latest advances in deep learning technology make it possible to recognize vegetable diseases from leaf images. The existing disease recognition methods based on computer vision have shown exciting achievements in terms of accuracy, stability, and portability. However, these methods cannot provide a decision-making basis for the final results, and lack a text basis to support the users' judgement. Disease diagnosis is a risky decision. If the detection method lacks transparency, the users will not be able to fully trust the recognition results, which greatly limits the application of various recognition methods based on deep learning. Aiming at the problem of low "man-machine" credibility due to the fact that deep learning-based methods are unable to provide decisionmaking basis, this paper proposed a two-stage image dense captioning model named "DFYOLOv5m-M2Transformer", which can generate description sentences of visualized disease features on the basis of the recognized diseased area. Firstly, we established a target detection dataset and a dense captioning dataset containing leaf images of 10 diseases, involving 2 vegetables, i.e., cucumber and tomato. Secondly, we chose the DFYOLOv5m network as the disease detector to extract the diseased area from the image, and the M2-Transformer network as the decision basis generator to generate description sentences of disease features. Then, the Bi-Level Routing Attention module was introduced to extract fine-grained features under complex backgrounds in order to resolve the problem of poor feature extraction in case of mixed diseases. Finally, we used Atrous Convolution to expand the receptive field of the model, and fused NWD and CIoU to improve the model's performance in detecting small targets. The experimental results show that the IoU and Meteor joint evaluation indicator of DFYOLOv5mM2Transformer achieved a mean Average Precision (mAP) of 94.7 % on the dense captioning dataset, which was 7.2 % higher than that of the best-performing model Veg-DenseCap in the control group. Moreover, the decision basis that is automatically generated by the proposed model is characterized by the advantages of high accuracy, correct grammar and large sentence variety. The outcome of this study provides a new idea for optimizing the user experience in using vegetable disease recognition models.

  • 相关文献
作者其他论文 更多>>