您好,欢迎访问北京市农林科学院 机构知识库!

Veg-DenseCap: Dense Captioning Model for Vegetable Leaf Disease Images

文献类型: 外文期刊

作者: Sun, Wei 1 ; Wang, Chunshan 1 ; Gu, Jingqiu 1 ; Sun, Xiang 1 ; Li, Jiuxi 4 ; Liang, Fangfang 4 ;

作者机构: 1.Natl Engn Res Ctr Informat Technol Agr, Beijing 100097, Peoples R China

2.Beijing Acad Agr & Forestry Sci, Informat Technol Res Ctr, Beijing 100097, Peoples R China

3.Minist Agr & Rural Affairs, Key Lab Digital Rural Technol, Beijing 100097, Peoples R China

4.Hebei Agr Univ, Sch Informat Sci & Technol, Baoding 071001, Peoples R China

关键词: image captioning; disease detection; text generation; Faster R-CNN; attention mechanism

期刊名称:AGRONOMY-BASEL ( 影响因子:3.7; 五年影响因子:4.0 )

ISSN:

年卷期: 2023 年 13 卷 7 期

页码:

收录情况: SCI

摘要: The plant disease recognition model based on deep learning has shown good performance potential. However, high complexity and nonlinearity lead to the low transparency and poor interpretability of such models. These limitations greatly limit the deployment and application of such models in field scenarios. To solve the above problems, we propose a dense caption generative model, Veg DenseCap. This model takes vegetable leaf images as input and uses object detection technology to locate abnormal parts of the leaf and identify the disease results. More importantly, it can describe the disease features it sees in natural language, and users can judge whether the relevant features are semantically consistent with human cognition based on these descriptions. First of all, a dataset containing Chinese feature description statements for images of 10 leaf diseases involving two vegetables (cucumber and tomato) was established. Secondly, Faster R-CNN was used as a disease detector to extract visual features of diseases, and LSTM was used as a language generator to generate description statements for disease features. Finally, the Convolutional Block Attention Module (CBAM) and the Focal Loss function were employed to overcome the imbalance between positive and negative samples and the weak performance of Faster R-CNN in obtaining key features. According to the test results, the Intersection-over-Union (IoU) and Meteor joint evaluation index of Veg-DenseCap achieved a mean Average Precision (mAP) of 88.0% on the dense captioning dataset of vegetable leaf disease images, which is 9.1% higher than that of the classical FCLN model. The automatically generated description statements are characterized by advantages of accurate feature description, correct grammar, and high diversity.

  • 相关文献
作者其他论文 更多>>