A vision transformer-based robotic perception for early tea chrysanthemum flower counting in field environments
文献类型: 外文期刊
作者: Qi, Chao 1 ; Chen, Kunjie 2 ; Gao, Junfeng 3 ;
作者机构: 1.Jiangsu Acad Agr Sci, Inst Agr Informat, Nanjing, Peoples R China
2.Nanjing Agr Univ, Coll Engn, Dept Agr Machinery, Nanjing, Peoples R China
3.Univ Lincoln, Lincoln Agrirobot Ctr, Lincoln Inst Agrifood Technol, Riseholme PK, Lincoln LN2 2LG, England
4.Univ Lincoln, Lincoln Ctr Autonomous Syst L CAS, Lincoln, England
关键词: agricultural robotics; density map estimation; high density object counting; visual transformer
期刊名称:JOURNAL OF FIELD ROBOTICS ( 影响因子:4.2; 五年影响因子:7.2 )
ISSN: 1556-4959
年卷期: 2024 年
页码:
收录情况: SCI
摘要: The current mainstream approaches for plant organ counting are based on convolutional neural networks (CNNs), which have a solid local feature extraction capability. However, CNNs inherently have difficulties for robust global feature extraction due to limited receptive fields. Visual transformer (ViT) provides a new opportunity to complement CNNs' capability, and it can easily model global context. In this context, we propose a deep learning network based on a convolution-free ViT backbone (tea chrysanthemum-visual transformer [TC-ViT]) to achieve the accurate and real-time counting of TCs at their early flowering stage under unstructured environments. First, all cropped fixed-size original image patches are linearly projected into a one-dimensional vector sequence and fed into a progressive multiscale ViT backbone to capture multiple scaled feature sequences. Subsequently, the obtained feature sequences are reshaped into two-dimensional image features and using a multiscale perceptual field module as a regression head to detect the overall scale and density variance. The resulting model was tested on 400 field images in the collected TC test data set, showing that the proposed TC-ViT achieved the mean absolute error and mean square error of 12.32 and 15.06, with the inference speed of 27.36 FPS (512 x 512 image size) under the NVIDIA Tesla V100 GPU environment. It is also shown that light variation had the greatest effect on TC counting, whereas blurring had the least effect. This proposed method enables accurate counting for high-density and occlusion objects in field environments and this perception system could be deployed in a robotic platform for selective harvesting and flower phenotyping.
- 相关文献
作者其他论文 更多>>
-
Effects of Ammonification-Steam Explosion Pretreatment on the Production of True Protein from Rice Straw during Solid-State Fermentation
作者:Li, Bin;Zhao, Chao;Zhao, Xiangjun;Xu, Lijun;Yang, Zidong;Peng, Hehuan;Sun, Qian;Chen, Kunjie
关键词:steam explosion; lignocellulose; solid substrate fermentation; true protein; rice straw
-
Drying process optimization of garlic slices in closed-loop heat pump drying system by Box-Behnken design
作者:Liu, Haolu;Yousaf, Khurram;Nyalala, Innocent;Chen, Kunjie;Liu, Haolu;Yousaf, Khurram;Chattha, Muhammad Waqas Alam;Yu, Zhenwei;Riaz, Asad
关键词:
-
Rice straw addition and biological inoculation promote the maturation of aerobic compost of rice straw biogas residue
作者:Du, Xiaorong;Li, Bin;Zhao, Chao;Xu, Lijun;Yang, Zidong;Chen, Kunjie;Sun, Qian;Chandio, Farman Ali;Wu, Guiru
关键词:Biogas residue; Rice straw; Inoculation; Composting; Maturity