MMVSL: A multi-modal visual semantic learning method for pig pose and action recognition

文献类型: 外文期刊

第一作者: Guan, Zhibin

作者: Guan, Zhibin;Chai, Xiujuan;Guan, Zhibin;Chai, Xiujuan

作者机构:

关键词: MMVSL; Multi-modal; Pig pose estimation; Action recognition; Improved HRNet

期刊名称:COMPUTERS AND ELECTRONICS IN AGRICULTURE ( 影响因子:8.9; 五年影响因子:9.3 )

ISSN: 0168-1699

年卷期: 2025 年 229 卷

页码:

收录情况: SCI

摘要: Pig health monitoring is grounded on rapid detection and evaluation on their pose and behavior traits in complex environments. However, as of right now, there is no intelligent recognition approach for the posture and action of pigs raised in vast enclosures. Therefore, to unravel a shortage of datasets in pig pose and action comprehension research based on visual features, we accumulated multi-objective pig images in a large enclosure environment and constructed multi-grained pig pose estimation and multi-modal action recognition datasets. The datasets can be used for skeleton-based or RGB-based pig action recognition. Furthermore, MMVSL, a multi-modal visual semantic learning approach, was presented for pig pose and action comprehension. It can be used for global visual semantic extraction, local pose semantic learning, and fine-grained topological semantics analysis of pigs. It is composed of three key modules: global feature learning, local feature learning, and topological semantic feature learning. The experimental results demonstrate that the proposed method outperforms ST-GCN and RepVGG for multi-grained pig pose estimation and multi-modal pig action recognition, respectively. Dense pose estimation has an average precision of 96.8%, semi-dense pose estimation of 92.1%, and sparse pose estimation of 80.7%, respectively. The skeleton-based action recognition achieved top-1 accuracy of 94.3%, which is 0.5% higher than ST-GCN. The accuracy of RGB-based action recognition is 99.4%, which is 0.2% higher than RepVGG.

分类号:

  • 相关文献
作者其他论文 更多>>