A New Chinese Named Entity Recognition Method for Pig Disease Domain Based on Lexicon-Enhanced BERT and Contrastive Learning
文献类型: 外文期刊
作者: Peng, Cheng 1 ; Wang, Xiajun 1 ; Li, Qifeng 1 ; Yu, Qinyang 1 ; Jiang, Ruixiang 1 ; Ma, Weihong 1 ; Wu, Wenbiao 1 ; Meng, Rui 1 ; Li, Haiyan 1 ; Huai, Heju 1 ; Wang, Shuyan 1 ; He, Longjuan 5 ;
作者机构: 1.Beijing Acad Agr & Forestry Sci, Informat Technol Res Ctr, Beijing 100097, Peoples R China
2.Natl Innovat Ctr Digital Technol Anim Husb, Beijing 100097, Peoples R China
3.Natl Engn Res Ctr Informat Technol Agr, Beijing 100097, Peoples R China
4.Hubei Univ, Fac Resources & Environm Sci, Wuhan 430061, Peoples R China
5.Chinese Acad Agr Sci, Inst Agr Econ & Dev, Beijing 100081, Peoples R China
关键词: pig disease; Chinese named entity recognition; lexicon-enhanced BERT; contrastive learning; small sample
期刊名称:APPLIED SCIENCES-BASEL ( 影响因子:2.5; 五年影响因子:2.7 )
ISSN:
年卷期: 2024 年 14 卷 16 期
页码:
收录情况: SCI
摘要: Featured Application Our work provides reliable technical support for the information extraction of pig diseases in Chinese . It can be applied to other domain - specific fields, thereby facilitating seamless adaptation for named entity identification across diverse contexts .Abstract Named Entity Recognition (NER) is a fundamental and pivotal stage in the development of various knowledge-based support systems, including knowledge retrieval and question-answering systems. In the domain of pig diseases, Chinese NER models encounter several challenges, such as the scarcity of annotated data, domain-specific vocabulary, diverse entity categories, and ambiguous entity boundaries. To address these challenges, we propose PDCNER, a Pig Disease Chinese Named Entity Recognition method leveraging lexicon-enhanced BERT and contrastive learning. Firstly, we construct a domain-specific lexicon and pre-train word embeddings in the pig disease domain. Secondly, we integrate lexicon information of pig diseases into the lower layers of BERT using a Lexicon Adapter layer, which employs char-word pair sequences. Thirdly, to enhance feature representation, we propose a lexicon-enhanced contrastive loss layer on top of BERT. Finally, a Conditional Random Field (CRF) layer is employed as the model's decoder. Experimental results show that our proposed model demonstrates superior performance over several mainstream models, achieving a precision of 87.76%, a recall of 86.97%, and an F1-score of 87.36%. The proposed model outperforms BERT-BiLSTM-CRF and LEBERT by 14.05% and 6.8%, respectively, with only 10% of the samples available, showcasing its robustness in data scarcity scenarios. Furthermore, the model exhibits generalizability across publicly available datasets. Our work provides reliable technical support for the information extraction of pig diseases in Chinese and can be easily extended to other domains, thereby facilitating seamless adaptation for named entity identification across diverse contexts.
- 相关文献
作者其他论文 更多>>
-
Effect of combined nitrogen and phosphorus fertilization on summer maize yield and soil fertility in coastal saline-alkali land
作者:Ma, Changjian;Wang, Yue;Liu, Lining;Wang, Xuejun;Sun, Zeqiang;Li, Yan;Ma, Changjian;Wang, Yue;Wu, Wenbiao;Hou, Peng;Li, Bowen;Yuan, Huabin
关键词:Grain yield; Biomass yield; Fertilizer physiological efficiency; Coastal saline-alkali land
-
DASNet a dual branch multi level attention sheep counting network
作者:Chen, Yini;Gao, Ronghua;Li, Qifeng;Wang, Rong;Ding, Luyu;Li, Xuwen;Chen, Yini;Zhao, Hongtao;Li, Xuwen
关键词:
-
Combining UAV Remote Sensing with Ensemble Learning to Monitor Leaf Nitrogen Content in Custard Apple (Annona squamosa L.)
作者:Jiang, Xiangtai;Xu, Xingang;Wu, Wenbiao;Yang, Guijun;Meng, Yang;Feng, Haikuan;Li, Yafeng;Xue, Hanyu;Chen, Tianen;Jiang, Xiangtai;Xu, Xingang;Gao, Lutao
关键词:canopy nitrogen content; UAV remote sensing; ensemble learning; Lasso model
-
Hyperspectral estimation of chlorophyll content in grapevine based on feature selection and GA-BP
作者:Li, Yafeng;Xu, Xingang;Wu, Wenbiao;Jiang, Xiangtai;Meng, Yang;Yang, Guijun;Xue, Hanyu;Li, Yafeng;Xu, Xingang;Zhu, Yaohui;Gao, Lutao
关键词:Data preprocessing; Feature selection; Machine learning; Hyperspectral monitoring.
-
Using XGBoost-SHAP for understanding the ecosystem services trade-off effects and driving mechanisms in ecologically fragile areas
作者:Du, Peiyu;Huai, Heju;Wang, Hongjia;Liu, Wen;Tang, Xiumei;Du, Peiyu;Huai, Heju;Wang, Hongjia;Liu, Wen;Tang, Xiumei;Du, Peiyu;Wu, Xiaoyang;Wang, Hongjia;Liu, Wen
关键词:ecosystem services; trade-offs and synergies; XGBoost-SHAP; driving mechanism; ecologically fragile areas
-
Construction and Completion of the Knowledge Graph for Cow Estrus with the Association Rule Mining
作者:Cheng, Zhiwei;Yu, Helong;Cheng, Zhiwei;Ding, Luyu;Peng, Cheng;Yang, Baozhu;Yu, Ligen;Li, Qifeng;Ding, Luyu;Peng, Cheng;Yu, Ligen;Li, Qifeng
关键词:cow estrus; knowledge graph; knowledge complementation; association rule algorithm
-
Wearable Sensors-Based Intelligent Sensing and Application of Animal Behaviors: A Comprehensive Review
作者:Ding, Luyu;Zhang, Chongxian;Yue, Yuxiao;Yao, Chunxia;Li, Zhuo;Hu, Yating;Yang, Baozhu;Ma, Weihong;Yu, Ligen;Gao, Ronghua;Li, Qifeng;Ding, Luyu;Yao, Chunxia;Yang, Baozhu;Ma, Weihong;Yu, Ligen;Gao, Ronghua;Li, Qifeng;Ding, Luyu;Yao, Chunxia;Yang, Baozhu;Ma, Weihong;Yu, Ligen;Gao, Ronghua;Li, Qifeng;Zhang, Chongxian;Yue, Yuxiao;Li, Zhuo;Hu, Yating
关键词:behavior monitoring; contact sensing; algorithms; tiny machine learning; monitoring applications



