Chat-rgie: precision extraction of rice germplasm data using large language models and prompt engineering
文献类型: 外文期刊
第一作者: Wei, Yijin
作者: Wei, Yijin;Fan, Jingchao;Wei, Yijin;Fan, Jingchao
作者机构:
关键词: Data extraction; Large language model (LLM); Rice germplasm; Agriculture
期刊名称:JOURNAL OF BIG DATA ( 影响因子:6.4; 五年影响因子:13.4 )
ISSN:
年卷期: 2025 年 12 卷 1 期
页码:
收录情况: SCI
摘要: Varietal improvement is a key aspect of breeding, and as a result of this work, crop varietal data becomes more complicated, requiring more resources to extract. As a result, we developed Chat-RGIE, a rice germplasm data extraction strategy based on conversational large language models (LLM) and cue word engineering, to achieve rice germplasm data extraction in a ZERO-shot manner. The technique employs multi-response voting to limit the chance of phantom appearances, as well as an additional calibration component to choose the best data extraction findings. We performed performance evaluation and real-life data extraction evaluation on Chat-RGIE, and the scheme obtained 0.9102 precision, 0.9941 recall, and 0.9554 accuracy in performance evaluation, and 0.6351 precision, 1.0 recall, and 0.8225 accuracy in real-life data extraction evaluation, which completely proved the effectiveness of the scheme. Furthermore, the well-designed data extraction procedure mitigates the likelihood of potential bias from a single large model leading to hallucinations to some extent, with the incidence of hallucinations in the two evaluations being 0.0015 and 0.005, respectively, with a very minor influence. Furthermore, we employed Restraint Rate, a statistic used to quantify the degree of limits placed by the prompt on LLM replies, with values of 0.9265 and 0.911 in the two evaluations, resulting in normative responses. Furthermore, when we examined the data extraction results, we discovered that when confronted with an unanswerable answer, the LLM is affected by the stress provided by the prompt, and the higher the stress, the more likely it is to engage in constraint-violating behavior, which is similar to what humans do when stressed. We therefore believe that some of the countermeasures in the human behavior in question also have the potential to help improve LLM performance.
分类号:
- 相关文献
作者其他论文 更多>>
-
Analysis of the genetic basis of fiber-related traits and flowering time in upland cotton using machine learning
作者:Li, Weinan;Peng, Jun;Zhang, Jianhua;Zhang, Mingjun;Yang, Zhaoen;Peng, Jun;Chai, Mao;Fan, Jingchao;Zhang, Jianhua;Li, Weinan;Lan, Yubin
关键词:
-
Extracting Fruit Disease Knowledge from Research Papers Based on Large Language Models and Prompt Engineering
作者:Fei, Yunqiao;Fan, Jingchao;Fei, Yunqiao;Fei, Yunqiao;Fan, Jingchao;Zhou, Guomin;Zhou, Guomin
关键词:research papers; knowledge extraction; large language models; prompt engineering; fruit tree diseases
-
TAL-SRX: an intelligent typing evaluation method for KASP primers based on multi-model fusion
作者:Chen, Xiaojing;Fan, Jingchao;Yan, Shen;Zhou, Guomin;Zhang, Jianhua;Chen, Xiaojing;Fan, Jingchao;Huang, Longyu;Zhou, Guomin;Zhang, Jianhua;Huang, Longyu;Huang, Longyu
关键词:KASP fractal evaluation; multi-model fusion; stacking integration; deep learning; hyperparameter tuning
-
KASP-IEva: an intelligent typing evaluation model for KASP primers
作者:Chen, Xiaojing;Fan, Jingchao;Yan, Shen;Zhang, Jianhua;Chen, Xiaojing;Huang, Longyu;Fan, Jingchao;Zhou, Guomin;Zhang, Jianhua;Huang, Longyu;Zhou, Guomin;Huang, Longyu
关键词:intelligent evaluation; KASP marker; decision tree; genotyping; cotton; molecular marker-assisted selection
-
DFN-PSAN: Multi-level deep information feature fusion extraction network for interpretable plant disease classification
作者:Dai, Guowei;Fan, Jingchao;Tian, Zhimin;Sunil, C. K.;Dewi, Christine;Fan, Jingchao
关键词:Deep learning; Image processing; Feature fusion; Multilevel features; Pixel attention; Disease classification
-
Diagnosis of Custard Apple Disease Based on Adaptive Information Entropy Data Augmentation and Multiscale Region Aggregation Interactive Visual Transformers
作者:Cui, Kunpeng;Huang, Jianbo;Dai, Guowei;Fan, Jingchao;Dewi, Christine;Dewi, Christine
关键词:plant disease; convolutional neural network; adaptive data augmentation; feature fusion; visual transformer
-
PPLC-Net:Neural network-based plant disease identification model supported by weather data augmentation and multi-level attention mechanism
作者:Dai, Guowei;Fan, Jingchao;Fan, Jingchao;Tian, Zhimin;Wang, Chaoyu
关键词:Convolutional neural network; Dilated convolutions; Global average pooling; Attention mechanism (CBAM); Weather data augmentation; Leaf disease recognition