An Enhanced Retrieval Scheme for a Large Language Model with a Joint Strategy of Probabilistic Relevance and Semantic Association in the Vertical Domain
文献类型: 外文期刊
作者: Chen, Qi 1 ; Zhou, Weifeng 2 ; Cheng, Jian 1 ; Yang, Ji 1 ;
作者机构: 1.Zhejiang Ocean Univ, Sch Informat Engn, Zhoushan 316022, Peoples R China
2.Chinese Acad Fishery Sci, East China Sea Fisheries Res Inst, Shanghai 200090, Peoples R China
关键词: large language model; information retrieval; BM25; retrieval-augmented generation
期刊名称:APPLIED SCIENCES-BASEL ( 影响因子:2.5; 五年影响因子:2.7 )
ISSN:
年卷期: 2024 年 14 卷 24 期
页码:
收录情况: SCI
摘要: Large language model (LLM) processing, with natural language as its core, carries out information retrieval through intelligent Q&A. It has a wide range of application scenarios and is commonly considered a kind of generative AI. However, when LLMs handle generation tasks, the results generated by fundamental LLMs with an insufficient comprehensive performance, specifically in the vertical domain, are often inaccurate due to a poor generalization ability, resulting in the so-called "hallucination" phenomenon. To solve these problems, in this study, an enhanced retrieval scheme for LLM processing was developed, named the BM-RAGAM (BM25 retrieval-augmented generation attention mechanism), by constructing a vectorized knowledge base, utilizing a hybrid joint retrieval strategy of keyword matching through searching and a semantic-enhanced association with an attention mechanism and taking ocean-front- and eddy-related knowledge in oceanography as an example. This scheme realized accurate word-based matching with the BM25 algorithm and text generation through a semantic-enhanced association using RAG, and it was used to construct a vector database of the text knowledge on ocean fronts and eddies. The output was compared and analyzed with the fundamental LLM of Qwen2-72B using the proposed scheme, and an ablation experiment was conducted. The results show that the proposed scheme greatly reduced hallucination generation in the process of text generation, making its outputs more interpretable.
- 相关文献
作者其他论文 更多>>
-
Screening and Analysis of Potential Aquaculture Spaces for Larimichthys crocea in China's Surrounding Waters Based on Environmental Temperature Suitability
作者:Yang, Ling;Zhou, Weifeng;Cui, Xuesen;Lu, Yanan;Liu, Qin;Yang, Ling
关键词:
Larimichthys crocea ; deep-sea aquaculture; potential spaces; spatial analysis; China -
Effects of western boundary currents and sea surface temperature anomalies on interannual variability of chub mackerel abundance in the Northwest Pacific
作者:Li, Jiasheng;Zhou, Weifeng;Dai, Yang;Tang, Fenghua;Wu, Yumei;Zhang, Heng;Fan, Xiumei;Cui, Xuesen;Li, Jiasheng
关键词:Chub mackerel; Abundance; Kuroshio; Oyashio; Sea surface temperature anomaly; Northwest Pacific
-
Unsupervised Classification of Global Temperature Profiles Based on Gaussian Mixture Models
作者:Ye, Xiaotian;Zhou, Weifeng;Ye, Xiaotian
关键词:ocean temperature; Gaussian Mixture Models; the optimal model; global distribution
-
A Data Cleaning Method for the Identification of Outliers in Fishing Vessel Trajectories Based on a Geocoding Algorithm
作者:Zhang, Li;Zhou, Weifeng;Zhang, Li
关键词:Geohash; fishing vessel; trajectory data; outliers; data cleaning; data mining
-
Multiscale variation analysis of sea surface temperature in the fishing grounds of pelagic fisheries
作者:Lai, Qixiang;Zhou, Weifeng;Lai, Qixiang
关键词:pelagic fisheries; sea surface temperature; Ocean Nino Index; trend decomposition; variance analysis; the interquartile ranges; change point analysis
-
Fishing operation type recognition based on multi-branch convolutional neural network using trajectory data
作者:Jiang, Bohui;Zhou, Weifeng;Jiang, Bohui
关键词:Geohash; Vessel trajectory; Deep convolutional neural network; Deep learning; Spatiotemporal context; Fishing operation type; Embedding
-
Identification of glass eel capture equipment in the Yangtze River estuary based on high-spatial-resolution imagery and an improved YOLOv8 model
作者:Zhu, Pengfei;Zhou, Weifeng;Zhu, Pengfei
关键词:YOLOv8; Small target detection; Glass eel; Deep learning



