RDFAdaptor: Efficient ETL Plugins for RDF Data Process

文献类型: 外文期刊

第一作者: Li, Jiao

作者: Li, Jiao;Xian, Guojian;Zhao, Ruixue;Huang, Yongwen;Kou, Yuantao;Luo, Tingting;Xian, Guojian;Zhao, Ruixue;Kou, Yuantao;Sun, Tan;Sun, Tan

作者机构:

关键词: RDF ETL solution; RDF data processing; Linked data; Portable plugins

期刊名称:JOURNAL OF DATA AND INFORMATION SCIENCE ( 影响因子:1.5; 五年影响因子:3.9 )

ISSN: 2096-157X

年卷期: 2021 年 6 卷 3 期

页码:

收录情况: SCI

摘要: Purpose: The interdisciplinary nature and rapid development of the Semantic Web led to the mass publication of RDF data in a large number of widely accepted serialization formats, thus developing out the necessity for RDF data processing with specific purposes. The paper reports on an assessment of chief RDF data endpoint challenges and introduces the RDF Adaptor, a set of plugins for RDF data processing which covers the whole life-cycle with high efficiency. Design/methodology/approach: The RDFAdaptor is designed based on the prominent ETL tool-Pentaho Data Integration-which provides a user-friendly and intuitive interface and allows connect to various data sources and formats, and reuses the Java framework RDF4J as middleware that realizes access to data repositories, SPARQL, endpoints and all leading RDF database solutions with SPARQL 1.1 support. It can support effortless services with various configuration templates in multi-scenario applications, and help extend data process tasks in other services or tools to complement missing functions. Findings: The proposed comprehensive RDF ETL solution-RDFAdaptor-provides an easy-to-use and intuitive interface, supports data integration and federation over multi-source heterogeneous repositories or endpoints, as well as manage linked data in hybrid storage mode. Research limitations: The plugin set can support several application scenarios of RDF data process, but error detection/check and interaction with other graph repositories remain to be improved. Practical implications: The plugin set can provide user interface and configuration templates which enable its usability in various applications of RDF data generation, multi-format data conversion, remote RDF data migration, and RDF graph update in semantic query process. Originality/value: This is the first attempt to develop components instead of systems that can include extract, consolidate, and store RDF data on the basis of an ecologically mature data warehousing environment.

分类号:

  • 相关文献
作者其他论文 更多>>