期刊导航

论文摘要

基于PROV的ETL起源信息统一表达机制

APROVBasedRepresentationofDataProvenanceforETLProcess

作者:柯洁(武汉大学计算机学院);董红斌(武汉大学国际软件学院);梁意文(武汉大学计算机学院);谭成予(武汉大学计算机学院);艾勇(中南民族大学计算机科学学院)

Author:KeJie(ComputerSchool,WuhanUniv.);DongHongbin(InternationalSchoolofSoftware,WuhanUniv.);LiangYiwen(ComputerSchool,WuhanUniv.);TanChengyu(ComputerSchool,WuhanUniv.);AiYong(CollegeofComputerSci.,SouthCentralUniv.forNationalities)

收稿日期:2015-01-11          年卷(期)页码:2015,47(5):123-129

期刊名称:工程科学与技术

Journal Name:Advanced Engineering Sciences

关键字:ETL;数据起源;互操作性;PROV;OPM

Key words:ETL;dataprovenance;interoperability;PROV;OPM

基金项目:国家自然科学基金面上项目资助(61170306)

中文摘要

在异构环境下,目前数据起源研究主要基于OPM模型来表示数据在ETL中的来源过程,存在着起源概念不统一、词汇使用混乱以及无法提供标准化访问等问题。基于W3C的PROV模型,提出了ETL起源信息的统一表达机制。该机制首先对ETL过程的起源概念及其关系进行了统一描述。然后,针对ETL过程特殊的语义表达需求,建立了多粒度的ETL起源词汇表。最终,建立在RDF之上的标准化查询机制提高了起源信息的可访问性。

英文摘要

In heterogeneous environment, data provenance information in ETL is represented on the basis of OPM.However,there is still a lack of consensus on conceptual representation of ETL provenance information,usage of provenance vocabulary and a consolidated access mode.A unified provenance representation mechanism,which was based on PROV,was proposed for ETL.Firstly,it presented a concept representation mechanism for ETL,which demonstrated primary provenance concepts and their relationships.Secondly,it constructed a multi-granularity vocabulary to fulfill the requirement of expressing provenance information on different abstraction levels.Finally,a standard access mode was proposed in which provenance information was organized into two levels,the bottom one was described with RDF,and the above level was formed based on query of the former.

关闭

Copyright © 2020四川大学期刊社 版权所有.

地址:成都市一环路南一段24号

邮编:610065