一种基于h指数变体的软件网络节点重要性度量方法
A Method of Node Importance Measurement in Software Network Based on the Variations of h-Index
作者:丁沂(武汉大学 计算机学院 软件工程国家重点实验室, 湖北 武汉 430072;武汉软件工程职业学院 计算机学院, 湖北 武汉 430205);李兵(武汉大学 国际软件学院 软件工程国家重点实验室, 湖北 武汉 430072;武汉大学 复杂网络研究中心, 湖北 武汉 430072);程璨(武汉大学 国际软件学院 软件工程国家重点实验室, 湖北 武汉 430072);赵玉琦(武汉大学 国际软件学院 软件工程国家重点实验室, 湖北 武汉 430072)
Author:DING Yi(State Key Lab of Software Eng. & School of Computer, Wuhan Univ., Wuhan 430072, China;School of Computer, Wuhan Vocational College of Software and Eng., Wuhan 430205, China);LI Bing(International School of Software & State Key Lab. of Software Eng., Wuhan Univ., Wuhan 430072, China;Research Center of Complex Network, Wuhan Univ., Wuhan 430072, China);CHENG Can(International School of Software & State Key Lab. of Software Eng., Wuhan Univ., Wuhan 430072, China);ZHAO Yuqi(International School of Software & State Key Lab. of Software Eng., Wuhan Univ., Wuhan 430072, China)
收稿日期:2016-10-25 年卷(期)页码:2017,49(4):136-144
期刊名称:工程科学与技术
Journal Name:Advanced Engineering Sciences
关键字:关键类;h指数;软件网络;节点重要性;中心性度量
Key words:key class;h-index;software network;node importance;centrality measure
基金项目:国家重点研发计划资助项目(2016YFB0800401);国家重点基础研究发展计划资助项目(2014CB340401);国家自然科学基金资助项目(61572371);中国博士后基金资助项目(2015M582272);中央高校基本科研业务费专项资金资助项目(2042016kf0033);湖北省自然科学基金资助项目(2016CFB158);武汉市黄鹤英才(专项)计划资助项目资助项目
中文摘要
新成员在参与软件项目开发和维护系统时,往往需要花费大量时间去理解系统的结构和功能,为了加速新成员对系统的理解,通常优先推荐他们关注一些系统中更重要的类。大量研究表明软件系统具有明显的复杂网络拓扑形态,可以将软件系统抽象为软件网络模型,通过网络节点重要性度量方法识别软件系统中更重要的类,辅助新成员快速掌握系统的核心结构和功能。目前,关于网络节点重要性度量的方法很多,大多数方法仅考虑邻居节点的度或边的权重。另外,h指数作为一种成功用于定量评估研究人员学术成就的指标也很少应用于软件网络中重要类的识别。作者以Ant、Jung和Maven项目为研究对象,构建对应的加权软件网络模型,结合节点的度和连边的权重信息提出H-NWD、A-NWD和G-NWD 3个h指数的变体指标来度量软件系统中类的重要性,并与已有的度中心性、介数中心性、接近度中心性、特征向量中心性、PageRank中心性5个常用的复杂网络中心性度量指标进行对比。实验结果表明,本文所提的H-NWD和G-NWD指标与已有的度量指标交集达到80%以上,能够很好地识别软件系统中重要类;在确定类的修改情况下,H-NWD指标与度中心性、特征向量中心性、PageRank中心性共同识别的重要类节点rank值更靠前,且被识别的其他类节点修改更频繁,相比于已有指标在识别关键类上更准确。
英文摘要
When new members were involved in the development and maintenance of software projects,they usually need to spendmuch time to understand the architecture and function of the system.To help them understanding a software systemand quickly grasp the system,somekey classes were in general given priority to be recommended as soon as possible.A large number of studies have shown that the software system has aclear form of complex network topology.Therefore,we could build software network models,and then identified important classes in software systems by means of network node importance measurement,so as to help new members to master the core structure and function of the system quickly.Previously,there were many methods for measuring the importance of node in a network.But most methods considered only the degree of neighbor node or the weight of edge.As a metric successfully applied to evaluate the productivity of a scholar,little was known about whetherh-index was suitable to identify key classes in weighted software network.In this paper,based on the degree of node and the weight of edge,three variations of h-index (i.e.,H-NWD,A-NWD,G-NWD) were proposed to measure the importance of the classes on three open-source software projects (i.e.,Jung,Ant,and Maven) built by corresponding model of weighted software network.The feasibility of proposed measures was validated by comparing them with the five existing centrality measures of complex network(i.e.,degree centrality,betweenness centrality,closeness centrality,eigenvector centrality and pageRank centrality).The results showed that the proposed index of H-NWD and G-NWD was effective in identifying the key classes,and the intersection reached more than 80% with the existing metrics.In the case of determining class modifications,the rank value of important class nodes identified by H-NWD was much higher and the other class nodes identified by H-NWD were modified more frequently.Compared to existing indicators,it was more accurate in identifying key classes.
【关闭】