期刊导航

论文摘要

基于因果岭回归的多数据源科研主题识别方法

The research topics identification with multiple data source based on causal regression

作者:何增颖(岭南师范学院网络与信息技术中心, 湛江 524048);陈建锐(岭南师范学院网络与信息技术中心, 湛江 524048);钟足峰(岭南师范学院商学院, 湛江 524048)

Author:HE Zeng-Ying(Network and Information Technology Center, Lingnan Normal University, Zhanjiang 524048, China);CHEN Jian-Rui(Network and Information Technology Center, Lingnan Normal University, Zhanjiang 524048, China);ZHONG Zu-Feng(Business School, Lingnan Normal University, Zhanjiang 524048, China)

收稿日期:2017-11-24          年卷(期)页码:2018,55(6):1204-1210

期刊名称:四川大学学报: 自然科学版

Journal Name:Journal of Sichuan University (Natural Science Edition)

关键字:多数据源;科研主题;识别方法;形态特征;因果岭回归

Key words:Multiple Data Source; Scientific Topics; Identification Method; Morphological Characteristics; Causal Regression

基金项目:广东省科技厅公益研究与能力建设专项资金项目(2015A020219013)

中文摘要

为了有效解决多数据源科研主题的识别问题,基于因果岭回归建立了一种新的多数据源科研主题识别方法。该方法首先给出了多数据源科研主题识别关键参数(如主题词的引用权重、状态密度)的评价指标,同时根据科研主题形态特征建立了特征函数,并基于因果岭回归给出了具体识别方法。最后,通过仿真实验深入研究了影响该识别方法的关键因素。结果显示,与朴素贝叶斯、KNN算法和MGe-LDA算法相比较,该方法在价值引用量、引用权重和前沿主题相似度等方面具有较大优势。

英文摘要

In order to effectively tackle the research topics identification with multiple data source, a new research topic identification method is presented based on causal regression. In this paper, the evaluation indicators are defined to identify the key parameters of research topics for multiple data source, such as the citation weight and status density of research topics, the feature function is established with morphological characteristics of research topics, and the research topics identification based on multiple data sources is modeled by causal regression. The experimental results show that the proposed method has great advantages in terms of value citation, citation weight and similarity with frontier topics, compared with Naive Bayes, KNN and Mge LDA algorithm.

关闭

Copyright © 2020四川大学期刊社 版权所有.

地址:成都市一环路南一段24号

邮编:610065