期刊导航

论文摘要

基于主题注意力层次记忆网络的文档情感建模

Document sentiment modeling based on topic attention hierarchy memory network

作者:刘广峰(重庆理工大学计算机科学与工程学院);黄贤英(重庆理工大学);刘小洋(重庆理工大学);范海波(重庆理工大学)

Author:liuguangfeng(College of Computer Science and Engineering,Chongqing University of Technology);huangxianying(Chongqing University of Technology);liuxiaoyang(Chongqing University of Technology);fanhaibo(Chongqing University of Technology)

收稿日期:2018-10-30          年卷(期)页码:2019,56(5):0833-0842

期刊名称:四川大学学报: 自然科学版

Journal Name:Journal of Sichuan University (Natural Science Edition)

关键字:文档分类;情感分析;层次记忆网络;注意力机制;词向量

Key words:Document classification; Sentiment analysis; Hierarchical memory network; Attention mechanism; Word embedding

基金项目:国家社会科学基金项目(17XXW004);教育部基金项目(16YJC860010);2017年重庆市教委人文社会科学研究项目(17SKG144);2018年重庆市科委技术创新与应用示范项目(cstc2018jscx-msybX0049);重庆市教委科学技术研究青年项目(KJQN201801104);重庆理工大学研究生创新项目资助(ycx2018245)

中文摘要

针对文档水平情感分析传统模型存在先验知识依赖以及语义理解不足问题,提出一种基于注意力机制与层次网络特征表示的情感分析模型TWE ANN.采用基于CBOW方式的word2vec模型针对语料训练词向量,减小词向量间的稀疏度,使用基于Gibbs采样的LDA算法计算出文档主题分布矩阵,继而通过层次LSTM神经网络获取更为完整的文本上下文信息从而提取出深度情感特征,将文档主题分布矩阵作为模型注意力机制提取文档特征,从而实现情感分类.实验结果表明:提出的TWE ANN模型较TSA、HAN模型分类效果较好,在Yelp2015、IMDB、Amazon数据集上的F值分别提升了1.1%、0.3%、1.8%,在Yelp2015和Amazon数据集上的RMSE值分别提升了1.3%、2.1%.

英文摘要

To the problem of prior knowledge and lack of semantic understanding in the traditional model of document level sentiment analysis, this paper proposes an sentiment analysis model called TWE ANN(Attention Neural Networks based on Topic enhanced Word Embedding), which is based on attention mechanism and hierarchical network feature representation. The word2vec model based on CBOW is used to train the word vector for corpus and the sparsity in the word vectors is reduced, the document topic distribution matrix is computed with LDA algorithm based on Gibbs sampling, the more complete text context information are obtained through hierarchical LSTM neural network and the deep sentiment features are finally extracted. The document topic distribution matrix is used as the model attention mechanism to extract the document features and the sentiment classification is thereby implemented. The experimental results show that the proposed TWE ANN model has better classification results, compared with the TSA and HAN models. The F values on the Yelp2015, IMDB, and Amazon datasets is increased by 1.1%, 0.3%, and 1.8%, respectively, and the RMSE values on the Yelp2015 and Amazon datasets increased by 1.3% and 2.1%, respectively.

关闭

Copyright © 2020四川大学期刊社 版权所有.

地址:成都市一环路南一段24号

邮编:610065