基于微博数据的“新冠肺炎疫情”舆情演化时空分析
Spatial and temporal analysis on public opinion evolution of epidemic situation about novel coronavirus pneumonia based on micro blog data
作者:陈兴蜀(四川大学网络空间安全学院;四川大学网络空间安全研究院);常天祐(四川大学吴玉章学院;四川大学计算机学院);王海舟(四川大学网络空间安全学院, 成都 610207; 四川大学网络空间安全研究院, 成都 610065);赵志龙(四川大学吴玉章学院, 成都 610207; 四川大学计算机学院, 成都 610207);张杰(四川大学网络空间安全学院, 成都 610207)
Author:CHEN XingShu(College of Cybersecurity, Sichuan University;Cybersecurity Research Institute, Sichuan University);CHANG TianYou(Wu Yuzhang College, Sichuan University; College of Computer Science, Sichuan University);WANG HaiZhou(College of Cybersecurity, Sichuan University, Chengdu 610207, China; Cybersecurity Research Institute, Sichuan University, Chengdu 610065, China);ZHAO ZhiLong(Wu Yuzhang College of Sichuan University, Chengdu 610207, China; College of Computer Science, Sichuan University, Chengdu 610207, China);ZHANG Jie(College of Cybersecurity, Sichuan University, Chengdu 610207, China)
收稿日期:2020-03-03 年卷(期)页码:2020,57(2):409-416
期刊名称:四川大学学报: 自然科学版
Journal Name:Journal of Sichuan University (Natural Science Edition)
关键字:新浪微博;新冠肺炎疫情;分布式爬虫;情感分析;文本聚类;地理统计分析
Key words:Sina micro blog; Novel coronavirus pneumonia; Distributed crawler; Sentiment analysis; Text clustering; Geostatistical analysis
基金项目:四川省科技厅新型冠状病毒疫情防控科技攻关项目(2020YFS0007); 四川大学新冠肺炎应急项目(2020scunCoV应急20012); 四川大学大学生创新创业计划(C2020109217)
中文摘要
本文依托2020年1月1日至2月29日期间共计6万条新浪微博博文与1.5万条微博热门评论,基于分布式爬虫技术、分布式数据库系统、SnowNLP情感分析模型以及K-Means文本聚类算法,对与“新冠肺炎疫情”相关的话题展开舆情分析,可视化地展现本次疫情事件中网络舆情的时空演化过程。在时间维度层面,通过文本聚类与情感分析,发现网民对于此次肺炎疫情的态度大致经历了三个阶段,即起伏不定的紧张焦虑期、缓慢攀升的团结振作期以及波动很小的自信平稳期,总体上呈现积极大于消极、正面大于负面的情绪状态。在空间维度层面,通过地理统计分析,发现疫情最严重地区网民评论人数最多,同时情感值也最低。
英文摘要
Relying on 60 thousand blogs and 15 thousand hot blog reviews in Sina micro-blog from January 1st to February 29th in 2020, this article launches the analysis of the public opinion to the topic about novel coronavirus pneumonia based on distributed crawler technology, distributed database system, SnowNLP sentiment analysis model and K-Means algorithm. This analysis can show visually the spatial and temporal evolution process of Internet public opinion in the events of this epidemic situation. On spatial dimension, the netizens' attitude towards this pneumonia epidemic has roughly gone through three periods. The first period appears in the shape of bigger fluctuation which presents as tension and anxiety. The second period appears in the shape of rising slowly which presents as unity and excitement. The third period appears in the shape of slight fluctuation which presents as confidence and stability. On the whole, it shows the emotional state that positive is greater than negative and Optimism is greater than pessimism. On spatial dimension, we find that the area which has the most serious epidemic has the most comments and the lowest emotional value through geographical statistical analysis.
【关闭】