基于GPU加速的恶意代码字节码特征提取方法研究
Research on feature extraction of malware bytecode based on GPU acceleration
作者:周紫瞻(四川大学计算机学院, 成都 610065);王俊峰(四川大学计算机学院)
Author:zhouzizhan(College of Computer Science, Sichuan University, Chengdu 610065, China);Wang junfeng(College of Computer Science, Sichuan University, Chengdu 610065, China)
收稿日期:2018-08-18 年卷(期)页码:2019,56(2):227-234
期刊名称:四川大学学报: 自然科学版
Journal Name:Journal of Sichuan University (Natural Science Edition)
关键字:恶意代码; 特征提取; CUDA; 字节码序列; N-Gram
Key words:Malware;Feature extraction; CUDA; Byte sequence; N-gram
基金项目:国家重点研发计划项目(2016YFB0800605、2016QY06X1205),装备预研教育部联合基金(6141A02033304、6141A02011607)和四川省重点研发计划项目(18ZDYF3867、18ZDYF2039)
中文摘要
随着恶意代码的数量和种类增长,快速有效地检测恶意代码显得十分有必要,其中关键技术就是恶意代码特征提取.针对现有恶意代码字节码序列特征提取速度的不足,提出了一种GPU加速提取恶意代码字节码序列特征的方法.使用目前比较成熟的统一计算设备架构CUDA,将传统恶意代码字节码序列特征提取方法中字节码N Gram特征的提取、TFIDF特征的计算等密集计算型任务移交给GPU进行并行计算.实验表明,针对不同样本文件大小的数据集,该方法均有2~4倍以上的速度提升,大幅提高恶意代码字节码序列特征提取的速度.
英文摘要
With the increase of the number and type of malicious code, it is necessary to detect malicious code quickly and effectively. One of the key point is the feature extraction of malicious code. Aiming at the insufficiency of the feature extraction speed of malicious code bytecode sequences in the existing methods, a method of GPU acceleration to extract the features of malicious code bytecode sequences is proposed. Compute intensive tasks such as the feature extraction of bytecode and the calculation of TFIDF features in traditional methods are transferred to the GPU for parallel computing by using the more mature CUDA architecture. The experimental results show that the method has a speed increase of 2 to 4 times for data sets with different sample file sizes,which greatly improves the speed of feature extraction of malicious code bytecode sequences.
【关闭】