期刊导航

论文摘要

基于多重遗传算法的单核苷酸多态性特征选择

Feature Selection for Single Nucleotide Polymorphisms Based on Muti-group Genetic Algorithm

作者:蒋胜利(1.西安电子科技大学 计算机学院,陕西 西安 710071;2.洛阳师范学院 信息技术学院,河南 洛阳 471022);张军英(西安电子科技大学 计算机学院)

Author:Jiang Shengli(1.School of Computer Sci. and Eng., Xidian Univ., Xi’an 710071, China;2. Acadamy of Info. Technol.,Luoyang Normal Univ., Luoyang 471022, China);Zhang Junying(School of Computer Sci. and Eng., Xidian Univ.)

收稿日期:2009-04-11          年卷(期)页码:2010,42(2):132-138

期刊名称:工程科学与技术

Journal Name:Advanced Engineering Sciences

关键字:遗传算法;互信息;单核苷酸多态性;特征选择;机器学习

Key words:Genetic Algorithm;Mutual Information; Single Nucleotide Polymorphisms(SNP); Feature selection; Machine Learning

基金项目:国家自然科学基金资助项目(60574039;60674106)

中文摘要

应用统计机器学习方法研究大规模单核苷酸多态性(SNP)与复杂疾病的关联关系面临着“维数灾难”,首要的工作是把大规模SNP缩减为较小集合。为此,提出了多重遗传算法用于单核苷酸多态性的特征粗选择。该方法首次提出了用互信息衡量SNP与疾病间关联的紧密程度并作为遗传算法(GA)的适应值,通过多次运用遗传算法并合并寻优的结果得到候选的特征SNP集合。在SNP仿真数据上的实验及与最大熵(ME)方法性能比较表明,该方法最大可能丢弃了SNP集合中与疾病无关的SNP,同时保留了与疾病相关的SNP,为进一步研究提供了合适规模的SNP数据,本方法可用于规模中等或较大的SNP集合。

英文摘要

Association studies between SNP and complex disease using statistics and machine learning methods has been faced serious curse of dimensionality in a large-scale SNP set. Reducing a large-scale SNP set to a smaller one is the key and primary problem for the association research. To solve the problem, a novel method, called Multi-group Genetic Algorithm (MGA), is proposed for rough feature selection in SNPs. Mutual information (MI) as the fitness of genetic algorithm is used to measure the association relation of SNPs to disease. Optimal SNP subsets searched by MGA method are combined to form a feature SNP set. In contrast to Maximum Entropy (ME) method, this method can reduce the number of redundant SNPs ,which have nothing to do with disease, while disease-related SNPs are retained. Experimental results on simulated datasets of SNPs show that the MGA method provides the appropriate size of SNP data for the future research, and can be employed in a middle-scale or large-scale of SNPs set.

关闭

Copyright © 2020四川大学期刊社 版权所有.

地址:成都市一环路南一段24号

邮编:610065