期刊导航

论文摘要

基于类别保留投影的基因表达数据降维方法

Dimension Reduction for Gene Expression Data Using Class Preserveing Projection

作者:王文俊(西安电子科技大学);张军英(西安电子科技大学);杨利英(西安电子科技大学)

Author:wang wenjun();();()

收稿日期:2008-09-18          年卷(期)页码:2009,41(6):153-157

期刊名称:工程科学与技术

Journal Name:Advanced Engineering Sciences

关键字:基因表达数据;类别保留投影;可视化;聚类分析

Key words:gene expression data; Class Preserving Projection; visualization; clustering analysis

基金项目:国家自然科学基金资助项目(60533010);NSFC-微软亚洲研究院联合资助项目(60933009)

中文摘要

针对基因表达数据的高维小样本问题,提出一种新的线性降维方法。该方法采用保局投影,结合样本的类别信息,将基因表达数据投影到特征子空间。与主分量分析方法寻找最大方差方向不同,类别保留投影方法旨在寻找能够反映样本类别结构的特征子空间。采用该方法进行数据降维的同时能使样本按照类别属性进行聚类。对真实的基因表达数据进行了降维可视化和k均值聚类分析,并与主分量分析方法进行了实验比较,结果表明,类别保留投影方法在实现降维的同时能更好地识别样本的类别特征,从而可视化效果相比主分量分析要好得多,且能得到较好的聚类效果。

英文摘要

In order to overcome the problem of high-dimensionality of gene expression data, a linear-based method for dimensionality reduction was proposed. Using Locality Preserving Projections (LPP) incorporated with the class information, the gene expression data were mapped into a feature subspace. Different from Principal Component Analysis (PCA) which searches the direction of maximal variance, CPP seeks a feature subspace that reflects the class information of the samples and by using CPP the dimensions of the data can be reduced while preserving the class information. Experiment results on real gene expression data compared with the classical linear technology PCA showed that CPP can identify the class feature better and obtain much better efficiency of visualization after dimensionality reduction via CPP than PCA.

关闭

Copyright © 2020四川大学期刊社 版权所有.

地址:成都市一环路南一段24号

邮编:610065