一种混合聚类算法及其应用
A Hybrid Clustering Algorithm and It’s Application
作者:胡瑞飞(四川大学 制造科学与工程学院 ,四川 成都 610065);殷国富(四川大学 制造科学与工程学院 ,四川 成都 610065);谭颖(四川大学 制造科学与工程学院 ,四川 成都 610065)
Author:(School of Manufacturing Sci. and Eng., Sichuan Univ., Chengdu 610065, China);(School of Manufacturing Sci. and Eng., Sichuan Univ., Chengdu 610065, China);(School of Manufacturing Sci. and Eng., Sichuan Univ., Chengdu 610065, China)
收稿日期:2005-10-20 年卷(期)页码:2006,38(5):156-161
期刊名称:工程科学与技术
Journal Name:Advanced Engineering Sciences
关键字:数据挖掘 ; 聚类; 种子对象
Key words:data mining;clustering;seed object
基金项目:国家自然科学基金资助项目(50575153 )
中文摘要
通过分析基于网格与基于密度的聚类算法特征,提出了一种基于网格和密度的混合聚类算法,通过分阶段聚类并选取代表单元中的种子对象来扩展类, 从而减少区域查询次数,实现快速聚类。该算法保持了基于密度的聚类算法可以发现任意形状的聚类和对噪声数据不敏感的优点,同时保持了基于网格的聚类算法的高效性,适合对大规模数据的挖掘。实验数据分析验证了算法的有效性,对数据挖掘应用于设备状态监测和故障诊断具有指导意义。
英文摘要
Grounding on the analysis of features of grid based and density based clustering methods, a hybrid clustering algorithm based on grid and density was presented. By clustering in two phases and using only a small number of seed objects in representative units to expand the cluster, the frequency of region query can be decreased, and consequently the cost of time is reduced. An equivalent rule was proposed to make smooth conversion between clustering parameters in that two phases. The algorithm keeps good feature of both density based and grid based clustering methods. It can discover clusters with arbitrary shape with high efficiency and is insensitive to noise. So it is applicable for data mining on large database. The application of the hybrid algorithm in data analysis of accelerometer demonstrates its effectiveness. It is of instructional meaning for the application of data mining in equipment monitoring and faults diagnosis.
【关闭】