自适应时间平滑的演化谱聚类
CSTR:
作者:
作者单位:

扬州大学信息工程学院, 扬州 225009

通讯作者:

徐晓华,男,副教授,E-mail:arterx@gmail.com。

中图分类号:

TP181

基金项目:

国家自然科学基金(61402395)资助项目;江苏省自然科学基金(BK20201430,BK20151314,BK20140492)资助项目。


Adaptive Time-Smoothed Evolutionary Spectral Clustering
Author:
Affiliation:

College of Information Engineering, Yangzhou University, Yangzhou 225009, China

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献 [17]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    传统的聚类算法一般只适用于静态数据的处理,而真实世界的数据往往数据量大且变化多,静态的聚类算法不能为动态数据提供其演化规律的分析学习。演化数据的聚类,一方面要正确反映每一时刻数据的合理簇划分,另一方面又要使动态的聚类结果在演化过程中尽可能平滑。本文提出了一种自适应时间平滑的演化聚类框架,该模型考虑到当前时刻数据与历史时刻数据的未知关联,通过限定时间回溯的范围,自适应地寻找与当前快照最相关的历史快照,并通过有机融合基于Itakura-Saito距离的静态相似度和基于时间序列的动态相似度,计算各个时间片快照上的相似度矩阵。本文进一步提出了两种自适应时间平滑的演化谱聚类算法,从不同的角度定义时间代价,得到不同的演化聚类结果。在真实数据集上的实验表明这两种算法能够有效地利用历史数据,在聚类结果上准确性更高,时间平滑性也更好。

    Abstract:

    Traditional clustering algorithms are generally only suitable for static data processing, while the real world data are often large and changeable, so static clustering algorithms cannot provide the analysis and learning of evolution rules for dynamic data. On one hand, the clustering of evolutionary data needs to reflect the reasonable cluster partition of data at each snapshot; on the other hand, it needs to make sure the dynamic clustering results are as smooth as possible. This paper proposes an adaptive time-smoothed evolutionary clustering framework, which takes into account of the unknown relationship between the current data and the historical data. By imposing a time window for backtracking, it adaptively finds the most relevant historical snapshot to the current snapshot. Meanwhile, it fuses the static similarity based on the Itakura-Saito distance and the dynamic similarity based on the time series to compute, so as so compute the similarity matrix on each snapshot. Under this framework, this paper further proposes two adaptive time-smoothed evolutionary spectral clustering algorithms, which define the time cost from different aspects, and obtain different evolutionary clustering results. Experiments on real datasets show that the two proposed algorithms can effectively utilize historical data, and achieve better clustering performance as well as better temporal smoothness.

    图1 演化数据示例Fig.1 Illustration of evolutionary data
    图2 非线性演化数据示例Fig.2 Illustration of non-linear evolutionary data
    图3 MNIST部分数据样本示例Fig.3 Illustration of data samples in MNIST dataset
    图4 Iris数据集快照聚类质量的比较Fig.4 Comparison of snapshots quality on Iris dataset
    图5 Iris数据集时间损失的比较Fig.5 Comparison of history cost on Iris dataset
    图6 Iris数据集NMI的比较Fig.6 Comparison of NMI on Iris dataset
    图7 MNIST数据集快照聚类质量的比较Fig.7 Comparison of snapshots quality on MNIST dataset
    图8 MNIST数据集时间损失的比较Fig.8 Comparison of history cost on MNIST dataset
    图9 MNIST数据集NMI的比较Fig.9 Comparison of NMI on MNIST dataset
    参考文献
    [1] 张长水, 张见闻.演化数据的学习[J]. 计算机学报, 2013, 36(2): 310-316.
    [2] Chakrabarti D, Kumar R, Tomkins A. Evolutionary clustering[C]//Proceedings of 12th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.Philadelphia PA USA: ACM, 2006: 554-560.
    [3] Chakrabarti D, SAMMUT C, WEBB G. Graph mining[C]//Proceedings of Encyclopedia of Machine Learning and Data Mining. Boston, MA:Springer,2017: 581-584.
    [4] Chakrabarti D, Funiak S, Chang J, et al. Joint label inference in networks[C]//Proceedings of 27th International World Wide Web Conference. Lyon France: ACM,2018: 483-487.
    [5] Chi Y, Song X, Zhou D, et al. Evolutionary spectral clustering by incorporating temporal smoothness[C]//Proceedings of 13th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. San Jose California USA: ACM, 2007: 153-162.
    [6] Lin Y R, Chi Y, Zhu S, et al. Analyzing communities and their evolutions in dynamic social networks[J]. ACM Transactions on Knowledge Discovery from Data, 2009, 3(2): 1-31.
    [7] Tang L, Wang X, Liu H. Community detection via heterogeneous interaction analysis[J]. Data Mining and Knowledge Discovery, 2012, 25(1): 1-33.
    [8] Folino F, Pizzuti C. An evolutionary multiobjective approach for community discovery in dynamic networks[J]. IEEE Transactions on Knowledge & Data Engineering, 2014, 26(8): 1838-1852.
    [9] Folino F, Pizzuti C. Multiobjective evolutionary community detection for dynamic networks[C]//Proceedings of Genetic and Evolutionary Computation Conference. Portland, Oregon, USA: ACM, 2010: 535-536.
    [10] Rana C, Jain S K. An evolutionary clustering algorithm based on temporal features for dynamic recommender systems[J]. Swarm & Evolutionary Computation, 2014, 14: 21-30.
    [11] Giulio R, Luca P, Dino P, et al. Tiles: An online algorithm for community discovery in dynamic social networks[J]. Machine Learning,2017, 106(8): 1231-1241.
    [12] Xu K S, Kliger M, Hero A O. Adaptive evolutionary clustering[J]. Data Mining & Knowledge Discovery, 2014, 28(2): 304-336.
    [13] Xu K S, Hero A O. Dynamic stochastic blockmodels for time-evolving social networks[J]. IEEE Journal of Selected Topics in Signal Processing,2014, 8(4): 552-562.
    [14] Yu S, Liu M, Dou W, et al. Networking for big data: A survey[J]. IEEE Communications Surveys & Tutorials, 2017, 19(1): 531-549.
    [15] Li R, Zhang W, Zhao Y, et al. Sparsity learning formulations for mining time-varying data[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 27(5): 1411-1423.
    [16] Shi J, Malik J M. Normalized cuts and image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8): 888-905.
    [17] Bregman L M. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming[J]. USSR Computational Mathematics & Mathematical Physics, 1967, 7(3): 200-217.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

何萍,姜玉麟,徐晓华,林惠惠,葛方毅,方威,仁祥.自适应时间平滑的演化谱聚类[J].南京航空航天大学学报,2021,53(5):700-707

复制
分享
文章指标
  • 点击次数:800
  • 下载次数: 1890
  • HTML阅读次数: 727
  • 引用次数: 0
历史
  • 收稿日期:2020-09-25
  • 最后修改日期:2020-11-09
  • 在线发布日期: 2021-10-05
文章二维码
您是第6842775位访问者
网站版权 © 南京航空航天大学学报
技术支持:北京勤云科技发展有限公司
请使用 Firefox、Chrome、IE10、IE11、360极速模式、搜狗极速模式、QQ极速模式等浏览器,其他浏览器不建议使用!