
Dominant-Set Based Clustering For Functional Data

来源:数学科学学院 发布时间:2024-07-15   334

报告人:汪洪浪教授(Indiana University-Purdue University Indianapolis)

时间:  2024725日下午13:00-17:00

地点:  海纳苑2102


报告人简介:Dr. Honglang Wang is an Associate Professor of Statistics in the Department of Mathematical Sciences at Indiana University-Purdue University Indianapolis (IUPUI). He got his PhD in Statistics from Michigan State University in 2015. His research interests focus on Statistical Analysis for Longitudinal and Functional Data, High Dimensional Statistical Inference and its Applications, Causal Inference, Machine Learning/Deep Learning, Nonparametric Statistics, Empirical Likelihood Methods and its Applications, and Statistical Genetics/Genomics.


摘要: Dominant-set based clustering is a sequential partitioning of data to maximize the within-cluster similarity using the concept from graph theory, which is different from common existing methods such as K-means clustering, agglomeration hierarchical clustering, and spectral clustering. We propose a hierarchical bipartition procedure under the penalized optimization framework with the tuning parameter selected by maximizing the modularity of the resulting two clusters. The proposed dominant-set based hierarchical clustering method is applied to functional data clustering with a flexible choice of similarity measures between curves. It is not only robust to imbalanced groups but also to outliers, which overcomes the limitation of many existing clustering methods. 

We further propose a thorough semi-supervised clustering method that learns the metric by modularity maximization over a linear combination of similarity metric candidates from the labeled portion of the data, and perform hierarchical dominant-set based clustering tuned by modularity maximization.  The proposed algorithm is not only able to learn a global metric but also able to learn individual metrics for each cluster, which permits innovative clustering with overlapping clusters.  This is a general clustering method and superiorly applicable to functional data which in nature encompass a variety of metrics for comparing curves. Empirical investigations using simulation studies and real data applications demonstrate the advantages of our proposed methods.





Copyright © 2023 浙江大学数学科学学院    版权所有


技术支持: 创高软件     管理登录

    您是第 1000 位访问者