Spectral Clustering: Methodology and Statistical Analysis
2023-06-02 21:39:00
2023-06-02 21:39:00
2023-06-02 21:39:00
Speaker : 4:00PM, Zhang Ye, University of Pennsylvania, 312, Building 2, Hainayuan
Time : 2023-06-02 21:39:00
Location : June
Title: Spectral Clustering: Methodology and Statistical Analysis
Reported by: Zhang Ye (University of Pennsylvania)
Time: June 2, 2023 (Friday) 16:00
Location: 312, Building 2, Hainayuan
Summary: Spectral clustering is one of the most popular algorithms to group high dimensional data It is easy to implement, computationally effective, and has achieved success in many applications The idea behind spectral clustering is dimensionality reduction It first performs a spectral composition on the dataset and only keeps the leading few spectral components to reduce the dimension of the data It then applies some standard methods such as the k-means on the low dimensional space to do clustering In this talk, we justify the success of spectral clustering by providing a sharp statistical analysis of its performance under mixture models For isotropic Gaussian mixture models, we show spectral clustering is optimal For sub Gaussian mixture models, we derive empirical error rates for spectral clustering To establish these results, we developed a new spectral perturbation analysis for singular subspaces.