数据科学| Stochastic Optimization for AUC Maximization in Machine Learning
Stochastic optimization algorithms such as stochastic gradient descent (SGD) update the model sequentially with cheap per-iteration costs, making them amenable for large-scale streaming data analysis. However, most of the existing studies focus on the classification accuracy which can not be directly applied to the important problems of maximizing the Area under the ROC curve (AUC) in imbalanced classification and bipartite ranking.
In this talk, I will present our recent work on developing novel SGD-type algorithms for AUC maximization. In particular, we establish min-max optimal rates for the existing AUC maximization algorithms using stability and generalization trade-off framework. Then, we explore the problem structures to develop fast stochastic optimization algorithms for AUC maximization with least-square loss. The main idea is to reformulate it as a stochastic saddle-point (min-max) problem. Furthermore, we extend the developed saddle-point formulation to the setting of general losses by Bernstein polynomial approximation, and to the setting of deep neural network. This talk is based on a series of joint work with Michael Natole, Neyo Yang, Wei Shen, Siwei Lyu, and Tianbao Yang.