计算与应用讨论班
报告题目:Privacy introduces minor trade-offs in fine-tuning: a perspective from training dynamics and representations
报 告 人:王晨笛(厦门大学)
时 间:2025年11月12日(星期三),上午10:30
地 点:海纳苑2幢920
摘 要:In this talk, we study the behavior of representations and training dynamics when fine-tuning foundation models under differential privacy (DP). Specifically, we employ recently developed representation learning tools—such as the laws of data separation and next-token prediction separability metrics—to analyze how DP noise affects feature quality across transformer blocks.
Our experiments span a wide range of settings, including Vision Transformers fine-tuned on CIFAR-10 and Llama 3.2 models of various scales fine-tuned on mathematical reasoning benchmarks. We find that while poorly tuned hyperparameters can severely distort learned representations, carefully tuned hyperparameters preserve high-quality features during fine-tuning. This observation explains why public pretraining effectively mitigates the privacy–utility trade-off. Beyond representation quality, we further examine the training dynamics of DP fine-tuning in large language models. Taken together, our findings suggest that, from a representation learning perspective, privacy introduces only minor trade-offs in fine-tuning.
报告人简介:王晨笛是厦门大学王亚南经济研究院(WISE)与经济学院的助理教授。他在香港理工大学获得博士学位,并在北京师范大学获得学士学位。2021年至2024年期间,他在美国宾夕法尼亚大学沃顿商学院统计与数据科学系从事访问学者。他的研究方向聚焦数据隐私与机器学习,相关成果接收或发表于《美国国家科学院院刊》(PNAS)等顶级期刊,以及ICML、ICLR、NeurIPS等顶尖人工智能会议,包括一篇 ICML 2024口头报告(前 1.5%)及 NeurIPS 2025 spotlight(前 3%)。此外,他参与的美国人口普查数据隐私合作研究,还曾被美国《新科学家》(New Scientist)杂志报道。