【直播】“中国科大统计和计算物理邓友金课题组”专栏 | Deep Learning Theory through the…
学术
2024-11-25 00:01
安徽
“中国科大统计和计算物理邓友金课题组”专栏Deep Learning Theory through the Lens of Statistical Physics 2024年11月27日 14:30 2024年11月26日,黄伟博士将为大家带来“Deep Learning Theory through the Lens of Statistical Physics”学术报告,本场报告由中国科大统计和计算物理邓友金课题组主办,详情如下。 蔻享学术扫码观看直播
报告人介绍
黄伟 Dr. Wei Huang is a Research Scientist in the Deep Learning Theory Team at RIKEN AIP, Tokyo. He obtained his PhD. in Computer Science from the University of Technology Sydney, holds a Master's degree in Statistical Physics from the University of Science and Technology of China, and earneda Bachelor's degree in Physics from the University of Science and Technology of China. Dr. Huang focuses on exploring the theoretical foundations of deep learning, as well as developing new methods that enhance interpretability and performance in large language models, graph neural networks, and computer vision. His contributions to the field are documented in publications such as NeurlPS, ICLR, and ICML. 报告摘要 Deep learning has become a comerstone of modern artificial intelligence, achieving remarkable success across various domains. However, the underlying mechanisms governing the training and generalization of deep models remain elusive, particularly regarding the relationship between overparameterization and generalization. Statistical physics provides a powerful framework for analyzing complex systems, offering insights into deep learning through concepts such as disordered systems, energy landscapes, and phase transitions. This talk will explore how statistical physics methods can unravel thetraining dynamics and generalization of neural networks, focusing on applications in Neural Tangent Kernels (NTKs) and Mean-Field Feature Learning. Additionally, we will discuss the growing significance of diffusion models, emphasizing their theoretical foundations and applications. By bridging insights fromstatistical physics with deep learning paradigms, this session aims to provide a comprehensive understanding of the mechanisms underlying modern neural network architectures and inspire newdirections for research and applications.