理论中心前沿系列讲座 | 线上讲座:揭秘大语言模型推理机制——超越人类的二级推理

科技   科技   2024-09-23 17:14   北京  


微软亚洲研究院理论中心前沿系列讲座第二十期,将于 9 月 25 日(周三)上午 10:00 - 11:00 与你相见。


本期,我们请到了卡内基梅隆大学博士生叶添,带来以 “Revealing the Reasoning Mechanisms of Large Language Models—Level-2 Reasoning Skill Beyond Human” 为主题的讲座分享,欢迎通过 Teams 参会!


  //  

理论中心前沿系列讲座是微软亚洲研究院的常设系列讲座,将邀请全球站在理论研究前沿的研究者介绍他们的研究发现,主题涵盖大数据、人工智能以及其他相关领域的理论进展。讲座以线上直播与线下研讨的形式呈现,通过这一系列讲座,我们期待与各位一起探索当前理论研究的前沿发现,并建立一个活跃的理论研究社区。


欢迎对理论研究感兴趣的老师同学们参与讲座并加入社区(加入方式见后文),共同推动理论研究进步,加强跨学科研究合作,助力打破 AI 发展瓶颈,实现计算机技术实质性发展!



参加方式


欢迎您通过 Teams 参会并与讲者互动

会议链接:

https://teams.microsoft.com/l/meetup-join/19%3ameeting_OTExNmEwMjYtOGVmYi00MmZmLWE4Y2QtZWMyNmEzYTA0ZTgy%40thread.v2/0?context=%7b%22Tid%22%3a%2272f988bf-86f1-41af-91ab-2d7cd011db47%22%2c%22Oid%22%3a%2259a635d1-33e7-4d85-a6c4-c84917190ee5%22%7d

会议 ID:256 449 644 193

会议密码:dJu5sx

会议时间:9 月 25 日(周三)上午 10:00 - 11:00


讲座信息


叶添

卡内基梅隆大学

博士生


Tian Ye, a PhD student at Carnegie Mellon University, majoring in Machine Learning, currently serves as a Research Scientist Intern at Meta. His research interests

primarily focus on the reasoning mechanisms of large language models. He has published research papers at top conferences such as NeurIPS. Additionally, he has twice qualified for the Chinese Mathematical Olympiad training team; he earned his bachelor's degree from the Yao Class at Tsinghua University.


报告题目:

Revealing the Reasoning Mechanisms of Large Language Models—Level-2 Reasoning Skill Beyond Human


The latest language models have demonstrated near-perfect accuracy on elementary math test sets (such as GSM8K), indicating their capability to solve mathematical reasoning problems. To investigate how language models address these issues, we designed a series of controlled variable experiments and explored the following questions: First, have language models truly learned genuine reasoning abilities, or do they merely rely on memorizing answer templates? Second, what is the nature of the internal reasoning process within the models? Third, do the models employ human-like techniques to solve mathematical problems? Fourth, can models trained on datasets like GSM8K learn reasoning skills beyond what is necessary to solve GSM8K problems? Fifth, what causes the models to make reasoning errors? Sixth, how large or deep must a model be to effectively solve math problems at the GSM8K level? Our research reveals many hidden mechanisms by which language models solve mathematical problems and provides insights that surpass the current understanding of large language models.



上期讲座回顾


在上期讲座中,我们邀请到南加州大学计算机科学系教授滕尚华,带来以 “Regularization and Optimal Multiclass Learning” 为主题的讲座分享,探讨研究正则化在ERM失效中的作用。


若想了解往期讲座详情,请点击文末“阅读原文”或访问下方链接:

https://www.microsoft.com/en-us/research/event/msr-asia-theory-lecture-series/



 加入理论研究社区


欢迎扫码加入理论研究社区,与关注理论研究的研究者交流碰撞,群内也将分享微软亚洲研究院理论中心前沿系列讲座的最新信息。


【微信群二维码】



您也可以向

MSRA.TheoryCenter@outlook.com 发送以"Subscribe the Lecture Series"为主题的邮件,以订阅讲座信息。


关于微软亚洲研究院理论中心

2021 年 12 月,微软亚洲研究院理论中心正式成立,期待通过搭建国际学术交流与合作枢纽,促进理论研究与大数据和人工智能技术的深度融合,在推动理论研究进步的同时,加强跨学科研究合作,助力打破 AI 发展瓶颈,实现计算机技术实质性发展。目前,理论中心已经汇集了微软亚洲研究院内部不同团队和研究背景的成员,聚焦于解决包括深度学习、强化学习、动力系统学习和数据驱动优化等领域的基础性问题。


想了解关于理论中心的更多信息,请访问
https://www.microsoft.com/en-us/research/group/msr-asia-theory-center/



微软学术合作
架起微软与学术界的合作桥梁
 最新文章