TIME
2024年12月3日(周二)14:00-15:00
VENUE
信管学院308会议室
SPEAKER
Muning Wen(温睦宁) is currently a third-year Ph.D. student at Shanghai Jiao Tong University, under the supervision of Professor Weinan Zhang. He possesses extensive theoretical and practical experience in reinforcement learning, multi-agent systems, and LLM agents. In his recent academic endeavors, Muning has been dedicated to developing advanced RL/MARL algorithms aimed at enhancing the sequential decision-making capabilities of LLM agents in dynamic environments. Additionally, he has been deeply involved in the application of these algorithms in fields such as data science, mathematics, and embodied intelligence. In the past three years, Muning has published over ten papers in top-tier academic conferences, including NeurIPS, ICML, and ICLR. Since 2023, he has also been serving as a reviewer for these prestigious conferences.
PERSONAL HOMEPAGE
https://scholar.google.com/citations?user=Zt1WFtQAAAAJ
TITLE
Building Generalizable Sequential Decision-Making Systems: Multi-Agent Reinforcement Learning in the Era of LLMs
ABSTRACT
In this talk, the speaker will discuss the feasibility of building a sequence decision-making system with strong generalization abilities, drawing from his previous research experience in the fields of multi-agent reinforcement learning and LLM agents. The speaker will first introduce the Multi-Agent Advantage Decomposition Theorem and its application in multi-agent reinforcement learning. This approach allows for transforming the MARL problem into a sequence modeling problem, which can then be optimized in conjunction with sequence models like Transformers. Additionally, the speaker will present their latest exploration to improve LLM agents' performance, including a framework for LLM agent reinforcement learning—Action Decomposition-based Bellman Update and Policy Optimization (BAD and POAD), which aims to bridge the theoretical gaps between reinforcement learning and language model optimization and improve learning efficiency. Lastly, the speaker will explore the alignment between multi-agent sequence modeling methods and the current generative paradigm of language agents, discussing the potential and challenges of applying multi-agent reinforcement learning for systems involving multiple language agents.
欢迎 关注!