点击蓝字 关注我们
论文分享 | 智能体相关研究进展
我们从2024-12-04到2024-12-09的34篇文章中精选出5篇优秀的工作分享给读者。
HyperMARL: Adaptive Hypernetworks for Multi-Agent RL Distributed Task Allocation for Multi-Agent Systems: A Submodular Optimization Approach MTSpark: Enabling Multi-Task Learning with Spiking Neural Networks for Generalist Agents Mobile Cell-Free Massive MIMO with Multi-Agent Reinforcement Learning: A Scalable Framework Out-of-Distribution Detection for Neurosymbolic Autonomous Cyber Agents
1.HyperMARL: Adaptive Hypernetworks for Multi-Agent RL
Authors: Kale-ab Abebe Tessera, Arrasy Rahman, Stefano V. Albrecht
https://arxiv.org/abs/2412.04233
论文摘要
Balancing individual specialisation and shared behaviours is a critical challenge in multi-agent reinforcement learning (MARL). Existing methods typically focus on encouraging diversity or leveraging shared representations. Full parameter sharing (FuPS) improves sample efficiency but struggles to learn diverse behaviours when required, while no parameter sharing (NoPS) enables diversity but is computationally expensive and sample inefficient. To address these challenges, we introduce \textit{HyperMARL}, a novel approach using hypernetworks to balance efficiency and specialisation. HyperMARL generates agent-specific actor and critic parameters, enabling agents to adaptively exhibit diverse or homogeneous behaviours as needed, without modifying the learning objective or requiring prior knowledge of the optimal diversity. Furthermore, HyperMARL decouples agent-specific and state-based gradients, which empirically correlates with reduced policy gradient variance, potentially offering insights into its ability to capture diverse behaviours. Across MARL benchmarks requiring homogeneous, heterogeneous, or mixed behaviours, HyperMARL consistently matches or outperforms FuPS, NoPS, and diversity-focused methods, achieving NoPS-level diversity with a shared architecture. These results highlight the potential of hypernetworks as a versatile approach to the trade-off between specialisation and shared behaviours in MARL.\footnote{All code will be made available soon.}
论文简评
这篇论文《HyperMARL:利用超网络平衡多智能体强化学习中的个体专长与共享行为》是一个创新性的研究,旨在解决多智能体强化学习中个体专长与共享行为之间的平衡问题。作者提出了一个名为HyperMARL的方法,并成功实现了样本效率的提升和策略梯度方差的降低,同时允许不同或同质的行为出现。该方法通过将超网络集成到多智能体强化学习中来实现这一目标。实验结果表明,相比于现有方法,HyperMARL在多个MARL基准测试上取得了显著的改进效果。此外,文章还探讨了超网络如何有效地分离代理特定和状态基的梯度,从而提高学习的稳定性。总的来说,这篇论文为多智能体强化学习提供了新思路和解决方案,对于推动相关领域的发展具有重要意义。
2.Distributed Task Allocation for Multi-Agent Systems: A Submodular Optimization Approach
Authors: Jing Liu, Fangfei Li, Xin Jin, Yang Tang
https://arxiv.org/abs/2412.02146
论文摘要
This paper investigates dynamic task allocation for multi-agent systems (MASs) under resource constraints, with a focus on maximizing the global utility of agents while ensuring a conflict-free allocation of targets. We present a more adaptable submodular maximization framework for MAS task allocation under resource constraints. Our proposed distributed greedy bundles algorithm (DGBA) is specifically designed to address communication limitations in MASs and provides rigorous approximation guarantees for submodular maximization under-independent systems, with low computational complexity. Specifically, DGBA can generate a feasible task allocation policy within polynomial time complexity, significantly reducing space complexity compared to existing methods. To demonstrate the practical viability of our approach, we apply DGBA to the scenario of active observation information acquisition within a micro-satellite constellation, transforming the NP-hard task allocation problem into a tractable submodular maximization problem under a-independent system constraint. Our method not only provides a specific performance bound but also surpasses benchmark algorithms in metrics such as utility, cost, communication time, and running time.
论文简评
本文主要探讨了多智能体系统(MAS)中动态任务分配的问题,并在此基础上提出了一个分布式贪婪包算法(DGBA),旨在最大化全局效用的同时确保冲突免目标的分配。研究者详细阐述了该方法的优势:采用子模态最大化框架,提供严格的近似保证和低计算复杂性。此外,通过模拟实验验证了DGBA的有效性与优越性,与现有方法相比在实际应用中具有重要意义。综上所述,该论文不仅解决了当前学术界面临的挑战,还为解决类似问题提供了新思路和解决方案,具有重要的理论意义和实践价值。
3.MTSpark: Enabling Multi-Task Learning with Spiking Neural Networks for Generalist Agents
Authors: Avaneesh Devkota, Rachmad Vidya Wicaksana Putra, Muhammad Shafique
https://arxiv.org/abs/2412.04847
论文摘要
Currently, state-of-the-art RL methods excel in single-task settings, but they still struggle to generalize across multiple tasks due to “catastrophic forgetting” challenges, where previously learned tasks are forgotten as new tasks are introduced. This multi-task learning capability is significantly important for generalist agents, where adaptation features are highly required (e.g., autonomous robots). On the other hand, Spiking Neural Networks (SNNs) have emerged as alternative energy-efficient neural network algorithms due to their sparse spike-based operations. Toward this, we propose \textbf{MTSpark}, a novel methodology to enable multi-task RL using spiking networks. Specifically, MTSpark develops a Deep Spiking Q-Network (DSQN) with active dendrites and a dueling structure by leveraging task-specific context signals. Specifically, each neuron computes task-dependent activations that dynamically modulate inputs, forming specialized sub-networks for each task. Moreover, this bio-plausible network model also benefits from SNNs, enhancing energy efficiency and making the model suitable for hardware implementation. Experimental results show that our MTSpark effectively learns multiple tasks with higher performance compared to the state-of-the-art. Specifically, MTSpark successfully achieves high scores in three Atari games (i.e., Pong: -5.4, Breakout: 0.6, and Enduro: 371.2), reaching human-level performance (i.e., Pong: -3, Breakout: 31, and Enduro: 368), where state-of-the-art methods struggle to achieve. In addition, our MTSpark also shows better accuracy in image classification tasks than the state-of-the-art. These results highlight the potential of our MTSpark methodology to develop generalist agents that can learn multiple tasks by leveraging both RL and SNN concepts.
论文简评
本文主要探讨了一种名为MTSpark的新方法,旨在解决多任务学习中常见的“灾难性遗忘”问题。该方法利用了Spiking Neural Networks(SNNs),这是一种特殊的神经网络类型,其核心是主动突触。此外,还提出了dueling结构来改进SNN的学习效率。为了验证这种方法的有效性,研究者对多个Atari游戏和图像分类任务进行了实验,并取得了令人满意的结果。总的来说,MTSpark通过引入主动突触和dueling结构,为多任务学习提供了一个有效的解决方案,有望在未来的研究中得到广泛应用。
4.Mobile Cell-Free Massive MIMO with Multi-Agent Reinforcement Learning: A Scalable Framework
Authors: Ziheng Liu, Jiayi Zhang, Yiyang Zhu, Enyu Shi, Bo Ai
https://arxiv.org/abs/2412.02581
论文摘要
Cell-free massive multiple-input multiple-output (mMIMO) offers significant advantages in mobility scenarios, mainly due to the elimination of cell boundaries and strong macro diversity. In this paper, we examine the downlink performance of cell-free mMIMO systems equipped with mobile-APs utilizing the concept of unmanned aerial vehicles, where mobility and power control are jointly considered to effectively enhance coverage and suppress interference. However, the high computational complexity, poor collaboration, limited scalability, and uneven reward distribution of conventional optimization schemes lead to serious performance degradation and instability. These factors complicate the provision of consistent and high-quality service across all user equipments in downlink cell-free mMIMO systems. Consequently, we propose a novel scalable framework enhanced by multi-agent reinforcement learning (MARL) to tackle these challenges. The established framework incorporates a graph neural network (GNN)-aided communication mechanism to facilitate effective collaboration among agents, a permutation architecture to improve scalability, and a directional decoupling architecture to accurately distinguish contributions. In the numerical results, we present comparisons of different optimization schemes and network architectures, which reveal that the proposed scheme can effectively enhance system performance compared to conventional schemes due to the adoption of advanced technologies. In particular, appropriately compressing the observation space of agents is beneficial for achieving a better balance between performance and convergence.
论文简评
该篇论文提出了一个基于移动接入点(MUs)集成多智能体强化学习(MARL)的cell-free Massive MIMO系统,旨在增强下链路性能。通过引入图神经网络(GNNs)与变换器网络(Permutation Networks),该框架解决了覆盖、干扰抑制以及计算复杂性等问题。此外,该研究还探讨了MARL在优化移动接入点的移动性和功率控制中的应用,这一创新方法有望为大规模多输入多输出(Massive MIMO)系统的下链路性能提供显著提升。论文对所提出框架进行了全面评估,并对比了其与多种基准测试结果,展示了潜在的性能改进潜力。
5.Out-of-Distribution Detection for Neurosymbolic Autonomous Cyber Agents
Authors: Ankita Samaddar, Nicholas Potteiger, Xenofon Koutsoukos
https://arxiv.org/abs/2412.02875
论文摘要
Autonomous agents for cyber applications take advantage of modern defense techniques by adopting intelligent agents with conventional and learning-enabled components. These intelligent agents are trained via reinforcement learning (RL) algorithms and can learn, adapt to, reason about, and deploy security rules to defend networked computer systems while maintaining critical operational workflows. However, the knowledge available during training about the state of the operational network and its environment may be limited. The agents should be trustworthy so that they can reliably detect situations they cannot handle and hand them over to cyber experts. In this work, we develop an out-of-distribution (OOD) Monitoring algorithm that uses a Probabilistic Neural Network (PNN) to detect anomalous or OOD situations of RL-based agents with discrete states and discrete actions. To demonstrate the effectiveness of the proposed approach, we integrate the OOD monitoring algorithm with a neurosymbolic autonomous cyber agent that uses behavior trees with learning-enabled components. We evaluate the proposed approach in a simulated cyber environment under different adversarial strategies. Experimental results over a large number of episodes illustrate the overall efficiency of our proposed approach.
论文简评
这篇论文主要探讨了一种用于增强自主网络防御系统可靠性的方法。该方法利用概率神经网络(PNN)对自动化的网络安全代理进行OOD检测,目标是通过在复杂环境中监测异常情况来提高这些代理的安全性和信任度。提出的策略融合了OOD监控与行为树,并在模拟环境下进行了有效性评估。实验结果表明,这种方法在仿真环境中表现出良好的性能,为确保安全性和可靠性提供了重要参考。总之,这篇文章提出了一个创新的方法,旨在解决自动化网络安全领域的关键问题,具有很高的实用价值。
我们欢迎您在评论区中留下宝贵的建议!包括但不限于:
可以提出推文中论文简评的不足! 可以分享最近更值得推荐的论文并给出理由!
END