A Fascinating New Paper: Multi-Agent Architecture Search via Agentic Supernet
An Overview
Introduction
Multi-agent systems powered by Large Language Models (LLMs) have emerged as an approach to enhance the capabilities of individual agents through collaborative and interactive mechanisms. However, the construction of such systems often involves intricate manual design processes, including prompt engineering, agent profiling, and inter-agent communication pipeline design. Recent research has focused on automating these design aspects, but existing methods typically aim to find a single, static multi-agent architecture that may not be optimal for all tasks or queries. This paper introduces a novel approach called Multi-agent Architecture Search (MaAS), which shifts the paradigm from searching for a single optimal solution to optimizing a distribution of multi-agent architectures, termed the "agentic supernet".
The Agentic Supernet
The agentic supernet is a probabilistic and continuous distribution of agentic architectures, modeled as a cascaded multi-layer workflow. Each layer consists of multiple agentic operators, such as Chain-of-Thought (CoT), Multi-agent Debate, and ReAct, along with parameterized probability distributions governing the selection of operators at each layer. This probabilistic representation allows for dynamic allocation of inference resources based on the specific characteristics of each query, addressing the limitations of static multi-agent systems.
Multi-Agent Architecture Search (MaAS)
MaAS is an automated framework that leverages the agentic supernet to sample query-dependent multi-agent systems. During the training phase, a controller network samples architectures from the supernet based on input queries. The distribution parameters and operators are then jointly updated based on feedback from the environment, with gradients approximated using Monte Carlo sampling and textual gradient estimation, respectively. During inference, MaAS samples a suitable multi-agent system for each query, delivering high-quality solutions while optimizing resource allocation.
Methodology
The MaAS framework operates as follows:
Input: MaAS takes diverse queries as input, each potentially varying in difficulty and domain.
Controller Network: A controller network samples a subnetwork from the agentic supernet for each query, effectively tailoring a customized multi-agent system.
Execution and Feedback: The sampled multi-agent system executes the query, and MaAS receives feedback from the environment based on the system's performance.
Supernet Optimization: The feedback is used to jointly optimize the parameterized distribution of the supernet and the agentic operators themselves, ensuring that the supernet evolves to generate more effective and efficient multi-agent systems over time.
Search Space and Optimization Objective
The search space of MaAS is defined by the set of available agentic operators, which are composite LLM-agent invocation processes involving multiple LLM calls and tool usage. The optimization objective is to maximize the expected utility of the sampled multi-agent systems while minimizing their cost, with the cost typically measured in terms of token usage.
Agentic Architecture Sampling
The controller network within MaAS employs a Mixture-of-Experts (MoE)-style architecture to sample operators at each layer of the supernet. The sampling process is query-dependent, with operators selected sequentially based on their activation scores until a cumulative score threshold is reached. This dynamic selection mechanism enables MaAS to allocate resources adaptively based on the complexity of the query.
Cost-Constrained Supernet Optimization
MaAS optimizes the supernet under cost constraints, aiming to find multi-agent architectures that achieve high performance with minimal token cost. The gradient with respect to the distribution parameters is estimated using an empirical Bayes Monte Carlo procedure. For the operators, which involve non-differentiable components like natural language prompts, MaAS utilizes agent-based textual gradients to approximate the backpropagation process.
Experimental Evaluation
MaAS was evaluated on six public benchmarks covering math reasoning, code generation, and tool use domains. The results demonstrated that MaAS outperforms existing handcrafted and automated multi-agent systems in terms of both performance and resource efficiency. The case studies further illustrated the adaptive resource allocation capabilities of MaAS, with the framework dynamically adjusting the complexity of the multi-agent system based on the difficulty of the query.
Conclusion
By optimizing a distribution of architectures, MaAS enables dynamic resource allocation and adaptation to diverse tasks and queries. This research opens up new possibilities for the development of self-organizing and self-evolving collective intelligence systems.


