Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model-AI-php.cn

Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model

PHPz

Release： 2024-02-29 15:55:20

forward

999 people have browsed it

Tencent’s research team conducted a study on the scalability of agents. They found that through simple sampling voting, the performance of large language models (LLMs) increases with the number of instantiated agents. This study has verified the universality of this phenomenon in various scenarios for the first time, compared it with other complex methods, explored the reasons behind this phenomenon, and proposed methods to further exert the scaling effect.

Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model

Paper title: More Agents Is All You Need
Paper address: https://arxiv .org/abs/2402.05120
Code address: https://github.com/MoreAgentsIsAllYouNeed/More-Agents-Is-All-You-Need

In this article, researchers from Tencent found that: through a simple sampling voting method, the performance of large language models will increase as the number of instantiated agents increases, showing scaling property (can Scalability), without the support of complex multi-LLM agents collaboration framework and prompt engineering methods. Furthermore, this method is orthogonal to existing sophisticated methods and, when combined, can further enhance LLM to a degree related to task difficulty. This paper did the first study on the scaling property of raw agents (referring to LLM agents that do not rely on complex prompt engineering and collaboration frameworks). It conducted comprehensive experiments on various LLM benchmarks to verify the universality of this finding. , and examine strategies that can facilitate its occurrence. The code is now open source.

Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model

## Multiple models exceeded the big model

Thesis detailed discussed a variety of integrated LLM related related related LLM Research, including LLM self-integration, heterogeneous LLM integration, and research on multiple LLM agent collaboration frameworks. By comparing with the proposed method, it can be seen that the paper has conducted a more comprehensive research and analysis.

To study how the performance of large language models improves as the number of instantiated agents increases. The paper uses a simple sampling and voting method (the author uses the term simple (st), which shows that they think this method may be one of the simplest methods). Notably, this method can be orthogonally combined with existing complex methods. It can be divided into two stages:

Input task query into a single LLM or multiple LLM Agents collaboration framework to generate multiple outputs ;
The final result is determined by majority voting

The paper selects different scales from the Llama2 and GPT series Language models are evaluated on task datasets covering multiple domains such as inference and generation. Experimental results show that on all tasks and LLMs of different types and sizes, it is found that the performance of LLM increases with the number of instantiated agents.

Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model

For example, the improvement is 12% to 24% on the GSM8K task and 6% to 10% on the MATH task. Interestingly, ensembles of multiple small LLMs can match or even exceed the performance of larger LLMs.

For example, an ensemble of multiple Llama2-13Bs achieved 59% accuracy on GSM8K, exceeding the 54% accuracy of a single Llama2-70B.

Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model

###

Further, the author also explored ’s compatibility with other methods. Although these methods are implemented differently, when used in combination with them, the performance can be further improved, and are also consistent with the phenomenon that the more agents are instantiated, the stronger the performance gain. The experimental results show that the gain ranges from 1% to 27%, indicating that this simple method can further enhance the performance of LLM by using it orthogonally with other methods.

# Based on LLama13B

## Based on LLama70B

Based on GPT-3.5-Turbo In addition, the paper also analyzes the relationship between

performance improvement and problem difficulty.

Nodes: steps, dashed lines: possible alternative steps. Depth of nodes: number of steps, intensity of colors: level of inherent difficulty. The illustration helps the reader understand how task complexity is measured along these dimensions.

Based on this, the paper proposes two optimization strategies to further improve the effectiveness of the method:

The above is the detailed content of Quantity is power! Tencent reveals: The greater the number of agents, the better the effect of the large language model. For more information, please follow other related articles on the PHP Chinese website!