Agents divide work and collaborate like people, and can also exchange information through 'group chat'-AI-php.cn

Agents divide work and collaborate like people, and can also exchange information through 'group chat'

王林

Release： 2024-02-04 14:36:30

forward

1098 people have browsed it

Intelligent agents must also have a "specification manual"!

A study called MetaGPT, by clearly dividing the roles of agents and requiring multiple agents to adopt a unified and standardized "communication format" in collaboration, allows The performance of the agent is greatly improved.

Currently, this research has garnered 33.6k stars on GitHub, and was included as an Oral paper at the top deep learning conference ICLR 2024.

Agents divide work and collaborate like people, and can also exchange information through group chat

In general, MetaGPT imitates human division of labor and collaboration, and encodes the standard operating procedures of various tasks into the "standards" of intelligent agents. Manual", agents with different roles are responsible for different professional tasks.

For example, the product manager role can use network search tools, while the engineer role can execute code:

Agents divide work and collaborate like people, and can also exchange information through group chat

In this way, multi-agent collaboration completes the task.

The researchers set up a "message sharing group" for the agents, and the agents can freely view relevant messages sent by other agents.

After testing, using this method, MetaGPT achieved 85.9% and 87.7% new SOTA respectively on the public data sets HumanEval and MBPP for code completion tasks.

Currently this work has been open sourced and has attracted the attention of many netizens across the Internet:

Agents divide work and collaborate like people, and can also exchange information through group chat

What does MetaGPT look like?

This research was jointly proposed by the DeepWisdom team and scholars from KAUST AI Center, Xiamen University, CUHK(SZ), Nanjing University, UPenn, UCB and many other universities and institutions.

Agents divide work and collaborate like people, and can also exchange information through group chat

# As the capabilities of large models continue to improve, there is growing interest in academia and industry in using large model-based agents to solve various tasks.

It is worth noting that research on using multiple agents to collaborate to solve problems in specific fields is still in its early stages. Existing research mainly focuses on enhancing task understanding and reasoning decision-making capabilities through role-playing mechanisms and communication topology settings. Despite some progress, these methods still rely on direct dialogue forms and lack standard specifications and constraints on agent behavior.

Some recent work has also pointed out that multi-agent systems based on dialogue may face problems such as information inconsistency, ambiguity, and possible invalid repetition and infinite loops.

In contrast, standard operating procedures (SOPs) in human workflows not only clearly define the division of labor and topology of participating roles, but also establish standard specifications for the role's output results .

Research shows that clearly defined SOPs can improve the consistency and accuracy of task execution and ensure that the end result meets specified quality standards. Therefore, to solve the challenges in multi-agent collaboration, researchers designed a large model-based agent meta-programming framework MetaGPT.

MetaGPT requires agents to participate in collaboration as experts and generate structured output as required, such as high-quality requirements documents, architectural design diagrams, and flow charts.

The structured output is a higher-level thinking chain(Chain-of-Thought) for a single agent, and a context with clear semantics and clear goals for downstream roles(Context).

In the framework of MetaGPT, researchers aligned the concepts of SOPs to role specialization, communication protocol design, and iterative executable feedback design.

Role Specialization

Through clearly defined roles, complex work can be broken down into smaller, more specific tasks.

As shown in the figure below, different professional roles are initialized with different goals and constraints, as well as different professional skills. For example, the product manager role can use web search tools, while the engineer role can execute code. At the same time, each character follows the ReAct behavior pattern by default.

Agents divide work and collaborate like people, and can also exchange information through group chat

#Role specialization enables each agent to focus on specific tasks within its domain, thereby improving the output quality of large models.

For software development, through the flow of roles, this division of labor more skillfully completes the alignment from natural language to programming language. The character ablation experiment in the paper further proves the effect of this part.

Communication Protocol Design

In practical applications, although natural language has rich semantics, due to its unstructured characteristics, the information is often distorted or even distorted during the message transmission process. Loss of important content.

To solve this problem, the author constrained the agent to participate in collaboration with structured output (including documents and charts) to improve the clarity and completeness of the information. To verify this design, the authors designed a variety of software development tasks to emphasize the criticality of structured output in collaboration through the executability of the generated code and productivity indicators.

Agents divide work and collaborate like people, and can also exchange information through group chat

In order to improve communication efficiency during multi-agent collaboration, MetaGPT introduces a publish-subscribe mechanism based on message sharing(Publish-Subscribe Mechanism).

As shown in the figure above, the shared message pool allows messages to be exchanged directly, and any agent can transparently access messages from other agents without asking and waiting for a response. The subscription mechanism makes the agent more inclined to receive information related to its own tasks and avoid being distracted by irrelevant details. At the same time, each agent can directly retrieve the required information from the shared message pool to form self-memory.

Executable feedback

The intelligent agent self-optimizes and actively updates based on environmental feedback, which is a manifestation of the intelligent agent's autonomous consciousness.

In terms of software development tasks, MetaGPT has designed an executable feedback mechanism for engineers’ agents to automatically optimize code quality.

Specifically, engineers write and execute corresponding unit test cases, and make decisions and self-prompts recursively through the observed execution results to achieve automatic debugging. This iterative process of design-test-feedback continues until the unit test passes or the maximum number of retries is reached.

Multiple benchmarks test new SOTA

In terms of code generation capabilities, the researchers used two public benchmark data sets: HumanEval and MBPP, and reported the Pass@1 indicator.

In addition, they also collected the SoftwareDev data set covering 70 typical software development tasks (such as mini-games, data visualization, image processing, etc.) , and made multiple agents open source Comparison of frameworks, statistical analysis and qualitative description of the executability and production efficiency of multiple software development tasks.

As shown in the figure below, MetaGPT outperforms previous methods in both the HumanEval and MBPP benchmarks, reaching 85.9% and 87.7% respectively. Compared with the results of GPT-4, MetaGPT has a relative improvement of 28.2% on the HumanEval data set, and adding an executable feedback mechanism has improved 4.2% and 5.4% on HumanEval and MBPP respectively.

Agents divide work and collaborate like people, and can also exchange information through group chat

On the challenging SoftwareDev dataset, MetaGPT achieves an executability score of 3.75, very close to 4, while requiring a shorter running time(503 seconds);The number of generated lines of code increased by 2.24 times compared to the baseline framework, while the number of tokens consumed per unit line of code dropped by 50%.

These results highlight the efficiency improvements brought by SOPs during multi-agent collaboration.

Agents divide work and collaborate like people, and can also exchange information through group chat

MetaGPT’s high executability and relatively short running time in software development tasks demonstrate its practicality and efficiency in real-world applications.

Agents divide work and collaborate like people, and can also exchange information through group chat

Focusing on the field of software development, researchers provide a qualitative comparison of the capabilities of different agent frameworks.

They found that MetaGPT not only has the ability to generate files in multiple modes, but is also the only open source framework among many current frameworks that completely covers the software development process in the real world.

Agents divide work and collaborate like people, and can also exchange information through group chat

In general, MetaGPT is a novel multi-agent framework that combines meta-programming ideas and embeds SOPs to enhance the capabilities of large models in multi-agent collaboration.

Through role specialization, workflow management and flexible messaging mechanism, it becomes a multi-agent framework with high versatility and portability.

Combined with the iterative feedback mechanism, MetaGPT has achieved SOTA performance on multiple benchmark tests.

SOPs combined with human social practice inspire future research and exploration of multi-agent societies, and can also be regarded as an early attempt to regulate multi-agent frameworks based on large models.

Paper link: https://arxiv.org/abs/2308.00352
Code link: https://github.com/geekan/MetaGPT

The above is the detailed content of Agents divide work and collaborate like people, and can also exchange information through 'group chat'. For more information, please follow other related articles on the PHP Chinese website!