Is DeepSeek R1 also brain overload? Performance declines after overthinking, and less thinking can reduce computing costs by 43%.-AI-php.cn

Is DeepSeek R1 also brain overload? Performance declines after overthinking, and less thinking can reduce computing costs by 43%.

Mary-Kate Olsen

Release： 2025-03-12 14:06:01

Original

290 people have browsed it

Large language models (LLMs) may also face the dilemma of "overthinking" when performing tasks, resulting in inefficiency or even failure. Recently, researchers from institutions such as UC Berkeley, UIUC, ETH Zurich, and CMU have conducted in-depth research on this phenomenon and published a paper entitled "The Danger of Overthinking: Examining Reasoning-Action Dilemma in Agent Tasks" (Paper Link: //m.sbmmt.com/link/d12e9ce9949f610ac6075ea1edbade93 ).

Is DeepSeek R1 also brain overload? Performance declines after overthinking, and less thinking can reduce computing costs by 43%.

Researchers found that in real-time interactive environments, LLMs often hesitate between "direct action" and "careful planning." This kind of "overthinking" will cause the model to spend a lot of time building complex action plans, but it is difficult to implement effectively, and it will eventually achieve half the result with twice the effort.

In order to gain an in-depth understanding of this issue, the research team used real-world software engineering tasks as an experimental framework and selected a variety of LLMs including o1, DeepSeek R1, Qwen2.5 and other LLMs for testing. They construct a controlled environment that allows LLM to balance information collection, reasoning and action, and maintain context on a constant basis.

Is DeepSeek R1 also brain overload? Performance declines after overthinking, and less thinking can reduce computing costs by 43%.

Researchers divide "overthinking" into three modes: analysis paralysis, Rogue Actions, and Premature Disengagement. They developed an LLM-based evaluation framework, conducted quantitative analysis of 4018 model trajectories, and constructed an open source dataset to facilitate relevant research.

The results show that overthinking is significantly negatively correlated with problem-solving rates. The inference model is almost three times more overthinking than the non-inference model and is more susceptible to this problem.

Is DeepSeek R1 also brain overload? Performance declines after overthinking, and less thinking can reduce computing costs by 43%.

To alleviate overthinking, researchers proposed two methods: native function calls and selective reinforcement learning, and achieved remarkable results. For example, by selectively using low inference-capable models, computational costs can be greatly reduced while maintaining a high task completion rate.

Is DeepSeek R1 also brain overload? Performance declines after overthinking, and less thinking can reduce computing costs by 43%.

The study also found that there is a negative correlation between model size and overthinking, and smaller models are more likely to overthinking. In addition, increasing the number of inference tokens can effectively suppress overthinking, while the context window size has no significant impact.

Is DeepSeek R1 also brain overload? Performance declines after overthinking, and less thinking can reduce computing costs by 43%.

This study provides valuable insights for understanding and solving the problem of "overthinking" in LLM, which helps improve the efficiency and reliability of LLM in practical applications.

The above is the detailed content of Is DeepSeek R1 also brain overload? Performance declines after overthinking, and less thinking can reduce computing costs by 43%.. For more information, please follow other related articles on the PHP Chinese website!