Super generalization ability makes large models a ray of hope for "general artificial intelligence".
However, reading thousands of books is not as good as traveling thousands of miles. In an open environment, large models need to truly "walk" into the physical world in order to truly understand complex tasks and solve practical problems.
Recently, Professor Li Xuelong’s team conducted innovative research on autonomous drone swarms in an open environment. They used domestic large-scale models to successfully realize human-computer and multi-machine dialogue interaction in an open environment, breaking the interaction barriers between humans and machines. This research further expands the application scenarios of local security, allowing large drones to soar in real life
Inspired by human cognitive models, our team summarized the highly autonomous cognitive process as " The three-dimensional interaction of "Thinking Computing-Entity Control-Environment Perception" has been established, and a "group chat-style" control framework for autonomous drones driven by the "Scholar·Puyu" open source large model has been established. We equip each drone with an intelligent brain, allowing the drone group to dynamically collaborate through language communication to achieve intelligent interaction, active perception and autonomous control in open environments and complex tasks. This move improves the autonomy of drone mission execution
In general, the main capabilities of autonomous drone swarms include human-like conversational interaction, active environment perception and autonomous entity control
Figure 1 Drone group chat communication
Explore the interaction between human users and drones, allowing drones to understand complex The user needs in the mission are the prerequisites for realizing autonomous drones.
In response to this, the team proposed a "group chat-style" dialogue interaction method, which converts various information such as sounds, images, and the drone's own status into a natural language dialogue form through a large model, realizing the dialogue between users and Drones, and autonomous and intuitive ways to interact with drones.
In order to improve the stability and safety of complex tasks, the team designed an efficient real-time feedback mechanism. This mechanism enables the drone to report its status through dialogue and seek user confirmation at key nodes in mission execution. At the same time, this mechanism can also greatly improve the efficiency of task execution
Figure 2 Actively discover and approach the target
Figure 3 Dynamic environment obstacle avoidance
During flight, the drone actively senses the external environment and adjusts the mission plan in real time, which is a key link in completing complex tasks.
In order to deal with this problem, the team developed a task-guided active perception mechanism and proposed multi-sensor fusion low-altitude search, dynamic obstacle avoidance and visual positioning algorithms
In actual mission execution During the process, we can dynamically adjust the flight path and observation posture of the drone based on the perceived information and mission goals. We can try to perceive the world around us from different angles and positions, and gradually reduce the uncertainty in the environment to achieve efficient information collection and task execution
Figure 4 Autonomous target grabbing
Figure 5 Heterogeneous drone cluster collaborative control
The key research is to explore the form of composite agents , to enhance its ability to handle complex tasks. In the era of large models, this is a key area for new intelligent agents
In order to solve this problem, the R&D team used the drone platform to design end effectors such as grippers, upgrading traditional drones to " "Flying Robot", endows it with grabbing capabilities
At the same time, a heterogeneous UAV cluster collaborative control mechanism is also established, and combined with environmental perception feedback, the flight status of the UAV formation is adjusted in real time so that the cluster can Division of labor and cooperation to perform tasks such as regional search, target positioning and crawling
The team successfully tried to apply the three-dimensional interaction model of biological intelligence "thinking computing-entity control-environment perception" to autonomous agents, forming a large-scale autonomous drone cluster. This kind of cluster uses large-scale language models, drone platforms and a variety of sensors to achieve conversational interaction, active perception and autonomous control. This technology is of great significance for the application in on-site security scenarios such as security inspections, disaster rescue, and air logistics
References: Li Xuelong, Vicinagearth security, Communications of the Computer Society of China, 18(11) ), 44-52, 2022
The above is the detailed content of NPU launches an innovative UAV control framework: enabling group chat-style interaction, active perception of the environment, and autonomous control of UAVs. For more information, please follow other related articles on the PHP Chinese website!