Workshop home page: //m.sbmmt.com/link/f73850aa36d8564629a0d62c51009acf
Overview
This seminar aims to explore the gap between the current state-of-the-art autonomous driving technology and comprehensive and reliable intelligent autonomous driving agents. In recent years, large multi-modal models (such as GPT-4V) have demonstrated unprecedented progress in multi-modal perception and understanding. Using MLLMs to deal with complex scenarios in autonomous driving, especially rare but critical hard-case scenarios, is an unsolved challenge. This workshop aims to promote innovative research in multi-modal large model perception and understanding, the application of advanced AIGC technology in autonomous driving systems, and end-to-end autonomous driving.
WorkshopCall for Papers
This draft paper focuses on multi-modal perception and understanding of autonomous driving scenes, automatic driving scene image and video generation, terminal Topics such as end-to-end autonomous driving and next-generation industrial-grade autonomous driving solutions, including but not limited to:
Submission rules:
This submission will be approved The OpenReview platform implements double-blind review and accepts submissions in two forms:
Submission entrance:
Autonomous Driving Difficult Scene Multimodal Understanding and Video Generation Challenge
This competition aims to improve the multi-modal model’s perception and understanding of extreme situations in autonomous driving, and to generate the ability to depict these extreme situations. We offer generous prizes and bonuses and sincerely invite you to participate!
Track 1: Perception and understanding of difficult autonomous driving scenarios
This track focuses on the perception of multimodal large models (MLLMs) in difficult autonomous driving scenarios and understanding capabilities, including overall scene understanding, regional understanding, and driving suggestions, aiming to promote the development of more reliable and explainable autonomous driving agents.
Track 2: Video Generation of Difficult Autonomous Driving Scenarios
This track focuses on the diffusion model’s ability to generate multi-view autonomous driving scene videos. Based on the given 3D geometric structure of the autonomous driving scene, the model needs to generate the corresponding autonomous driving scene video and ensure timing consistency, multi-view consistency, specified resolution and video duration.
Competition time: June 15, 2024 to August 15, 2024
Prize setting: The winner is US$1,000, the runner-up is US$800 , 600 US dollars for the third place (per track)
Time node (AoE Time, UTC-12)
Full Paper Submission | ||
##Full Paper Submission Deadline | 1 st Aug, 2024 |
|
##Full Paper Notification to Authors | 10 | th Aug, 2024 |
##Full Paper Camera Ready Deadline | 15 | th Aug, 2024
|
Abstract Paper Submission Deadline |
1st Sep, 2024 |
|
Abstract Paper Notification to Authors |
7th Sep, 2024 |
|
Abstract Paper Camera Ready Deadline |
10th Sep , 2024 |
|
Challenge | ||
# #Challenge Open to Public | 15 th Jun, 2024 |
|
##Challenge Submission Deadline | ##15 | th Aug, 2024
|
Challenge Notification to Winner |
##1 st | Sep, 2024|
30 th | Sep, 2024
If you have any questions about the Workshop and the Challenge, please contact: w-coda2024@googlegroups.com.
The above is the detailed content of ECCV 2024 Workshop Multi-modal Understanding and Video Generation of Autonomous Driving Difficult Scenarios Call for Papers and Challenge is now open!. For more information, please follow other related articles on the PHP Chinese website!