Community Learn Tools Library Leisure

English

Home > Technology peripherals > AI > New title: Peking University opens a new era: a new paradigm of category-level 6D object pose estimation achieves the latest and best results at NeurIPS 2023

New title: Peking University opens a new era: a new paradigm of category-level 6D object pose estimation achieves the latest and best results at NeurIPS 2023

WBOY

Release： 2023-12-04 18:41:41

forward

1418 people have browsed it

Researchers from Peking University have proposed a new category-level 6D object pose estimation method. This is a basic and important problem and is widely used in fields such as robotics, virtual reality and augmented reality. . They achieved new SOTA results in this paper, and it has been accepted by NeurIPS 2023, a top conference in the field of machine learning

6D object pose estimation as an important task in the field of computer vision , which has numerous applications in fields such as robotics, virtual reality, and augmented reality. Although significant progress has been made in instance-level object pose estimation, it requires prior knowledge of the object's characteristics and therefore cannot be easily applied to new objects, which limits its practical application. To solve this problem, in recent years, more and more research efforts have focused on category-level object pose estimation. Category-level pose estimation requires algorithms that do not rely on the CAD model of the object and can be directly applied to new objects of the same category as those in the training data.

At present, the currently widely used 6D object pose estimation methods can be divided into two major categories: one is the end-to-end method of direct regression, and the other is the two-stage method based on the object category prior. However, these methods all model the problem as a regression task, so special designs are needed to deal with multi-solution problems when dealing with symmetric objects and partially visible objects

To overcome these challenges, a research team from Peking University proposed A new category-level 6D object pose estimation paradigm is developed, which redefines the problem as a conditional distribution modeling problem, thereby achieving the latest optimal performance. They have also successfully applied this method to robot manipulation tasks such as pouring water as shown in the video.

新标题：北京大学开创新纪元：类别级6D物体位姿估计新范式在NeurIPS 2023取得最新最佳结果

Please click the following link to view the paper: https://arxiv.org/abs/2306.10531

Multiple solution problems in category-level 6D object attitude estimation

At the category level of 6D object attitude estimation, multiple solution problems refer to the problems that may exist under the same observation conditions Multiple reasonable pose estimates. This situation is mainly caused by two factors, as shown in Figure 1: symmetric objects and partial observations. For symmetrical objects, such as spherical or cylindrical objects, they may be exactly the same when viewed in different directions, so theoretically they have an infinite number of possible true values for their attitude. At the same time, a single perspective cannot obtain a complete observation of an object, such as a mug. Without observing the cup handle, there are infinitely many possible true values of the posture

新标题：北京大学开创新纪元：类别级6D物体位姿估计新范式在NeurIPS 2023取得最新最佳结果

^{Figure 1. Source of multiple solution problems: symmetric objects and partial observations}

Method introduction

How to deal with the above How about solving multiple problems? The authors view this problem as a conditional distribution modeling problem and propose a method called GenPose, which utilizes a diffusion model to estimate the conditional distribution of object poses. The method first uses a score-based diffusion model to generate object pose candidates. The candidates are then aggregated in two steps: first, outliers are filtered out through likelihood estimation, and then the remaining candidate poses are aggregated through average pooling. In order to avoid the need for tedious integral calculations when estimating likelihood, the study authors also introduced an energy-based diffusion model training method to achieve end-to-end likelihood estimation

新标题：北京大学开创新纪元：类别级6D物体位姿估计新范式在NeurIPS 2023取得最新最佳结果

^{Re-expressed as: Picture 2 shows the framework structure of GenPose}

The score-based diffusion model is used to generate object pose candidates

Rewritten content: The purpose of this step is to solve the multi-solution problem, so how to model the conditional probability distribution of the object's pose? The authors adopted a fraction-based diffusion model and constructed a continuous diffusion process using VE SDE (variational Euler stochastic differential equations). During the training process of the model, the goal is to estimate the fractional function of the perturbed conditional attitude distribution, and finally sample the candidate object attitude from the conditional distribution through Probability Flow ODE (ordinary differential equation)

新标题：北京大学开创新纪元：类别级6D物体位姿估计新范式在NeurIPS 2023取得最新最佳结果

^{Generate object pose candidates based on the diffusion model of the score, as shown in Figure 3}

Apply to improve the accuracy of object recognition

Through the trained conditional distribution, an infinite number of object pose candidates can be generated. From these candidates, how to derive the final object pose? The simplest method is random sampling, but this method may not guarantee the stability of the prediction results. Is it possible to aggregate these pose candidates through average pooling? However, this aggregation method does not consider the quality of pose candidates and is easily affected by outliers. The author believes that the quality of pose candidates can be considered and aggregated through likelihood estimation. Specifically, based on the results of likelihood estimation, the object pose candidates are sorted, outliers with lower likelihood estimates are filtered out, and then the remaining pose candidates are average pooled to obtain the aggregated pose estimation results. However, using the diffusion model for likelihood estimation requires complex integral calculations, which seriously affects the inference speed and limits its practical application. In order to solve this problem, the author proposed to train an energy-based diffusion model, which is directly used for end-to-end likelihood estimation, thereby achieving rapid aggregation of candidates

新标题：北京大学开创新纪元：类别级6D物体位姿估计新范式在NeurIPS 2023取得最新最佳结果

^{Figure 4. Energy-based diffusion model for likelihood estimation and object pose candidate aggregation}

Experiments and results

The author is at The performance of GenPose was verified on the REAL275 data set. It can be seen that GenPose is significantly better than the previous method in various indicators. Even when compared with methods that use more modal information, GenPose still has a big lead. , Table 1 shows the advantages of the generative object pose estimation paradigm proposed by the author. Figure 5 is the visualization result.

新标题：北京大学开创新纪元：类别级6D物体位姿估计新范式在NeurIPS 2023取得最新最佳结果

^{The content that needs to be rewritten is: comparison with other methods}

新标题：北京大学开创新纪元：类别级6D物体位姿估计新范式在NeurIPS 2023取得最新最佳结果

^{The fifth picture shows the prediction visualization effect of different methods}

The author also studied different aggregation methods (random sampling, random sorting and aggregation, Based on the influence of energy sorting and post-aggregation, GT sorting and post-aggregation). The results show that ranking using energy models significantly outperforms random sampling methods. In addition, the energy-based diffusion model proposed by the author to aggregate object pose candidates is also significantly better than the average pooling method after random sampling and random sorting

新标题：北京大学开创新纪元：类别级6D物体位姿估计新范式在NeurIPS 2023取得最新最佳结果

^{Table 2. Comparison of different aggregation methods}

In order to better analyze the impact of the energy model, the author further studied the relationship between the estimated pose error and the predicted energy. Correlation. As shown in Figure 4, there is a general negative correlation between predicted pose error and energy. The energy model performs better when identifying postures with larger errors, but performs worse when identifying postures with smaller errors, which explains why the predicted energy is used to remove outliers instead of directly selecting the one with the largest energy. Candidate

新标题：北京大学开创新纪元：类别级6D物体位姿估计新范式在NeurIPS 2023取得最新最佳结果

^{Figure 6. Energy and prediction error correlation analysis}

The author also demonstrated the method In terms of cross-category generalization capabilities, this method does not rely on category prior knowledge, and its performance in cross-category generalization is also significantly better than the previous method

新标题：北京大学开创新纪元：类别级6D物体位姿估计新范式在NeurIPS 2023取得最新最佳结果

^{Table 3 shows the cross-category generalization effect. The slash on the left represents the performance when the test category is included in the training data set, and the slash on the right represents the performance after the test category is removed during training}

At the same time, due to the diffusion model Closed-loop generation process, the single-frame attitude estimation framework in the article can also be directly used for 6D object attitude tracking tasks without any special design. This method is better than the most advanced 6D object attitude tracking method in multiple indicators. The results are as shown in Table 4 shown.

新标题：北京大学开创新纪元：类别级6D物体位姿估计新范式在NeurIPS 2023取得最新最佳结果

^{Table 4. Category-level 6D object attitude tracking performance comparison}

Summary and outlook

This work proposes a new paradigm for category-level 6D object pose estimation. The training process does not require any special design for the multi-solution problems caused by symmetric objects and partial observations, and has achieved a new SOTA performance. Future work will leverage recent advances in diffusion models to accelerate the inference process and consider incorporating reinforcement learning to achieve active 6D object pose estimation.

Introduction to the research team:

The corresponding author of this study, Dong Hao, is an assistant professor, doctoral supervisor, liberal arts young scholar, and Intellectual Property Scholar at Peking University. It was founded and Leads the Hyperplane Lab at Peking University.

The co-authors of the paper Zhang Jiyao and Wu Mingdong are doctoral students at Peking University, and their supervisor is Professor Dong Hao. For details, please see their personal homepage. The content that needs to be rewritten is: Zhang Jiyao and Wu Mingdong are doctoral students at Peking University. They co-wrote a paper, and Mr. Dong Hao is their supervisor. For specific information, please check their personal homepage

What needs to be rewritten is: https://jiyao06.github.io/
https: //aaronanima.github.io/

The above is the detailed content of New title: Peking University opens a new era: a new paradigm of category-level 6D object pose estimation achieves the latest and best results at NeurIPS 2023. For more information, please follow other related articles on the PHP Chinese website!

Related labels：

industry 6d object pose estimation neurips 2023

source：jiqizhixin.com

Previous article：It’s not that large models cannot afford global fine-tuning, it’s just that LoRA is more cost-effective and the tutorial is ready. Next article：R-CNN author Ross Girshick resigned, He Kaiming and Xie Saining returned to academia, how many great minds have emerged from Meta CV

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Latest Articles by Author

What is a NullPointerException, and how do I fix it?

2024-10-22 09:46:29
From Novice to Coder: Your Journey Begins with C Fundamentals

2024-10-13 13:53:41
Unlocking Web Development with PHP: A Beginner's Guide

2024-10-12 12:15:51
Demystifying C: A Clear and Simple Path for New Programmers

2024-10-11 22:47:31
Unlock Your Coding Potential: C Programming for Absolute Beginners

2024-10-11 19:36:51
Unleash Your Inner Programmer: C for Absolute Beginners

2024-10-11 15:50:41
Automate Your Life with C: Scripts and Tools for Beginners

2024-10-11 15:07:41
PHP Made Easy: Your First Steps in Web Development

2024-10-11 14:21:21
Build Anything with Python: A Beginner's Guide to Unleashing Your Creativity

2024-10-11 12:59:11
The Key to Coding: Unlocking the Power of Python for Beginners

2024-10-11 12:17:31

Latest Issues

Select woocommerce related products using custom taxonomy with 3 level hierarchy I have a woocommerce store with a custom classification of "Sports". The classif...

From 2024-04-06 20:05:30

0

1

544

CSS styles not applied to site I'm making a website using Bootstrap5 but the index.css properties are not being applied t...

From 2024-04-06 17:12:23

0

1

336

Solving Vue3 webcomponents production build issues I'm trying to migrate my vue2web components to vue3, although the problem arises when I cr...

From 2024-04-06 12:43:37

0

1

473

Symfony Redis cannot connect to the host defined in the env file, which defaults to localhost We have a new Symfony setup with Redis as the caching mechanism. We want to connect to a s...

From 2024-04-06 10:53:02

0

1

375

Axios related errors encountered when building React applications using vite Axios works perfectly in production but when building the application I get this error. &g...

From 2024-04-05 13:20:02

0

1

326

Related Topics

More>

Popular Recommendations

Popular Tutorials

More>

Related Tutorials

Popular Recommendations

Latest courses

Latest Downloads

More>

Web Effects

Website Source Code

Website Materials

Front End Template