"The only true journey of discovery is not to visit a strange land, but to observe the universe through the eyes of others." - Marcel Proust
The sci-fi, poetic (and terrifying) idea of seeing the world through the eyes of others has come true!
"Black Mirror" Season 1 "The Whole History of You"
Now, we only need to use the reflection of the eyes to reconstruct the object the person is observing in three dimensions.
Yes, this is very Black Mirror.
#Recently, a team from the University of Maryland proposed a brand new method - by using portraits containing eye reflections to perform three-dimensional reconstruction of scenes not captured by the camera.
##Paper address: https://arxiv.org/abs/2306.09348
Project address: https://world-from-eyes.github.io/ Scenes from classic science fiction have come true ?
Use eye reflection to generate radiation field reconstruction? This idea may seem crazy, but it actually has enough theoretical basis.The author said that because the human eye is highly reflective, it is possible to reconstruct a series of frames that capture the movement of the head using only the reflection of the eye. and rendering the 3D scene people are observing.
In view of the fact that this concept is very "Black Mirror", and only a few hours after this paper was released, "Black Mirror" "The new season of "Black Mirror" was announced to be online. This coincidence makes people wonder whether the director of "Black Mirror" also noticed this paper. (Dog Head)
##Black Mirror Season 6 is online today
As soon as this study came out, netizens went crazy.
So, we’ve fast forwarded to this point?
Isn’t this the scene from “Ghost in the Shell” in the 2000s? All these fictions have become reality!
100% Blade Runner, give me a copy now.
Jules Verne's "Brothers Kip" comes true!
Of course, some people are horrified by this: This technology should never be used for investigation and evidence collection. matter.
Today, we already have the Varjo eye-tracking camera, as well as Apple’s VisionPro and other headsets. , these devices can capture a large amount of lens material. Combined with this new technology, countless new science fiction scenes may soon come true...
By exploiting the tiny reflections of light on the human eye, the research team developed a method to reconstruct what a person observes (indirectly) using a sequence of monocular images taken at a fixed camera position. (view) scene.
However, simply training the radiation field on the observed reflections is not enough for several reasons: 1) the inherent noise in corneal positioning, 2) the complexity of the iris texture, 3) Low-resolution reflections captured in each image.
To address these challenges, the team introduced corneal pose optimization and iris texture decomposition during the training process, with the help of a radial texture regularization loss based on the human iris.
Unlike traditional neural field training methods that require moving the camera, the method they used places the camera at a fixed viewpoint and relies entirely on the user's movement.
Due to the difficulty of accurately estimating the posture of the eye, and the intertwined textures between the iris and scene reflection, The task is quite challenging.
To solve this problem, the author jointly optimized the eye pose, the radiation field describing the scene, and the observer's eye iris texture.
##Specifically, there are three main contributions:
1. New 3D Reconstruction
proposes a new method to reconstruct the 3D scene of the observer's world from eye images, which can combine previous basic work with neural Combined with the latest advances in rendering.
2. Radial prior of iris
##Introduces the radial prior of iris texture decomposition, which is significant Improved quality of reconstructed radiation fields.
3. Optimization of Corneal Posture
A process of corneal posture optimization was developed to alleviate eye posture The estimated noise overcomes the unique challenge of extracting features from the human eye.The results show that with this new method, we can obtain multiple perspectives of the scene from the reflection of the eyes by moving the picture, and finally achieve a complete scene reconstruction.
What’s even more amazing is that the team also tried to use the MVs of Miley Cyrus and Lady Gaga to recreate the scenes in their eyes.
The authors stated that they successfully reconstructed the objects that appeared in Miley's eyes, and that the upper body of a person seemed to be seen through Lady Gaga's eyes.
However, since the quality of these videos is not high enough, it cannot be concluded that the reconstruction results are accurate.
Lady Gaga
Miley Cyrus
It is well known that the corneal geometry of healthy adults is almost identical.
So by simply calculating the pixel size of a person’s cornea in the image, their eye position can be accurately calculated.
Next, the authors trained the radiation field reflected by the eye by taking rays from the camera and reflecting them off the approximate eye geometry.
In order to avoid the iris of the human eye appearing in the reconstruction, the author also trained a two-dimensional texture mapping that learned the iris texture to perform texture decomposition.
Synthetic data evaluation
First, the author conducted an evaluation on synthetic data by placing a human eye model in the Blender scene.
The image below shows a scene reconstructed using only eye reflections.
Since the cornea cannot be perfectly estimated in real life, the authors and evaluated the robustness of corneal pose optimization to estimated cornea radius noise.
To simulate depth estimation errors that might be encountered in real data, the authors corrupted the observed cornea by scaling it with different noise levels in each image. radius r_img.
The following figure shows the performance changes under different noise levels.
It is worth noting that as noise increases, the pose-optimized reconstruction proposed by the authors is more robust in terms of reconstructed geometry and color compared to the reconstruction without pose optimization .
This proves that pose optimization is critical for real-world scenarios, as the fit from the projected cornea to the initial ellipse in the image is not perfect.
Additionally, quantitative comparisons with and without texture decomposition show that the authors’ method performs better in terms of SSIM and LPIPS Performs better with texture decomposition.
It is worth noting that the author did not calculate PSNR because in the setup, the difference in lighting between reflections and the scene itself is very large.
Real World Assessment
For Guarantee To ensure the reality of the field of view, the author chose a Sony RX IV camera for shooting and used Adobe Lightroom to post-process the images to reduce noise in corneal reflections. At the same time, the author added light sources on both sides of the character to illuminate the target object.
During the process, the person being photographed needs to move within the camera's field of view so that the team can capture 5-15 frames of images in each scene.
Due to the large dynamic range of scene lighting, the authors used 16-bit images in all experiments to avoid losing information in the observed reflections.
On average, the cornea only covers about 0.1% of the area in each image, while the target object takes up about 20x20 pixels, interleaved with the iris texture.
Data processing
The author first passed The corneal center and radius are estimated from the image to obtain an initial position estimate of the cornea.
Then, the three-dimensional position of the cornea is calculated using a direct approximation of the average depth and the focal length of the camera, and its surface normal is calculated.
To automate this process, the author uses Grounding Dino to locate the bounding box of the eye and uses ELLSeg to perform ellipse fitting on the iris.
Although the cornea is usually occluded, we only need the unoccluded area, so we can use Segment Anything to obtain a segmentation mask for the iris.
Real results
From the picture below As can be seen from the demonstrated results, the author's method is able to reconstruct 3D scenes from real-world portrait images, despite the inaccuracy of corneal position and geometric estimation.
Due to the ambiguity of the corneal boundary, it is very difficult to achieve precise positioning in the image.
Additionally, 3D reconstruction will also be more difficult for certain eye colors, such as green and blue, because the iris texture is brighter.
##In addition, when there is no explicit modeling texture, more "floating" will appear in the reconstructed picture. "thing".
In order to solve these problems, the quality of reconstruction can be improved by increasing the degree of radial regularization.
#However, this method still has two main limitations.
First of all, the current real-world results are based on "laboratory settings", such as zooming in on faces, using additional light sources to illuminate the scene, etc. In a freer environment, you need to face greater challenges such as lower sensor resolution, smaller dynamic range, and motion blur.
Secondly, current assumptions about iris texture (e.g. constant texture, radially constant color) may be oversimplified, so the method may fail when the eye rotates significantly.
Co-author Kevin Zhang is currently a doctoral student at the University of Maryland.
Brandon Y. Feng received his PhD in computer science from the University of Maryland. His research interests focus on computational imaging, mid-level vision and The field of computational photography. He has developed machine learning algorithms for image and 3D data processing, with applications ranging from mixed reality to natural sciences.
Jia-Bin Huang is an associate professor at the University of Maryland and previously received her PhD from UIUC. Research interests focus on the intersection of computer vision, computer graphics and machine learning.
The above is the detailed content of Eyeball reflection unlocks the 3D world, making Black Mirror a reality! Maryland Chinese's new work wows science fiction fans. For more information, please follow other related articles on the PHP Chinese website!