Table of Contents
Technology Overview
Experimental Results
Home Technology peripherals AI Stable Video 3D makes a shocking debut: a single image generates 3D video without blind spots, and model weights are opened

Stable Video 3D makes a shocking debut: a single image generates 3D video without blind spots, and model weights are opened

Mar 20, 2024 pm 10:31 PM
ai data

Stability AI has a new member in its great model family.

Yesterday, after launching Stable Diffusion and Stable Video Diffusion, Stability AI brought a large 3D video generation model "Stable Video 3D" (SV3D) to the community. .

The model is built based on Stable Video Diffusion, its main advantage is that it significantly improves the quality of 3D generation and multi-view consistency. Compared with the previous Stable Zero123 launched by Stability AI and the joint open source Zero123-XL, the effect of this model is even better.

Currently, Stable Video 3D supports both commercial use, which requires joining Stability AI membership (Membership); and non-commercial use, where users can download the model weights on Hugging Face.

Stable Video 3D震撼登场:单图生成无死角3D视频、模型权重开放

Stability AI provides two model variants, SV3D_u and SV3D_p. SV3D_u generates orbital video based on a single image input without the need for camera adjustments, while SV3D_p further extends the generation capabilities by adapting a single image and orbital perspective, allowing users to create 3D videos along a specified camera path.

Currently, the research paper on Stable Video 3D has been released, with three core authors.

Stable Video 3D震撼登场:单图生成无死角3D视频、模型权重开放


  • Paper address: https://stability.ai/s/SV3D_report.pdf
  • Blog address: https://stability.ai/news/introducing-stable-video-3d
  • Huggingface address: https:// huggingface.co/stabilityai/sv3d

Technology Overview

Stable Video 3D delivers significant advancements in 3D generation, especially in Novel view synthesis (NVS).

Previous approaches often tend to solve the problem of limited viewing angles and inconsistent inputs, while Stable Video 3D is able to provide a coherent view from any given angle and generalize well. As a result, the model not only increases pose controllability but also ensures consistent object appearance across multiple views, further improving key issues affecting realistic and accurate 3D generation.

As shown in the figure below, compared with Stable Zero123 and Zero-XL, Stable Video 3D can generate novel multi-views with stronger details, more faithfulness to the input image, and more consistent multi-viewpoints .

Stable Video 3D震撼登场:单图生成无死角3D视频、模型权重开放

In addition, Stable Video 3D leverages its multi-view consistency to optimize 3D Neural Radiance Fields (NeRF) to improve direct resynchronization. The quality of the 3D mesh generated by the view.

To this end, Stability AI designed a masked fractional distillation sampling loss that further enhances the 3D quality of unseen regions in the predicted view. Also to alleviate baked lighting issues, Stable Video 3D uses a decoupled lighting model that is optimized with 3D shapes and textures.

The image below shows an example of improved 3D mesh generation through 3D optimization when using the Stable Video 3D model and its output.

Stable Video 3D震撼登场:单图生成无死角3D视频、模型权重开放

The following figure shows the comparison of the 3D mesh results generated using Stable Video 3D with those generated by EscherNet and Stable Zero123.

Stable Video 3D震撼登场:单图生成无死角3D视频、模型权重开放

Architecture details

The architecture of the Stable Video 3D model is as shown in Figure 2 As shown, it is built based on the Stable Video Diffusion architecture and contains a UNet with multiple layers, each of which contains a residual block sequence with a Conv3D layer, and two with attention layers (spatial and time) transformer block.

Stable Video 3D震撼登场:单图生成无死角3D视频、模型权重开放

The specific process is as follows:

(i) Delete "fps id" and "motion bucket id", because they have nothing to do with Stable Video 3D;

(ii) The conditional image is embedded into the latent space through the VAE encoder of Stable Video Diffusion, and then passed to The noise latent state input zt of UNet at time step t is connected to the noise latent state input zt;

#(iii) The CLIPembedding matrix of the conditional image is provided to the cross-attention layer of each transformer block to act as a key and values, and the query becomes the feature of the corresponding layer;

(iv) The camera trajectory is fed into the residual block along the diffusion noise time step. The camera pose angles ei and ai and the noise time step t are first embedded into the sinusoidal position embedding, then the camera pose embeddings are concatenated together for linear transformation and added to the noise time step embedding, and finally fed into each residual block and is added to the input features of the block.

In addition, Stability AI designed static orbits and dynamic orbits to study the impact of camera pose adjustments, as shown in Figure 3 below.

Stable Video 3D震撼登场:单图生成无死角3D视频、模型权重开放

#On a static orbit, the camera rotates around the object in equidistant azimuth using the same elevation angle as the condition image. The disadvantage of this is that based on the adjusted elevation angle, you may not get any information about the top or bottom of the object. In a dynamic orbit, the azimuth angles can be unequal, and the elevation angles of each view can also be different.

To build dynamic orbits, Stability AI samples a static orbit, adding small random noise to its azimuth and a randomly weighted combination of sinusoids of different frequencies to its elevation. Doing so provides temporal smoothness and ensures that the camera trajectory ends along the same azimuth and elevation loop as the condition image.

Experimental Results

Stability AI evaluated Stable Video on static and dynamic orbits on unseen GSO and OmniObject3D datasets 3D composite multi-view effect. The results, shown in Tables 1 through 4 below, show that Stable Video 3D achieves state-of-the-art performance in novel multi-view synthesis.

Tables 1 and 3 show the results of Stable Video 3D and other models on static orbits, showing that even the model SV3D_u without pose adjustment performs better than all previous methods. better.

Ablation analysis results show that SV3D_c and SV3D_p outperform SV3D_u in the generation of static trajectories, although the latter is trained exclusively on static trajectories.

Stable Video 3D震撼登场:单图生成无死角3D视频、模型权重开放

Stable Video 3D震撼登场:单图生成无死角3D视频、模型权重开放

##Table 2 and Table 4 below show the generation results of dynamic orbits, including pose adjustment models SV3D_c and SV3D_p, which achieves SOTA on all metrics.

Stable Video 3D震撼登场:单图生成无死角3D视频、模型权重开放

Stable Video 3D震撼登场:单图生成无死角3D视频、模型权重开放

The visual comparison results in Figure 6 below further demonstrate that Stable Video 3D The resulting images are more detailed, more faithful to the conditional image, and more consistent across multiple viewing angles.

Stable Video 3D震撼登场:单图生成无死角3D视频、模型权重开放

#Please refer to the original paper for more technical details and experimental results.

The above is the detailed content of Stable Video 3D makes a shocking debut: a single image generates 3D video without blind spots, and model weights are opened. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undress AI Tool

Undress AI Tool

Undress images for free

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

PHP Tutorial
1502
276
What is Ethereum? What are the ways to obtain Ethereum ETH? What is Ethereum? What are the ways to obtain Ethereum ETH? Jul 31, 2025 pm 11:00 PM

Ethereum is a decentralized application platform based on smart contracts, and its native token ETH can be obtained in a variety of ways. 1. Register an account through centralized platforms such as Binance and Ouyiok, complete KYC certification and purchase ETH with stablecoins; 2. Connect to digital storage through decentralized platforms, and directly exchange ETH with stablecoins or other tokens; 3. Participate in network pledge, and you can choose independent pledge (requires 32 ETH), liquid pledge services or one-click pledge on the centralized platform to obtain rewards; 4. Earn ETH by providing services to Web3 projects, completing tasks or obtaining airdrops. It is recommended that beginners start from mainstream centralized platforms, gradually transition to decentralized methods, and always attach importance to asset security and independent research, to

How to choose a free market website in the currency circle? The most comprehensive review in 2025 How to choose a free market website in the currency circle? The most comprehensive review in 2025 Jul 29, 2025 pm 06:36 PM

The most suitable tools for querying stablecoin markets in 2025 are: 1. Binance, with authoritative data and rich trading pairs, and integrated TradingView charts suitable for technical analysis; 2. Ouyi, with clear interface and strong functional integration, and supports one-stop operation of Web3 accounts and DeFi; 3. CoinMarketCap, with many currencies, and the stablecoin sector can view market value rankings and deans; 4. CoinGecko, with comprehensive data dimensions, provides trust scores and community activity indicators, and has a neutral position; 5. Huobi (HTX), with stable market conditions and friendly operations, suitable for mainstream asset inquiries; 6. Gate.io, with the fastest collection of new coins and niche currencies, and is the first choice for projects to explore potential; 7. Tra

Ethena treasury strategy: the rise of the third empire of stablecoin Ethena treasury strategy: the rise of the third empire of stablecoin Jul 30, 2025 pm 08:12 PM

The real use of battle royale in the dual currency system has not yet happened. Conclusion In August 2023, the MakerDAO ecological lending protocol Spark gave an annualized return of $DAI8%. Then Sun Chi entered in batches, investing a total of 230,000 $stETH, accounting for more than 15% of Spark's deposits, forcing MakerDAO to make an emergency proposal to lower the interest rate to 5%. MakerDAO's original intention was to "subsidize" the usage rate of $DAI, almost becoming Justin Sun's Solo Yield. July 2025, Ethe

What is Binance Treehouse (TREE Coin)? Overview of the upcoming Treehouse project, analysis of token economy and future development What is Binance Treehouse (TREE Coin)? Overview of the upcoming Treehouse project, analysis of token economy and future development Jul 30, 2025 pm 10:03 PM

What is Treehouse(TREE)? How does Treehouse (TREE) work? Treehouse Products tETHDOR - Decentralized Quotation Rate GoNuts Points System Treehouse Highlights TREE Tokens and Token Economics Overview of the Third Quarter of 2025 Roadmap Development Team, Investors and Partners Treehouse Founding Team Investment Fund Partner Summary As DeFi continues to expand, the demand for fixed income products is growing, and its role is similar to the role of bonds in traditional financial markets. However, building on blockchain

Ethereum (ETH) NFT sold nearly $160 million in seven days, and lenders launched unsecured crypto loans with World ID Ethereum (ETH) NFT sold nearly $160 million in seven days, and lenders launched unsecured crypto loans with World ID Jul 30, 2025 pm 10:06 PM

Table of Contents Crypto Market Panoramic Nugget Popular Token VINEVine (114.79%, Circular Market Value of US$144 million) ZORAZora (16.46%, Circular Market Value of US$290 million) NAVXNAVIProtocol (10.36%, Circular Market Value of US$35.7624 million) Alpha interprets the NFT sales on Ethereum chain in the past seven days, and CryptoPunks ranked first in the decentralized prover network Succinct launched the Succinct Foundation, which may be the token TGE

Solana and the founders of Base Coin start a debate: the content on Zora has 'basic value' Solana and the founders of Base Coin start a debate: the content on Zora has 'basic value' Jul 30, 2025 pm 09:24 PM

A verbal battle about the value of "creator tokens" swept across the crypto social circle. Base and Solana's two major public chain helmsmans had a rare head-on confrontation, and a fierce debate around ZORA and Pump.fun instantly ignited the discussion craze on CryptoTwitter. Where did this gunpowder-filled confrontation come from? Let's find out. Controversy broke out: The fuse of Sterling Crispin's attack on Zora was DelComplex researcher Sterling Crispin publicly bombarded Zora on social platforms. Zora is a social protocol on the Base chain, focusing on tokenizing user homepage and content

What is Zircuit (ZRC currency)? How to operate? ZRC project overview, token economy and prospect analysis What is Zircuit (ZRC currency)? How to operate? ZRC project overview, token economy and prospect analysis Jul 30, 2025 pm 09:15 PM

Directory What is Zircuit How to operate Zircuit Main features of Zircuit Hybrid architecture AI security EVM compatibility security Native bridge Zircuit points Zircuit staking What is Zircuit Token (ZRC) Zircuit (ZRC) Coin Price Prediction How to buy ZRC Coin? Conclusion In recent years, the niche market of the Layer2 blockchain platform that provides services to the Ethereum (ETH) Layer1 network has flourished, mainly due to network congestion, high handling fees and poor scalability. Many of these platforms use up-volume technology, multiple transaction batches processed off-chain

Why does Binance account registration fail? Causes and solutions Why does Binance account registration fail? Causes and solutions Jul 31, 2025 pm 07:09 PM

The failure to register a Binance account is mainly caused by regional IP blockade, network abnormalities, KYC authentication failure, account duplication, device compatibility issues and system maintenance. 1. Use unrestricted regional nodes to ensure network stability; 2. Submit clear and complete certificate information and match nationality; 3. Register with unbound email address; 4. Clean the browser cache or replace the device; 5. Avoid maintenance periods and pay attention to the official announcement; 6. After registration, you can immediately enable 2FA, address whitelist and anti-phishing code, which can complete registration within 10 minutes and improve security by more than 90%, and finally build a compliance and security closed loop.

See all articles