NeVRF: Neural Video-based Radiance Fields for Long-duration Sequences

KU Leuven
International Conference on 3D Vision 2024

Abstract

Adopting Neural Radiance Fields (NeRF) to long-duration dynamic sequences has been challenging. Existing methods struggle to balance between quality and storage size and encounter difficulties with complex scene changes such as topological changes and large motions. To tackle these issues, we propose a novel neural video-based radiance fields (NeVRF) representation. NeVRF marries neural radiance field with image-based rendering to support photo-realistic novel view synthesis on long-duration dynamic inward-looking scenes. We introduce a novel multi-view radiance blending approach to predict radiance directly from multi-view videos. By incorporating continual learning techniques, NeVRF can efficiently reconstruct frames from sequential data without revisiting previous frames, enabling long-duration free-viewpoint video. Furthermore, with a tailored compression approach, NeVRF can compactly represent dynamic scenes, making dynamic radiance fields more practical in real-world scenarios. Our extensive experiments demonstrate the effectiveness of NeVRF in enabling long-duration sequence rendering, sequential data reconstruction, and compact data storage.

Method

NeVRF Rendering

NeVRF exploits the multi-view radiance blending method to predict its RGB color as the density is directly interpolated on the density grid.


NeVRF Training

We develop a ray-based retraining strategy to combat the catastrophic forgetting in blending networks when processing sequential data.


NeVRF Compression

We employ Singular Value Decomposition (SVD) to break down the density grids, retaining only the top singular values to eliminate insignificant signals. The multi-view videos are compressed using the H.265 codec, maintaining a bitrate of 1 Mbps for each view.


Results

results

Gallery of rendered examples. Our neural pipeline enables efficient training and photo-realistic rendering for dynamic scenes. The results are all rendered at novel viewpoints

Please refer to our paper for more results.