TeTriRF: Temporal Tri-Plane Radiance Fields for Efficient Free-Viewpoint Video

KU Leuven
CVPR 2024

Abstract

Neural Radiance Fields (NeRF) revolutionize the realm of visual media by providing photorealistic Free-Viewpoint Video (FVV) experiences, offering viewers unparalleled immersion and interactivity. However, the technology's significant storage requirements and the computational complexity involved in generation and rendering currently limit its broader application. To close this gap, this paper presents Temporal Tri-Plane Radiance Fields (TeTriRF), a novel technology that significantly reduces the storage size for Free-Viewpoint Video (FVV) while maintaining low-cost generation and rendering. TeTriRF introduces a hybrid representation with tri-planes and voxel grids to support scaling up to long-duration sequences and scenes with complex motions or rapid changes. We propose a group training scheme tailored to achieving high training efficiency and yielding temporally consistent, low-entropy scene representations. Leveraging these properties of the representations, we introduce a compression pipeline with off-the-shelf video codecs, achieving an order of magnitude less storage size compared to the state-of-the-art. Our experiments demonstrate that TeTriRF can achieve competitive quality with a higher compression rate.

Method

TeTriRF Representation

(a) For each frame in the stream, we factorize the radiance field to a tri-plane and a 3D density grid. This hybrid approach effectively captures high-dimensional appearance features in compact planes and enables efficient point sampling through the explicit density grid, achieving a balance between compactness and representation effectiveness.
(b) Building upon this hybrid representation, we adopt a deferred shading model paired with lightweight MLP decoders to bring real-time rendering within reach.


TeTriRF Training

The proposed training strategy that groups consecutive frames from sequential data and reduces the entropy of the frame representations via imposing temporal consistency by deploying intra-group and inter-group regularizers. By sharing temporal information during training, TeTriRF is able to dramatically accelerate training compared to the per- frame training methods.


TeTriRF Compression

We develop a compression pipeline specifically for TeTriRF, which includes processes such as value quantization, removal of empty spaces, conversion into 2D serialization, and subsequent video encoding.


Results

results

Comparative analysis of training speed and storage capacity using the ReRF dataset.

Comparison on NHR and ReRF Dataset

Qualitative comparisons on NHR and ReRF datasets.

Comparison on DyNeRF Dataset

Qualitative comparisons on DyNeRF dataset.

Ablation

Qualitative results of complete TeTriRF model, its variants and ReRF. The variants are compared at approximately matched sizes.


Please refer to our paper for more results.