Abstract

The conventional mesh-based Level of Detail (LoD) technique, exemplified by applications such as Google Earth and many game engines, exhibits the capability to holistically represent a large scene even the Earth, and achieves rendering with a space complexity of O(log 𝑛). This constrained data requirement not only enhances rendering efficiency but also facilitates dynamic data fetching, thereby enabling a seamless 3D navigation experience for users.
In this work, we extend this proven LoD technique to Neural Radiance Fields (NeRF) by introducing an octree structure to represent the scenes in different scales. This innovative approach provides a mathematically simple and elegant representation with a rendering space complexity of O(log 𝑛), aligned with the efficiency of mesh-based LoD techniques. We also present a novel training strategy that maintains a complexity of O(𝑛). This strategy allows for parallel training with minimal overhead, ensuring the scalability and efficiency of our proposed method. Our contribution is not only in extending the capabilities of existing techniques but also in establishing a foundation for scalable and efficient large-scale scene representation using NeRF and octree structures.

overview

NeRF with LoD


The core of our InfNeRF is a LoD structure, essentially an octree where each node represents a specific cubic space of the scene. InfNeRF begins with the entire scene encapsulated within a root node, which is then divided into eight smaller cubes, corresponding to the child nodes. This division process continues recursively, partitioning the scene into increasingly finer parts, both in terms of space (size of the area) and scale (level of detail). With this LoD partition, infNeRF only requires a minimum subset of the nodes in rendering, chosen based on their proximity to the camera and their intersection within the frustum. As illustrated in (a), when zooming out, only the root node is required. In (b), when zooming in, only the leaf node is required. In (c), when looking at the horizon, approximately O(log 𝑛) nodes are required which is the upper bound of InfNeRF. In (d), in contrast, other methods require all the blocks when zoom out resulting in an upper bound of O(𝑛). InfNerf significantly reduces the memory burden and the I/O time for retrieving parameters from disk or cloud storage.

Result on Window of the World, ShenZhen
rendering with < 17% of the model

Result on Residential, rendering with < 16% of the model

Result on Sci-Art, rendering with < 22% of the model

Anti-aliasing

The parent nodes in the LoD tree act as a smoother, low-pass-filtered version of the scene. Therefore, high-frequency components in distant views are automatically removed, which effectively reduces the aliasing artifacts.

InfNeRF

Nerfacto

No Popup

Randomly perturbing the radius of the sample sphere in InfNeRF can effectively mitigate the pop-up effect commonly observed in traditional LoD rendering techniques.

TDOM

InfNeRF can generate TDOM(True digital orthophoto map) with high resolution and high quality. Download the full resolution(10240*10240, 100Mb) TDOM here.

Citation

Acknowledgements

We would like to thank DJI providing us the windows of the world dataset.
The website template was borrowed from MichaΓ«l Gharbi.