Introduction

David Marr, the pioneer of computer vision, once defined the ultimate computer vision problem as the output of a three-dimensional object’s position and shape, which is “reconstructed” from the input of a two-dimensional image; other tasks, such as recognition, detection, etc., in Marr’s theory, can only be called “Pattern Recognition” problems, not “Computer Vision” problems. The difference is that Marr proved that if the ultimate problem can be solved, then all other problems can be solved. So in terms of the whole field of computer vision, NeRF (Neural Radiance Fields) is solving the most fundamental problem of computer vision, and the effect it shows is the most fundamental progress in the field of computer vision.

View synthesis

SDF

Neural volume

RGB-alpha volume rendering for view synthesis

NeRF

NeRF (Neural Radiance Fields) fits Radiance Fields with a neural network, which is specifically a multilayer perceptron. And it is actually an implicit scene representation because a 3D model cannot be seen directly like point cloud, mesh, or voxel. It can represent the scene as the volume density and color value of any point in space. With the scene representation in the form of NeRF, we can render the scene and generate a simulated image of the new view.

Radiance field

Volume rendering

Neural Radiance field

Features

Positional encoding

Hierarchical volume sampling

Post work