Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance#

  • Authors: Lior Yariv, Yoni Kasten, Dror Moran, Meirav Galun, Matan Atzmon, Ronen Basri, Yaron Lipman

  • Affiliations: Weizmann Institute of Science

  • NeurIPS 2020

  • Links: arXiv, Project Page, Code

Summary#

In this work the authors introduce a neural network architecture that simultaneously learns the unknown geometry, camera parameters, and a neural renderer that approximates the light reflected from the surface towards the camera. By training the network on real world 2D images of objects with different material properties, lighting conditions, and noisy camera initializations from the DTU MVS dataset, the model can produce state-of-the-art 3D surface reconstructions with high fidelity, resolution, and detail.

../_images/fig-022.png

Key Ideas#

The goal is to reconstruct the geometry of an object from masked 2D images with possibly rough or noisy camera information. There are three unknowns:

  • geometry θRm

  • appearance γRn

  • cameras τRk

../_images/fig-012.png

The geometry is represented as the zero level set of an MLP f

Sθ={xR3f(x;θ)=0}

IDR forward model. Let the pixel be p and the ray through pixel p be Rp(τ)={cp+tvpt0}. Let x^p=x^p(θ,τ) denote the first intersection. The rendered color of the pixel Lp is given by

Lp(θ,γ,τ)=M(x^p,n^p,z^p,vp;γ)

Approximation of the surface light field. The surface light field radiance L is determined by two functions: the bidrectional reflectance distribution function (BRDF) and the light emitted in the scene.

  • The BRDF function B(x,n,wo,wi) describes the proportion of reflected radiance leaving the surface point x with normal n at direction wo with respect to the incoming radiance from direction wi.

  • The light sources are described by a function Le(x,wo) measuring the emitted radiance of light at point x in direction wo.

The overall rendering equation is given by

L(x^,wo)=Le(x^,wo)+ΩB(x^,n^,wi,wo)Li(x^,wi)dwi=M(x^,n^,v;γ)

where M is a sufficiently large MLP approximating M0. For M to be able to represent the correct light reflected from a surface point x , i.e., be P -universal, it has to receive as arguments also v,n .

Masked rendering. Consider the indicator function identifying whether a certain pixel is occupied by the rendered object

S(θ,τ)=sigmoid(αmint0f(c+tv;θ))

which approximates S(θ,τ) as α.

Loss. The loss is given by

L(θ,γ,τ)=LRGB(θ,γ,τ)+ρLmask(θ,τ)+λLE(θ)LRGB(θ,γ,τ)=1|P|pPin|IpLp(θ,γ,τ)Lmask(θ,τ)=1α|P|α|P|CE(Op,Sp,α(θ,τ))LE(θ)=Ex(xf(x;θ)1)2

where LE is the Implicit Geometric Regularization (IGR) incorporating the Eikonal regularization [1].

Technical Details#

Notes#

References#

[1] A. Gropp, L. Yariv, N. Haim, M. Atzmon, Y. Lipman. Implicit geometric regularization for learning shape. In arXiv, 2020.