I am reading NVIDIA's Get3D paper in which they train a ML model to generate 3D models. They mention in "Related Work":
Early approaches aimed to directly extend the 2D [Convolutional Neural Network] generators to 3D voxel grids, but the high memory footprint and computational complexity of 3D convolutions hinder the generation process at high resolution
But then they use a Convolutional Neural Network to create a Signed Distance Field, which is also a 3D grid. So they are still doing 3D convolutions.
Why are the 3D convolutions for voxel grids so much more expensive, than for SDF? Does this only have to do with the amount of points in the grid?