Thank you for the great work!
I am about to start some experiments with Pi3X conditioning.
I have a set of camera positions and yaws for my building, but I do not have data on pitch, roll and intrinsics.
Is it true that you always need to supply all of these in multimodal mode? Or is there a way to let the model fit in missing values?