TY - JOUR
T1 - Online Training of Stereo Self-Calibration Using Monocular Depth Estimation
AU - Gil, Yotam
AU - Elmalem, Shay
AU - Haim, Harel
AU - Marom, Emanuel
AU - Giryes, Raja
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2021
Y1 - 2021
N2 - Stereo imaging is the most common passive method for producing reliable depth maps. Calibration is a crucial step for every stereo-based system, and despite all the advancements in the field, most calibrations are still done by the same tedious method using a checkerboard target. Monocular-based depth estimation methods do not require extrinsic calibration but generally achieve inferior depth accuracy. In this paper, we present a novel online self-calibration method, which makes use of both stereo and monocular depth maps to find the transformation required for extrinsic calibration by enforcing consistency between both maps. The proposed method works in a closed-loop and exploits the pre-trained networks' global context, and thus avoids feature matching and outliers issues. In addition to presenting our method using an image-based monocular depth estimation method, which can be implemented in most systems without additional changes, we also show that adding a phase-coded aperture mask leads to even better and faster convergence. We demonstrate our method on road scenes from the KITTI vision benchmark and real-world scenes using our prototype camera. Our code is publicly available at https://github.com/YotYot/CalibrationNet.
AB - Stereo imaging is the most common passive method for producing reliable depth maps. Calibration is a crucial step for every stereo-based system, and despite all the advancements in the field, most calibrations are still done by the same tedious method using a checkerboard target. Monocular-based depth estimation methods do not require extrinsic calibration but generally achieve inferior depth accuracy. In this paper, we present a novel online self-calibration method, which makes use of both stereo and monocular depth maps to find the transformation required for extrinsic calibration by enforcing consistency between both maps. The proposed method works in a closed-loop and exploits the pre-trained networks' global context, and thus avoids feature matching and outliers issues. In addition to presenting our method using an image-based monocular depth estimation method, which can be implemented in most systems without additional changes, we also show that adding a phase-coded aperture mask leads to even better and faster convergence. We demonstrate our method on road scenes from the KITTI vision benchmark and real-world scenes using our prototype camera. Our code is publicly available at https://github.com/YotYot/CalibrationNet.
KW - Stereo imaging
KW - calibration
KW - consistency loss
KW - monocular depth estimation
KW - unsupervised learning
UR - http://www.scopus.com/inward/record.url?scp=85112631823&partnerID=8YFLogxK
U2 - 10.1109/TCI.2021.3098927
DO - 10.1109/TCI.2021.3098927
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85112631823
SN - 2573-0436
VL - 7
SP - 812
EP - 823
JO - IEEE Transactions on Computational Imaging
JF - IEEE Transactions on Computational Imaging
M1 - 9495157
ER -