\begin{tabular}{c | c | c | c | c | c | c | c}
{\bf Method} & {\bf Setting} & {\bf SILog} & {\bf sqErrorRel} & {\bf absErrorRel} & {\bf iRMSE} & {\bf Runtime} & {\bf Environment}\\ \hline
UniDepth & & 8.13 & 1.09 \% & 6.54 \% & 8.24 & 0.1 s / GPU & L. Piccinelli, Y. Yang, C. Sakaridis, M. Segu, S. Li, L. Van Gool and F. Yu: UniDepth: Universal Monocular Metric Depth Estimation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2024.\\
1PNet & & 9.46 & 1.45 \% & 7.62 \% & 9.75 & 0.1 s / 1 core & \\
MSFusion & & 9.59 & 1.56 \% & 7.81 \% & 10.32 & 0.1 s / 1 core & \\
DCDepth & & 9.60 & 1.54 \% & 7.83 \% & 10.12 & 0.07 s / 1 core & \\
AssFusionNet & & 9.62 & 1.57 \% & 7.82 \% & 10.33 & 0.1 s / 1 core & \\
NDDepth & & 9.62 & 1.59 \% & 7.75 \% & 10.62 & 0.1s / GPU & S. Shao, Z. Pei, W. Chen, X. Wu and Z. Li: NDDepth: Normal-Distance Assisted Monocular Depth Estimation. International Conference on Computer Vision (ICCV) 2023.\\
IEBins & & 9.63 & 1.60 \% & 7.82 \% & 10.68 & 0.1s / GPU & S. Shao, Z. Pei, X. Wu, Z. Liu, W. Chen and Z. Li: IEBins: Iterative Elastic Bins for Monocular Depth Estimation. Advances in Neural Information Processing Systems (NeurIPS) 2023.\\
VMDepth & & 9.69 & 1.68 \% & 7.23 \% & 9.60 & 0.1 s / 1 core & \\
VA-DepthNet & & 9.84 & 1.66 \% & 7.96 \% & 10.44 & 0.1 s / 1 core & C. Liu, S. Kumar, S. Gu, R. Timofte and L. Van Gool: VA-DepthNet: A Variational Approach to Single Image Depth Prediction. International Conference on Learning Representations (ICLR) 2023.\\
DiffusionDepth-I & & 9.85 & 1.64 \% & 8.06 \% & 10.58 & 0.2 s / 1 core & Y. Duan, X. Guo and Z. Zhu: Diffusiondepth: Diffusion denoising approach for monocular depth estimation. arXiv preprint arXiv:2303.05021 2023.\\
iDisc & & 9.89 & 1.77 \% & 8.11 \% & 10.73 & 0.1 s / 1 core & L. Piccinelli, C. Sakaridis and F. Yu: iDisc: Internal Discretization for Monocular Depth Estimation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2023.\\
MG & & 9.93 & 1.68 \% & 7.99 \% & 10.63 & 0.1 s / 1 core & C. Liu, S. Kumar, S. Gu, R. Timofte and L. Van Gool: Single Image Depth Prediction Made Better: A Multivariate Gaussian Take. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023.\\
URCDC-Depth & & 10.03 & 1.74 \% & 8.24 \% & 10.71 & 0.1 s / 1 core & S. Shao, Z. Pei, W. Chen, R. Li, Z. Liu and Z. Li: URCDC-Depth: Uncertainty Rectified Cross-Distillation with CutFlip for Monocular Depth Estimation. IEEE Transactions on Multimedia (TMM) 2023.\\
BinsFormer & & 10.14 & 1.69 \% & 8.23 \% & 10.90 & 0.1 s / 1 core & Z. Li, X. Wang, X. Liu and J. Jiang: BinsFormer: Revisiting Adaptive Bins for Monocular Depth Estimation. arXiv preprint arXiv:2204.00987 2022.\\
TrapNet & & 10.15 & 1.66 \% & 7.92 \% & 10.45 & 0.1 s / 1 core & C. Ning and H. Gan: Trap Attention: Monocular Depth Estimation with Manual Traps. Proceedings of the IEEE/CVF International Conference on Computer Vision and Pattern Recognition 2023.\\
PixelFormer & & 10.28 & 1.82 \% & 8.16 \% & 10.84 & 0.1 s / 1 core & A. Agarwal and C. Arora: Attention Attention Everywhere: Monocular Depth Prediction with Skip Attention. WACV 2023.\\
glformer & & 10.28 & 1.73 \% & 8.19 \% & 11.09 & 0.05s / 1 core & \\
RED-T & & 10.36 & 1.92 \% & 8.11 \% & 10.82 & 0.1 s / GPU & K. Shim, J. Kim, G. Lee and B. Shim: Depth-Relative Self Attention for Monocular Depth Estimation. 2023.\\
ZDepth & & 10.36 & 1.89 \% & 8.53 \% & 11.23 & 0.1 s / GPU & \\
NeWCRFs & & 10.39 & 1.83 \% & 8.37 \% & 11.03 & 0.1 s / 1 core & W. Yuan, X. Gu, Z. Dai, S. Zhu and P. Tan: NeWCRFs: Neural Window Fully-connected CRFs for Monocular Depth Estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2022.\\
DepthFormer & & 10.69 & 1.84 \% & 8.68 \% & 11.39 & 0.1 s / 1 core & Z. Li, Z. Chen, X. Liu and J. Jiang: Depthformer: Exploiting long-range correlation and local information for accurate monocular depth estimation. arXiv preprint arXiv:2203.14211 2022.\\
ViP-DeepLab & & 10.80 & 2.19 \% & 8.94 \% & 11.77 & 0.1 s / GPU & S. Qiao, Y. Zhu, H. Adam, A. Yuille and L. Chen: ViP-DeepLab: Learning Visual Perception with Depth-aware Video Panoptic Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2021.\\
CoGF-Depth & & 10.99 & 2.04 \% & 8.82 \% & 11.23 & 1 s / 1 core & \\
SideRT & & 11.42 & 2.25 \% & 9.28 \% & 11.88 & 0.02 s / GPU & C. Shu, Z. Chen, L. Chen, K. Ma, M. Wang and H. Ren: SideRT: A Real-time Pure Transformer Architecture for Single Image Depth Estimation. 2022.\\
PWA & & 11.45 & 2.30 \% & 9.05 \% & 12.32 & 0.06 s / GPU & S. Lee, J. Lee, B. Kim, E. Yi and J. Kim: Patch-Wise Attention Network for Monocular Depth Estimation. Proceedings of the AAAI Conference on Artificial Intelligence 2021.\\
BANet & & 11.55 & 2.31 \% & 9.34 \% & 12.17 & 0.04 s / GPU & S. Aich, J. Vianney, M. Islam, M. Kaur and B. Liu: Bidirectional Attention Network for Monocular Depth Estimation. IEEE International Conference on Robotics and Automation (ICRA) 2021.\\
BTS & & 11.67 & 2.21 \% & 9.04 \% & 12.23 & 0.06 s / GPU & J. Lee, M. Han, D. Ko and I. Suh: From Big to Small: Multi-Scale Local Planar Guidance for Monocular Depth Estimation. 2019.\\
DL\_61 (DORN) & & 11.77 & 2.23 \% & 8.78 \% & 12.98 & 0.5 s / GPU & H. Fu, M. Gong, C. Wang, K. Batmanghelich and D. Tao: Deep Ordinal Regression Network for Monocular Depth Estimation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018.\\
RefinedMPL & & 11.80 & 2.31 \% & 10.09 \% & 13.39 & 0.05 s / GPU & J. Vianney, S. Aich and B. Liu: RefinedMPL: Refined Monocular PseudoLiDAR for 3D Object Detection in Autonomous Driving. arXiv preprint arXiv:1911.09712 2019.\\
DLE & & 11.81 & 2.22 \% & 9.09 \% & 12.49 & 0.09 s / & C. Liu, S. Gu, L. Gool and R. Timofte: Deep Line Encoding for Monocular 3D Object Detection and Depth Prediction. Proceedings of the British Machine Vision Conference (BMVC) 2021.\\
PFANet & & 11.84 & 2.46 \% & 9.23 \% & 12.63 & 0.1 s / GPU & Y. Xu, C. Peng, M. Li, Y. Li and S. Du: Pyramid Feature Attention Network for Monocular Depth Prediction. 2021 IEEE International Conference on Multimedia and Expo (ICME) 2021.\\
GAC & & 12.13 & 2.61 \% & 9.41 \% & 12.65 & 0.05 s / GPU & Y. Liu, Y. Yuan and M. Liu: Ground-aware Monocular 3D Object Detection for Autonomous Driving. IEEE Robotics and Automation Letters 2021.\\
DL\_SORD\_SL & & 12.39 & 2.49 \% & 10.10 \% & 13.48 & 0.8 s / GPU & R. Diaz and A. Marathe: Soft Labels for Ordinal Regression. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2019.\\
VNL & & 12.65 & 2.46 \% & 10.15 \% & 13.02 & 0.5 s / 1 core & Y. Wei, Y. Liu, C. Shen and Y. Yan: Enforcing geometric constraints of virtual normal for depth prediction. 2019.\\
P3Depth & & 12.82 & 2.53 \% & 9.92 \% & 13.71 & 0.1 s / GPU & V. Patil, C. Sakaridis, A. Liniger and L. Van Gool: P3Depth: Monocular Depth Estimation with a Piecewise Planarity Prior. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022.\\
MS-DPT & & 12.83 & 3.62 \% & 11.01 \% & 13.43 & 0.1 s / GPU & J. Song and S. Lee: Knowledge Distillation of Multi-scale Dense Prediction Transformer for Self-supervised Depth Estimation. 2023.\\
DS-SIDENet\_ROB & & 12.86 & 2.87 \% & 10.03 \% & 14.40 & 0.35 s / GPU & H. Ren, M. El-Khamy and J. Lee: Deep Robust Single Image Depth Estimation Neural Network Using Scene Understanding. IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW) 2019.\\
DL\_SORD\_SQ & & 13.00 & 2.95 \% & 10.38 \% & 13.78 & 0.88 s / GPU & R. Diaz and A. Marathe: Soft Labels for Ordinal Regression. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2019.\\
PAP & & 13.08 & 2.72 \% & 10.27 \% & 13.95 & 0.18 s / GPU & Z. Zhang, Z. Cui, C. Xu, Y. Yan, N. Sebe and J. Yang: Pattern-Affinitive Propagation Across Depth, Surface Normal and Semantic Segmentation. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2019.\\
CADepth-Net & & 13.34 & 3.33 \% & 10.67 \% & 13.61 & 0.08 s / 1 core & J. Yan, H. Zhao, P. Bu and Y. Jin: Channel-Wise Attention-Based Network for Self-Supervised Monocular Depth Estimation. 2021.\\
VGG16-UNet & & 13.41 & 2.86 \% & 10.60 \% & 15.06 & 0.16 s / GPU & X. Guo, H. Li, S. Yi, J. Ren and X. Wang: Learning monocular depth by distilling cross-domain stereo networks. Proceedings of the European Conference on Computer Vision (ECCV) 2018.\\
DORN\_ROB & & 13.53 & 3.06 \% & 10.35 \% & 15.96 & 2 s / GPU & H. Fu, M. Gong, C. Wang, K. Batmanghelich and D. Tao: Deep Ordinal Regression Network for Monocular Depth Estimation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2018.\\
g2s & & 14.16 & 3.65 \% & 11.40 \% & 15.53 & 0.04 s / GPU & H. Chawla, A. Varma, E. Arani and B. Zonooz: Multimodal Scale Consistency and Awareness for Monocular Self-Supervised Depth Estimation. 2021 IEEE International Conference on Robotics and Automation (ICRA) 2021.\\
MT-SfMLearner & & 14.25 & 3.72 \% & 12.52 \% & 15.83 & 0.04s / GPU & A. Varma., H. Chawla., B. Zonooz. and E. Arani.: Transformers in Self-Supervised Monocular Depth Estimation with Unknown Camera Intrinsics. Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, 2022.\\
MLDA-Net & & 14.42 & 3.41 \% & 11.67 \% & 16.12 & 0.2 s / 1 core & X. Song, W. Li, D. Zhou, Y. Dai, J. Fang, H. Li and L. Zhang: MLDA-Net: Multi-Level Dual Attention-Based Network for Self-Supervised Monocular Depth Estimation. IEEE Transactions on Image Processing 2021.\\
DABC\_ROB & & 14.49 & 4.08 \% & 12.72 \% & 15.53 & 0.7 s / GPU & R. Li, K. Xian, C. Shen, Z. Cao, H. Lu and L. Hang: Deep attention-based classification network for robust depth prediction. Proceedings of the Asian Conference on Computer Vision (ACCV) 2018.\\
BTSREF\_RVC & & 14.67 & 3.12 \% & 12.42 \% & 16.84 & 0.1 s / 1 core & J. Lee, M. Han, D. Ko and I. Suh: From big to small: Multi-scale local planar guidance for monocular depth estimation. arXiv preprint arXiv:1907.10326 2019.\\
SDNet & & 14.68 & 3.90 \% & 12.31 \% & 15.96 & 0.2 s / GPU & M. Ochs, A. Kretz and R. Mester: SDNet: Semantic Guided Depth Estimation Network. German Conference on Pattern Recognition (GCPR) 2019.\\
APMoE\_base\_ROB & & 14.74 & 3.88 \% & 11.74 \% & 15.63 & 0.2 s / GPU & S. Kong and C. Fowlkes: Pixel-wise Attentional Gating for Parsimonious Pixel Labeling. arxiv 1805.01556 2018.\\
DiPE & & 14.84 & 4.04 \% & 12.28 \% & 15.69 & 0.01 s / GPU & H. Jiang, L. Ding, Z. Sun and R. Huang: DiPE: Deeper into Photometric Errors for Unsupervised Learning of Depth and Ego-motion from Monocular Videos. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2020.\\
CSWS\_E\_ROB & & 14.85 & 3.48 \% & 11.84 \% & 16.38 & 0.2 s / 1 core & M. Bo Li: Monocular Depth Estimation with Hierarchical Fusion of Dilated CNNs and Soft-Weighted-Sum Inference. 2018.\\
R-MSFM & & 15.09 & 3.57 \% & 11.80 \% & 17.60 & 1 s / 1 core & Z. Zhou, X. Fan, P. Shi and Y. Xin: R-msfm: Recurrent multi-scale feature modulation for monocular depth estimating. Proceedings of the IEEE/CVF international conference on computer vision 2021.\\
HBC & & 15.18 & 3.79 \% & 12.33 \% & 17.86 & 0.05 s / GPU & H. Jiang and R. Huang: Hierarchical Binary Classification for Monocular Depth Estimation. IEEE International Conference on Robotics and Biomimetics 2019.\\
SGDepth & & 15.30 & 5.00 \% & 13.29 \% & 15.80 & 0.1 s / GPU & M. Klingner, J. Termöhlen, J. Mikolajczyk and T. Fingscheidt: Self-Supervised Monocular Depth Estimation: Solving the Dynamic Object Problem by Semantic Guidance. ECCV 2020.\\
DHGRL & & 15.47 & 4.04 \% & 12.52 \% & 15.72 & 0.2 s / GPU & Z. Zhang, C. Xu, J. Yang, Y. Tai and L. Chen: Deep hierarchical guidance and regularization learning for end-to-end depth estimation. Pattern Recognition 2018.\\
GCNDepth & & 15.54 & 4.26 \% & 12.75 \% & 15.99 & 0.05 s / GPU & A. Masoumian, H. Rashwan, S. Abdulwahab, J. Cristiano and D. Puig: GCNDepth: Self-supervised Monocular Depth Estimation based on Graph Convolutional Network. arXiv preprint arXiv:2112.06782 2021.\\
packnSFMHR\_RVC & & 15.80 & 4.73 \% & 12.28 \% & 17.96 & 0.5 s / GPU & V. Guizilini, R. Ambrus, S. Pillai, A. Raventos and A. Gaidon: 3D Packing for Self-Supervised Monocular Depth Estimation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) .\\
MultiDepth & & 16.05 & 3.89 \% & 13.82 \% & 18.21 & 0.01 s / GPU & L. Liebel and M. Körner: MultiDepth: Single-Image Depth Estimation via Multi-Task Regression and Classification. IEEE Intelligent Transportation Systems Conference (ITSC) 2019.\\
LSIM & & 17.92 & 6.88 \% & 14.04 \% & 17.62 & 0.08 s / GPU & M. Goldman, T. Hassner and S. Avidan: Learn Stereo, Infer Mono: Siamese Networks for Self-Supervised, Monocular, Depth Estimation. Computer Vision and Pattern Recognition Workshops (CVPRW) 2019.
\end{tabular}