Method

Hadamard Attention Recurrent Transformer: A Strong Baseline for Stereo Matching Transformer [HART]
https://github.com/ZYangChen/HART

Submitted on 1 Feb. 2024 04:25 by
Ziyang Chen (Guizhou University)

Running time:0.34 s
Environment:NVIDIA Tesla A100 (Python)

Method Description:
In light of the advancements in transformer
technology, existing research posits the
construction of stereo transformers as a potential
solution to the binocular stereo matching
challenge. However, constrained by the low-rank
bottleneck and quadratic complexity of attention
mechanisms, stereo transformers still fail to
demonstrate sufficient nonlinear expressiveness
within a reasonable inference time. The lack of
focus on key homonymous points renders the
representations of such methods vulnerable to
challenging conditions, including reflections and
weak textures. Furthermore, a slow computing speed
is not conducive to the application. To overcome
these difficulties, we present the Hadamard
Attention Recurrent Stereo Transformer (HART).
Parameters:
use mixed precision & True
batch size used during training & 8
crop size & 320 $\times$ 720
max learning rate & 0.0002
length of training schedule & 200000
recurrent-number during training & 22
Weight decay in optimizer & 0.00001
Latex Bibtex:
@article{chen2025hart,
title={Hadamard Attention Recurrent Transformer:
A Strong Baseline for Stereo Matching Transformer},
author={Chen, Ziyang and Zhang, Yongjun and Li,
Wenting and Wang, Bingshu and Wu, Yabo and Zhao,
Yong and Chen, CL},
journal={arXiv preprint arXiv:2501.01023},
year={2025}
}

Detailed Results

This page provides detailed results for the method(s) selected. For each of the first 20 test images, the number of erroneous pixels at all thresholds is depicted in the table. Underneath, the left input image, the disparity / end-point error map and the estimated (and interpolated) disparity / optical flow map are shown. The error map scales linearly between 0 (black) and >=5 (white) pixels error. Red denotes all occluded pixels, falling outside the image boundaries. The false color map is scaled to the largest ground truth disparity / flow value.

Test Set Average

Error Out-Noc Out-All Avg-Noc Avg-All
2 pixels 1.70 % 2.12 % 0.4 px 0.4 px
3 pixels 1.11 % 1.38 % 0.4 px 0.4 px
4 pixels 0.84 % 1.05 % 0.4 px 0.4 px
5 pixels 0.69 % 0.86 % 0.4 px 0.4 px
This table as LaTeX

Reflective Regions

Error Out-Noc Out-All Avg-Noc Avg-All
2 pixels 6.18 % 7.38 % 0.8 px 0.8 px
3 pixels 3.14 % 3.92 % 0.8 px 0.8 px
4 pixels 1.99 % 2.49 % 0.8 px 0.8 px
5 pixels 1.41 % 1.75 % 0.8 px 0.8 px
This table as LaTeX

Test Image 0

Error Out-Noc Out-All Avg-Noc Avg-All
2 pixels 1.08 % 1.31 % 0.3 px 0.4 px
3 pixels 0.87 % 1.11 % 0.3 px 0.4 px
4 pixels 0.72 % 0.96 % 0.3 px 0.4 px
5 pixels 0.59 % 0.83 % 0.3 px 0.4 px
This table as LaTeX





Test Image 1

Error Out-Noc Out-All Avg-Noc Avg-All
2 pixels 2.85 % 3.38 % 0.5 px 0.5 px
3 pixels 1.73 % 2.06 % 0.5 px 0.5 px
4 pixels 1.31 % 1.65 % 0.5 px 0.5 px
5 pixels 1.13 % 1.48 % 0.5 px 0.5 px
This table as LaTeX





Test Image 2

Error Out-Noc Out-All Avg-Noc Avg-All
2 pixels 0.63 % 0.68 % 0.2 px 0.2 px
3 pixels 0.41 % 0.46 % 0.2 px 0.2 px
4 pixels 0.32 % 0.37 % 0.2 px 0.2 px
5 pixels 0.27 % 0.31 % 0.2 px 0.2 px
This table as LaTeX





Test Image 3

Error Out-Noc Out-All Avg-Noc Avg-All
2 pixels 0.77 % 1.15 % 0.3 px 0.3 px
3 pixels 0.20 % 0.22 % 0.3 px 0.3 px
4 pixels 0.17 % 0.18 % 0.3 px 0.3 px
5 pixels 0.15 % 0.16 % 0.3 px 0.3 px
This table as LaTeX





Test Image 4

Error Out-Noc Out-All Avg-Noc Avg-All
2 pixels 0.61 % 0.76 % 0.3 px 0.3 px
3 pixels 0.24 % 0.31 % 0.3 px 0.3 px
4 pixels 0.21 % 0.27 % 0.3 px 0.3 px
5 pixels 0.17 % 0.23 % 0.3 px 0.3 px
This table as LaTeX





Test Image 5

Error Out-Noc Out-All Avg-Noc Avg-All
2 pixels 0.83 % 0.96 % 0.3 px 0.3 px
3 pixels 0.57 % 0.64 % 0.3 px 0.3 px
4 pixels 0.48 % 0.53 % 0.3 px 0.3 px
5 pixels 0.42 % 0.46 % 0.3 px 0.3 px
This table as LaTeX





Test Image 6

Error Out-Noc Out-All Avg-Noc Avg-All
2 pixels 3.74 % 4.95 % 0.5 px 0.6 px
3 pixels 1.85 % 2.28 % 0.5 px 0.6 px
4 pixels 1.27 % 1.68 % 0.5 px 0.6 px
5 pixels 1.00 % 1.37 % 0.5 px 0.6 px
This table as LaTeX





Test Image 7

Error Out-Noc Out-All Avg-Noc Avg-All
2 pixels 1.10 % 1.46 % 0.5 px 0.5 px
3 pixels 0.81 % 0.97 % 0.5 px 0.5 px
4 pixels 0.68 % 0.76 % 0.5 px 0.5 px
5 pixels 0.59 % 0.64 % 0.5 px 0.5 px
This table as LaTeX





Test Image 8

Error Out-Noc Out-All Avg-Noc Avg-All
2 pixels 1.19 % 1.25 % 0.3 px 0.3 px
3 pixels 0.88 % 0.94 % 0.3 px 0.3 px
4 pixels 0.69 % 0.70 % 0.3 px 0.3 px
5 pixels 0.62 % 0.62 % 0.3 px 0.3 px
This table as LaTeX





Test Image 9

Error Out-Noc Out-All Avg-Noc Avg-All
2 pixels 1.71 % 2.99 % 0.5 px 0.7 px
3 pixels 1.22 % 1.87 % 0.5 px 0.7 px
4 pixels 1.10 % 1.33 % 0.5 px 0.7 px
5 pixels 0.97 % 1.16 % 0.5 px 0.7 px
This table as LaTeX





Test Image 10

Error Out-Noc Out-All Avg-Noc Avg-All
2 pixels 1.03 % 1.06 % 0.3 px 0.3 px
3 pixels 0.70 % 0.73 % 0.3 px 0.3 px
4 pixels 0.57 % 0.59 % 0.3 px 0.3 px
5 pixels 0.53 % 0.56 % 0.3 px 0.3 px
This table as LaTeX





Test Image 11

Error Out-Noc Out-All Avg-Noc Avg-All
2 pixels 1.32 % 1.28 % 0.4 px 0.4 px
3 pixels 0.80 % 0.78 % 0.4 px 0.4 px
4 pixels 0.52 % 0.50 % 0.4 px 0.4 px
5 pixels 0.40 % 0.38 % 0.4 px 0.4 px
This table as LaTeX





Test Image 12

Error Out-Noc Out-All Avg-Noc Avg-All
2 pixels 2.27 % 3.82 % 0.5 px 0.6 px
3 pixels 1.73 % 3.06 % 0.5 px 0.6 px
4 pixels 1.48 % 2.56 % 0.5 px 0.6 px
5 pixels 1.27 % 2.12 % 0.5 px 0.6 px
This table as LaTeX





Test Image 13

Error Out-Noc Out-All Avg-Noc Avg-All
2 pixels 1.13 % 1.69 % 0.4 px 0.4 px
3 pixels 0.79 % 1.20 % 0.4 px 0.4 px
4 pixels 0.62 % 0.92 % 0.4 px 0.4 px
5 pixels 0.56 % 0.79 % 0.4 px 0.4 px
This table as LaTeX





Test Image 14

Error Out-Noc Out-All Avg-Noc Avg-All
2 pixels 0.27 % 0.26 % 0.2 px 0.2 px
3 pixels 0.18 % 0.18 % 0.2 px 0.2 px
4 pixels 0.16 % 0.16 % 0.2 px 0.2 px
5 pixels 0.14 % 0.14 % 0.2 px 0.2 px
This table as LaTeX





Test Image 15

Error Out-Noc Out-All Avg-Noc Avg-All
2 pixels 2.00 % 2.09 % 0.4 px 0.4 px
3 pixels 1.38 % 1.44 % 0.4 px 0.4 px
4 pixels 0.85 % 0.89 % 0.4 px 0.4 px
5 pixels 0.57 % 0.59 % 0.4 px 0.4 px
This table as LaTeX





Test Image 16

Error Out-Noc Out-All Avg-Noc Avg-All
2 pixels 2.08 % 2.04 % 0.5 px 0.5 px
3 pixels 1.06 % 1.04 % 0.5 px 0.5 px
4 pixels 0.68 % 0.66 % 0.5 px 0.5 px
5 pixels 0.51 % 0.51 % 0.5 px 0.5 px
This table as LaTeX





Test Image 17

Error Out-Noc Out-All Avg-Noc Avg-All
2 pixels 3.52 % 3.78 % 0.5 px 0.5 px
3 pixels 2.23 % 2.37 % 0.5 px 0.5 px
4 pixels 1.71 % 1.81 % 0.5 px 0.5 px
5 pixels 1.48 % 1.58 % 0.5 px 0.5 px
This table as LaTeX





Test Image 18

Error Out-Noc Out-All Avg-Noc Avg-All
2 pixels 0.45 % 0.44 % 0.2 px 0.2 px
3 pixels 0.33 % 0.33 % 0.2 px 0.2 px
4 pixels 0.29 % 0.29 % 0.2 px 0.2 px
5 pixels 0.24 % 0.24 % 0.2 px 0.2 px
This table as LaTeX





Test Image 19

Error Out-Noc Out-All Avg-Noc Avg-All
2 pixels 0.93 % 0.90 % 0.3 px 0.3 px
3 pixels 0.64 % 0.62 % 0.3 px 0.3 px
4 pixels 0.51 % 0.49 % 0.3 px 0.3 px
5 pixels 0.45 % 0.43 % 0.3 px 0.3 px
This table as LaTeX







eXTReMe Tracker