Method

Luminet: A Multi-Modal 3-Stream Feature Fusion Network for 3d Object Detection [LumiNet]


Submitted on 16 Feb. 2026 04:31 by
Fazal ghaffar (Deakin University)

Running time:0.1 s
Environment:1 core @ 2.5 Ghz (Python)

Method Description:
LumiNet is a novel 3D object detection framework
that combines LiDAR point clouds, RGB images, and
depth data to enhance 3D object detection. By
fusing these complementary modalities, the
approach provides robust and reliable detection
for applications like autonomous vehicles. LumiNet
integrates semantic information from RGB images
into point features using a dedicated fusion
module and leverages depth features to strengthen
the representation of LiDAR and RGB data. This
multi-modal fusion enables accurate 3D bounding
box predictions and improves scene understanding,
particularly through reliable depth estimation
critical for real-world environments. The
framework incorporates a Strong Attention
mechanism and a 3-Stream multi-modal loss to
enhance cross-modal feature learning and fusion.
LumiNet's performance is evaluated on the KITTI
and JRDB datasets, with experimental results
highlighting the effectiveness of its multi-modal
fusion framework compared to state-of-the-art 3D
detectio
Parameters:
0.2
Latex Bibtex:

Detailed Results

Object detection and orientation estimation results. Results for object detection are given in terms of average precision (AP) and results for joint object detection and orientation estimation are provided in terms of average orientation similarity (AOS).


Benchmark Easy Moderate Hard
Car (Detection) 99.23 % 96.27 % 88.94 %
Car (Orientation) 99.09 % 95.87 % 88.47 %
Car (3D Detection) 91.76 % 83.32 % 78.29 %
Car (Bird's Eye View) 95.79 % 90.13 % 85.06 %
Pedestrian (Detection) 72.01 % 61.38 % 58.94 %
Pedestrian (Orientation) 66.85 % 55.80 % 53.17 %
Pedestrian (3D Detection) 53.54 % 45.26 % 41.55 %
Pedestrian (Bird's Eye View) 57.64 % 50.44 % 46.74 %
Cyclist (Detection) 88.45 % 74.76 % 67.89 %
Cyclist (Orientation) 87.99 % 74.03 % 67.13 %
Cyclist (3D Detection) 80.43 % 62.31 % 55.72 %
Cyclist (Bird's Eye View) 85.56 % 68.42 % 61.65 %
This table as LaTeX


2D object detection results.
This figure as: png eps txt gnuplot



Orientation estimation results.
This figure as: png eps txt gnuplot



3D object detection results.
This figure as: png eps txt gnuplot



Bird's eye view results.
This figure as: png eps txt gnuplot



2D object detection results.
This figure as: png eps txt gnuplot



Orientation estimation results.
This figure as: png eps txt gnuplot



3D object detection results.
This figure as: png eps txt gnuplot



Bird's eye view results.
This figure as: png eps txt gnuplot



2D object detection results.
This figure as: png eps txt gnuplot



Orientation estimation results.
This figure as: png eps txt gnuplot



3D object detection results.
This figure as: png eps txt gnuplot



Bird's eye view results.
This figure as: png eps txt gnuplot




eXTReMe Tracker