This page contains our raw data recordings, sorted by category (see menu above). So far, we included only sequences, for which we either have 3D object labels or which occur in our odometry benchmark training set. The dataset comprises the following information, captured and synchronized at 10 Hz:

  • Raw (unsynced+unrectified) and processed (synced+rectified) grayscale stereo sequences (0.5 Megapixels, stored in png format)
  • Raw (unsynced+unrectified) and processed (synced+rectified) color stereo sequences (0.5 Megapixels, stored in png format)
  • 3D Velodyne point clouds (100k points per frame, stored as binary float matrix)
  • 3D GPS/IMU data (location, speed, acceleration, meta information, stored as text file)
  • Calibration (Camera, Camera-to-GPS/IMU, Camera-to-Velodyne, stored as text file)
  • 3D object tracklet labels (cars, trucks, trams, pedestrians, cyclists, stored as xml file)

Here, "unsynced+unrectified" refers to the raw input frames where images are distorted and the frame indices do not correspond, while "synced+rectified" refers to the processed data where images have been rectified and undistorted and where the data frame numbers correspond across all sensor streams. For both settings, files with timestamps are provided. Most people require only the "synced+rectified" version of the files.
More detailed information about the sensors, data format and calibration can be found here:

Note: We were not able to annotate all sequences and only provide those tracklet annotations that passed the 3rd human validation stage, ie, those that are of very high quality. For sequences for which tracklets are available, you will find the link [tracklets] in the download category.

