Dense 3D Reconstruction from Stereo (using LIBELAS)
March 24, 2017
Introduction
This article is a tutorial of the code implemented in the repository stereo-dense-reconstruction. The theory is not covered in detail, so some basic reading on the topic is suggested before one delves into the code. This ROS package computes dense disparity maps using LIBELAS and converts them into 3D point clouds. The method for generation of disparity maps can be changed based on user preferences. I had written an article on generating dense disparity maps using ORB descriptors earlier. That method can be implemented real time by using optimized code and parallelization.
An overview of the working code can be seen in the above video. Generation of dense disparity maps and the resulting point clouds is shown. The visualizations are shown in rviz
. Towards the end of the video, point cloud transformation is demonstrated using rqt_reconfigure
.
Stereo Calibration
A calibrated pair of stereo cameras is required for dense reconstruction. The calibration file is stored as a YAML file containing the following information about the stereo cameras:
Name | Description |
---|---|
K1 |
Intrinsic matrix of left camera |
K2 |
Intrinsic matrix of right camera |
D1 |
Distortion coefficients of left camera |
D2 |
Distortion coefficients of right camera |
R |
Rotation matrix from left camera to right camera |
T |
Translation vector from left camera to right camera |
XR |
Rotation matrix for point cloud transformation |
XT |
Translation matrix for point cloud transformation |
See a sample calibration file. A YAML calibration file can be generated by using this stereo camera calibration tool. For more information see this detailed tutorial on stereo camera calibration.
Although we require only the matrices mentioned in the above table, the sample calibration file contains other matrices found after rectification (e.g., P1
, P2
, Q
etc.). These matrices are found by OpenCV’s stereoRectify
function and they depend on the output image resolution of the rectified stereo images. Hence it is better to run stereoRectify
online than storing those matrices in a calibration file. This gives us the flexibility to work at different image resolutions.
The use of XR
and XT
matrices is explained later. For initial values XR
can be set as the $3 \times 3$ identity matrix while XT
can be set as the $3 \times 1$ zero vector.
Stereo Rectification
Disparity maps are generated from rectified stereo images. The raw stereo images should be published by a topic which publishes messages of the type sensor_msgs/Image
. The code for rectification of stereo images is implemented in src/stereo_rectify.cpp
.
Declare all the calibration matrices as Mat
variables. calib_img_size
is the resolution of the images used for stereo calibration. out_img_size
is the output resolution of the rectified images. This can be set at our own will. Lower resolution images require less computation time for dense reconstruction. pub_img_left
and pub_img_right
are the publishers for the rectified images. The output images are published as a sensor_msgs/CompressedImage
message. This is helpful while collecting large datasets, so that file size is reduced.
This function undistorts and rectifies the src
image into dst
. The homographic mappings lmapx
, lmapy
, rmapx
, and rmapy
are found from OpenCV’s initUndistortRectifyMap
function.
This function computes all the projection matrices and the rectification transformations using the stereoRectify
and initUndistortRectifyMap
functions respectively.
This callback function takes a raw stereo (left) image as input. Then it undistorts and rectifies the image using the functions defined above and publishes on the rectified image topic using pub_img_left
. Similarly imgRightCallback
is defined for the right stereo image.
In the main function, the stereo calibration information is read into calib_file
. sub_img_left
and sub_img_right
subscribes to the topics publishing the raw stereo image data. The rectified output images are published to the following topics:
/camera_left_rect/image_color
/camera_right_rect/image_color
The ros node can be run as
Dense 3D Reconstruction
Generating dense 3D reconstructions involve two major steps: (1) computing a disparity map (2) converting the disparity map into a 3D point cloud. The code for dense reconstruction is implemented in src/dense_reconstruction.cpp
.
Having time synced stereo images is important for generating accurate disparity maps. To sync the left and right image messages by their header time stamps, ApproximateTime
is used. The procedure is implemented in the main function of src/dense_reconstruction.cpp
.
This syncs the two rectified image topics, and the callback function imgCallback
receives the time synced messages.
Disparity Maps using LIBELAS
LIBELAS (Efficient Large-Scale Stereo Matching) is a great open source software for generating dense disparity maps. It does not use global optimization techniques and yet produces great results at almost real time. More information about ELAS can be found here.
This function computes the dense disparity map using LIBELAS, and publishes it as a 8-bit grayscale image to the topic /camera_left_rect/disparity_map
. The disparity map is constructed with the left image as reference. The parameters for LIBELAS can be changed in the file src/elas/elas.h
.
Any method other than LIBELAS can be implemented inside the generateDisparityMap
function to generate disparity maps. One can use OpenCV’s StereoBM
class as well. The output should be a 8-bit grayscale image.
3D Reconstruction
Given a disparity map, a corresponding 3D point cloud can be easily constructed. The Q
matrix stored in the calibration file is used for this conversion. The reconstruction is mathematically expressed by the following matrix equation.
where $(X,Y,Z,W)^\top$ denotes the 3D point in homogeneous coordinates. $d(x,y)$ is the disparity of a point $(x,y)$ in the left image. The $4 \times 4$ matrix denotes the Q
matrix. This matrix multiplication in homogeneous coordinates is implemented in the publishPointCloud
function. It publishes a sensor_msgs/PointCloud
message to the topic camera_left_rect/point_cloud
. The point cloud generated is in the reference frame of the left camera. Hence a transformation (XR
, XT
) is applied to transform the point cloud into a different reference frame (as required by the user). The transformation equation is as follows
where $P_A$ is a point in reference frame $A$, $P_B$ is a point in the reference frame $B$. $R$ and $T$ denote the rotation and translation matrices respectively. In this case $R$ = XR
and $T$ = XT
. Figure 1 shows some visualizations of the resulting point clouds from dense disparity maps.
Run the dense reconstruction node as
Visualize the point cloud in rviz
Point Cloud Transformation
For ground robots with mounted stereo cameras, usually the point clouds are formed in the reference frame of the left camera. Often the point clouds need to be transformed to a different frame e.g., a reference frame with the origin at the centre of rotation of the robot projected on to the ground plane. These transformations are hard to calculate mathematically. Hence it is easier to find these transformations visually, if the geometry of the scene is known.
The transformation from one reference frame to another is represented by the rotation matrix XR
and the translation vector XT
. XR
can be decomposed into Euler angles ($\phi_x, \phi_y, \phi_z$) while XT
contains the translation values in the $X,Y,$ and $Z$ direction. A config file cfg/CamToRobotCalibParams.cfg
stores these six values. The config file is bound with the dense_reconstruction
node using dynamic_reconfigure
.
The above piece of code implements dynamic_reconfigure
to configure the transformation parameters.
The above two functions are used to compose the matrices XR
and XT
. composeRotationCamToRobot
finds a rotation matrix from three Euler angles by the $ZYX$ convention. composeTranslationCamToRobot
returns a $3\times 1$ translation matrix.
Initially, the values of XR
and XT
in the calibration file were the $3\times 3$ identity matrix and the $3\times 1$ zero matrix. The function debugCamToRobotTransformation(XR, XT)
can be used to dynamically change the values of XR
and XT
with the help of dynamic_reconfigure
. If the desired transformation is found, the resulting values of XR
and XT
can be updated in the calibration file. To modify the transformation values start dynamic_reconfigure
by
Refer to the video in the Introduction section to have a better understanding of how the point cloud transformation works. It is shown towards the end of the video.
Build and Run
Clone the repository stereo-dense-reconstruction in your ros catkin workspace. Build using catkin_make
. Further instructions on running each node is given in the README of the repository.
References and Related Stuff
- Stereo Camera Calibration
- LIBELAS webpage
- Efficient Large-Scale Stereo Matching (paper)
- elas_ros (another implementation of ELAS on ROS)
- Finding 3D transformations