rmc logo

Dr.-Ing. Klaus H. Strobl ‒ icra2011

Complementary videos to the ICRA 2011 publication "Vision-Based Pose Estimation for 3-D Modeling in Rapid, Hand-Held Motion."

This work is an extension of "The Self-Referenced DLR 3D-Modeler" at IROS 2009 [1]. You can find the then videos here. This is the current accompanying video:

strobl_icra11_video.avi (codec: msmpeg4v2, size: 4.7 MB)

strobl_icra11_video-hd.wmv (codec: MS Windows Media Video 9, res.: 1920x1080, size: 85 MB)

The video demonstrates real-time performance of the current approach, concurrently with scanning a scene, online meshing of the resulting pointcloud (work by Tim Bodenmüller) as well as its 3-D representation. A notebook equipped with an Intel® Core™ 2 Duo P8700 processor manages it all in real-time, at 25 Hz (i.e., every 40 ms). Feature matching and subsequent pose estimation from a new image typically take less than 10 ms (with occasional stereo calculations on another thread); laser stripe profiler operation takes 16 ms; and online meshing together with visualization 18 ms. Thus computations take place with ease. It is worth noting that the scanning speed is limited by the desired density of 3-D points on the model's surface, at 25 Hz. Hectic movements are also performed to show the actual agility of the system.

The next videos focus on the tracking robustness (agility) of the algorithm.

First, we show the same challenging sequence already presented in Ref. [1], now using the current algorithm, i.e., without support from an inertial measurement unit. The motion is clearly beyond the requirements for rapid 3-D modeling. Note the decisive effect of the two first active features, whose radii of search only amount to 40 and 20 pixels respectively. Two versions of the same sequence---in real-time and in slow motion---are shown:

challenging-active_100.avi (codec: msmpeg4v2, size: 17 MB)

challenging-active_50.avi (codec: msmpeg4v2, size: 34 MB)

Next, we compare the current approach with others in their tracking performance. In Ref. [1] we showed the following videos:

In the first video neither IMU data nor image flow extrapolation is used: the starting feature locations for tracking correspond to the last tracked positions of the features or, if the feature was lost, to the expected projection of the feature using both the known structure and the last pose estimation available. The tracking is very unstable. Only very slow motion is allowed. Optical flow over 7-8 pixels is not tolerated anymore, which is very easily reached in rotation:

tracking_last.avi (codec: msmpeg4v2, size: 3.0MB)

The second video estimates the last image flow for all features and subsequently adds it to their last tracked positions. This is called image flow extrapolation. Performance is good in translation but not in rotation because the variations in the rotational part of the image flow are bigger than in the translational part:

tracking_optical.avi (codec: msmpeg4v2, size: 5.8MB)

In the last video we show the satisfactory performance of the hybrid prediction scheme presented in the paper in highly-dynamic conditions. IMU data are used for the prediction of the rotational image flow for every feature. For the estimation of the translational image flow, first the last translational image flow is being estimated (from the last image flow as from the tracking results, and from the last predicted rotational image flow from the IMU). The current translational image flow prediction corresponds to the extrapolation of the last translational image flow. This hybrid image flow is, again, added to the last tracked positions of the features:

tracking_hybrid.avi (codec: msmpeg4v2, size: 3.5MB)

Using our current novel algorithm, however, we now achieve even better performance (agility) using even less resources, and without making recourse to a synchronized inertial measurement unit:

tracking_active.avi (codec: msmpeg4v2, size: 8.4MB)

by Klaus Strobl, on September 10th, 2010.


[1] K. H. Strobl, E. Mair, T. Bodenmüller, S. Kielhöfer, W. Sepp, M. Suppa, D. Burschka, and G. Hirzinger, "The Self-Referenced DLR 3D-Modeler,'' in Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), St. Louis, MO, USA, October 2009, pp. 21--28, best paper finalist. [PDF]

Last updated on Tuesday, 13 May 2014 by Klaus H. Strobl