As mentioned previously [1], we strongly believe that giving a robot the ability to autonomously map its environment provides valuable information for Building Information Modeling (BIM). For example, our first tests [8] successfully detected out of date building plans, a common occurrence following remodelling (Figure 1).

However impressive, the displayed map is still limited; while it is internally consistent, fine details could not be properly captured using a bare-bones robot equipped only with an inexpensive lidar. Moreover, the data produced is a horizontal slice of the world at the height of the robot (~20 cm high), which makes it hard to understand for a human used to living high up (~175 cm). Both of those problems can be fixed using 3D scanning (Figure 2).

Traditionally, a full 3D scan of a building’s interior involves a human lugging around a 3D scanner throughout its rooms [2]; this process turns out to be very expensive for a variety of reasons. First, in order to achieve metrological quality, the used scanner has to be of metrological quality itself : fast, accurate and long range, hence expensive. Second, humans are needed to create the scans : to select the best scanning spots, to move the equipment, to launch and validate scans and finally to stitch together the scans into a cohesive map. A lot of those steps require specialist knowledge, which is also expensive. Finally, to limit the time (and thus expenditures) required to complete operations, only the bare minimum of scans required to cover the space are performed. This can lead to underwhelming results as details are missed.

An alternative method that we support relies on short range, low cost scanners. Sensors in this category can be exceptionally precise : for example, at 30 cm standoff a small baseline stereo pair can achieve micrometer accuracy [3]. The problems with using them to scan buildings are twofold :

  1. They are shortsighted, thus the number of scans necessary to completely model a building is an order of magnitude greater than with typical large 3D sensors. While the amount of work and precision required might make such an approach impractical for humans, robots are well suited to it.
  2. The maps they produce are imprecise at long distances, meaning that weaving individual scans into a cohesive whole without accumulating errors is very hard. The traditional ways to correct this use scale bars or other calibration methods [7]; in addition to requiring extra material, this presents the disadvantage of being time consuming. Luckily, the whole error accumulation problem can be entirely avoided by precisely positioning the sensor within an exact 2D map such as the one in Figure 1 (even if this map lacks fine details).

One such short range scanner is the Intel Realsense D435 [4], an active stereo camera that projects a pseudo-random light pattern onto the scene in order to recover 3D point cloud using the triangulation principle. While its theoretical maximum range is 10 m, this sensor suffers from the typical stereo camera exponential accuracy drop with measured distance [6]. Indeed, at 4 m, the expected reading’s RMS error is about 60 mm (or 180 mm peak-to-peak). Worse, in practice, this error can be considerably higher depending on lighting conditions and the type of surface being measured. Since we aim to produce metrologic quality models (accurate within 0.1% of the true value), the Intel Realsense D435 might appear to be a poor choice; as we’ll discuss in an upcoming paper, its shortcomings can be corrected through the smart use of already available data.

For the moment, to illustrate the viability of short range and low cost 3D scanning, we mounted a RealSense D435 camera on a robot in addition to a LIDAR-Lite sensor [5] for long range positioning. The procedure used to generate a full 3D scan of a building is simple. First, an accurate (but lacking in details) outline of the space is built using the technique described in our previous paper [8]. Using this contour, the robot automatically generates a map refinement and scanning trajectory (Figure 3). At each stop along this path, the robot captures a panorama of colored 3D points. By progressively stitching those panorama together, a full model is built (Figure 4). No human is involved in the whole process (other than to click “Go” and open doors that is).

The scans in Figure 4 are raw : no post-treatment was applied to fuse the data, filter it or correct the colors and illumination artefacts. As such, positions that are far from the D435 during an individual scan (such as the top of the walls) are noticeably wavier because of increased noise. However, since the robot position is known precisely through the use of the building outline, the effect of these errors is not accumulated. In other words, by limiting the D435 produced data to 2.5 m measurements the expected error is (in our experience) 50 mm or 2% when measured from the sensor; when compared to the size of the building, this error is within a 0.2% threshold. As mentioned previously, this noise phenomenon can be corrected through clever treatment of the collected information, the subject of our next paper.

In short, we’ve shown that creating a complete 3D colored model of a building is possible only using a short range 3D sensor and a 2D map serving as a guide. Such models have many uses for BIM, from structure inspection during building construction to planning renovations or repurposing passing and passing by many others. The best part of the presented system is cost : whilst 3D scanners are traditionally expensive to buy and operate, the combination of a robot with two inexpensive sensors lets us create accurate 3D models at a fraction of the price.


[1] From Design to Demolition : Robotics Hold a Promise to Revolutionize Every Stage of a Building’s Life Cycle
[2] Matterport, Company main page.
[3] Creaform, HandySCAN 3D product page.
[4] Intel, Intel Realsense Depth Camera D435 product page.
[5] Garmin, LIDAR-Lite v3HP product page.
[6] Anders Grunnet-Jepsen, Tuning D435 and D415 Camera for Optimal Performance.
[7] Chris Aher, Automatic Accuracy : Getting the most out of DotProduct data with the accuscale-DP scale bar kit.
[8] Cost-Effective Map Building