Introduction to SLAM

Every homeowner, building manager and construction foreman should have precise, detailed and up-to-date maps of their buildings. Due to cost and manpower constraints, we believe that autonomous robots will be the cornerstone in creating those maps [1]. To do this, the computational problem of modelling an unknown environment and tracking the robot’s position within it at the same time must be solved; this problem is usually referred to as SLAM (simultaneous localisation and mapping).

SLAM is something of a chicken-or-the-egg problem. First, the robot needs a perfect map to precisely track its position using visual sensors such as cameras, sonars, lidars, etc. Second, the robot needs to know it’s position and orientation perfectly to add new information correctly to the map it is building. Any imprecision in one of those two steps will cascade down to the other, resulting in an increasingly inaccurate map and robot pose. The Figure 1 illustrates this phenomenon.

There are two components to the mapping system: the sensors used to gather information and the SLAM algorithm itself. Both of those components can be optimized to minimize error accumulation, but we’ll concentrate here on the former.

Better sensors mean that the robot position and orientation can be better estimated as it moves along in both dead reckoning (inertial navigation, akin to walking with your eyes closed) and with feedback from the environment (comparing the sensor’s readings to the map). Good visual sensors also mean that the sensor information added to the map is more precise, leading to a virtuous circle of better pose estimation, and thus better mapping.

Metrologic Mapping Requirements

Knowing that high quality sensors are important to the SLAM process and that we intend our robots to provide building information modeling (BIM) to a large array of customers and in a variety of environmental conditions, our sensor and algorithm selection needs to take into account the following constraints :

  • Accuracy: BIM requires maps of metrological quality; as such, our goal is creating maps accurate within 0.1% of the true distances.
  • Cost: In a business to business (B2B) setting, the cost of a metrology device is more forgiving than when selling to consumers (B2C). Ideally however, a single platform should be usable for both. We thus believe that a bare bones mapping robot should cost below $500.
  • Ambient light: Robots should be able to work in the dark: the building in which the robots operate might be unfinished, without electricity or simply unoccupied at night. Conversely, robots should also be able to work in places in which only ambient light is available (such as unoccupied buildings during the day). This last environment is surprisingly harsh, the luminosity varying greatly from one spot to the next and direct sunlight being far brighter (roughly 1000x [5]) than common office lighting.
  • Range: In commercial spaces, 30 meters is the maximal expected distance between the robot and every directly observable surfaces from its position.
  • Computational complexity: Regardless the size of the environment and the amount of data collected, computations should not slow down as the amount of data or the size of the map increases.
  • Power consumption: The robot should be able to map 10’000 m² (the size of an average department store [2]) on a single charge. Minimisation of the power requirements allows greater autonomy or using a smaller battery pack.
  • Reliability: The robot design should minimize the number of fragile or mobile components.

Choosing a visual sensor

Visual sensors are the backbone of a robot’s ability to produce good quality maps; as discussed previously, they directly affect the quality of the SLAM as they gather the data added to the map and indirectly as they help track the robot’s position. Choosing the right sensor is thus critical to achieve a metrological robot. There are many options when it comes to sensing devices that might be used for SLAM. Those options are usually split into two categories : passive and active sensors.

Passive sensors simply “take in” the environment; some examples extracting 3D information from a scene are passive stereo and feature tracking cameras. Advantages of this class of sensor include their relative physical simplicity, low power consumption and low cost. Unfortunately, since one of our requirements is to be able to function at long ranges in complete darkness, such sensors are unsuitable as they rely on ambient visible light.

Active sensors, such as active stereo (e.g. structured light sensors), lidars, sonars and radars, produce easily identifiable signals and then measure their reflection on the environment. Obviously, this self-illumination solves the darkness constraint; however, sensors of this class suffers from other problems.

For instance, stereo cameras intrinsically have a very limited range due to their error growing exponentially with the distance [4]; the Figure 2 illustrates the phenomenon for a camera configuration similar to the Intel Realsense series [5]. From this, it is clear that at the required range of 30 m, such a system would incur significant imprecisions. Lidars generally do not suffer from this problem : given sufficient return signal strength, they have a constant error regardless of the measured distance (except for very short ranges).

A flash lidar would therefore seem like a natural choice for our application. However, they present a power consumption problem : the power of light arriving at the surface pointed at by a lidar is inversely proportional to the distance between the two squared (unless the light is collimated). This makes lightning the whole scene at 30 m distance an incredibly power hungry task, which is unacceptable given battery costs.

We therefore limit ourselves to a single laser beam sensor that uses time of flight (ToF) technology. There seems to be many options when it comes to this technology due to recent developments in the autonomous automotive industry [6]. Indeed, many of the industry players believe that lidars is the way to go, leading to massive investments in several dozen tech companies. Low cost, indoor solutions, on the other hand, are fewer. A typical example of a rotating single beam sensor are Hokyo sensors, their UTM-30LN [7] model costing over $6,000 for the required range. Rotating the lidar along 2 axis to achieve 3D scanning, Leica’s BLK360 [8] is currently their entry level metrological 3D scanner at the prohibitive cost of roughly $16,000. At the other end of the price spectrum, sensors like the $100 RPLidar A1M8 [9] are too short range for our needs and raises serious durability concerns. This leaves us with a seemingly impossible task of finding the sensor that fits all our criteria. It turns out, however, that there is a satisfactory off the shelf solution: LIDAR-Lite [10], a single point ToF sensor produced by PulsedLight (recently acquired by Garmin).

The LIDAR-Lite is marketed as a ranging solution for drone, robot, or unmanned vehicle applications. With a retail price of $150, it sports the following specifications:

Using LIDAR-Lite to create precise maps

Using the LIDAR-Lite for our application (autonomous robotic mapping) might not be an obvious choice. After all, being a single point fixed sensor, it lacks the ability to produce scans (i.e. contours or curves of the environment around the sensor). The lack of precision is also obvious : with measurements having a bias of 1% off the mark (mean) and the measurement noise being equivalent to 1% of the distance measured (ripple), this sensor does not attain our metrological grade objective of 0.1% error out of the box. Nevertheless, software can correct what the LIDAR-Lite lacks in hardware capabilities.

The first missing element, the scanning capability, can be easily provided by the robot. Indeed, making the robot turn on itself around the focal point of the sensor provides the contour of the visible environment. To achieve this, the robot provides angular position, time-synchronized with the sensor’s distance reading. In addition to the angular position synchronization, our proprietary auto-calibrating technology allows for accurate reading without knowing the exact angular speed. Overall, this technique provides us with a robustness and significant cost advantage over mounting the sensor on a standalone motor.

The second missing element is accuracy: the 1% error bias and 1% noise standard deviation are more than an order of magnitude worse than metrology grade accuracy. Our processing algorithms solves this through the integration (averaging) of data. To illustrate how this works (Figure 3), consider multiple measurements of the same point corrupted with an arbitrarily large noise. If the error is of zero-mean (i.e. the mean of an infinite number of samples stands on the true value), then the true measure can be obtained with arbitrary accuracy by integrating a sufficient number of samples.

Therefore, an important step is to calibrate the LIDAR-Lite to make it a zero-mean error sensor. This is accomplished via a reference range finder (in our case, a slow but precise Bosch 50cx rangefinder) mounted on the robot. By capturing the simultaneous measurements made by the two sensors for a range of distances (0.25 m – 25 m) and then using a polynomial model to represent the difference between the two sensors, the LIDAR-Lite bias can be corrected. Other than this calibration, only a shape preserving filtering is applied to the LIDAR-Lite data to make it ready for SLAM.

The process described above reduces the noise found in individual LIDAR-Lite measurements; it is however not the only method used to smooth out noisy data. First, as described previously, individual range measurements are organised into 360o scans of the environment through the movement of the robot. Knowing the relationship between measurements, it is possible to filter out noise. Second, when two such scans overlap, the overlap can be used to refine the estimated pose of the robot during the scans (effectively eliminating the effect of noise on it). The overlap region can also be averaged, further reducing the noise in the map model. Consequently, this allows us to incrementally build the map whilst keeping the error low.


The procedure described above produces very impressive results for a very low cost autonomous mapping robot. The accuracy obtained by measuring a part of a multi-room environment of approximately 30 m x 20 m is up to 0.006% of the longest measured distance, 0.085% in average and 0.147% in the worst case. The results illustrated in Figure 4 and tabulated in Table 2 were obtained by using three different robots for a total of 8 times at random starting locations. The robots mapped the space without human intervention.

An interesting feature of those trials was the discovery of inaccuracies in the official CAD of the office space in Figure 4. One flagrant example is the rightmost wall of the open space : a double wall was installed during construction without any updates to the plans, reducing the advertised space length by about 30 cm (this was confirmed using the precise Bosch 50cx rangefinder). Other inaccuracies stem from furniture occupying the space; unfortunately, since the described robot setup only produces 2D curves, those items are impossible to distinguish from walls. This is where inexpensive 3D cameras could come to the rescue.


[1] From Design to Demolition : Robotics Hold a Promise to Revolutionize Every Stage of a Building’s Life Cycle
[2] Wikipedia, Walmart., May 2019.
[3] Wikipedia, Lux., May 2019.
[4] Rajesh Rao. CSE 455 : Computer Vision / Lecture 16 – Stereo and 3D Vision., September 2019.
[5] Intel, Intel Realsense product page.
[6] Timothy B. Lee, How 10 companies are trying to make powerful, low-cost lidar., February 2019.
[7] Hokuyo, UTM-30LX product page.
[8] Leica, BLK360 product page.
[9] Slamtec, RPLidar A1 product page.
[10] Garmin, LIDAR-Lite v3HP product page.