Introduction
Autonomous navigation is a vital step for robotic systems. One of the essential elements of autonomous mobile robot (AMR) navigation is “visual SLAM,” or visual simultaneous localization and mapping. Visual SLAM is a technique that uses images captured by a camera to simultaneously map the environment and estimate the robot’s position. The Phoenix lander, Opportunity rover, and Ingenuity helicopter used for Mars exploration utilize visual SLAM algorithms for autonomous navigation and landing. Further applications include vacuum cleaning, endoscopy, and unmanned aerial, ground, and underwater robots, where machines do the automation of tasks.
Visual SLAM requires large computation power to estimate poses and generate a map from images acquired by a camera. The majority of Visual SLAM (VSLAM) algorithms use CPU power, which limits real-time pose estimation and map creation. Isaac ROS (Robot Operating System), VSLAM, and Nvblox are GPU-accelerated ROS2 packages that help estimate poses and create the map of the surrounding environment for navigation of autonomous mobile robots in real time. Experimentation is carried out to see how well the Isaac ROS VSLAM and Nvblox work with Realsense cameras for autonomous navigation of AMRs.
NVIDIA ISAAC ROS Visual SLAM and Nvblox Pipeline
Figure 1 shows the working and integration of the NVIDIA ISAAC ROS Visual SLAM and Nvblox pipeline. Here, the VSLAM estimates poses and Nvblox reconstructs the map of the surrounding environment for autonomous navigation of the robot.
Figure 1: Block Diagram of Isaac ROS VSLAM and Nvblox
NVIDIA Isaac ROS VSLAM is a ROS2 package that performs simultaneous stereo visual localization and mapping. It calculates stereo visual inertial odometry using a time-synchronized pair of stereo images and uses Isaac Elbrus GPU-accelerated library. Visual inertial odometry, optimization, and mapping are the three components of Isaac ROS VSLAM. Visual odometry (VO) is a technique for estimating the position of the camera relative to its starting point. The VO pose shown in Figure 1 is estimated by VO. The Visual Simultaneous Localization and Mapping (VSLAM) method is built on top of the VO pose and aims to improve the quality of VO poses by loop closure detection and estimating SLAM poses using previously seen parts of the environment. In addition to visual data, Elbrus can use measurements from an Inertial Measurement Unit (IMU). When VO is unable to estimate a pose, such as when there is poor lighting or a long featureless surface in front of the camera, it switches to IMU. Figure 1 shows the estimated SLAM poses by V-SLAM.
NVIDIA Isaac ROS Nvblox is a ROS2 package used for real-time 3D reconstruction of the environment around AMR from camera images. The reconstruction is intended to be used by path planners to generate a safe navigation path. Nvblox makes use of NVIDIA CUDA to accelerate this process and enable real-time operation. This repository contains ROS2 integration for the Nvblox core library. The Nvblox algorithm is comprised of three key components: the truncated signed distance function (TSDF), the mesh, and the Euclidean signed distance function (ESDF). Nvblox uses an RGB image, a depth image, and a SLAM pose as input and outputs meshes for visualization, and a distance slice map for path planning. Nvblox generates a 2D distance map slice and a 3D mesh from a stream of RGB and depth images and the corresponding pose of the depth image. The RGB image stream, which is used to color the 3D reconstruction of the surrounding environment for visualization, is optional. The 2D distance map slice shows the distance between each point and the nearest reconstructed obstacle and a 3D mesh for RVIZ visualization. the 3D reconstruction of the surrounding environment for visualization, is optional. The 2D distance map slice displays the distance between each point and the nearest reconstructed obstacle, as well as a 3D mesh for RVIZ visualization.
Explanation of Isaac ROS VSLAM, Nvblox, and the Nav2 stack pipeline
The entire pipeline for AMR autonomous navigation using Isaac ROS V-SLAM, Nvblox, and the Nav2 stack is depicted in Figure 2. This pipeline is made up of five nodes: the Realsense camera node, the Isaac ROS V-SLAM node, the Isaac ROS Nvblox node, the Nav2 node, and the Rviz node. The following paragraph explains each block.
Figure 2: Isaac ROS VSLAM, Nvblox and NAV2 pipeline
The Realsense camera node captures images from the Realsense camera and publishes RGB, depth, infra-1, and infra-2 images. Infra-1 and Infra-2 are infrared images. The infra-1 and infra-2 images are subscribed by the Isaac ROS V-SLAM block, and it publishes pose and tf. The Isaac ROS Nvblox block generates a distance slice map by subscribing to the RGB image, depth image, pose, and if. For path planning and navigation, the Nav2 node utilizes a distance slice map. The Nav2 node is the control system that allows an AMR to reach a goal state autonomously by using a current pose, a map, and a goal, such as a destination pose. The Nav2 node successfully plots a path to the goal state and sends commands to the AMR to follow the planned path to the goal position. Rviz is a useful tool for visualizing images, odometry, and generated maps, and providing a goal pose to the AMR.
Experimental Results
Figure 3 shows the AMR, what it sees through the realsense camera mounted on it, and the resulting 3D point cloud map as a mesh. Figure 4 displays the created map with the given goal state, planned path, and the AMR safely arriving at the goal state by following the planned path.
Conclusion
To autonomously navigate the AMR in the lab, an Intel RealSense camera has been integrated into the Isaac ROS VSLAM, Isaac ROS Nvblox, and Nav2 stacks. The AMR moved inside a lab environment for mapping and navigation purposes. Isaac ROS VSLAM successfully estimated the odometry of the AMR while also creating a map of the environment around the AMR. Thanks to GPU acceleration, the NVIDIA Isaac ROS VSLAM and Nvblox pipelines provide real-time performance. Simultaneously, the VSLAM algorithm localized its position on the map and created a map for safe navigation. We hope that demonstrating our experiments will help robotics engineers in developing various commercial robotics products. For more details, you may contact the authors.
Authors
Sagar Dhatrak
Sagar Dhatrak completed his M.Sc in electronics science in 2011 and submitted his Ph.D. thesis on monocular visual SLAM in 2021. He’s currently working as a VSLAM specialist at Einfochips (an Arrow company) and working on autonomous navigation of autonomous mobile robots using Visual SLAM. He has worked on embedded systems and robotics related projects for around six years.
Naitik Nakrani
Naitik Nakrani is currently working as a Solution Engineer at Einfochips (An Arrow Company), Ahmedabad location. He holds a Ph.D. and has over 9 years of R&D expertise in areas such as robotics, SLAM, navigation, control system design, AI/ML, signal processing, and computer vision. He is currently working on various PoC’s for ROS/ROS2 based algorithms to solve problems for AMR system design and development. He has experience working with various embedded platforms, range sensors, vision sensors, and system profiling and benchmarking.