Scene understanding is a crucial factor in the development of robots that can effectively act in an uncontrolled, dynamic, unstructured, and unknown environment, such as those found in real-world scenarios. In this context, a higher level of understanding of the scene (such as identifying certain objects, people, as well as localizing them) is usually required to perform effective navigation and perception tasks.

Here, we propose an open framework for building hybrid maps, combining both environment structure (metric map) and environment semantics (objects classes) to support autonomous robot perception and navigation tasks.
We detect and model objects in the environment from RGB-D images, using convolutional neural networks to capture higher-level information. Finally, the metric map is augmented with the semantic information extracted using the object categories.

We also make available datasets containing robot sensor data in the form of rosbag, allowing the reproductibility of experiments on machines without the need to have the physical robot and its sensors.

 

DOI

Coming Soon.

ArXiv

Methodology and Visual Results

 

Publications


[LARS 2018] Dhiego Bersan, Renato Martins, Mario Campos and Erickson R. Nascimento. Semantic Map Augmentation for Robot Navigation: A Learning Approach based on Visual and Depth Data, IEEE Latin American Robotics Symposium (2018)
Visit the page for more information and paper access.

 

Datasets


We make available two datasets for offline experiments, i.e., experiments that do not require a physical robot. They contain raw sensor streams recorded from the robot using the rosbag toolkit. They are provided with a bash file, play.sh, that runs the rosbag play command with the correct configurations used in our experiments (rate, simulation time, topic name remapping, static transformations, etc.) to stream the recorded data.

The data was recorded in the following format:

RGB-D: Both RGB and depth images have 640 × 480 resolution, 0.6m to 8.0m depth range and 60° horizontal × 49.5° vertical field of view. The camera used is the Orbec Astra, throttled at 15 Hz for memory saving.

Laser Scan: 180 degrees of scanning range with 0.36° angular resolution, and 0.02m to 5.6m of depth range. The data were recorded using a Hokuyo URG-04LXUG01 at 15 Hz.

Odometry: This information is provided by the Kobuki base at 20 Hz. Max speed: 0.57m/s. Max angular: 1.07 rad/s.

Download Dataset 1

Download Dataset 2

More information about the dataset files and usage can be found at the repository Readme.

 

Source Code


The source code is made publicly available throught our git repository.

Source Code

 

Acknowledgment


This project is supported by FAPEMIG.
 

Team



Dhiego Bersan

Undergraduate Student

Renato José Martins

Post-doctoral Researcher