arXiv daily

Robotics (cs.RO)

Fri, 16 Jun 2023

Other arXiv digests in this category:Thu, 14 Sep 2023; Wed, 13 Sep 2023; Tue, 12 Sep 2023; Mon, 11 Sep 2023; Fri, 08 Sep 2023; Tue, 05 Sep 2023; Fri, 01 Sep 2023; Thu, 31 Aug 2023; Wed, 30 Aug 2023; Tue, 29 Aug 2023; Mon, 28 Aug 2023; Fri, 25 Aug 2023; Thu, 24 Aug 2023; Wed, 23 Aug 2023; Tue, 22 Aug 2023; Mon, 21 Aug 2023; Fri, 18 Aug 2023; Thu, 17 Aug 2023; Wed, 16 Aug 2023; Tue, 15 Aug 2023; Mon, 14 Aug 2023; Fri, 11 Aug 2023; Thu, 10 Aug 2023; Wed, 09 Aug 2023; Tue, 08 Aug 2023; Mon, 07 Aug 2023; Fri, 04 Aug 2023; Thu, 03 Aug 2023; Wed, 02 Aug 2023; Tue, 01 Aug 2023; Mon, 31 Jul 2023; Fri, 28 Jul 2023; Thu, 27 Jul 2023; Wed, 26 Jul 2023; Tue, 25 Jul 2023; Mon, 24 Jul 2023; Fri, 21 Jul 2023; Thu, 20 Jul 2023; Wed, 19 Jul 2023; Tue, 18 Jul 2023; Mon, 17 Jul 2023; Fri, 14 Jul 2023; Thu, 13 Jul 2023; Wed, 12 Jul 2023; Tue, 11 Jul 2023; Mon, 10 Jul 2023; Fri, 07 Jul 2023; Thu, 06 Jul 2023; Wed, 05 Jul 2023; Tue, 04 Jul 2023; Mon, 03 Jul 2023; Fri, 30 Jun 2023; Thu, 29 Jun 2023; Wed, 28 Jun 2023; Tue, 27 Jun 2023; Mon, 26 Jun 2023; Fri, 23 Jun 2023; Thu, 22 Jun 2023; Wed, 21 Jun 2023; Tue, 20 Jun 2023; Thu, 15 Jun 2023; Tue, 13 Jun 2023; Mon, 12 Jun 2023; Fri, 09 Jun 2023; Thu, 08 Jun 2023; Wed, 07 Jun 2023; Tue, 06 Jun 2023; Mon, 05 Jun 2023; Fri, 02 Jun 2023; Thu, 01 Jun 2023; Wed, 31 May 2023; Tue, 30 May 2023; Mon, 29 May 2023; Fri, 26 May 2023; Thu, 25 May 2023; Wed, 24 May 2023; Tue, 23 May 2023; Mon, 22 May 2023; Fri, 19 May 2023; Thu, 18 May 2023; Wed, 17 May 2023; Tue, 16 May 2023; Mon, 15 May 2023; Fri, 12 May 2023; Thu, 11 May 2023; Wed, 10 May 2023; Tue, 09 May 2023; Mon, 08 May 2023; Fri, 05 May 2023; Thu, 04 May 2023; Wed, 03 May 2023; Tue, 02 May 2023; Mon, 01 May 2023; Fri, 28 Apr 2023; Thu, 27 Apr 2023; Wed, 26 Apr 2023; Tue, 25 Apr 2023; Mon, 24 Apr 2023; Fri, 21 Apr 2023; Thu, 20 Apr 2023; Wed, 19 Apr 2023; Tue, 18 Apr 2023; Mon, 17 Apr 2023; Fri, 14 Apr 2023; Thu, 13 Apr 2023; Wed, 12 Apr 2023; Tue, 11 Apr 2023; Mon, 10 Apr 2023
1.Enabling BIM-Driven Robotic Construction Workflows with Closed-Loop Digital Twins

Authors:Xi Wang, Hongrui Yu, Wes McGee, Carol C. Menassa, Vineet R. Kamat

Abstract: Robots can greatly alleviate physical demands on construction workers while enhancing both the productivity and safety of construction projects. Leveraging a Building Information Model (BIM) offers a natural and promising approach to drive a robotic construction workflow. However, because of uncertainties inherent on construction sites, such as discrepancies between the designed and as-built workpieces, robots cannot solely rely on the BIM to guide field construction work. Human workers are adept at improvising alternative plans with their creativity and experience and thus can assist robots in overcoming uncertainties and performing construction work successfully. This research introduces an interactive closed-loop digital twin system that integrates a BIM into human-robot collaborative construction workflows. The robot is primarily driven by the BIM, but it adaptively adjusts its plan based on actual site conditions while the human co-worker supervises the process. If necessary, the human co-worker intervenes in the robot's plan by changing the task sequence or target position, requesting a new motion plan, or modifying the construction component(s)/material(s) to help the robot navigate uncertainties. To investigate the physical deployment of the system, a drywall installation case study is conducted with an industrial robotic arm in a laboratory. In addition, a block pick-and-place experiment is carried out to evaluate system performance. Integrating the flexibility of human workers and the autonomy and accuracy afforded by the BIM, the system significantly increases the robustness of construction robots in the performance of field construction work.

2.Modelling, identification and geometric control of autonomous quadcopters for agile maneuvering

Authors:Péter Antal, Tamás Péni, Roland Tóth

Abstract: This paper presents a multi-step procedure to construct the dynamic motion model of an autonomous quadcopter, identify the model parameters, and design a model-based nonlinear trajectory tracking controller. The aim of the proposed method is to speed up the commissioning of a new quadcopter design, i.e., to enable the drone to perform agile maneuvers with high precision in the shortest time possible. After a brief introduction to the theoretical background of the modelling and control design, the steps of the proposed method are presented using the example of a self-developed quadcopter platform. The performance of the method is tested and evaluated by real flight experiments.

3.Synchronizing Machine Learning Algorithms, Realtime Robotic Control and Simulated Environment with o80

Authors:Vincent Berenz, Felix Widmaier, Simon Guist, Bernhard Schölkopf, Dieter Büchler

Abstract: Robotic applications require the integration of various modalities, encompassing perception, control of real robots and possibly the control of simulated environments. While the state-of-the-art robotic software solutions such as ROS 2 provide most of the required features, flexible synchronization between algorithms, data streams and control loops can be tedious. o80 is a versatile C++ framework for robotics which provides a shared memory model and a command framework for real-time critical systems. It enables expert users to set up complex robotic systems and generate Python bindings for scientists. o80's unique feature is its flexible synchronization between processes, including the traditional blocking commands and the novel ``bursting mode'', which allows user code to control the execution of the lower process control loop. This makes it particularly useful for setups that mix real and simulated environments.

4.A Signal Temporal Logic Planner for Ergonomic Human-Robot Collaboration

Authors:Giuseppe Silano, Amr Afifi, Martin Saska, Antonio Franchi

Abstract: This paper proposes a method for designing human-robot collaboration tasks and generating corresponding trajectories. The method uses high-level specifications, expressed as a Signal Temporal Logic (STL) formula, to automatically synthesize task assignments and trajectories. To illustrate the approach, we focus on a specific task: a multi-rotor aerial vehicle performing object handovers in a power line setting. The motion planner considers limitations, such as payload capacity and recharging constraints, while ensuring that the trajectories are feasible. Additionally, the method enables users to specify robot behaviors that take into account human comfort (e.g., ergonomics, preferences) while using high-level goals and constraints. The approach is validated through numerical analyzes in MATLAB and realistic Gazebo simulations using a mock-up scenario.

5.Efficient Search and Detection of Relevant Plant Parts using Semantics-Aware Active Vision

Authors:Akshay K. Burusa, Joost Scholten, David Rapado Rincon, Xin Wang, Eldert J. van Henten, Gert Kootstra

Abstract: To automate harvesting and de-leafing of tomato plants using robots, it is important to search and detect the relevant plant parts, namely tomatoes, peduncles, and petioles. This is challenging due to high levels of occlusion in tomato greenhouses. Active vision is a promising approach which helps robots to deliberately plan camera viewpoints to overcome occlusion and improve perception accuracy. However, current active-vision algorithms cannot differentiate between relevant and irrelevant plant parts, making them inefficient for targeted perception of specific plant parts. We propose a semantic active-vision strategy that uses semantic information to identify the relevant plant parts and prioritises them during view planning using an attention mechanism. We evaluated our strategy using 3D models of tomato plants with varying structural complexity, which closely represented occlusions in the real world. We used a simulated environment to gain insights into our strategy, while ensuring repeatability and statistical significance. At the end of ten viewpoints, our strategy was able to correctly detect 85.5% of the plant parts, about 4 parts more on average per plant compared to a volumetric active-vision strategy. Also, it detected 5 and 9 parts more compared to two predefined strategies and 11 parts more compared to a random strategy. It also performed reliably with a median of 88.9% correctly-detected objects per plant in 96 experiments. Our strategy was also robust to uncertainty in plant and plant-part position, plant complexity, and different viewpoint sampling strategies. We believe that our work could significantly improve the speed and robustness of automated harvesting and de-leafing in tomato crop production.

6.Pose Graph Optimization for a MAV Indoor Localization Fusing 5GNR TOA with an IMU

Authors:Meisam Kabiri, Claudio Cimarelli, Hriday Bavle, Jose Luis Sanchez-Lopez, Holger Voos

Abstract: This paper explores the potential of 5G new radio (NR) Time-of-Arrival (TOA) data for indoor drone localization under different scenarios and conditions when fused with inertial measurement unit (IMU) data. Our approach involves performing graph-based optimization to estimate the drone's position and orientation from the multiple sensor measurements. Due to the lack of real-world data, we use Matlab 5G toolbox and QuaDRiGa (quasi-deterministic radio channel generator) channel simulator to generate TOA measurements for the EuRoC MAV indoor dataset that provides IMU readings and ground truths 6DoF poses of a flying drone. Hence, we create twelve sequences combining three predefined indoor scenarios setups of QuaDRiGa with 2 to 5 base station antennas. Therefore, experimental results demonstrate that, for a sufficient number of base stations and a high bandwidth 5G configuration, the pose graph optimization approach achieves accurate drone localization, with an average error of less than 15 cm on the overall trajectory. Furthermore, the adopted graph-based optimization algorithm is fast and can be easily implemented for onboard real-time pose tracking on a micro aerial vehicle (MAV).

7.Can robots mold soft plastic materials by shaping depth images?

Authors:Ege Gursoy, Sonny Tarbouriech, Andrea Cherubini

Abstract: Can robots mold soft plastic materials by shaping depth images? The short answer is no: current day robots can't. In this article, we address the problem of shaping plastic material with an anthropomorphic arm/hand robot, which observes the material with a fixed depth camera. Robots capable of molding could assist humans in many tasks, such as cooking, scooping or gardening. Yet, the problem is complex, due to its high-dimensionality at both perception and control levels. To address it, we design three alternative data-based methods for predicting the effect of robot actions on the material. Then, the robot can plan the sequence of actions and their positions, to mold the material into a desired shape. To make the prediction problem tractable, we rely on two original ideas. First, we prove that under reasonable assumptions, the shaping problem can be mapped from point cloud to depth image space, with many benefits (simpler processing, no need for registration, lower computation time and memory requirements). Second, we design a novel, simple metric for quickly measuring the distance between two depth images. The metric is based on the inherent point cloud representation of depth images, which enables direct and consistent comparison of image pairs through a non-uniform scaling approach, and therefore opens promising perspectives for designing \textit{depth image -- based} robot controllers. We assess our approach in a series of unprecedented experiments, where a robotic arm/hand molds flour from initial to final shapes, either with its own dataset, or by transfer learning from a human dataset. We conclude the article by discussing the limitations of our framework and those of current day hardware, which make human-like robot molding a challenging open research problem.

8.Actor-Critic Model Predictive Control

Authors:Angel Romero, Yunlong Song, Davide Scaramuzza

Abstract: Despite its success, Model Predictive Control (MPC) often requires intensive task-specific engineering and tuning. On the other hand, Reinforcement Learning (RL) architectures minimize this effort, but need extensive data collection and lack interpretability and safety. An open research question is how to combine the advantages of RL and MPC to exploit the best of both worlds. This paper introduces a novel modular RL architecture that bridges these two approaches. By placing a differentiable MPC in the heart of an actor-critic RL agent, the proposed system enables short-term predictions and optimization of actions based on system dynamics, while retaining the end-to-end training benefits and exploratory behavior of an RL agent. The proposed approach effectively handles two different time-horizon scales: short-term decisions managed by the actor MPC and long term ones managed by the critic network. This provides a promising direction for RL, which combines the advantages of model-based and end-to-end learning methods. We validate the approach in simulated and real-world experiments on a quadcopter platform performing different high-level tasks, and show that the proposed method can learn complex behaviours end-to-end while retaining the properties of an MPC.

9.Learning to Summarize and Answer Questions about a Virtual Robot's Past Actions

Authors:Chad DeChant, Iretiayo Akinola, Daniel Bauer

Abstract: When robots perform long action sequences, users will want to easily and reliably find out what they have done. We therefore demonstrate the task of learning to summarize and answer questions about a robot agent's past actions using natural language alone. A single system with a large language model at its core is trained to both summarize and answer questions about action sequences given ego-centric video frames of a virtual robot and a question prompt. To enable training of question answering, we develop a method to automatically generate English-language questions and answers about objects, actions, and the temporal order in which actions occurred during episodes of robot action in the virtual environment. Training one model to both summarize and answer questions enables zero-shot transfer of representations of objects learned through question answering to improved action summarization. % involving objects not seen in training to summarize.

10.Tactile-Reactive Roller Grasper

Authors:Shenli Yuan, Shaoxiong Wang, Radhen Patel, Megha Tippur, Connor Yako, Edward Adelson, Kenneth Salisbury

Abstract: Manipulation of objects within a robot's hand is one of the most important challenges in achieving robot dexterity. The "Roller Graspers" refers to a family of non-anthropomorphic hands utilizing motorized, rolling fingertips to achieve in-hand manipulation. These graspers manipulate grasped objects by commanding the rollers to exert forces that propel the object in the desired motion directions. In this paper, we explore the possibility of robot in-hand manipulation through tactile-guided rolling. We do so by developing the Tactile-Reactive Roller Grasper (TRRG), which incorporates camera-based tactile sensing with compliant, steerable cylindrical fingertips, with accompanying sensor information processing and control strategies. We demonstrated that the combination of tactile feedback and the actively rolling surfaces enables a variety of robust in-hand manipulation applications. In addition, we also demonstrated object reconstruction techniques using tactile-guided rolling. A controlled experiment was conducted to provide insights on the benefits of tactile-reactive rollers for manipulation. We considered two manipulation cases: when the fingers are manipulating purely through rolling and when they are periodically breaking and reestablishing contact as in regrasping. We found that tactile-guided rolling can improve the manipulation robustness by allowing the grasper to perform necessary fine grip adjustments in both manipulation cases, indicating that hybrid rolling fingertip and finger-gaiting designs may be a promising research direction.

11.Robot Learning with Sensorimotor Pre-training

Authors:Ilija Radosavovic, Baifeng Shi, Letian Fu, Ken Goldberg, Trevor Darrell, Jitendra Malik

Abstract: We present a self-supervised sensorimotor pre-training approach for robotics. Our model, called RPT, is a Transformer that operates on sequences of sensorimotor tokens. Given a sequence of camera images, proprioceptive robot states, and past actions, we encode the interleaved sequence into tokens, mask out a random subset, and train a model to predict the masked-out content. We hypothesize that if the robot can predict the missing content it has acquired a good model of the physical world that can enable it to act. RPT is designed to operate on latent visual representations which makes prediction tractable, enables scaling to 10x larger models, and 10 Hz inference on a real robot. To evaluate our approach, we collect a dataset of 20,000 real-world trajectories over 9 months using a combination of motion planning and model-based grasping algorithms. We find that pre-training on this data consistently outperforms training from scratch, leads to 2x improvements in the block stacking task, and has favorable scaling properties.