Realizing Learned Quadruped Locomotion Behaviors through Kinematic Motion Primitives

A new ICRA 2019 paper using PyBullet:

Abhik Singla, Shounak Bhattacharya and Dhaivat Dholakiya are with the Robert Bosch Centre for Cyber-Physical Systems, IISc, Bangalore, India.

Humans and animals are believed to use a very minimal set of trajectories to perform a wide variety of tasks including walking. Our main objective in this paper is two fold 1) Obtain an effective tool to realize these basic motion patterns for quadrupedal walking, called the kinematic motion primitives (kMPs), via trajectories learned from deep reinforcement learning (D-RL) and 2) Realize a set of behaviors, namely trot, walk, gallop and bound from these kinematic motion primitives in our custom four legged robot, called the `Stoch’. D-RL is a data driven approach, which has been shown to be very effective for realizing all kinds of robust locomotion behaviors, both in simulation and in experiment. On the other hand, kMPs are known to capture the underlying structure of walking and yield a set of derived behaviors. We first generate walking gaits from D-RL, which uses policy gradient based approaches. We then analyze the resulting walking by using principal component analysis. We observe that the kMPs extracted from PCA followed a similar pattern irrespective of the type of gaits generated. Leveraging on this underlying structure, we then realize walking in Stoch by a straightforward reconstruction of joint trajectories from kMPs. This type of methodology improves the transferability of these gaits to real hardware, lowers the computational overhead on-board, and also avoids multiple training iterations by generating a set of derived behaviors from a single learned gait.

See also https://arxiv.org/abs/1810.03842 and a video here: https://www.youtube.com/watch?v=kiLKSqI4KhE&feature=youtu.be

Deep Lagrangian Networks: Using Physics as Model Prior for Deep Learning

A new ICLR 2019 paper using PyBullet:

Michael Lutter, Christian Ritter & Jan Peters, Department of Computer Science Technische Universität Darmstadt

Deep learning has achieved astonishing results on many tasks with large amounts of data and generalization within the proximity of training data. For many important real-world applications, these requirements are unfeasible and additional prior knowledge on the task domain is required to overcome the resulting problems. In particular, learning physics models for model-based control requires robust extrapolation from fewer samples – often collected online in real-time – and model errors may lead to drastic damages of the system.
Directly incorporating physical insight has enabled us to obtain a novel deep model learning approach that extrapolates well while requiring fewer samples. As a first example, we propose Deep Lagrangian Networks (DeLaN) as a deep network structure upon which Lagrangian Mechanics have been imposed. DeLaN can learn the equations of motion of a mechanical system (i.e., system dynamics) with a deep network efficiently while ensuring physical plausibility.
The resulting DeLaN network performs very well at robot tracking control. The proposed method did not only outperform previous model learning approaches at learning speed but exhibits substantially improved and more robust extrapolation to novel trajectories and learns online in real-time.

See also https://openreview.net/forum?id=BklHpjCqKm

Comparing Task Simplifications to Learn Closed-Loop Object Picking Using Deep Reinforcement Learning

A new paper using PyBullet from ETH Zurich (Michel Breyer, Fadri Furrer, Tonci Novkovic, Roland Siegwart, and Juan Nieto)

Enabling autonomous robots to interact in unstructured environments with dynamic objects requires manipulation capabilities that can deal with clutter, changes, and objects’ variability. This paper presents a comparison of different reinforcement learning-based approaches for object picking with a robotic manipulator. We learn closed-loop policies mapping depth camera inputs to motion commands and compare different approaches to keep the problem tractable, including reward shaping, curriculum learning and using a policy pre-trained on a task with a reduced action set to warm-start the full problem. For efficient and more flexible data collection, we train in simulation and transfer the policies to a real robot. We show that using curriculum learning, policies learned with a sparse reward formulation can be trained at similar rates as with a shaped reward. These policies result in success rates comparable to the policy initialized on the simplified task. We could successfully transfer these policies to the real robot with only minor modifications of the depth image filtering. We found that using a heuristic to warm-start the training was useful to enforce desired behavior, while the policies trained from scratch using a curriculum learned better to cope with unseen scenarios where objects are removed.

See also https://arxiv.org/abs/1803.04996 and a video here: https://www.youtube.com/watch?v=ii16Zejmf-E&feature=youtu.be

PyBullet and “Sim-to-Real: Learning Agile Locomotion For Quadruped Robots”

PyBullet is receiving regular updates, you can see the latest version here: https://pypi.org/project/pybullet
Installation and update is simple:
pip install -U pybullet

Check out the PyBullet Quickstart Guide and clone the github repository for more PyBullet examples and OpenAI Gym environments.

A while ago, Our RSS 2018 paper “Sim-to-Real: Learning Agile Locomotion For Quadruped Robots” is accepted! (with Jie Tan, Tingnan Zhang, Erwin Coumans, Atil Iscen, Yunfei Bai, Danijar Hafner, Steven Bohez, Vincent Vanhoucke).

See also the video and paper on Arxiv.

Erwin @ twitter

DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills

An excellent SIGGRAPH 2018 paper using Bullet Physics to simulate physics based character locomotion, by Xue Bin Peng, Pieter Abbeel, Sergey Levine, Michiel van de Panne.

See https://xbpeng.github.io/projects/DeepMimic/index.html

Update: there is also an implementation using PyBullet

pip3 install pybullet
python3 -m pybullet_envs.deep_mimic.testrl --arg_file run_humanoid3d_backflip_args.txt

Gibson Env: Real-World Perception for Embodied Agents

The Gibson project, by Stanford University AI Lab uses PyBullet:

Perception and being active (i.e. having a certain level of motion freedom) are closely tied. Learning active perception and sensorimotor control in the physical world is cumbersome as existing algorithms are too slow to efficiently learn in real-time and robots are fragile and costly. This has given rise to learning in simulation which consequently casts a question on transferring to real-world. In this paper, we study learning perception for active agents in real-world, propose a virtual environment for this purpose, and demonstrate complex learned locomotion abilities. The primary characteristics of the learning environments, which transfer into the trained agents, are I) being from the real-world and reflecting its semantic complexity, II) having a mechanism to ensure no need to further domain adaptation prior to deployment of results in real-world, III) embodiment of the agent and making it subject to constraints of space and physics.

See also http://gibsonenv.stanford.edu/method/

Rocket League using Bullet Physics in Unreal Engine 3

Rocket League integrated Bullet Physics in the Unreal engine for deterministic networked physics simulation.

‘Rocket League’ owes a large part of its success to how fun it is to play for people of all ages. But why does the game feel so good? One of Psyonix’s main programmers, Jared Cone, will take session attendees through an inside look at the specific game design decisions and implementation details that made ‘Rocket League’ and its networked physics the success that it is today!

See the GDC slides here: https://www.gdcvault.com/play/1024972/It-IS-Rocket-Science-The

Sim2Real View Invariant Visual Servoing by Recurrent Control

A paper from Google Brain Robotics using PyBullet:

Humans are remarkably proficient at controlling their limbs and tools from a wide range of viewpoints and angles, even in the presence of optical distortions. In robotics, this ability is referred to as visual servoing: moving a tool or end-point to a desired location using primarily visual feedback. In this paper, we study how viewpoint-invariant visual servoing skills can be learned automatically in a robotic manipulation scenario. To this end, we train a deep recurrent controller that can automatically determine which actions move the end-point of a robotic arm to a desired object. The problem that must be solved by this controller is fundamentally ambiguous: under severe variation in viewpoint, it may be impossible to determine the actions in a single feedforward operation. Instead, our visual servoing system must use its memory of past movements to understand how the actions affect the robot motion from the current viewpoint, correcting mistakes and gradually moving closer to the target. This ability is in stark contrast to most visual servoing methods, which either assume known dynamics or require a calibration phase. We show how we can learn this recurrent controller using simulated data and a reinforcement learning objective. We then describe how the resulting model can be transferred to a real-world robot by disentangling perception from control and only adapting the visual layers. The adapted model can servo to previously unseen objects from novel viewpoints on a real-world Kuka IIWA robotic arm.

Youtube video: https://www.youtube.com/watch?v=oLgM2Bnb7fo
Arxiv paper: https://arxiv.org/abs/1712.07642

Server Migration

The website and forum of Bullet just migrated to a new server. Things are under construction and http://pybullet.org and http://bulletphysics.org will be merged into one.

 

Multi-Task Domain Adaptation for Deep Learning of Instance Grasping from Simulation

A new paper from Google Brain and X using PyBullet:
earning-based approaches to robotic manipulation are limited by the scalability of data collection and accessibility of labels. In this paper, we present a multi-task domain adaptation framework for instance grasping in cluttered scenes by utilizing simulated robot experiments. Our neural network takes monocular RGB images and the instance segmentation mask of a specified target object as inputs, and predicts the probability of successfully grasping the specified object for each candidate motor command. The proposed transfer learning framework trains a model for instance grasping in simulation and uses a domain-adversarial loss to transfer the trained model to real robots using indiscriminate grasping data, which is available both in simulation and the real world. We evaluate our model in real-world robot experiments, comparing it with alternative model architectures as well as an indiscriminate grasping baseline.

See also https://sites.google.com/corp/view/multi-task-domain-adaptation and
https://arxiv.org/abs/1710.06422

Home of Bullet and PyBullet: physics simulation for games, visual effects, robotics and reinforcement learning.