I am experimenting with pybullet's reinforcement learning. It worked well on the ant model, but the model is made extra easy with the limits on motor angles. For example, if the policy outputs zero torques, the ant is not collapsing because of the motor limits.
I modified the ant model, so each leg has three joints. I could barely train it by starting with restrictive limits and gradually relaxing them. Even so, I could not remove the limits completely and the gait looks ugly.
If I remove the limits from the outset, if fails to learn completely. That is, if I start in the equilibrium configuration (all legs vertical), it barely learns to keep its balance with a two-layer policy NN. If I increase the number of layers, it cannot master even that. Notice that the minimal policy for a quadruped to stand upright without collapsing is trivial: a simple PD controller on every joint will suffice.
I used PPO algorithm. Any ideas on why it performs so poorly?
Official Python bindings with a focus on reinforcement learning and robotics.
1 post • Page 1 of 1