Reset an environment to an arbitrary state

Official Python bindings with a focus on reinforcement learning and robotics.
Post Reply
abef
Posts: 3
Joined: Wed Jan 29, 2020 2:38 am

Reset an environment to an arbitrary state

Post by abef »

Pardon if this question has been answered before -- I searched an could not find what I'm looking for.

I would like to be able to reset a pybullet environment to a given state.
As a concrete example, suppose I have two instances of the same environment (e.g. HopperBulletEnv), call them env1 and env2.
I want to be able to reset env2 to the same state that env1 is in.
Just using saveState and restoreState won't work (as far as I can tell), since that limits me to only those states that the environment instance has already traversed.

In principle, this seems possible using saveBullet, which should allow me to dump the state of env1 to disk, and then restore that state from disk to env2. However, I'd like to avoid disk IO operations as this would be too slow.

Is there a way to return the state of env1 as a python dict, and then pass that dict to env2 and just reset all the values? What are the values that I need to pass in order to fully recreate the state? (From what I gather, I need at least the joint angles and velocities, but possibly much else besides).

Thanks,
Abe
abef
Posts: 3
Joined: Wed Jan 29, 2020 2:38 am

Re: Reset an environment to an arbitrary state

Post by abef »

Ah, so this question has indeed been asked before here: viewtopic.php?t=12460
The response was
Erwin Coumans wrote: Tue Nov 06, 2018 4:59 am You will have to manually reset the state for all objects. See resetJointState and resetBasePositionAndOrientation, resetBaseVelocity in the PyBullet Quickstart Guide.
Could I get a bit more detail here? Does this mean the state is fully defined by the joint states, base position and orientation, and base velocity?
abef
Posts: 3
Joined: Wed Jan 29, 2020 2:38 am

Re: Reset an environment to an arbitrary state

Post by abef »

Ok, so it turns out Erwin's response that I linked above is sufficient, I just actually had to try it out :lol:
Instead of deleting this thread, I'll go ahead and post my solution in case others find it useful.
Here is a basic way to extract and reset the state of a pybullet environment without using the saveState and restoreState APIs
(at least for the locomotion envs).

Code: Select all

import gym
import pybullet_envs
import numpy as np

env1 = gym.make("HopperBulletEnv-v0")
env2 = gym.make("HopperBulletEnv-v0")
env1.reset()
env2.reset()

# get the pybullet connections
p1 = env1.env._p
p2 = env2.env._p

# step the envs randomly for a bit
for _ in range(5):
    env1.step(np.random.uniform(low=-1,high=1,size=3))
    env2.step(np.random.uniform(low=-1,high=1,size=3))
    
# get the state of env1
base_po = [] # position and orientation of base for each body
base_v = [] # velocity of base for each body
joint_states = [] # joint states for each body
for i in range(p1.getNumBodies()):
    base_po.append(p1.getBasePositionAndOrientation(i))
    base_v.append(p1.getBaseVelocity(i))
    joint_states.append([p1.getJointState(i,j) for j in range(p1.getNumJoints(i))])

# reset env2 to the current state of env1
for i in range(p2.getNumBodies()):
    p2.resetBasePositionAndOrientation(i,*base_po[i])
    p2.resetBaseVelocity(i,*base_v[i])
    for j in range(p2.getNumJoints(i)):
        p2.resetJointState(i,j,*joint_states[i][j][:2])

# check that they now follow the same trajectory
for _ in range(5):
    act = np.random.uniform(low=-1,high=1,size=3)
    out1 = env1.step(act)
    out2 = env2.step(act)
    print(np.allclose(out1[0],out2[0]))
Post Reply