SegFault when training RL agent using camera

Official Python bindings with a focus on reinforcement learning and robotics.
Post Reply
bjt
Posts: 2
Joined: Mon Aug 13, 2018 7:49 pm

SegFault when training RL agent using camera

Post by bjt »

Hello,

I am trying to train an RL agent that has a body mounted camera. I am using MPI to run several processes. It works well without a camera input, but after a variable number of steps I will get a segfault:

[brendan:05577] *** Process received signal ***
[brendan:05577] Signal: Segmentation fault (11)
[brendan:05577] Signal code: Address not mapped (1)
[brendan:05577] Failing at address: 0xfffffffe02660ad0
[brendan:05577] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7f258a516390]
[brendan:05577] [ 1] /usr/local/lib/python3.5/dist-packages/pybullet.cpython-35m-x86_64-linux-gnu.so(_ZN6Shader8fragmentE3vecILm3EfER8TGAColor+0x175)[0x7f25889ae815]
[brendan:05577] [ 2] /usr/local/lib/python3.5/dist-packages/pybullet.cpython-35m-x86_64-linux-gnu.so(_Z15triangleClippedR3matILm4ELm3EfES1_R7IShaderR8TGAImagePfPiRKS_ILm4ELm4EfEi+0x7c9)[0x7f25889a8659]
[brendan:05577] [ 3] /usr/local/lib/python3.5/dist-packages/pybullet.cpython-35m-x86_64-linux-gnu.so(_ZN12TinyRenderer12renderObjectER20TinyRenderObjectData+0x1b3f)[0x7f25889ac8ff]
[brendan:05577] [ 4] /usr/local/lib/python3.5/dist-packages/pybullet.cpython-35m-x86_64-linux-gnu.so(_ZN32TinyRendererVisualShapeConverter6renderEPKfS1_+0x768)[0x7f2588a2cf98]
[brendan:05577] [ 5] /usr/local/lib/python3.5/dist-packages/pybullet.cpython-35m-x86_64-linux-gnu.so(_ZN29PhysicsServerCommandProcessor32processRequestCameraImageCommandERK19SharedMemoryCommandR18SharedMemoryStatusPci+0x2db)[0x7f25889d911b]
[brendan:05577] [ 6] /usr/local/lib/python3.5/dist-packages/pybullet.cpython-35m-x86_64-linux-gnu.so(_ZN29PhysicsServerCommandProcessor14processCommandERK19SharedMemoryCommandR18SharedMemoryStatusPci+0x4de)[0x7f25889fa1ce]
[brendan:05577] [ 7] /usr/local/lib/python3.5/dist-packages/pybullet.cpython-35m-x86_64-linux-gnu.so(_ZN13PhysicsDirect13processCameraERK19SharedMemoryCommand+0x76)[0x7f25889cd716]
[brendan:05577] [ 8] /usr/local/lib/python3.5/dist-packages/pybullet.cpython-35m-x86_64-linux-gnu.so(b3SubmitClientCommand+0x14)[0x7f2588a1f864]
[brendan:05577] [ 9] /usr/local/lib/python3.5/dist-packages/pybullet.cpython-35m-x86_64-linux-gnu.so(b3SubmitClientCommandAndWaitStatus+0x80)[0x7f2588a1f900]
[brendan:05577] [10] /usr/local/lib/python3.5/dist-packages/pybullet.cpython-35m-x86_64-linux-gnu.so(+0x17a6eb)[0x7f25889956eb]
[brendan:05577] [11] python3(PyCFunction_Call+0x77)[0x4e9ba7]
[brendan:05577] [12] python3(PyEval_EvalFrameEx+0x59f5)[0x53c6d5]
[brendan:05577] [13] python3(PyEval_EvalFrameEx+0x4b04)[0x53b7e4]
[brendan:05577] [14] python3[0x540199]
[brendan:05577] [15] python3(PyEval_EvalCode+0x1f)[0x540e4f]
[brendan:05577] [16] python3[0x60c272]
[brendan:05577] [17] python3(PyRun_FileExFlags+0x9a)[0x60e71a]
[brendan:05577] [18] python3(PyRun_SimpleFileExFlags+0x1bc)[0x60ef0c]
[brendan:05577] [19] python3(Py_Main+0x456)[0x63fb26]
[brendan:05577] [20] python3(main+0xe1)[0x4cfeb1]
[brendan:05577] [21] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f258a15b830]
[brendan:05577] [22] python3(_start+0x29)[0x5d6049]
[brendan:05577] *** End of error message ***

You can find a simple demonstration at the following repo:

https://bitbucket.org/brendantidd/pybul ... rc/master/

Run with:

mpirun -np 6 python3 run.py

I'm not sure if the issue is related to
https://github.com/bulletphysics/bullet3/issues/1295

I am assuming that memory for the tiny renderer is being shared among processes, but I am unsure of how to fix this. Any help would be appreciated!

Kind regards

Brendan
bjt
Posts: 2
Joined: Mon Aug 13, 2018 7:49 pm

Re: SegFault when training RL agent using camera

Post by bjt »

It seams that p.computeProjectionMatrixFOV() could be the culprit (at least when I run the could without it I haven't been able to break it yet).

I have tried running p.computeProjectionMatrix() instead, though haven't been able to get it do what I want (and I'm not sure if it would fix my issue). The docs list the following arguments, how can I work them out? There doesn't seem to be an example of usage of p.computeProjectionMatrix().

left screen (canvas) coordinate
right screen (canvas) coordinate
bottom screen (canvas) coordinate
top screen (canvas) coordinate

I am trying to use camera that moves with the body, if anyone knows a a better way to get this I would appreciate it.

Kind regards

Brendan
Post Reply