Page 1 of 1

Is it safe to dispose of shapes/rigid bodies in a thread?

Posted: Wed Nov 19, 2014 12:50 am
by Slight0
So I've managed to integrate Bullet physics into Unity. Aside from the fact that bullet is faster than Unity's PhysX implementation, I can create many BvhTriangleMeshShape within a thread and then pass it to the main thread for simulation. However, I end up having to delete 100s of these BtRidigBody+BvhTriangleMeshShape classes at a time and it causes a noticable hitch in the main thread.

I was wondering if I could just run a cleanup thread that would handle disposing of these objects instead of the main thread; would this create any issues later on? I don't have time right now to test it, so I figured I'd ask here.

Edit: Upon closer inspection I'm going to guess that destroying rigidbodys and shapes is not thread-safe.

Edit2: I'm using the C# wrapper BulletSharpPInvoke to call the bullet API functions from Unity code. SimpleProfiler is a thin wrapper for .NET's StopWatch class which is capable of measuring microseconds accurately. Upon doing more profiling, I've found that World.RemoveCollisionObject is what's taking up all the time.

Edit3: I've identified the issue as being a call to Unity's Debug.Log which I omitted from the below code snippet and was taking up 99% of the time. I'm surprised at how slow it was, sorry for the false dilemma.

Why is destroying these objects taking so long? Each one is taking 1.5ms

Code being called per rigid body destruction

Code: Select all

    private void OnDestroy()
    {
        SimpleProfiler.StartSample("BtRigidBody Destroy");
        if (world != null)
            world.World.RemoveCollisionObject(btRigidBody);
        if (btRigidBody.MotionState != null)
            btRigidBody.MotionState.Dispose();
        btRigidBody.Dispose();
        SimpleProfiler.EndSample();
    }
From profiler output

Code: Select all

[...]
[11/18/2014] 11:31:29 PM BtRigidBody Destroy: 1ms (17680 tick(s))
[11/18/2014] 11:31:29 PM BtRigidBody Destroy: 2ms (21268 tick(s))
[11/18/2014] 11:31:29 PM BtRigidBody Destroy: 1ms (18813 tick(s))
[11/18/2014] 11:31:29 PM BtRigidBody Destroy: 1ms (16155 tick(s))
[11/18/2014] 11:31:29 PM BtRigidBody Destroy: 1ms (15530 tick(s))
[11/18/2014] 11:31:29 PM BtRigidBody Destroy: 1ms (19113 tick(s))
[11/18/2014] 11:31:29 PM BtRigidBody Destroy: 1ms (15306 tick(s))
[...]

Re: Is it safe to dispose of shapes/rigid bodies in a thread

Posted: Wed Nov 19, 2014 10:53 am
by c6burns
I don't understand where that OnDestroy method comes from. Are you using some bullet implementation for Unity where that exists? Also what does "dispose" mean? Free up the memory? Or perform all of the operations required to remove the btRigidBody from the world *and then* free up the memory? If the latter then no, that's not thread safe. You shouldn't change anything about the world while it is being stepped in another thread.

Anyway, what is actually happening inside those methods being called in OnDestroy is unclear.

Re: Is it safe to dispose of shapes/rigid bodies in a thread

Posted: Wed Nov 19, 2014 12:39 pm
by Basroil
Deleting rigid bodies shouldn't take 1ms even on an old 386, there's definitely something going on that you can improve on. My personal guess is that your profiler is just mistaken (removing and deleting should be in the micro-second level rather than millisecond), though "destroy" function might be at fault if it's not the profiler.

Bodies must be removed only when the world is not being stepped, so that part must be done sequentially after stepping ends and before the next. After that you can take care of cleanup whenever/whereever you like.

Re: Is it safe to dispose of shapes/rigid bodies in a thread

Posted: Wed Nov 19, 2014 1:04 pm
by xexuxjy
Out of interest , how have you integrated bullet into Unity, is it using Andres Traks wrapper, or a pure managed code version. Regardless, i'd be very surprised if it was thread safe. I would also suspect that SimpleProfiler really won't give you very good timings for such a small operation.

Re: Is it safe to dispose of shapes/rigid bodies in a thread

Posted: Wed Nov 19, 2014 10:27 pm
by Slight0
Sorry, perhaps I should have mentioned I'm using the C# wrapper BulletSharpPInvoke (repository link)
c6burns wrote:I don't understand where that OnDestroy method comes from. Are you using some bullet implementation for Unity where that exists? Also what does "dispose" mean? Free up the memory? Or perform all of the operations required to remove the btRigidBody from the world *and then* free up the memory? If the latter then no, that's not thread safe. You shouldn't change anything about the world while it is being stepped in another thread.

Anyway, what is actually happening inside those methods being called in OnDestroy is unclear.
I'm using my own implementation for Unity that uses BulletSharpPInvoke to call into the Bullet API. OnDestroy is a callback from Unity that is called when a GameObject is being destroyed and removed from memory; a destructor.

The first function that gets called is RemoveCollisionObject which is defined here. That just calls AlignedCollisionObjectArray.Remove (Upon further inspection this looks to be the culprit, I'm not sure if it's the brute force searching method they use or the actual API calls, but something here is taking over a millisecond to complete when there are over a thousand objects in the world.)

The second function is MotionState.Dispose defined here.

The last is RigidBody.Dispose which is actually CollisionObject.Dipose defined here.

In those links, you'll find calls to things like btCollisionObject_delete and btMotionState_delete which are really just C++ wrappers that call delete obj.
Basroil wrote:Deleting rigid bodies shouldn't take 1ms even on an old 386, there's definitely something going on that you can improve on. My personal guess is that your profiler is just mistaken (removing and deleting should be in the micro-second level rather than millisecond), though "destroy" function might be at fault if it's not the profiler.

Bodies must be removed only when the world is not being stepped, so that part must be done sequentially after stepping ends and before the next. After that you can take care of cleanup whenever/whereever you like.
I figured it shouldn't take more than a few microseconds. The profiler is just a basic wrapper for .NET's StopWatch class which is pretty accurate. It is very capable of measuring the microsecond level (those "ticks" are the number of elapsed hardware timer ticks since the start of the stopwatch). I've used StopWatch a lot to measure very minute tasks.
xexuxjy wrote:Out of interest , how have you integrated bullet into Unity, is it using Andres Traks wrapper, or a pure managed code version. Regardless, i'd be very surprised if it was thread safe. I would also suspect that SimpleProfiler really won't give you very good timings for such a small operation.
Yes, as I linked above, I'm using BulletSharpPInvoke by Andres Traks. Like I mentioned above, SimpleProfiler just wraps StopWatch which is capable of measuring below the millisecond level accurately. It rounds milliseconds down, not up. ~10,000 ticks = 1ms on my machine.

I did some more detailed profiling and found that World.RemoveCollisionObject is taking up the most time.
[11/19/2014] 5:37:26 PM BtRigidBody Destroy; World.RemoveCollisionObject: 1ms (15316 tick(s))
[11/19/2014] 5:37:26 PM BtRigidBody Destroy; MotionState.Dispose: 0ms (28 tick(s))
[11/19/2014] 5:37:26 PM BtRigidBody Destroy; RigidBody.Dispose: 0ms (28 tick(s))
[11/19/2014] 5:37:26 PM BtRigidBody Destroy; World.RemoveCollisionObject: 1ms (12782 tick(s))
[11/19/2014] 5:37:26 PM BtRigidBody Destroy; MotionState.Dispose: 0ms (28 tick(s))
[11/19/2014] 5:37:26 PM BtRigidBody Destroy; RigidBody.Dispose: 0ms (33 tick(s))
[11/19/2014] 5:37:26 PM BtRigidBody Destroy; World.RemoveCollisionObject: 1ms (13020 tick(s))
[11/19/2014] 5:37:26 PM BtRigidBody Destroy; MotionState.Dispose: 0ms (33 tick(s))
[11/19/2014] 5:37:26 PM BtRigidBody Destroy; RigidBody.Dispose: 0ms (28 tick(s))
[11/19/2014] 5:37:26 PM BtRigidBody Destroy; World.RemoveCollisionObject: 1ms (12647 tick(s))
[11/19/2014] 5:37:26 PM BtRigidBody Destroy; MotionState.Dispose: 0ms (28 tick(s))
[11/19/2014] 5:37:26 PM BtRigidBody Destroy; RigidBody.Dispose: 0ms (28 tick(s))
Is this typical? I'd like to note that I have about 1331 BvhTriangleMesh+RigidBody objects in the world all of which are static (mass = 0).

Re: Is it safe to dispose of shapes/rigid bodies in a thread

Posted: Wed Nov 19, 2014 11:51 pm
by c6burns
Looks like a hot mess inside that wrapper ... I don't understand why they are maintaining their own lists of objects when bullet already does this for you in the dynamics world. For example, btDiscreteDynamicsWorld already maintains a btAlignedObjectArray<btRigidBody *> of all the rigid bodies in the world ... so why duplicate that in the wrapper?

Were this my problem, my approach would be to clarify whether the issue is in the wrapper or in bullet. I don't see removing rigid bodies from the world taking 1ms in bullet, so my guess is the wrapper, but it's just my guess.

Re: Is it safe to dispose of shapes/rigid bodies in a thread

Posted: Thu Nov 20, 2014 8:25 am
by Slight0
c6burns wrote:Looks like a hot mess inside that wrapper ... I don't understand why they are maintaining their own lists of objects when bullet already does this for you in the dynamics world. For example, btDiscreteDynamicsWorld already maintains a btAlignedObjectArray<btRigidBody *> of all the rigid bodies in the world ... so why duplicate that in the wrapper?

Were this my problem, my approach would be to clarify whether the issue is in the wrapper or in bullet. I don't see removing rigid bodies from the world taking 1ms in bullet, so my guess is the wrapper, but it's just my guess.
While it is strange that they are basically duplicating a list that bullet already maintains (I'm guess he was thinking it was faster for other functions to use the c# list instead of having the query the c++ wrapper API), it seems even that runs pretty fast. The actual problem was a Debug.Log call that I removed from the example in my OP because I underestimated just how slow it was. Unity's Debug.Log function is really really slow and accounted for 99% of the slowdown.

The function inside the wrapper was running at 20-30 ticks (barely a microsecond) and my call to that exact wrapper function is taking about 200-300 ticks. While I don't understand where those extra hundred or so ticks are coming from (cross dll call overhead??), it's good enough.

Re: Is it safe to dispose of shapes/rigid bodies in a thread

Posted: Thu Nov 20, 2014 6:15 pm
by c6burns
Ah OK understood ... not too surprised I am wrong having only spent a few moments looking at the wrapper hehe

Cool though, glad you got it sorted :)

Re: Is it safe to dispose of shapes/rigid bodies in a thread

Posted: Thu Nov 20, 2014 10:28 pm
by kingchurch
Some comments on the wrapper and duplicate object list:

P-Invoke interface is notoriasly slow. Even though the C# code in the wrapper function only takes micro seconds and the C++ side Bullet code only takes microseconds, the P-Invoke API marshaling code that transforms managed call convention into extern C call convention can be very slow. That's basically the .Net interpreter code who does P-Invoke mar shelling online. In order to TRUEly find out the bottle neck, I suggest you insert Profiler probes into the Bullet native code so you know how much Bullet really takes. Then subtract bullet execution time from the C# wrapper time you will understand the P-Invoke overhead. It's generally 50-100 instructions overhead and heavily depends on the complexity of the parameters.

For the above reason, I suspect that's why the BulletSharp author choose to cache the rigid body list locally at the C# side. So when you walk through the linked list of rigid body you don't incur thousands of P-Invoke calls which is TERRIBLE.

As a rule of thumb, never do P-Invoke stuff from within a loop. That's why it's almost never a good idea to do a direct function-to-function C# wrapper for a C / C++ library.

Re: Is it safe to dispose of shapes/rigid bodies in a thread

Posted: Fri Nov 21, 2014 11:56 am
by anthrax11
Disclaimer: I'm the BulletSharp/BulletSharpPInvoke dev. :)

Even though the issue wasn't related to Bullet, I'd say multithreading is still a bit of a gray area. For example, there is a very similar issue described here:
https://code.google.com/p/bulletsharp/i ... tail?id=67
The problem there seems to be that the profiler in Bullet (not in the wrapper), is not implemented in a thread-safe manner. You can see that there are static variables that are accessed without locks every time that BT_PROFILE("...") is called. The profiler is called in many places in btCollisionWorld and other parts of the simulation. It's not entirely clear what was happening, but compiling Bullet without the profiler did fix the issue. I didn't have time to investigate further, but there's a tree structure that is being written to and read from, possibly getting stuck in some sort of loop.
https://github.com/erwincoumans/bullet3 ... rof.h#L146

Anyway, Bullet has multithreading capabilities, but not everything was designed with multithreading in mind. Certainly nothing should be done from another thread while the simulation is running.

The Dispose method on RigidBody doesn't remove the body from the world, so you have to make sure it is removed before cleaning up. Otherwise there's a deleted object in the simulation and a crash is guaranteed.

While P/Invoke has some overhead, it actually turned out to be faster than I had anticipated. In most cases, the performance is comparable to the original C++/CLI version of BulletSharp. But yes, all kinds of iterations over native arrays can still be a bottleneck. However, it's the same thing with C++/CLI. That's just the price to pay for using .NET. I'd say in general, the performance is not bad at all, but perhaps not enough for AAA games or heavy raytracing and things like that.

For the collision world, maintaining a separate list of managed collision objects is actually something that got rid of the managed-unmanaged overhead almost completely (you guys got the right idea :)). Firstly, each access to the native array was already a managed-unmanaged call. Getting a reference to a managed object from the native pointer took even more time.

The list is also a way for the world to maintain references to objects. If the reference was lost, then the managed object could be implicitly disposed by the garbage collector and then the managed destructor would in turn delete the btCollisionObject while it is still in the world. Previously, each object would keep a reference to itself, but obviously the garbage collector in that case had no way to clean it up if it was leaked.

Just my 2 cents.

Re: Is it safe to dispose of shapes/rigid bodies in a thread

Posted: Fri Nov 21, 2014 12:11 pm
by c6burns
anthrax11 wrote:The list is also a way for the world to maintain references to objects. If the reference was lost, then the managed object could be implicitly disposed by the garbage collector and then the managed destructor would in turn delete the btCollisionObject while it is still in the world. Previously, each object would keep a reference to itself, but obviously the garbage collector in that case had no way to clean it up if it was leaked.
Ohhhhhh that makes a lot of sense. It's been so long since I used anything with a GC I didn't even think of this. Thanks for the explanation :)