Does btParallelConstraintSolver give any speedup?

Post Reply
alexey.rodriguez
Posts: 2
Joined: Mon Dec 10, 2012 10:45 pm

Does btParallelConstraintSolver give any speedup?

Post by alexey.rodriguez »

Hi there,

We have posted a message earlier [1] that talks about parallelization experiments we have performed on a modified version of a Bullet benchmark. The benchmark is AppBenchmarks, specifically the 1000 stacks example which we modified to increase the number of islands.

AppBenchmarks can be compiled to use two parallel algorithms from the Bullet Physics distribution. The first is a parallel constraint solver (btParallelConstraintSolver) and the second is a parallel implementation of the collision dispatcher (SpuGatheringCollision). In our tests we didn't get any benefit from the parallel constraint solver. In fact, we have observed that the parallel constraint solver is often slower than the sequential one. For the parallel collision dispatcher we did observe speed ups from using multiple cores.

Is it known whether btParallelConstraintSolver gives any speedup? Under what workloads does that happen?

Thank you!

Cheers,

Kristian Kolev and Alexey Rodriguez

[1] http://bulletphysics.org/Bullet/phpBB3/ ... f=6&t=8660
User avatar
Erwin Coumans
Site Admin
Posts: 4221
Joined: Sun Jun 26, 2005 6:43 pm
Location: California, USA
Contact:

Re: Does btParallelConstraintSolver give any speedup?

Post by Erwin Coumans »

In our tests we didn't get any benefit from the parallel constraint solver. In fact, we have observed that the parallel constraint solver is often slower than the sequential one.
Indeed, the btParallelConstraintSolver usually doesn't give speedup: it lacks several of the optimizations that the regular btSequentialImpulseConstraintSolver has.

Multithreading the btSequentialImpulseConstraintSolver using constraint splitting (see CustomSplitConstraints in btParallelConstraintSolver.cpp) will likely give good speedup,
both for single large islands and multiple islands.

If you have a patch to run separate simulation islands multithreaded, please share it and I'll look if we can apply your patch.
For the parallel collision dispatcher we did observe speed ups from using multiple cores.
The SpuGatheringCollisionDispatcher can improve the narrowphase collision detection performance, but it doesn't support all features/collision shape types.

The latest version of the regular btCollisionDispatcher should be much easier to parallelize, with full features. The only shared resource is the (de) allocation of contact manifolds in the methods btCollisionDispatcher::getNewManifold and btCollisionDispatcher::releaseManifold. Once those two functions are made thread-safe (for example using an atomic compare and swap) the narrowphase becomes embarrasingly parallel.

Thanks and please keep us updated!
Erwin

By the way, most effort in Bullet went into the regular (sequential) physics pipeline, the PlayStation 3 SPU version and nowadays the OpenCL parallel version.
Post Reply