CPU multithreading is working!

Post Reply
lunkhound
Posts: 99
Joined: Thu Nov 21, 2013 8:57 pm

Re: CPU multithreading is working!

Post by lunkhound » Thu Jun 18, 2015 8:04 am

ai-music wrote:Finally, after tests i changed sheduler to PPL (at MSVC 2010 OpenMP works not smoothly).
But for MSVC 2010 need to change code for PPL in ParallelFor.h (because partitioner-argument is not supported for parallel_for) like this:


And now my gameEngine works even faster. Thanks a lot for your work! :)

PS: Erwin has to know about it and maybe add MT-support officialy.
Thanks for the feedback! I've fixed those issues you found and updated my repository.

Tinos
Posts: 10
Joined: Thu Jul 23, 2015 10:29 pm

Re: CPU multithreading is working!

Post by Tinos » Thu Jul 23, 2015 11:08 pm

Hi lunkhound,

I've implemented your code into my project, and the speed difference is quite impressive!

There are two things I noticed though:

First, enabling USE_PARALLEL_NARROWPHASE seems to kill determinism. Unfortunately, this option seems to give the most speed gain (in my case anyway) and determinism is important for my project. I haven't tried to figure out why this happens yet, just an observation.

Also, I've been getting crashes when using kinematic bodies. It doesn't always happen at the same time during simulation, but it always crashes at this assertion, and it seems to always assert because solverBodyIdA == m_tmpSolverBodyPool.size(). I haven't tried figuring this one out either yet, but maybe have an idea why this is happening.

Code: Select all

btSequentialImpulseConstraintSolver.cpp

   717   int btSequentialImpulseConstraintSolver::getOrInitSolverBody(btCollisionObject& body,btScalar timeStep)
   ...
   725   solverBodyIdA = body.getCompanionId();
>  726   btAssert(solverBodyIdA < m_tmpSolverBodyPool.size());
Stack trace:

Code: Select all

btSequentialImpulseConstraintSolver::getOrInitSolverBody(btCollisionObject & body, float timeStep)
btSequentialImpulseConstraintSolver::convertContact(btPersistentManifold * manifold, const btContactSolverInfo & infoGlobal)
btSequentialImpulseConstraintSolver::convertContacts(btPersistentManifold * * manifoldPtr, int numManifolds, const btContactSolverInfo & infoGlobal)
btSequentialImpulseConstraintSolver::solveGroupCacheFriendlySetup(btCollisionObject * * bodies, int numBodies, btPersistentManifold * * manifoldPtr, int numManifolds, btTypedConstraint * * constraints, int numConstraints, const btContactSolverInfo & infoGlobal, btIDebugDraw * debugDrawer)
btSequentialImpulseConstraintSolver::solveGroup(btCollisionObject * * bodies, int numBodies, btPersistentManifold * * manifoldPtr, int numManifolds, btTypedConstraint * * constraints, int numConstraints, const btContactSolverInfo & infoGlobal, btIDebugDraw * debugDrawer, btDispatcher * __formal)
MyConstraintSolverPool::solveGroup(btCollisionObject * * bodies, int numBodies, btPersistentManifold * * manifolds, int numManifolds, btTypedConstraint * * constraints, int numConstraints, const btContactSolverInfo & info, btIDebugDraw * debugDrawer, btDispatcher * dispatcher)
InplaceSolverIslandCallback::processIsland(btCollisionObject * * bodies, int numBodies, btPersistentManifold * * manifolds, int numManifolds, btTypedConstraint * * constraints, int numConstraints, int islandId)
UpdateIslandDispatcher::operator()(const tbb::blocked_range<int> & range)
Do you have an idea what the cause of this could be?

Thanks!
Martijn

Tinos
Posts: 10
Joined: Thu Jul 23, 2015 10:29 pm

Re: CPU multithreading is working!

Post by Tinos » Thu Jul 23, 2015 11:38 pm

I just noticed that the companion id is a member of btCollisionObject, so I'm guessing that multiple threads are changing this value concurrently (body.setCompanionId). That doesn't explain why it only seems to be crashing when using kinematic objects though..

Martijn

Tinos
Posts: 10
Joined: Thu Jul 23, 2015 10:29 pm

Re: CPU multithreading is working!

Post by Tinos » Fri Jul 24, 2015 12:49 pm

I'm not sure, but I think the parallel narrowphase kills determinism because the order of the manifolds changes, which in turn causes the solver to yield different results.

Martijn

lunkhound
Posts: 99
Joined: Thu Nov 21, 2013 8:57 pm

Re: CPU multithreading is working!

Post by lunkhound » Fri Jul 24, 2015 6:17 pm

Tinos wrote:I'm not sure, but I think the parallel narrowphase kills determinism because the order of the manifolds changes, which in turn causes the solver to yield different results.

Martijn
Yes, the multithreaded narrowphase is currently implemented in a very simple way which doesn't ensure a deterministic ordering of manifolds. There is some discussion of this in the comments on the pull request. The way to fix it is discussed as well (having each thread gather a separate list of new manifolds, then after the parallel part, combine all of the lists, sort the list into a deterministic order, then finally add the deterministicly ordered manifolds).

As for the crash problem, I'm not sure what that is. It looks to me like a bug in the way the constraint solver or the island manager is bookkeeping kinematic bodies.
Do you know of a way to reproduce this with some small changes to the example browser? I would very much like to fix it, but I need a way to reproduce the problem.

lunkhound
Posts: 99
Joined: Thu Nov 21, 2013 8:57 pm

Re: CPU multithreading is working!

Post by lunkhound » Fri Jul 24, 2015 6:26 pm

Tinos wrote:I just noticed that the companion id is a member of btCollisionObject, so I'm guessing that multiple threads are changing this value concurrently (body.setCompanionId). That doesn't explain why it only seems to be crashing when using kinematic objects though..

Martijn
Well, a rigid body can only ever exist in a single simulation island, and can only be touched by that island's thread. A kinematic body can be part of more than one island, however, so multiple threads could be touching it. Sounds like a good hypothesis.

Tinos
Posts: 10
Joined: Thu Jul 23, 2015 10:29 pm

Re: CPU multithreading is working!

Post by Tinos » Fri Jul 24, 2015 8:24 pm

lunkhound wrote:Well, a rigid body can only ever exist in a single simulation island, and can only be touched by that island's thread. A kinematic body can be part of more than one island, however, so multiple threads could be touching it. Sounds like a good hypothesis.
Ah, of course! I didn't realize non-kinematic bodies can only ever exist in one island. So it does in fact explain why it crashes when using kinematic objects. Thanks for pointing me to the pull request btw, I somehow missed it.

Cheers,
Martijn

lunkhound
Posts: 99
Joined: Thu Nov 21, 2013 8:57 pm

Re: CPU multithreading is working!

Post by lunkhound » Fri Jul 24, 2015 10:59 pm

Tinos wrote:
lunkhound wrote:Well, a rigid body can only ever exist in a single simulation island, and can only be touched by that island's thread. A kinematic body can be part of more than one island, however, so multiple threads could be touching it. Sounds like a good hypothesis.
Ah, of course! I didn't realize non-kinematic bodies can only ever exist in one island. So it does in fact explain why it crashes when using kinematic objects. Thanks for pointing me to the pull request btw, I somehow missed it.

Cheers,
Martijn
Btw, I was able to reproduce this issue by setting the CF_KINEMATIC_OBJECT flag on the ground object in the multithreaded demo. So I've got a good test case now. I'll dig into it when I get some time.

Tinos
Posts: 10
Joined: Thu Jul 23, 2015 10:29 pm

Re: CPU multithreading is working!

Post by Tinos » Fri Jul 24, 2015 11:36 pm

lunkhound wrote:
Tinos wrote:
lunkhound wrote:Well, a rigid body can only ever exist in a single simulation island, and can only be touched by that island's thread. A kinematic body can be part of more than one island, however, so multiple threads could be touching it. Sounds like a good hypothesis.
Ah, of course! I didn't realize non-kinematic bodies can only ever exist in one island. So it does in fact explain why it crashes when using kinematic objects. Thanks for pointing me to the pull request btw, I somehow missed it.

Cheers,
Martijn
Btw, I was able to reproduce this issue by setting the CF_KINEMATIC_OBJECT flag on the ground object in the multithreaded demo. So I've got a good test case now. I'll dig into it when I get some time.
Thanks lunkhound, I'll try to come up with a solution for this as well.

lunkhound
Posts: 99
Joined: Thu Nov 21, 2013 8:57 pm

Re: CPU multithreading is working!

Post by lunkhound » Sat Jul 25, 2015 4:37 am

I created a new branch in my repo with a fix for this issue. I added a new branch for it because I'm not comfortable adding this to my pull-request just yet.

So its really not good for the constraint solver to be writing to the companionId field of CollisionBody objects that are potentially shared between threads. So I got rid of that, and instead I added a uniqueId field to CollisionBody which gets set before the constraint solvers. Each constraint solver can then use the uniqueId to create its own private mapping table, and not clash with solvers on other threads.

So basically the companionId is no longer used at all for the sequential-impulse-constraint-solver. However there is one place in the code that still depends on companionId being set -- it's in the featherstone point-to-point constraint (btMultiBodyPoint2Point.cpp). I'm not sure how to nicely eliminate that dependence. So currently, this "fix" most likely breaks the multi-body point-to-point constraint.

@Tinos, can you let me know if this fixes your crash?

Tinos
Posts: 10
Joined: Thu Jul 23, 2015 10:29 pm

Re: CPU multithreading is working!

Post by Tinos » Sat Jul 25, 2015 2:18 pm

lunkhound wrote:I created a new branch in my repo with a fix for this issue. I added a new branch for it because I'm not comfortable adding this to my pull-request just yet.

So its really not good for the constraint solver to be writing to the companionId field of CollisionBody objects that are potentially shared between threads. So I got rid of that, and instead I added a uniqueId field to CollisionBody which gets set before the constraint solvers. Each constraint solver can then use the uniqueId to create its own private mapping table, and not clash with solvers on other threads.

So basically the companionId is no longer used at all for the sequential-impulse-constraint-solver. However there is one place in the code that still depends on companionId being set -- it's in the featherstone point-to-point constraint (btMultiBodyPoint2Point.cpp). I'm not sure how to nicely eliminate that dependence. So currently, this "fix" most likely breaks the multi-body point-to-point constraint.

@Tinos, can you let me know if this fixes your crash?
Yep, that fixes it for me!

Tinos
Posts: 10
Joined: Thu Jul 23, 2015 10:29 pm

Re: CPU multithreading is working!

Post by Tinos » Sat Jul 25, 2015 7:56 pm

Tinos wrote:
lunkhound wrote:I created a new branch in my repo with a fix for this issue. I added a new branch for it because I'm not comfortable adding this to my pull-request just yet.

So its really not good for the constraint solver to be writing to the companionId field of CollisionBody objects that are potentially shared between threads. So I got rid of that, and instead I added a uniqueId field to CollisionBody which gets set before the constraint solvers. Each constraint solver can then use the uniqueId to create its own private mapping table, and not clash with solvers on other threads.

So basically the companionId is no longer used at all for the sequential-impulse-constraint-solver. However there is one place in the code that still depends on companionId being set -- it's in the featherstone point-to-point constraint (btMultiBodyPoint2Point.cpp). I'm not sure how to nicely eliminate that dependence. So currently, this "fix" most likely breaks the multi-body point-to-point constraint.

@Tinos, can you let me know if this fixes your crash?
Yep, that fixes it for me!
I did some more testing and found that it does in fact still crash. I might have forgotten to switch back to the parallel island solver before running those earlier tests...

For some reason, the following assertion fails:

Code: Select all

int	btSequentialImpulseConstraintSolver::getOrInitSolverBody(btCollisionObject& body, btScalar timeStep)
{
	int uniqueId = body.getUniqueId();
	btAssert(uniqueId < m_bodyUniqueIdToSolverBodyTable.size());
The 'body' happens to be my kinematic object, and its uniqueId == m_bodyUniqueIdToSolverBodyTable.size(). So, for some reason, it appears that the kinematic object is not included in the btCollisionObject** bodies array that is passed to the solveGroupCacheFriendlySetup function.

Debugging the code indeed shows that numBodies = 312 here:

Code: Select all

solveGroup( bodies, numBodies, manifolds, numManifolds, constraints, numConstraints, info, debugDrawer, dispatcher );
And the above assertion fails because uniqueId (again, the kinematic object) = 313.

Hope this helps,
Martijn

Tinos
Posts: 10
Joined: Thu Jul 23, 2015 10:29 pm

Re: CPU multithreading is working!

Post by Tinos » Sat Jul 25, 2015 11:02 pm

Tinos wrote:I did some more testing and found that it does in fact still crash. I might have forgotten to switch back to the parallel island solver before running those earlier tests...

For some reason, the following assertion fails:

Code: Select all

int	btSequentialImpulseConstraintSolver::getOrInitSolverBody(btCollisionObject& body, btScalar timeStep)
{
	int uniqueId = body.getUniqueId();
	btAssert(uniqueId < m_bodyUniqueIdToSolverBodyTable.size());
The 'body' happens to be my kinematic object, and its uniqueId == m_bodyUniqueIdToSolverBodyTable.size(). So, for some reason, it appears that the kinematic object is not included in the btCollisionObject** bodies array that is passed to the solveGroupCacheFriendlySetup function.

Debugging the code indeed shows that numBodies = 312 here:

Code: Select all

solveGroup( bodies, numBodies, manifolds, numManifolds, constraints, numConstraints, info, debugDrawer, dispatcher );
And the above assertion fails because uniqueId (again, the kinematic object) = 313.

Hope this helps,
Martijn
Probably not the best solution, but I just posted a pull request here that fixes the issue for me.
I think the problem is that kinematic bodies do not receive an island tag (btSimulationIslandManager::updateActivationState), so they're not included in the union. Which means (I think) that they're also not included in the bodies array (btSequentialImpulseConstraintSolver::solveGroupCacheFriendlySetup). I must admit that I am not all too familiar with bullet's internals, so I might be completely wrong.

Cheers,
Martijn

Tinos
Posts: 10
Joined: Thu Jul 23, 2015 10:29 pm

Re: CPU multithreading is working!

Post by Tinos » Sun Jul 26, 2015 3:00 pm

lunkhound wrote:
Tinos wrote:I'm not sure, but I think the parallel narrowphase kills determinism because the order of the manifolds changes, which in turn causes the solver to yield different results.

Martijn
Yes, the multithreaded narrowphase is currently implemented in a very simple way which doesn't ensure a deterministic ordering of manifolds. There is some discussion of this in the comments on the pull request. The way to fix it is discussed as well (having each thread gather a separate list of new manifolds, then after the parallel part, combine all of the lists, sort the list into a deterministic order, then finally add the deterministicly ordered manifolds).
I've posted another pull request with a possible fix to ensure determinism. Basically, when dispatchAllCollisionPairs has done it's work, the list of manifolds is rebuilt by calling getAllContactManifolds on each of the overlapping pair's collision algorithms (much like what was suggested in the discussion here).

Martijn

lunkhound
Posts: 99
Joined: Thu Nov 21, 2013 8:57 pm

Re: CPU multithreading is working!

Post by lunkhound » Mon Jul 27, 2015 5:29 am

Thanks for those fixes!

I haven't merged the determinism one yet because I want to see what the performance implications are. I'll have a look at it later.

Post Reply