Running 'stepSimulation' as a parallel task...

Zeal
Posts: 47
Joined: Thu Oct 18, 2007 6:49 am

Running 'stepSimulation' as a parallel task...

Post by Zeal »

So I know you can thread some of the 'innards' of this call (world->stepSimulation()), but I would really like to take the whole thing and run it as a parallel task (since it is usually the single biggest serial bottleneck in a lot of applications). The trouble is, if you run the entire call in parallel, you can no longer safely read the world state from any other threads. This got me thinking - what if you maintained TWO 'bullet states', one that represents the LAST frames state, and is read only, and the other which is writable and is used exclusively by stepSimulation. This way you could be updating the physics (for the next frame) in one thread, AND all of your other n threads could still safely do things like raycasts, get body transforms, ect...

Is such a double buffered approach the best way to solve this problem? Or am I missing something?
S.Lundmark
Posts: 50
Joined: Thu Jul 09, 2009 1:46 pm

Re: Running 'stepSimulation' as a parallel task...

Post by S.Lundmark »

Yes that is a typical approach that works well.

I would suggest using queue's for physic-operations that you apply at the sync-point. The sync-point being the part where you update your mirrored transforms from bullet. This would also be a good place to perform queue'd operations such as add-impulses or similar.

However, I wouldn't use a copy of the entire rigidbodies. Their transforms would be sufficient.

Hope this helps,
Regards,
Simon
Zeal
Posts: 47
Joined: Thu Oct 18, 2007 6:49 am

Re: Running 'stepSimulation' as a parallel task...

Post by Zeal »

Ok so assuming my core idea is sound, my next question is - It seems like if I want to be able to run ray casts WHILE I am stepping the simulation (kinda the whole point of doing all this extra work), I would need two COMPLETE bullet states. But you suggest only mirroring things like body transforms ect... Can you elaborate? Like I said, my goal is to allow ALL of my threads access to the 'read only' goodies that bullet provides (ray casts ect...), so what exactly do I need to copy over to my mirrored state in order to make that possible? How can a raycast work unless I copy over the entire 'world' object state?

So just to recap - I want to have one streamlined, super tight thread loop like this...

Code: Select all

while(true)
{
readMessages(); // read and apply messages to apply impulses, add bodies, ect...
world->stepSimulation();
syncWithFrame(); // go to sleep and wait for the next frame to begin
}
And while that is running, I want a SECOND 'world' object (representing the last frame) that I can still use for read only tasks.
User avatar
Erwin Coumans
Site Admin
Posts: 4221
Joined: Sun Jun 26, 2005 6:43 pm
Location: California, USA

Re: Running 'stepSimulation' as a parallel task...

Post by Erwin Coumans »

The ray test will use the broadphase acceleration structures. These are being updated within the stepSimulation, so you will need to deal with this. What other tasks, other then ray tests/convexSeepTest do you want to use?

Did you already check out where performance goes, by adding the line CProfileManager::dumpAll(); straight after the stepSimulation call?
We are trying to parallelize CPU intensive parts within Bullet, rather than overlapping the simulation with graphics/ray tests.
Have you tried out the multi threaded collision dispatcher?

We are working on better parallel optimizations for Bullet 3.x, but this will still take a while. What are the timelines for your project?
Thanks,
Erwin
Zeal
Posts: 47
Joined: Thu Oct 18, 2007 6:49 am

Re: Running 'stepSimulation' as a parallel task...

Post by Zeal »

What other tasks, other then ray tests/convexSeepTest do you want to use
Well anything that deals with reading the state of objects really. From transforms, to individual nodes on a soft body (so I can send the results to my graphics system). I suppose other than that (which would only require copying individual bodies), the only other thing I can think of is raycasting... ARE there any other neat things I could do given access to a mirrored read only state?

But just to clarify, lets say I did want to do a raycast, what all would I need to mirror? It seems like it would require a LOT of copying...
We are trying to parallelize CPU intensive parts within Bullet, rather than overlapping the simulation with graphics/ray tests.
I have not played around much with the multithreaded collision dispatcher, but it was my understanding that it only works for the broadphase? It just seems like unless you could keep n cores busy for the entire duration of the 'stepSimulation' call, it will always be a significant bottleneck, no?
ola
Posts: 169
Joined: Sun Jan 14, 2007 7:56 pm
Location: Norway

Re: Running 'stepSimulation' as a parallel task...

Post by ola »

I've done this in one project (running on a PC with Linux). I recommend you make one thread for rendering only, and another thread for everything else, including the physics simulation. You'll only have to deal with protecting the transforms that is written by the simulation and read by the rendering thread. If you have more than two cores on your CPU then you could even use Bullet's multithreaded features as well.

One advantage is, you can enable vsync on your rendering without caring about it slowing down anything else.

Cheers,
Ola
Calder
Posts: 11
Joined: Fri May 15, 2009 10:01 pm

Re: Running 'stepSimulation' as a parallel task...

Post by Calder »

Erwin Coumans wrote: We are working on better parallel optimizations for Bullet 3.x, but this will still take a while. What are the timelines for your project?
Are there any plans to make Bullet thread-safe as described, or will this just be left up to the individual developer? I also use a different thread for each subsystem as much for cleanliness of code as performance, and while I using the renderer for ray-casts, I'm sure I'll run into some situation where I need to reference the Bullet world outside of the Physics thread.
Zeal
Posts: 47
Joined: Thu Oct 18, 2007 6:49 am

Re: Running 'stepSimulation' as a parallel task...

Post by Zeal »

I'm sure I'll run into some situation where I need to reference the Bullet world outside of the Physics thread.
I could think of lots of cases where you would want one subsystem to reference the state of another. For example, the ai system might need to do a raycast to check line of sight. Or maybe you want the graphics system to render based on how fast a object is moving. Sure you could do SOME things via a simple message queue, but it sure would be nice just to have a thread safe, read only state that ALL systems/threads can access. And remember - the performance gains dont have to be great to justify such an approach, since the only real 'cost' is some memory overhead (which is cheap in most cases), and one frame of latency (which is very hard to detect at 60fps).

On the other hand, assuming we could manage to keep 100% of the cpu cores busy during the 'stepSimulation' call, it might make sense to just run physics in a serial fashion. By serial I mean - for each frame, you would first complete one entire physics step BEFORE you run any other tasks/threads. This would guarantee that all other systems could safely read the physics state. But short of that, it seems like double buffering is the only solution...
User avatar
Erwin Coumans
Site Admin
Posts: 4221
Joined: Sun Jun 26, 2005 6:43 pm
Location: California, USA

Re: Running 'stepSimulation' as a parallel task...

Post by Erwin Coumans »

Zeal wrote:On the other hand, assuming we could manage to keep 100% of the cpu cores busy during the 'stepSimulation' call, it might make sense to just run physics in a serial fashion. By serial I mean - for each frame, you would first complete one entire physics step BEFORE you run any other tasks/threads. This would guarantee that all other systems could safely read the physics state.
This has been the assumption for Bullet 2.x work, that physics can keep all cores/SPUs/threads 100% busy during the stepSimulation call. We will reconsider double-buffering for Bullet 3.x.

Thanks,
Erwin
Calder
Posts: 11
Joined: Fri May 15, 2009 10:01 pm

Re: Running 'stepSimulation' as a parallel task...

Post by Calder »

Cool, thanks!
Zeal
Posts: 47
Joined: Thu Oct 18, 2007 6:49 am

Re: Running 'stepSimulation' as a parallel task...

Post by Zeal »

This has been the assumption for Bullet 2.x work, that physics can keep all cores/SPUs/threads 100% busy during the stepSimulation call
And currently how close are we to that goal? Can the current implementation really keep all cores busy during the stepSimulation call? I just always assumed you could only do so much (like broadphase detection)... just sounds like keeping all cores at 100% would be very tricky to say the least..
sparkprime
Posts: 508
Joined: Fri May 30, 2008 2:51 am
Location: Ossining, New York

Re: Running 'stepSimulation' as a parallel task...

Post by sparkprime »

I agree with Ola -- any modern graphics engine will be maxing out a whole core. That is mostly due to the design philosophy behind d3d9 and gl.

All other aspects of the system (physics, AI, preparation of resources) can have their work chopped up into discrete independent units and farmed out to a thread pool (one thread for each available hardware thread) and therefore as long as there is enough work, the system will be used very efficiently. This is a very nice way to get good adaptive utilisation of cores with an unpredictable and dynamic work-load. I imagine this is what Bullet is doing internally.
Kafu
Posts: 8
Joined: Sat Jun 21, 2008 1:51 pm

Re: Running 'stepSimulation' as a parallel task...

Post by Kafu »

For reference there was a similar discussion here: http://bulletphysics.org/Bullet/phpBB3/ ... 23&start=0 .