Minor refactoring in cd-pipeline.

S.Lundmark · Post by **S.Lundmark** » Tue Oct 27, 2009 4:28 pm

Hi,

I've made a refactoring into the collision-detection pipeline. Instead of passing btCollisionObject*'s, I added a new struct, btCollider, that contains the information neccessary to traverse the collision-detection pipeline. The reason why I did this is because of the switch of collision shape that's done in btCollisionObjects during the compound collision detection. I needed this because I need read-access to the same instance of a btCompoundCollisionShape from multiple threads at once.

I'm wondering if this feature is something that you guys want added into the repository? I guess that this would also include my quantized bvh-implementation for the btCompoundShape.

I'd also like to ask a question regarding the btCompoundCollisionAlgorithm.

At the end of the processCollision()-function, an iteration is done over all childCollisionAlgorithms. An aabb-test is done if any of those algorithms exist, and if the test fails - the algorithm is removed. In the case of a list-shape that does not change, could this test be skipped? Since each collision-pair from the broadphase creates a unique collision-algorithm, the childcollisionalgorithms are only used at the case of actual aabb-overlapping (assuming that a spatial lookup is used such as the dbvt or the quantized bvh). In my case, this loop takes a lot of time. I have a really large compoundShape, and the general case is that objects are colliding against it.

Also, would it not be faster to do the check for the dynamic aabb-tree outside the loop in the preAllocateChildAlgorithms, and just use a memset instead?

Cheers,
Simon

S.Lundmark · Post by **S.Lundmark** » Fri Oct 30, 2009 8:16 am

Hey,

I've refactored the mention parts, passing btCollider's instead of btCollisionObject's through the narrowphase collision-pipeline and it seems to work very well. I haven't tested all of the demo framework's with it, but I'll try to do that sometimes during next week.

Is there a guide as to how I can generate appropriate changelists for you to review?

/Simon.

S.Lundmark · Post by **S.Lundmark** » Fri Oct 30, 2009 8:40 am

Another question:

Is there any reason as to why the btPersistentManifold pointer / m_ownsManifold bool in pretty much all collisionAlgorithms isn't located in the basic btCollisionAlgorithm-class? That could save quite a lot of code to just move it to the base-class.

/Simon

ola · Post by **ola** » Fri Oct 30, 2009 9:46 am

Hi Simon,

great work!

You can post your patch at the issue tracker, found here: http://code.google.com/p/bullet/issues/list
Then it won't get lost in the threads here in the forum.

Cheers,
Ola

S.Lundmark · Post by **S.Lundmark** » Wed Nov 04, 2009 7:15 am

Hi Ola,

Thanks! I'm currently reviewing my changelist (which was a lot larger than I first thought).

It also involves passing a btCollisionProcessInfo-struct in the processCollision()-function(s). This allowed me to pass the dispatcher through the calls rather than each collisionAlgorithm having a pointer to it. In another changelist I'm done with actually having the pointers that gjk needs also accessed through the dispatcher. This removes the need for creating structs that have specific implementations of how to create a collisionAlgorithm.

The reason to why I did this is because my inhouse multithreading of the bullet collision pipeline (for none-ps3 platforms) involve using one dispatcher per thread and allowing the collisionAlgorithms to be used on any thread. This seems to separate up the ownerships better and creates what I think is a cleaner api in the cd pipeline. I'll try to send that changelist as soon as I'm done with this one. However, as my code involves some other optimizations that I don't think that you are interested in (some ugly cache prefetching calls), it is taking some time to go through the code and clean it up before submission. I don't have more than maybe 1 hour or so per day to work on this either. Hopefully I'll be able to send you guys the first changelist sometime during next week.

If there are any objections to the changes that I've made, I'd be very glad if that information is given to me as soon as possible so that I'm not doing this work and it gets thrown out through the window

I assume that you guys will want to review the code quite a bit before it's passed through though.

Does the changes sound ok?
/Simon

Erwin Coumans · Post by **Erwin Coumans** » Wed Nov 04, 2009 11:07 am

Hi Simon,

Thanks for all the work, and considering sharing it.

As Ola already mentioned, please create an issue, with preferably a few patches for each major change (with description) and post it at http://code.google.com/p/bullet/issues/list
(please try to separate "quantized bvh-implementation for the btCompoundShape" patch from the refactoring patch if possible)

Note that we are now ramping up with Bullet 3.x, optimizing for GPGPU (OpenCL/Larrabee etc), changing the API a bit, so your refactoring contribution could influence this 3.x work.
Bullet 2.x is going to stabilize (bug fixes, no changes in API). Note that we didn't put a persistent manifold in the collision algorithm for non-SpuGatheringCollisionDispatcher so we can have more than 1 persistent manifold (GIMPACT for example provides multiple manifold, and so does the compound shape).

Have you tried to use Bullet/MultiThreaded/SpuGatheringCollisionDispatcher.cpp on PC/XBox360?
It provides a parallel collision dispatcher and it works for all platforms. Some games use it for XBox 360 and PC. Despite its confusing name, SpuGatheringCollisionDispatcher.cpp supports multi-threaded using either Win32 Threads and ptreads, next to SPURS tasks.
Thanks,
Erwin

S.Lundmark · Post by **S.Lundmark** » Wed Nov 04, 2009 5:09 pm

Hi Erwin,

I'll try to separate the quantized bvh-implementation of the btCompoundShape, it shouldn't be very hard.

Thanks for giving the heads up on 3.x, I'll definetly try to give as much feedback and patches as I can - given the amount of time I can spend on it of course. I'll post it on google-code as soon as I have the changelists cleaned up and working.

I tried the MultiThreaded version that comes with bullet but unfortunately there were issues with it that lead me to choose my own implementation:

The btCompound-algorithm is o2 in the multithreaded lib. This could pretty easily be optimized if it wasn't for the fact that the prefetching of the btcompound childshapes were limited to a certain number of shapes. Not only that, the optimized version of boxbox collision was before the actual 'processCollision' runthrough that did different algorithms based on what type of shapes collided. This caused the collision to be resolved as compound vs x followed by gjk for convex-convex if you were trying to collide a box vs a compound of boxes. This is the general case for us.

This was in bullet 2.74 though so maybe you changed some of the things to 2.75? Another issue was that the spu-intrinsics for main-memory fetches were translated into shared-memory versions that seemed to include memcopies. I'm not 100% certain that I'm correct with these points but it was what I interpreted the code to. I decided that it would take me more time to fix the issues due to our very specific use of btCompoundShapes (large amount of child-shapes + the quantizedBvh addition to it), rather than just implement my own. Fixxing up the multithreaded issues in the "normal" collision-pipeline wasn't a very difficult task and I am much more familiar with that code. The only real issue was to implement a multithreaded version of the btPoolAllocator, I can send you the code if you want it upstreamed?

I will be trying to look into more optimizations of the btCompoundCollisionAlgorithm, perhaps trying to eliminate the aabb-tests in the end of the algorithm. If you have any tips regarding that, I would be most grateful.

Erwin Coumans · Post by **Erwin Coumans** » Wed Nov 04, 2009 5:34 pm

S.Lundmark wrote: I tried the MultiThreaded version that comes with bullet but unfortunately there were issues with it that lead me to choose my own implementation:

The btCompound-algorithm is o2 in the multithreaded lib. This could pretty easily be optimized if it wasn't for the fact that the prefetching of the btcompound childshapes were limited to a certain number of shapes. Not only that, the optimized version of boxbox collision was before the actual 'processCollision' runthrough that did different algorithms based on what type of shapes collided. This caused the collision to be resolved as compound vs x followed by gjk for convex-convex if you were trying to collide a box vs a compound of boxes. This is the general case for us.

This was in bullet 2.74 though so maybe you changed some of the things to 2.75? Another issue was that the spu-intrinsics for main-memory fetches were translated into shared-memory versions that seemed to include memcopies. I'm not 100% certain that I'm correct with these points but it was what I interpreted the code to. I decided that it would take me more time to fix the issues due to our very specific use of btCompoundShapes (large amount of child-shapes + the quantizedBvh addition to it), rather than just implement my own. Fixxing up the multithreaded issues in the "normal" collision-pipeline wasn't a very difficult task and I am much more familiar with that code. The only real issue was to implement a multithreaded version of the btPoolAllocator, I can send you the code if you want it upstreamed?

Those 3 issues could be resolved. btCompoundShape wasn't designed for a huge amount of child shapes, hence the n^2 test. Most of the memcopy fake-dma fetches have been eliminated already, are there remaining copies that show up in any profiling timings? The box-box check could be enabled within the compound dispatch.

Does this mean you are running the narrowphase collision detection on PPU for PS3?

I will be trying to look into more optimizations of the btCompoundCollisionAlgorithm, perhaps trying to eliminate the aabb-tests in the end of the algorithm. If you have any tips regarding that, I would be most grateful.

Of course, a google code contributions are welcome. Have you also implemented using the acceleration structures for the ray test / convex cast against a btCompoundShape?
Thanks,
Erwin

S.Lundmark · Post by **S.Lundmark** » Thu Nov 05, 2009 6:42 am

Hi Erwin,

Yes currently we are still running bullet on the ppu on the ps3, we are going to use the sony-supported version of spu-bullet there though. We've had the previous spu-bullet running, so it shouldn't be a problem.

I'll take a look at the multithreading library in 2.75, to see if that's something we can use in future releases to avoid work with patching local code.

I didn't do much profiling on the multithreaded version since the first issue would be to resolve the o2 by using the quantized bvh together with allowing big compounds. We took a look at how much time it would take to do either that or multithread the normal version in-house. We decided that doing the in-house version would go faster (take less time to implement) since I am more familiar with that code - and we already had the quantized bvh implement in that cd pipeline.

The quantized-bvh has not yet been ported to convexcast / raycasting. I will be working on that in january, unless I get some time in my spare time to fix it

Do you want me to put that patch on hold until it is fully integrated? Perhaps that is better.

Cheers,
Simon

Real-Time Physics Simulation Forum

Minor refactoring in cd-pipeline.

Minor refactoring in cd-pipeline.

Re: Minor refactoring in cd-pipeline.

Re: Minor refactoring in cd-pipeline.

Re: Minor refactoring in cd-pipeline.

Re: Minor refactoring in cd-pipeline.

Re: Minor refactoring in cd-pipeline.

Re: Minor refactoring in cd-pipeline.

Re: Minor refactoring in cd-pipeline.

Re: Minor refactoring in cd-pipeline.