Caching Raycast structure

pico · Post by **pico** » Thu Dec 09, 2010 4:18 pm

Hi,

i have a special project where i cast many rays against a specific btBvhTriangleMeshShape. I directly use btBvhTriangleMeshShape::performRaycast. Unfortunately the CPU usage of the raycasts is too high. I need something around 3 times faster

Does anyone has an idea how to speed up raycasts to a specific mesh?
The raycasts are quite short and differ only a little each frame, so they would often hit the same nodes in the tree.

My first idea was to use a cache structure for each ray... however, i don't know how this could actually be implemented for btQuantizedBvh::walkStacklessTreeAgainstRay while still being conservative and don't miss any hits.

Any ideas?

Erwin Coumans · Post by **Erwin Coumans** » Thu Dec 09, 2010 7:58 pm

What CPU are you targetting?

One option is using multiple threads or another option is using a batch raycast. The batch raycast implementation is currently disabled, but with some effort you could re-enable it using an older Bullet SDK.
See also Bullet\Demos\ConcaveRaycastDemo\ConcaveRaycastDemo.cpp

Thanks,
Erwin

pico · Post by **pico** » Fri Dec 10, 2010 8:57 am

Erwin Coumans wrote:What CPU are you targetting?

One option is using multiple threads or another option is using a batch raycast. The batch raycast implementation is currently disabled, but with some effort you could re-enable it using an older Bullet SDK.
See also Bullet\Demos\ConcaveRaycastDemo\ConcaveRaycastDemo.cpp

Thanks,
Erwin

Hi Erwin,

thanks for the reply. I target a relative slow single core CPU with small cache.
After profiling i noticed that 50% of the performance gone into the un-quantize functions. Disabling quantization already doubled performance.
The rest of the performance goes into walking the tree.

I already forgot about the batch raycast. I will take a look into an old SDK. Was there a specific reason to disable it?

I think an axis sorted (like 'static' sweep and prune) approach for the tri mesh could improve performance. Each raycast would in a broadphase determine the aabb overlaps. Then overlaps are sorted by closest point on ray to aabb center. Not 100% correct but with semi even distributed triangles the real world use should always give proper results. Tho i'm not sure how a caching scheme could be added for the raycasts to improve performance further...
Do you think this could be actually faster then the current bvh walk?

Thanks

pico · Post by **pico** » Mon Dec 13, 2010 2:12 pm

One option is using multiple threads or another option is using a batch raycast. The batch raycast implementation is currently disabled, but with some effort you could re-enable it using an older Bullet SDK.
See also Bullet\Demos\ConcaveRaycastDemo\ConcaveRaycastDemo.cpp

Hi Erwin,

i've searched in the ConcaveRaycastDemo.cpp in older bullet sources but the only none spu implementations i found were just using the standard raycast.
Was there a special optimized non spu batch raycast in older version that i missed? I searched back until 2008 sources...

Real-Time Physics Simulation Forum

Caching Raycast structure

Caching Raycast structure

Re: Caching Raycast structure

Re: Caching Raycast structure

Re: Caching Raycast structure