Multi SAP
Posted: Tue Jul 31, 2007 2:31 pm
Hi,
I implemented the "multi SAP" optimization we talked about in this forum a while back. I also benchmarked the new code against available versions, and while doing this I found that Bullet doesn't report the same number of overlapping shapes as the others. Below are my preliminary notes for this stuff.
Any idea about this Bullet issue? Is this normal? If yes, is there a way to "fix" this?
Thanks,
- Pierre
-----------------------------------------
Multi-SAP notes:
================
The sweep-and-prune (SAP) algorithm doesn't scale well. As the number of objects
increases in the SAP structure, updating it for a single object takes longer and
longer. To solve this, a natural idea is to use multiple broad-phases instead of
a single one. For example, and that's what we tried here, one could use a 2D grid
of SAP structures.
The implementation is not trivial, but also not super complicated. So the details
are left out for another time. Basically we rasterize a 2D bounding box (discarding
the "up" axis) into a 2D grid covering the game world, and objects are inserted in
all SAPs of covered cells. A given object can be inserted in multiple SAPs.
Then we profiled 2 things:
- insertion of new objects (a new object is created in the game world)
- updates of objects (an existing object is moved around)
We profiled 4 different implementations:
- the original ICE sweep-and-prune (based on the one released in Opcode 1.3, but
more optimized)
- the multi SAP which has been built on the ICE version
- the PhysX SAP implementation
- the Bullet SAP implementation (from version 2.55)
In all honesty, only the ICE SAP and its multi-SAP version should be compared,
as they share the same algorithmic details. Other SAP implementations did not
make the same design choices (e.g. linked-lists or arrays, FPU or CPU compares,
etc) so any speed difference between them and the Multi-SAP might come from
those design choices, more than from the nature of the SAP (single or multi).
However it is useful to compare the PhysX and Bullet versions to the ICE SAP,
to see the impact of those design choices on runtime speed. For example it is
often heard that "using X is faster than using Y" when it comes to SAPs. We
found that very often, it actually depends on how X or Y has been implemented,
more than anything else. (For example a linked list of pool-allocated elements
is not as bad as a linked list of elements randomly located in memory.)
In any case, here are the features for the different SAP implementations:
ICE single SAP:
- linked lists (objects allocated from pools)
- FPU or CPU comparisons (CPU in this test)
- unlimited game world
ICE multi SAP:
- 2D array of "ICE single SAPs" (for this test, grid is 8*8 = 64 SAPs)
- limited game world (world bounds needed at creation time)
PhysX:
- arrays
- CPU comparisons
- unlimited game world
Bullet:
- arrays
- CPU comparisons
- quantized boxes
- limited game world (world bounds needed at creation time)
- limited to 32767 objects (*)
(*) using the BP_USE_FIXEDPOINT_INT_32 define to remove this constraint has a
significant performance impact, as we will show in one test.
================================================================================
I - Insertions:
---------------
The test is like this: we create N randomly located objects in a (1000 x 1000 x
1000) game world. Objects have a homogeneous size. This is somewhat artificial
but all implementations are tested against the same scenario so it should be fair.
Then we create object N+1 and profile how long it takes to update the structure.
Some implementations (e.g. PhysX) do not update the structure immediately, only
when the broadphase (BP) is later "updated", so we included the BP update in the
profile. Note however that an implementation (e.g. PhysX) can be optimized for
multiple inclusions at the same time, and this test doesn't reflect that feature.
But the focus should be on the MultiSAP here. At the time of writing the MultiSAP
version doesn't take advantage of multiple insertions, so we didn't test this case
(but we might come back to this later).
Anyway, current results are like this:
--------------------------------------------------------------------------------
N = 400
Insertion time PhysX: 142208
Insertion time Bullet: 66808
Insertion time ICE single: 169456
Insertion time ICE multi: 8056
N = 4000
Insertion time PhysX: 1528408
Insertion time Bullet: 1159072
Insertion time ICE single: 2725808
Insertion time ICE multi: 35712
N = 10000
Insertion time PhysX: 4417480
Insertion time Bullet: 4024632
Insertion time ICE single: 6730048
Insertion time ICE multi: 174712
--------------------------------------------------------------------------------
And the comments:
- insertion has never been optimized in the ICE single SAP. Object is just added
to the 3 linked lists in linear line. It does not try to use "stabbing numbers"
or "markers" to optimize this process. So it is not a surprise that this version
is the slowest.
- a much more interesting result comes from the Multi SAP, which really shines
here. It is based on the same ICE code, so it also doesn't use any special
optimization in individual SAPs. Nonetheless, it easily beats all the other
implementations for insertion times. This is good because that was the whole
point of the multi SAP in the first place: to optimize creation of new objects.
- the multi SAP is betwen 20 and 76 times faster than its "single" counterpart.
It is also between 8 and 32 times faster than Bullet, which is usually slightly
faster than PhysX for insertions.
================================================================================
II - Updates:
-------------
In this test we use the same setup as before, and objects are randomly moved within
the game world, following sine/cosine curves (Lissajoux). Again, this is very
artificial, but everybody's tested against the same scenario.
We have 3 variables to play with here:
- the total number of objects in the world (N)
- the number of objects moving at any given time (M)
- the speed of moving objects
We profiled two main scenarios:
- all objects are moving
- 1% of objects are moving
The first scenario is rather unlikely in a game, but it is a good stress test.
The speed is the same for all objects. We need to do more tests with varying
speeds.
Results so far:
The 8 columns are:
PhysX time | ICE single time | Bullet time | ICE Multi time ||
PhysX nb pairs | ICE single nb pairs | Bullet nb pairs | ICE Multi nb pairs
We recorded results for several frames. One line = one frame. We report results
for several frames, as the numbers might slightly evolve and diverge from one
frame to the next.
PX = PhysX
IS = ICE Single
IM = ICE Multi
BT = Bullet
--------------------------------------------------------------------------------
1) All objects are moving:
--------------------------
N = M = 400 objects
PX | IS | BT | IM ||PX |IS |BT |IM
646 | 526 | 773 | 548 || 2 | 2 | 4 | 2
647 | 543 | 754 | 537 || 2 | 2 | 3 | 2
646 | 523 | 753 | 561 || 2 | 2 | 3 | 2
645 | 529 | 766 | 539 || 2 | 2 | 3 | 2
642 | 537 | 761 | 557 || 2 | 2 | 3 | 2
644 | 526 | 743 | 522 || 2 | 2 | 2 | 2
650 | 537 | 759 | 582 || 2 | 2 | 2 | 2
653 | 541 | 749 | 535 || 2 | 2 | 2 | 2
N = M = 4000 objects
PX | IS | BT | IM || PX | IS | BT | IM
14504 | 20545 | 16579 | 16540 || 275 | 275 | 342 | 275
14987 | 21770 | 17464 | 17963 || 277 | 277 | 349 | 277
15938 | 21043 | 16890 | 16955 || 274 | 274 | 344 | 274
14454 | 21336 | 17075 | 17371 || 276 | 276 | 346 | 276
14410 | 20406 | 16347 | 16555 || 277 | 277 | 347 | 277
14468 | 20304 | 16497 | 15906 || 278 | 278 | 350 | 278
14638 | 22655 | 17377 | 18334 || 278 | 278 | 351 | 278
14982 | 21792 | 17242 | 17777 || 279 | 279 | 354 | 279
14810 | 22142 | 17450 | 17627 || 279 | 279 | 354 | 279
14843 | 22018 | 16640 | 16976 || 281 | 281 | 357 | 281
14982 | 22247 | 17565 | 17419 || 281 | 281 | 355 | 281
Comments:
- for some reason Bullet does not report the correct number of pairs! We don't
know if it's a bug in Bullet, but probably not. We suspect it is probably
because Bullet uses quantized bounds, or maybe some kind of loose bounds, i.e.
it only reports conservative results (to be confirmed). In any case we kept
the results, as the timings seem to indicate that Bullet does "the right job"
anyway (i.e. performance is consistent with other SAP implementations).
- when all objects are moving and the SAP contains "few" objects (400), the Multi
SAP doesn't show any benefit, and is actually a bit slower than the original code.
This is expected for two reasons: a) if the number of objects is limited, the
effects of cache misses during updates is not too big, and b) the Multi SAP has
some fixed overhead to rasterize objects bounds into the grid. This overhead
becomes significant when the update itself is very cheap.
- in the same scenario (400 objects moving) the ICE SAP happens to be faster than
both PhysX and Bullet. Linked lists seem faster than arrays here.
- however things change when the number of objects is bumped to 4000: here the ICE
single SAP becomes slower than the Bullet & PhysX implementations. At the same
time the Multi SAP becomes useful and always runs faster than the single version,
giving it roughly the same speed as Bullet (but still slower than PhysX). On the
other hand the number of pairs reported by Bullet seems really excessive here,
and if further box-box tests are needed to get back the real set of colliding
pairs, it may lose some performance.
- Bullet is always slower than PhysX here.
--------------------------------------------------------------------------------
- note that if all objects are moving, no SAP implementation will beat the "radix
based" pruning. The "box pruning" code in Opcode 1.3 needs ~4000 time units to
find the correct number of pairs in the second test, which is 3.5 times faster
than the fastest SAP implementation in this benchmark.
--------------------------------------------------------------------------------
In sake of completeness we repeat the last test without the Bullet limit of 32767
objects, i.e. with BP_USE_FIXEDPOINT_INT_32 defined:
N = M = 4000 objects, BP_USE_FIXEDPOINT_INT_32
PX | IS | BT | IM || PX | IS | BT | IM
14630 | 22643 | 20218 | 18519 || 276 | 276 | 346 | 276
14624 | 22348 | 19567 | 17763 || 277 | 277 | 347 | 277
14566 | 21406 | 19024 | 16587 || 278 | 278 | 349 | 278
14579 | 21655 | 19252 | 17724 || 278 | 278 | 351 | 278
15615 | 22640 | 20584 | 18405 || 279 | 279 | 353 | 279
14605 | 23848 | 20065 | 18554 || 279 | 279 | 351 | 279
14925 | 22386 | 20701 | 17334 || 281 | 281 | 356 | 281
15073 | 22578 | 20634 | 17603 || 281 | 281 | 356 | 281
We see that there is a significant performance loss here, from ~17000 time units
to ~20000. This makes Bullet comparable in speed to the ICE single SAP, which is
interesting because one of them uses arrays while the other uses linked lists.
We are in the case where the data structure used should have a significant impact
(all objects moving), yet both implementations roughly have the same speed.
Regardless, BP_USE_FIXEDPOINT_INT_32 is not used for other tests.
--------------------------------------------------------------------------------
2) 1% of objects are moving:
----------------------------
N = 4000, M = 40
PX | IS | BT | IM || PX | IS | BT | IM
186 | 187 | 187 | 85 || 89 | 89 | 89 | 89
168 | 190 | 193 | 79 || 89 | 89 | 89 | 89
169 | 194 | 192 | 156 || 89 | 89 | 89 | 89
166 | 206 | 190 | 114 || 89 | 89 | 89 | 89
159 | 179 | 187 | 119 || 89 | 89 | 89 | 89
159 | 179 | 187 | 92 || 89 | 89 | 89 | 89
150 | 167 | 186 | 87 || 89 | 89 | 89 | 89
N = 10000, M = 100
PX | IS | BT | IM || PX | IS | BT | IM
1067 | 1838 | 1227 | 701 || 521 | 521 | 525 | 521
1083 | 1880 | 1233 | 880 || 521 | 521 | 525 | 521
1143 | 2084 | 1284 | 876 || 521 | 521 | 525 | 521
1161 | 2141 | 1294 | 659 || 521 | 521 | 525 | 521
1165 | 2134 | 1321 | 802 || 521 | 521 | 525 | 521
1142 | 2055 | 1304 | 892 || 521 | 521 | 525 | 521
1453 | 1970 | 1256 | 731 || 521 | 521 | 525 | 521
1129 | 1917 | 1244 | 800 || 522 | 522 | 526 | 522
1236 | 2173 | 1347 | 686 || 522 | 522 | 526 | 522
Comments:
- in this more realistic test where a few objects only are moving, the multi SAP
becomes really useful. It is often twice faster than the single SAP counterpart.
It is also faster than all other SAP implementations.
- with a limited number of objects, performance for all single SAP implementations
is roughly the same. When that number is increased however, the array-based SAPs
(PhysX/Bullet) perform better.
- here again, the Bullet implementation is a bit slower than the PhysX one.
================================================================================
Conclusion so far:
- the MultiSAP is very efficient for insertions, and quite efficient for updates.
It's always a win except in trivial situations, where its overhead is just as
costly as the deeper SAP operations.
- more tests are needed to check the impact of:
- bigger SAP grids (16*16, etc)
- object speeds (slower speed, faster speed, etc)
- heterogeneous objects
- it would be interesting to apply the "multiple SAPs" strategy to both Bullet &
PhysX implementations...
I implemented the "multi SAP" optimization we talked about in this forum a while back. I also benchmarked the new code against available versions, and while doing this I found that Bullet doesn't report the same number of overlapping shapes as the others. Below are my preliminary notes for this stuff.
Any idea about this Bullet issue? Is this normal? If yes, is there a way to "fix" this?
Thanks,
- Pierre
-----------------------------------------
Multi-SAP notes:
================
The sweep-and-prune (SAP) algorithm doesn't scale well. As the number of objects
increases in the SAP structure, updating it for a single object takes longer and
longer. To solve this, a natural idea is to use multiple broad-phases instead of
a single one. For example, and that's what we tried here, one could use a 2D grid
of SAP structures.
The implementation is not trivial, but also not super complicated. So the details
are left out for another time. Basically we rasterize a 2D bounding box (discarding
the "up" axis) into a 2D grid covering the game world, and objects are inserted in
all SAPs of covered cells. A given object can be inserted in multiple SAPs.
Then we profiled 2 things:
- insertion of new objects (a new object is created in the game world)
- updates of objects (an existing object is moved around)
We profiled 4 different implementations:
- the original ICE sweep-and-prune (based on the one released in Opcode 1.3, but
more optimized)
- the multi SAP which has been built on the ICE version
- the PhysX SAP implementation
- the Bullet SAP implementation (from version 2.55)
In all honesty, only the ICE SAP and its multi-SAP version should be compared,
as they share the same algorithmic details. Other SAP implementations did not
make the same design choices (e.g. linked-lists or arrays, FPU or CPU compares,
etc) so any speed difference between them and the Multi-SAP might come from
those design choices, more than from the nature of the SAP (single or multi).
However it is useful to compare the PhysX and Bullet versions to the ICE SAP,
to see the impact of those design choices on runtime speed. For example it is
often heard that "using X is faster than using Y" when it comes to SAPs. We
found that very often, it actually depends on how X or Y has been implemented,
more than anything else. (For example a linked list of pool-allocated elements
is not as bad as a linked list of elements randomly located in memory.)
In any case, here are the features for the different SAP implementations:
ICE single SAP:
- linked lists (objects allocated from pools)
- FPU or CPU comparisons (CPU in this test)
- unlimited game world
ICE multi SAP:
- 2D array of "ICE single SAPs" (for this test, grid is 8*8 = 64 SAPs)
- limited game world (world bounds needed at creation time)
PhysX:
- arrays
- CPU comparisons
- unlimited game world
Bullet:
- arrays
- CPU comparisons
- quantized boxes
- limited game world (world bounds needed at creation time)
- limited to 32767 objects (*)
(*) using the BP_USE_FIXEDPOINT_INT_32 define to remove this constraint has a
significant performance impact, as we will show in one test.
================================================================================
I - Insertions:
---------------
The test is like this: we create N randomly located objects in a (1000 x 1000 x
1000) game world. Objects have a homogeneous size. This is somewhat artificial
but all implementations are tested against the same scenario so it should be fair.
Then we create object N+1 and profile how long it takes to update the structure.
Some implementations (e.g. PhysX) do not update the structure immediately, only
when the broadphase (BP) is later "updated", so we included the BP update in the
profile. Note however that an implementation (e.g. PhysX) can be optimized for
multiple inclusions at the same time, and this test doesn't reflect that feature.
But the focus should be on the MultiSAP here. At the time of writing the MultiSAP
version doesn't take advantage of multiple insertions, so we didn't test this case
(but we might come back to this later).
Anyway, current results are like this:
--------------------------------------------------------------------------------
N = 400
Insertion time PhysX: 142208
Insertion time Bullet: 66808
Insertion time ICE single: 169456
Insertion time ICE multi: 8056
N = 4000
Insertion time PhysX: 1528408
Insertion time Bullet: 1159072
Insertion time ICE single: 2725808
Insertion time ICE multi: 35712
N = 10000
Insertion time PhysX: 4417480
Insertion time Bullet: 4024632
Insertion time ICE single: 6730048
Insertion time ICE multi: 174712
--------------------------------------------------------------------------------
And the comments:
- insertion has never been optimized in the ICE single SAP. Object is just added
to the 3 linked lists in linear line. It does not try to use "stabbing numbers"
or "markers" to optimize this process. So it is not a surprise that this version
is the slowest.
- a much more interesting result comes from the Multi SAP, which really shines
here. It is based on the same ICE code, so it also doesn't use any special
optimization in individual SAPs. Nonetheless, it easily beats all the other
implementations for insertion times. This is good because that was the whole
point of the multi SAP in the first place: to optimize creation of new objects.
- the multi SAP is betwen 20 and 76 times faster than its "single" counterpart.
It is also between 8 and 32 times faster than Bullet, which is usually slightly
faster than PhysX for insertions.
================================================================================
II - Updates:
-------------
In this test we use the same setup as before, and objects are randomly moved within
the game world, following sine/cosine curves (Lissajoux). Again, this is very
artificial, but everybody's tested against the same scenario.
We have 3 variables to play with here:
- the total number of objects in the world (N)
- the number of objects moving at any given time (M)
- the speed of moving objects
We profiled two main scenarios:
- all objects are moving
- 1% of objects are moving
The first scenario is rather unlikely in a game, but it is a good stress test.
The speed is the same for all objects. We need to do more tests with varying
speeds.
Results so far:
The 8 columns are:
PhysX time | ICE single time | Bullet time | ICE Multi time ||
PhysX nb pairs | ICE single nb pairs | Bullet nb pairs | ICE Multi nb pairs
We recorded results for several frames. One line = one frame. We report results
for several frames, as the numbers might slightly evolve and diverge from one
frame to the next.
PX = PhysX
IS = ICE Single
IM = ICE Multi
BT = Bullet
--------------------------------------------------------------------------------
1) All objects are moving:
--------------------------
N = M = 400 objects
PX | IS | BT | IM ||PX |IS |BT |IM
646 | 526 | 773 | 548 || 2 | 2 | 4 | 2
647 | 543 | 754 | 537 || 2 | 2 | 3 | 2
646 | 523 | 753 | 561 || 2 | 2 | 3 | 2
645 | 529 | 766 | 539 || 2 | 2 | 3 | 2
642 | 537 | 761 | 557 || 2 | 2 | 3 | 2
644 | 526 | 743 | 522 || 2 | 2 | 2 | 2
650 | 537 | 759 | 582 || 2 | 2 | 2 | 2
653 | 541 | 749 | 535 || 2 | 2 | 2 | 2
N = M = 4000 objects
PX | IS | BT | IM || PX | IS | BT | IM
14504 | 20545 | 16579 | 16540 || 275 | 275 | 342 | 275
14987 | 21770 | 17464 | 17963 || 277 | 277 | 349 | 277
15938 | 21043 | 16890 | 16955 || 274 | 274 | 344 | 274
14454 | 21336 | 17075 | 17371 || 276 | 276 | 346 | 276
14410 | 20406 | 16347 | 16555 || 277 | 277 | 347 | 277
14468 | 20304 | 16497 | 15906 || 278 | 278 | 350 | 278
14638 | 22655 | 17377 | 18334 || 278 | 278 | 351 | 278
14982 | 21792 | 17242 | 17777 || 279 | 279 | 354 | 279
14810 | 22142 | 17450 | 17627 || 279 | 279 | 354 | 279
14843 | 22018 | 16640 | 16976 || 281 | 281 | 357 | 281
14982 | 22247 | 17565 | 17419 || 281 | 281 | 355 | 281
Comments:
- for some reason Bullet does not report the correct number of pairs! We don't
know if it's a bug in Bullet, but probably not. We suspect it is probably
because Bullet uses quantized bounds, or maybe some kind of loose bounds, i.e.
it only reports conservative results (to be confirmed). In any case we kept
the results, as the timings seem to indicate that Bullet does "the right job"
anyway (i.e. performance is consistent with other SAP implementations).
- when all objects are moving and the SAP contains "few" objects (400), the Multi
SAP doesn't show any benefit, and is actually a bit slower than the original code.
This is expected for two reasons: a) if the number of objects is limited, the
effects of cache misses during updates is not too big, and b) the Multi SAP has
some fixed overhead to rasterize objects bounds into the grid. This overhead
becomes significant when the update itself is very cheap.
- in the same scenario (400 objects moving) the ICE SAP happens to be faster than
both PhysX and Bullet. Linked lists seem faster than arrays here.
- however things change when the number of objects is bumped to 4000: here the ICE
single SAP becomes slower than the Bullet & PhysX implementations. At the same
time the Multi SAP becomes useful and always runs faster than the single version,
giving it roughly the same speed as Bullet (but still slower than PhysX). On the
other hand the number of pairs reported by Bullet seems really excessive here,
and if further box-box tests are needed to get back the real set of colliding
pairs, it may lose some performance.
- Bullet is always slower than PhysX here.
--------------------------------------------------------------------------------
- note that if all objects are moving, no SAP implementation will beat the "radix
based" pruning. The "box pruning" code in Opcode 1.3 needs ~4000 time units to
find the correct number of pairs in the second test, which is 3.5 times faster
than the fastest SAP implementation in this benchmark.
--------------------------------------------------------------------------------
In sake of completeness we repeat the last test without the Bullet limit of 32767
objects, i.e. with BP_USE_FIXEDPOINT_INT_32 defined:
N = M = 4000 objects, BP_USE_FIXEDPOINT_INT_32
PX | IS | BT | IM || PX | IS | BT | IM
14630 | 22643 | 20218 | 18519 || 276 | 276 | 346 | 276
14624 | 22348 | 19567 | 17763 || 277 | 277 | 347 | 277
14566 | 21406 | 19024 | 16587 || 278 | 278 | 349 | 278
14579 | 21655 | 19252 | 17724 || 278 | 278 | 351 | 278
15615 | 22640 | 20584 | 18405 || 279 | 279 | 353 | 279
14605 | 23848 | 20065 | 18554 || 279 | 279 | 351 | 279
14925 | 22386 | 20701 | 17334 || 281 | 281 | 356 | 281
15073 | 22578 | 20634 | 17603 || 281 | 281 | 356 | 281
We see that there is a significant performance loss here, from ~17000 time units
to ~20000. This makes Bullet comparable in speed to the ICE single SAP, which is
interesting because one of them uses arrays while the other uses linked lists.
We are in the case where the data structure used should have a significant impact
(all objects moving), yet both implementations roughly have the same speed.
Regardless, BP_USE_FIXEDPOINT_INT_32 is not used for other tests.
--------------------------------------------------------------------------------
2) 1% of objects are moving:
----------------------------
N = 4000, M = 40
PX | IS | BT | IM || PX | IS | BT | IM
186 | 187 | 187 | 85 || 89 | 89 | 89 | 89
168 | 190 | 193 | 79 || 89 | 89 | 89 | 89
169 | 194 | 192 | 156 || 89 | 89 | 89 | 89
166 | 206 | 190 | 114 || 89 | 89 | 89 | 89
159 | 179 | 187 | 119 || 89 | 89 | 89 | 89
159 | 179 | 187 | 92 || 89 | 89 | 89 | 89
150 | 167 | 186 | 87 || 89 | 89 | 89 | 89
N = 10000, M = 100
PX | IS | BT | IM || PX | IS | BT | IM
1067 | 1838 | 1227 | 701 || 521 | 521 | 525 | 521
1083 | 1880 | 1233 | 880 || 521 | 521 | 525 | 521
1143 | 2084 | 1284 | 876 || 521 | 521 | 525 | 521
1161 | 2141 | 1294 | 659 || 521 | 521 | 525 | 521
1165 | 2134 | 1321 | 802 || 521 | 521 | 525 | 521
1142 | 2055 | 1304 | 892 || 521 | 521 | 525 | 521
1453 | 1970 | 1256 | 731 || 521 | 521 | 525 | 521
1129 | 1917 | 1244 | 800 || 522 | 522 | 526 | 522
1236 | 2173 | 1347 | 686 || 522 | 522 | 526 | 522
Comments:
- in this more realistic test where a few objects only are moving, the multi SAP
becomes really useful. It is often twice faster than the single SAP counterpart.
It is also faster than all other SAP implementations.
- with a limited number of objects, performance for all single SAP implementations
is roughly the same. When that number is increased however, the array-based SAPs
(PhysX/Bullet) perform better.
- here again, the Bullet implementation is a bit slower than the PhysX one.
================================================================================
Conclusion so far:
- the MultiSAP is very efficient for insertions, and quite efficient for updates.
It's always a win except in trivial situations, where its overhead is just as
costly as the deeper SAP operations.
- more tests are needed to check the impact of:
- bigger SAP grids (16*16, etc)
- object speeds (slower speed, faster speed, etc)
- heterogeneous objects
- it would be interesting to apply the "multiple SAPs" strategy to both Bullet &
PhysX implementations...