Page 1 of 1

Performance loss when ghost object overlaps a lot bodies

Posted: Fri Sep 01, 2017 10:10 am
by B1TZ3R0
I'm implementing the wrapper of bullet physics engine to Godot engine: https://github.com/godotengine/godot/pull/10013 and I've noticed that if I create a ghost object that overlap a lot of rigid bodies, it starts to lag.

In the test project that I've created for this test, the static rigid bodies contains a compound shape with a concave shape.
If I move the ghost object away in order to stop overlapping the lag disappear, is like if the area keep the static rigid bodyes always active.

However I found the way how to fix it. In the broadphase collision check I must check if the area wants to collide with static object, in this case I return false.

I don't understand why this is required since as written in the doc:

Code: Select all

///The btGhostObject can keep track of all objects that are overlapping
///By default, this overlap is based on the AABB
The area should use only the broadphase, so basially after broadphase collision nothing should happen, why I get this performance loss?

Re: Performance loss when ghost object overlaps a lot bodies

Posted: Sun Sep 03, 2017 1:51 pm
by drleviathan
Bullet has a default setting where it updates the axis aligned bounding boxes (AABB) of all static objects every frame. Perhaps this causes the ghost's overlap to be constantly rebuilt? I'm guessing here, but you can disable it as follows:

Code: Select all

dynamicsWorld->setForceUpdateAllAabbs(false);
Try that and see if performance recovers. When that option is set false you must manually update the AABB of any static object that moves, however a truly "static" object shouldn't be moving. Any static object that must move around could be set kinematic and then activated when moved.

Re: Performance loss when ghost object overlaps a lot bodies

Posted: Sun Sep 03, 2017 6:36 pm
by B1TZ3R0
First of all thanks you for your response! I've fixed the link in the main post, if you need to check the code.

However "setForceUpdateAllAabbs" doens't work, It improve the performance a bit but nothing relevant.

I've the impression that the ghost force the creation of pair and the narrow phase is executed anyway.

Do you have another idea?

Re: Performance loss when ghost object overlaps a lot bodies

Posted: Tue Sep 05, 2017 4:02 pm
by drleviathan
The only other idea I have is for you to identify where the time is being spent. Bullet has a nice high-resolution timing system built that will measure time spent in various contexts and can print them out. One way to do this would be something like this:

Code: Select all

    // set wantTimingStats true when someCondition is met:
    static bool wantTimingStats = false;
    wantTimingStats = someCondition;

    // ... where you call stepSimulation()...
    CProfileManager::Reset();
    dynamicsWorld->stepSimulation(dt);

    if (wantTimingStats) {
        wantTimingStats = false;
        CProfileManager::dumpAll();
    }
I assume you wouldn't want to print the stats every frame so you would set wantTimingStats to true on a particular key press event, or every 1000th frame, or something. The "dump" output is verbose but easy to understand.

Re: Performance loss when ghost object overlaps a lot bodies

Posted: Tue Sep 05, 2017 4:53 pm
by B1TZ3R0
This is very good idea!

As you written I've inserted the CProfileManager::dumpAll(); that occur each 10sec.
I did two test, and I've registered the first tick of each test.

The first test I did I've used the broad phase check that avoid collisions between Ghost to Static body, and the result is this:
No lag

Code: Select all

----------------------------------
Profiling: Root (total running time: 1.817 ms) ---
0 -- btConvexConcaveCollisionAlgorithm::processCollision (0.00 %) :: 0.000 ms / frame (0 calls)
1 -- internalSingleStepSimulation (99.23 %) :: 1.803 ms / frame (1 calls)
2 -- btHashedOverlappingPairCache::processAllOverlappingPairs (0.00 %) :: 0.000 ms / frame (0 calls)
3 -- convexSweepTest (0.00 %) :: 0.000 ms / frame (0 calls)
Unaccounted: (0.771 %) :: 0.014 ms
...----------------------------------
...Profiling: btConvexConcaveCollisionAlgorithm::processCollision (total running time: 0.000 ms) ---
...0 -- btConvexTriangleCallback::processTriangle (0.00 %) :: 0.000 ms / frame (0 calls)
...Unaccounted: (0.000 %) :: 0.000 ms
...----------------------------------
...Profiling: internalSingleStepSimulation (total running time: 1.803 ms) ---
...0 -- updateActivationState (0.06 %) :: 0.001 ms / frame (1 calls)
...1 -- updateActions (0.00 %) :: 0.000 ms / frame (1 calls)
...2 -- integrateTransforms (0.22 %) :: 0.004 ms / frame (1 calls)
...3 -- solveConstraints (3.38 %) :: 0.061 ms / frame (1 calls)
...4 -- calculateSimulationIslands (0.39 %) :: 0.007 ms / frame (1 calls)
...5 -- performDiscreteCollisionDetection (91.18 %) :: 1.644 ms / frame (1 calls)
...6 -- createPredictiveContacts (0.11 %) :: 0.002 ms / frame (1 calls)
...7 -- predictUnconstraintMotion (0.22 %) :: 0.004 ms / frame (1 calls)
...Unaccounted: (4.437 %) :: 0.080 ms
......----------------------------------
......Profiling: solveConstraints (total running time: 0.061 ms) ---
......0 -- solveGroup (81.97 %) :: 0.050 ms / frame (1 calls)
......1 -- processIslands (4.92 %) :: 0.003 ms / frame (1 calls)
......2 -- islandUnionFindAndQuickSort (9.84 %) :: 0.006 ms / frame (1 calls)
......Unaccounted: (3.279 %) :: 0.002 ms
.........----------------------------------
.........Profiling: solveGroup (total running time: 0.050 ms) ---
.........0 -- solveGroupCacheFriendlyIterations (8.00 %) :: 0.004 ms / frame (1 calls)
.........1 -- solveGroupCacheFriendlySetup (88.00 %) :: 0.044 ms / frame (1 calls)
.........Unaccounted: (4.000 %) :: 0.002 ms
......----------------------------------
......Profiling: performDiscreteCollisionDetection (total running time: 1.644 ms) ---
......0 -- dispatchAllCollisionPairs (95.56 %) :: 1.571 ms / frame (1 calls)
......1 -- calculateOverlappingPairs (0.12 %) :: 0.002 ms / frame (1 calls)
......2 -- updateAabbs (4.32 %) :: 0.071 ms / frame (1 calls)
......Unaccounted: (0.000 %) :: 0.000 ms
.........----------------------------------
.........Profiling: dispatchAllCollisionPairs (total running time: 1.571 ms) ---
.........0 -- btHashedOverlappingPairCache::processAllOverlappingPairs (99.94 %) :: 1.570 ms / frame (1 calls)
.........Unaccounted: (0.064 %) :: 0.001 ms
............----------------------------------
............Profiling: btHashedOverlappingPairCache::processAllOverlappingPairs (total running time: 1.570 ms) ---
............0 -- btConvexConcaveCollisionAlgorithm::processCollision (0.06 %) :: 0.001 ms / frame (1 calls)
............1 -- btCompoundCompoundLeafCallback::Process (74.14 %) :: 1.164 ms / frame (82 calls)
............Unaccounted: (25.796 %) :: 0.405 ms
...............----------------------------------
...............Profiling: btCompoundCompoundLeafCallback::Process (total running time: 1.164 ms) ---
...............0 -- btConvexConcaveCollisionAlgorithm::processCollision (5.67 %) :: 0.066 ms / frame (14 calls)
...............Unaccounted: (94.330 %) :: 1.098 ms
..................----------------------------------
..................Profiling: btConvexConcaveCollisionAlgorithm::processCollision (total running time: 0.066 ms) ---
..................0 -- btConvexTriangleCallback::processTriangle (59.09 %) :: 0.039 ms / frame (20 calls)
..................Unaccounted: (40.909 %) :: 0.027 ms
......----------------------------------
......Profiling: createPredictiveContacts (total running time: 0.002 ms) ---
......0 -- release predictive contact manifolds (0.00 %) :: 0.000 ms / frame (1 calls)
......Unaccounted: (100.000 %) :: 0.002 ms
...----------------------------------
...Profiling: convexSweepTest (total running time: 0.000 ms) ---
...0 -- convexSweepCompound (0.00 %) :: 0.000 ms / frame (0 calls)
...Unaccounted: (0.000 %) :: 0.000 ms
The second test I did I've removed this code in order to allow Static Rigid body to be overlaped by a ghost and the result is:
Lag

Code: Select all

----------------------------------
Profiling: Root (total running time: 316787.438 ms) ---
0 -- btConvexConcaveCollisionAlgorithm::processCollision (0.00 %) :: 0.000 ms / frame (0 calls)
1 -- internalSingleStepSimulation (0.07 %) :: 209.717 ms / frame (1 calls)
2 -- btHashedOverlappingPairCache::processAllOverlappingPairs (0.00 %) :: 0.000 ms / frame (0 calls)
3 -- convexSweepTest (0.00 %) :: 0.000 ms / frame (0 calls)
Unaccounted: (99.934 %) :: 316577.719 ms
...----------------------------------
...Profiling: btConvexConcaveCollisionAlgorithm::processCollision (total running time: 0.000 ms) ---
...0 -- btConvexTriangleCallback::processTriangle (0.00 %) :: 0.000 ms / frame (0 calls)
...Unaccounted: (0.000 %) :: 0.000 ms
...----------------------------------
...Profiling: internalSingleStepSimulation (total running time: 209.717 ms) ---
...0 -- updateActivationState (0.00 %) :: 0.000 ms / frame (1 calls)
...1 -- updateActions (0.00 %) :: 0.000 ms / frame (1 calls)
...2 -- integrateTransforms (0.00 %) :: 0.006 ms / frame (1 calls)
...3 -- solveConstraints (0.05 %) :: 0.105 ms / frame (1 calls)
...4 -- calculateSimulationIslands (0.01 %) :: 0.016 ms / frame (1 calls)
...5 -- performDiscreteCollisionDetection (99.86 %) :: 209.425 ms / frame (1 calls)
...6 -- createPredictiveContacts (0.00 %) :: 0.002 ms / frame (1 calls)
...7 -- predictUnconstraintMotion (0.00 %) :: 0.003 ms / frame (1 calls)
...Unaccounted: (0.076 %) :: 0.160 ms
......----------------------------------
......Profiling: solveConstraints (total running time: 0.105 ms) ---
......0 -- solveGroup (27.62 %) :: 0.029 ms / frame (1 calls)
......1 -- processIslands (3.81 %) :: 0.004 ms / frame (1 calls)
......2 -- islandUnionFindAndQuickSort (66.67 %) :: 0.070 ms / frame (1 calls)
......Unaccounted: (1.905 %) :: 0.002 ms
.........----------------------------------
.........Profiling: solveGroup (total running time: 0.029 ms) ---
.........0 -- solveGroupCacheFriendlyIterations (17.24 %) :: 0.005 ms / frame (1 calls)
.........1 -- solveGroupCacheFriendlySetup (79.31 %) :: 0.023 ms / frame (1 calls)
.........Unaccounted: (3.448 %) :: 0.001 ms
......----------------------------------
......Profiling: performDiscreteCollisionDetection (total running time: 209.425 ms) ---
......0 -- dispatchAllCollisionPairs (99.96 %) :: 209.351 ms / frame (1 calls)
......1 -- calculateOverlappingPairs (0.00 %) :: 0.002 ms / frame (1 calls)
......2 -- updateAabbs (0.03 %) :: 0.071 ms / frame (1 calls)
......Unaccounted: (0.000 %) :: 0.001 ms
.........----------------------------------
.........Profiling: dispatchAllCollisionPairs (total running time: 209.351 ms) ---
.........0 -- btHashedOverlappingPairCache::processAllOverlappingPairs (100.00 %) :: 209.350 ms / frame (1 calls)
.........Unaccounted: (0.000 %) :: 0.001 ms
............----------------------------------
............Profiling: btHashedOverlappingPairCache::processAllOverlappingPairs (total running time: 209.350 ms) ---
............0 -- btConvexConcaveCollisionAlgorithm::processCollision (0.00 %) :: 0.001 ms / frame (1 calls)
............1 -- btCompoundCompoundLeafCallback::Process (97.37 %) :: 203.853 ms / frame (1500 calls)
............Unaccounted: (2.625 %) :: 5.496 ms
...............----------------------------------
...............Profiling: btCompoundCompoundLeafCallback::Process (total running time: 203.853 ms) ---
...............0 -- btConvexConcaveCollisionAlgorithm::processCollision (98.09 %) :: 199.961 ms / frame (1431 calls)
...............Unaccounted: (1.909 %) :: 3.892 ms
..................----------------------------------
..................Profiling: btConvexConcaveCollisionAlgorithm::processCollision (total running time: 199.961 ms) ---
..................0 -- btConvexTriangleCallback::processTriangle (96.58 %) :: 193.128 ms / frame (10065 calls)
..................Unaccounted: (3.417 %) :: 6.833 ms
......----------------------------------
......Profiling: createPredictiveContacts (total running time: 0.002 ms) ---
......0 -- release predictive contact manifolds (0.00 %) :: 0.000 ms / frame (1 calls)
......Unaccounted: (100.000 %) :: 0.002 ms
...----------------------------------
...Profiling: convexSweepTest (total running time: 0.000 ms) ---
...0 -- convexSweepCompound (0.00 %) :: 0.000 ms / frame (0 calls)
...Unaccounted: (0.000 %) :: 0.000 ms
As you can see there is an increment in time spent from here: btHashedOverlappingPairCache::processAllOverlappingPairs to btConvexConcaveCollisionAlgorithm::processCollision

But this doesn't tell me much more. What do you think about this stats?

If you want you can join in "developer Godot IRC" so it's more easy to talk. Just ASK for me in chat

Re: Performance loss when ghost object overlaps a lot bodies

Posted: Tue Sep 05, 2017 8:06 pm
by drleviathan
Actually, the time is being spent in btConvexTriangleCallback::processTriangle(), which means you're using one of the concave triangle-soup shapes: probably btBvhTriangleMeshShape. When you have a big shape that overlaps all or most of the triangles of an overly complex triangle mesh shape then yeah: Bullet performance goes to hell. This is fundamental to Bullet.

Perhaps you don't want to be stress-testing Bullet itself, but just verifying that the Godot engine does what you want when using Bullet? In other words, Bullet gives you rope to hang your simulation if you do the Wrong Thing (make big active bodies that overlap complex mesh geometry), but that is ok... it just means you should actually do Right Things (use convex geometry, optimize your mesh geometry, don't make your active bodies relatively big). If Bullet is properly wrapped by the Godot engine you would still have enough rope to hang your simulation.

Re: Performance loss when ghost object overlaps a lot bodies

Posted: Wed Sep 06, 2017 7:22 am
by B1TZ3R0
In the scene the ghost object is used to track the players that overlap it, when the player come inside the ghost, the Godot audio engine change the propagation effect.

As you can see the ghost could be very big depending on the effect that the artist want to create, and could overlap the entire map.

I think there is a bug in Bullet (or a misconfiguration) since to know if a RigidBody overlap the ghost is enough the broadphase
check This: https://github.com/bulletphysics/bullet ... ject.h#L31
As is pointed out by the stats here is executed the narrowphase algoriths.

There is a way how to avoid narrowphase or this is a bug and should be addressed?

Re: Performance loss when ghost object overlaps a lot bodies

Posted: Thu Sep 07, 2017 3:46 pm
by drleviathan
If you're using the ghost only to trigger events when a player enters the space then it probably doesn't need to know about the static objects. I think you can use collisionGroups & collisionMask to configure your ghost to only "overlap" with players. This should prune the overlapping pairs in the broadphase to the bare minimum. Compare this to how a ghost is used to follow a character around to track a reduced set of nearby objects to help optimize any ray-traces that the character needs, rather than ray-tracing against the entire world -- such a ghost would need to "overlap" against static objects.

Re: Performance loss when ghost object overlaps a lot bodies

Posted: Fri Sep 08, 2017 1:51 am
by B1TZ3R0
The problem is that I'm implementing a wrapper for a game engine and the use of Ghost can't be restricted. So can you help me to understand how to fix it and send a PR to Bullet?

Re: Performance loss when ghost object overlaps a lot bodies

Posted: Fri Sep 15, 2017 12:26 pm
by B1TZ3R0
I've fixed this problem, check this link: https://github.com/bulletphysics/bullet ... -329767624