Performance related help

codetiger
Posts: 19
Joined: Sat Aug 18, 2012 2:20 am
Location: Chennai, India

Performance related help

Post by codetiger »

We are building a game that uses a lot of moving objects that are built as convexhull shapes.

This is how we build the world.

Code: Select all

		btDefaultCollisionConfiguration* collisionConfiguration;
		btDispatcher* dispatcher;
		btBroadphaseInterface* pairCache;
		btConstraintSolver*	constraintSolver;
		
		collisionConfiguration = new  btDefaultCollisionConfiguration();
		pairCache = new btDbvtBroadphase();
		dispatcher = new btCollisionDispatcher(collisionConfiguration);
		constraintSolver = new btSequentialImpulseConstraintSolver();
		_bulletWorld = new btDiscreteDynamicsWorld(dispatcher, pairCache, constraintSolver, collisionConfiguration);
		
		btContactSolverInfo& info = _bulletWorld->getSolverInfo();
		info.m_numIterations = 10;
		info.m_minimumSolverBatchSize = 64;
		info.m_splitImpulse = false;
		info.m_solverMode = SOLVER_ENABLE_FRICTION_DIRECTION_CACHING | SOLVER_USE_WARMSTARTING | SOLVER_SIMD;
We wrote a wrapper that can toggle between different Physics Engines (Bullet, Physx and Tokamak). We have an urge to use BulletPhysics just because we are happy to use OpenSource engine. However, our game FPS drops to 80 in a simple level where as the same settings yield 140 fps in Physx.

I know we are wrong some where. We didn't optimize bullet for our needs.

I am posting profile dump below for a few frames. The info seems to change a lot for every frame.

Code: Select all


----------------------------------
Profiling: Root (total running time: 18.982 ms) ---
0 -- stepSimulation (99.97 %) :: 18.977 ms / frame (1 calls)
Unaccounted: (0.026 %) :: 0.005 ms
...----------------------------------
...Profiling: stepSimulation (total running time: 18.977 ms) ---
...0 -- internalSingleStepSimulation (98.39 %) :: 18.671 ms / frame (1 calls)
...1 -- synchronizeMotionStates (1.42 %) :: 0.270 ms / frame (1 calls)
...Unaccounted: (0.190 %) :: 0.036 ms
......----------------------------------
......Profiling: internalSingleStepSimulation (total running time: 18.671 ms) --
-
......0 -- updateActivationState (0.04 %) :: 0.008 ms / frame (1 calls)
......1 -- updateActions (0.01 %) :: 0.002 ms / frame (1 calls)
......2 -- integrateTransforms (0.46 %) :: 0.085 ms / frame (1 calls)
......3 -- solveConstraints (10.20 %) :: 1.905 ms / frame (1 calls)
......4 -- calculateSimulationIslands (0.17 %) :: 0.032 ms / frame (1 calls)
......5 -- performDiscreteCollisionDetection (88.36 %) :: 16.498 ms / frame (1 c
alls)
......6 -- predictUnconstraintMotion (0.66 %) :: 0.123 ms / frame (1 calls)
......Unaccounted: (0.096 %) :: 0.018 ms
.........----------------------------------
.........Profiling: solveConstraints (total running time: 1.905 ms) ---
.........0 -- solveGroup (0.00 %) :: 0.000 ms / frame (0 calls)
.........1 -- processIslands (97.11 %) :: 1.850 ms / frame (1 calls)
.........2 -- islandUnionFindAndQuickSort (2.57 %) :: 0.049 ms / frame (1 calls)

.........Unaccounted: (0.315 %) :: 0.006 ms
............----------------------------------
............Profiling: solveGroup (total running time: 0.000 ms) ---
............0 -- solveGroupCacheFriendlyIterations (0.00 %) :: 0.000 ms / frame
(0 calls)
............1 -- solveGroupCacheFriendlySetup (0.00 %) :: 0.000 ms / frame (0 ca
lls)
............Unaccounted: (0.000 %) :: 0.000 ms
............----------------------------------
............Profiling: processIslands (total running time: 1.850 ms) ---
............0 -- solveGroup (96.22 %) :: 1.780 ms / frame (1 calls)
............Unaccounted: (3.784 %) :: 0.070 ms
...............----------------------------------
...............Profiling: solveGroup (total running time: 1.780 ms) ---
...............0 -- solveGroupCacheFriendlyIterations (53.31 %) :: 0.949 ms / fr
ame (1 calls)
...............1 -- solveGroupCacheFriendlySetup (45.45 %) :: 0.809 ms / frame (
1 calls)
...............Unaccounted: (1.236 %) :: 0.022 ms
.........----------------------------------
.........Profiling: performDiscreteCollisionDetection (total running time: 16.49
8 ms) ---
.........0 -- dispatchAllCollisionPairs (98.72 %) :: 16.287 ms / frame (1 calls)

.........1 -- calculateOverlappingPairs (0.07 %) :: 0.011 ms / frame (1 calls)
.........2 -- updateAabbs (1.17 %) :: 0.193 ms / frame (1 calls)
.........Unaccounted: (0.042 %) :: 0.007 ms



----------------------------------
Profiling: Root (total running time: 0.308 ms) ---
0 -- stepSimulation (98.70 %) :: 0.304 ms / frame (1 calls)
Unaccounted: (1.299 %) :: 0.004 ms
...----------------------------------
...Profiling: stepSimulation (total running time: 0.304 ms) ---
...0 -- internalSingleStepSimulation (0.00 %) :: 0.000 ms / frame (0 calls)
...1 -- synchronizeMotionStates (97.37 %) :: 0.296 ms / frame (1 calls)
...Unaccounted: (2.632 %) :: 0.008 ms
......----------------------------------
......Profiling: internalSingleStepSimulation (total running time: 0.000 ms) ---

......0 -- updateActivationState (0.00 %) :: 0.000 ms / frame (0 calls)
......1 -- updateActions (0.00 %) :: 0.000 ms / frame (0 calls)
......2 -- integrateTransforms (0.00 %) :: 0.000 ms / frame (0 calls)
......3 -- solveConstraints (0.00 %) :: 0.000 ms / frame (0 calls)
......4 -- calculateSimulationIslands (0.00 %) :: 0.000 ms / frame (0 calls)
......5 -- performDiscreteCollisionDetection (0.00 %) :: 0.000 ms / frame (0 cal
ls)
......6 -- predictUnconstraintMotion (0.00 %) :: 0.000 ms / frame (0 calls)
......Unaccounted: (0.000 %) :: 0.000 ms
.........----------------------------------
.........Profiling: solveConstraints (total running time: 0.000 ms) ---
.........0 -- solveGroup (0.00 %) :: 0.000 ms / frame (0 calls)
.........1 -- processIslands (0.00 %) :: 0.000 ms / frame (0 calls)
.........2 -- islandUnionFindAndQuickSort (0.00 %) :: 0.000 ms / frame (0 calls)

.........Unaccounted: (0.000 %) :: 0.000 ms
............----------------------------------
............Profiling: solveGroup (total running time: 0.000 ms) ---
............0 -- solveGroupCacheFriendlyIterations (0.00 %) :: 0.000 ms / frame
(0 calls)
............1 -- solveGroupCacheFriendlySetup (0.00 %) :: 0.000 ms / frame (0 ca
lls)
............Unaccounted: (0.000 %) :: 0.000 ms
............----------------------------------
............Profiling: processIslands (total running time: 0.000 ms) ---
............0 -- solveGroup (0.00 %) :: 0.000 ms / frame (0 calls)
............Unaccounted: (0.000 %) :: 0.000 ms
...............----------------------------------
...............Profiling: solveGroup (total running time: 0.000 ms) ---
...............0 -- solveGroupCacheFriendlyIterations (0.00 %) :: 0.000 ms / fra
me (0 calls)
...............1 -- solveGroupCacheFriendlySetup (0.00 %) :: 0.000 ms / frame (0
 calls)
...............Unaccounted: (0.000 %) :: 0.000 ms
.........----------------------------------
.........Profiling: performDiscreteCollisionDetection (total running time: 0.000
 ms) ---
.........0 -- dispatchAllCollisionPairs (0.00 %) :: 0.000 ms / frame (0 calls)
.........1 -- calculateOverlappingPairs (0.00 %) :: 0.000 ms / frame (0 calls)
.........2 -- updateAabbs (0.00 %) :: 0.000 ms / frame (0 calls)
.........Unaccounted: (0.000 %) :: 0.000 ms



----------------------------------
Profiling: Root (total running time: 18.746 ms) ---
0 -- stepSimulation (99.97 %) :: 18.741 ms / frame (1 calls)
Unaccounted: (0.027 %) :: 0.005 ms
...----------------------------------
...Profiling: stepSimulation (total running time: 18.741 ms) ---
...0 -- internalSingleStepSimulation (98.38 %) :: 18.438 ms / frame (1 calls)
...1 -- synchronizeMotionStates (1.41 %) :: 0.264 ms / frame (1 calls)
...Unaccounted: (0.208 %) :: 0.039 ms
......----------------------------------
......Profiling: internalSingleStepSimulation (total running time: 18.438 ms) --
-
......0 -- updateActivationState (0.05 %) :: 0.009 ms / frame (1 calls)
......1 -- updateActions (0.01 %) :: 0.001 ms / frame (1 calls)
......2 -- integrateTransforms (0.46 %) :: 0.085 ms / frame (1 calls)
......3 -- solveConstraints (9.89 %) :: 1.823 ms / frame (1 calls)
......4 -- calculateSimulationIslands (0.17 %) :: 0.031 ms / frame (1 calls)
......5 -- performDiscreteCollisionDetection (88.57 %) :: 16.330 ms / frame (1 c
alls)
......6 -- predictUnconstraintMotion (0.77 %) :: 0.142 ms / frame (1 calls)
......Unaccounted: (0.092 %) :: 0.017 ms
.........----------------------------------
.........Profiling: solveConstraints (total running time: 1.823 ms) ---
.........0 -- solveGroup (0.00 %) :: 0.000 ms / frame (0 calls)
.........1 -- processIslands (97.64 %) :: 1.780 ms / frame (1 calls)
.........2 -- islandUnionFindAndQuickSort (1.97 %) :: 0.036 ms / frame (1 calls)

.........Unaccounted: (0.384 %) :: 0.007 ms
............----------------------------------
............Profiling: solveGroup (total running time: 0.000 ms) ---
............0 -- solveGroupCacheFriendlyIterations (0.00 %) :: 0.000 ms / frame
(0 calls)
............1 -- solveGroupCacheFriendlySetup (0.00 %) :: 0.000 ms / frame (0 ca
lls)
............Unaccounted: (0.000 %) :: 0.000 ms
............----------------------------------
............Profiling: processIslands (total running time: 1.780 ms) ---
............0 -- solveGroup (96.01 %) :: 1.709 ms / frame (1 calls)
............Unaccounted: (3.989 %) :: 0.071 ms
...............----------------------------------
...............Profiling: solveGroup (total running time: 1.709 ms) ---
...............0 -- solveGroupCacheFriendlyIterations (54.48 %) :: 0.931 ms / fr
ame (1 calls)
...............1 -- solveGroupCacheFriendlySetup (44.24 %) :: 0.756 ms / frame (
1 calls)
...............Unaccounted: (1.287 %) :: 0.022 ms
.........----------------------------------
.........Profiling: performDiscreteCollisionDetection (total running time: 16.33
0 ms) ---
.........0 -- dispatchAllCollisionPairs (98.74 %) :: 16.125 ms / frame (1 calls)

.........1 -- calculateOverlappingPairs (0.04 %) :: 0.007 ms / frame (1 calls)
.........2 -- updateAabbs (1.16 %) :: 0.190 ms / frame (1 calls)
.........Unaccounted: (0.049 %) :: 0.008 ms



----------------------------------
Profiling: Root (total running time: 0.287 ms) ---
0 -- stepSimulation (98.61 %) :: 0.283 ms / frame (1 calls)
Unaccounted: (1.394 %) :: 0.004 ms
...----------------------------------
...Profiling: stepSimulation (total running time: 0.283 ms) ---
...0 -- internalSingleStepSimulation (0.00 %) :: 0.000 ms / frame (0 calls)
...1 -- synchronizeMotionStates (97.17 %) :: 0.275 ms / frame (1 calls)
...Unaccounted: (2.827 %) :: 0.008 ms
......----------------------------------
......Profiling: internalSingleStepSimulation (total running time: 0.000 ms) ---

......0 -- updateActivationState (0.00 %) :: 0.000 ms / frame (0 calls)
......1 -- updateActions (0.00 %) :: 0.000 ms / frame (0 calls)
......2 -- integrateTransforms (0.00 %) :: 0.000 ms / frame (0 calls)
......3 -- solveConstraints (0.00 %) :: 0.000 ms / frame (0 calls)
......4 -- calculateSimulationIslands (0.00 %) :: 0.000 ms / frame (0 calls)
......5 -- performDiscreteCollisionDetection (0.00 %) :: 0.000 ms / frame (0 cal
ls)
......6 -- predictUnconstraintMotion (0.00 %) :: 0.000 ms / frame (0 calls)
......Unaccounted: (0.000 %) :: 0.000 ms
.........----------------------------------
.........Profiling: solveConstraints (total running time: 0.000 ms) ---
.........0 -- solveGroup (0.00 %) :: 0.000 ms / frame (0 calls)
.........1 -- processIslands (0.00 %) :: 0.000 ms / frame (0 calls)
.........2 -- islandUnionFindAndQuickSort (0.00 %) :: 0.000 ms / frame (0 calls)

.........Unaccounted: (0.000 %) :: 0.000 ms
............----------------------------------
............Profiling: solveGroup (total running time: 0.000 ms) ---
............0 -- solveGroupCacheFriendlyIterations (0.00 %) :: 0.000 ms / frame
(0 calls)
............1 -- solveGroupCacheFriendlySetup (0.00 %) :: 0.000 ms / frame (0 ca
lls)
............Unaccounted: (0.000 %) :: 0.000 ms
............----------------------------------
............Profiling: processIslands (total running time: 0.000 ms) ---
............0 -- solveGroup (0.00 %) :: 0.000 ms / frame (0 calls)
............Unaccounted: (0.000 %) :: 0.000 ms
...............----------------------------------
...............Profiling: solveGroup (total running time: 0.000 ms) ---
...............0 -- solveGroupCacheFriendlyIterations (0.00 %) :: 0.000 ms / fra
me (0 calls)
...............1 -- solveGroupCacheFriendlySetup (0.00 %) :: 0.000 ms / frame (0
 calls)
...............Unaccounted: (0.000 %) :: 0.000 ms
.........----------------------------------
.........Profiling: performDiscreteCollisionDetection (total running time: 0.000
 ms) ---
.........0 -- dispatchAllCollisionPairs (0.00 %) :: 0.000 ms / frame (0 calls)
.........1 -- calculateOverlappingPairs (0.00 %) :: 0.000 ms / frame (0 calls)
.........2 -- updateAabbs (0.00 %) :: 0.000 ms / frame (0 calls)
.........Unaccounted: (0.000 %) :: 0.000 ms



----------------------------------
Profiling: Root (total running time: 19.703 ms) ---
0 -- stepSimulation (99.97 %) :: 19.698 ms / frame (1 calls)
Unaccounted: (0.025 %) :: 0.005 ms
...----------------------------------
...Profiling: stepSimulation (total running time: 19.698 ms) ---
...0 -- internalSingleStepSimulation (98.41 %) :: 19.385 ms / frame (1 calls)
...1 -- synchronizeMotionStates (1.40 %) :: 0.276 ms / frame (1 calls)
...Unaccounted: (0.188 %) :: 0.037 ms
......----------------------------------
......Profiling: internalSingleStepSimulation (total running time: 19.385 ms) --
-
......0 -- updateActivationState (0.04 %) :: 0.008 ms / frame (1 calls)
......1 -- updateActions (0.01 %) :: 0.002 ms / frame (1 calls)
......2 -- integrateTransforms (0.45 %) :: 0.088 ms / frame (1 calls)
......3 -- solveConstraints (10.09 %) :: 1.955 ms / frame (1 calls)
......4 -- calculateSimulationIslands (0.20 %) :: 0.039 ms / frame (1 calls)
......5 -- performDiscreteCollisionDetection (88.49 %) :: 17.153 ms / frame (1 c
alls)
......6 -- predictUnconstraintMotion (0.63 %) :: 0.123 ms / frame (1 calls)
......Unaccounted: (0.088 %) :: 0.017 ms
.........----------------------------------
.........Profiling: solveConstraints (total running time: 1.955 ms) ---
.........0 -- solveGroup (0.00 %) :: 0.000 ms / frame (0 calls)
.........1 -- processIslands (95.91 %) :: 1.875 ms / frame (1 calls)
.........2 -- islandUnionFindAndQuickSort (3.73 %) :: 0.073 ms / frame (1 calls)

.........Unaccounted: (0.358 %) :: 0.007 ms
............----------------------------------
............Profiling: solveGroup (total running time: 0.000 ms) ---
............0 -- solveGroupCacheFriendlyIterations (0.00 %) :: 0.000 ms / frame
(0 calls)
............1 -- solveGroupCacheFriendlySetup (0.00 %) :: 0.000 ms / frame (0 ca
lls)
............Unaccounted: (0.000 %) :: 0.000 ms
............----------------------------------
............Profiling: processIslands (total running time: 1.875 ms) ---
............0 -- solveGroup (96.21 %) :: 1.804 ms / frame (1 calls)
............Unaccounted: (3.787 %) :: 0.071 ms
...............----------------------------------
...............Profiling: solveGroup (total running time: 1.804 ms) ---
...............0 -- solveGroupCacheFriendlyIterations (52.16 %) :: 0.941 ms / fr
ame (1 calls)
...............1 -- solveGroupCacheFriendlySetup (46.62 %) :: 0.841 ms / frame (
1 calls)
...............Unaccounted: (1.220 %) :: 0.022 ms
.........----------------------------------
.........Profiling: performDiscreteCollisionDetection (total running time: 17.15
3 ms) ---
.........0 -- dispatchAllCollisionPairs (98.58 %) :: 16.910 ms / frame (1 calls)

.........1 -- calculateOverlappingPairs (0.07 %) :: 0.012 ms / frame (1 calls)
.........2 -- updateAabbs (1.29 %) :: 0.222 ms / frame (1 calls)
.........Unaccounted: (0.052 %) :: 0.009 ms

Here's my step simulation code: Regarding the accumulation, this was suggested for Physx from their examples, so we tried the same on Bullet but it didn't help much.

Code: Select all

	static float mAccumulator;
	static float mStepSize = 1.0f / 120.0f;
		
	mAccumulator  += deltaTime;
	if(mAccumulator < mStepSize)
		return;
	mAccumulator -= mStepSize;

	if(engineChoice == BULLET) 
		_bulletWorld->stepSimulation(mStepSize, 10);

User avatar
Erwin Coumans
Site Admin
Posts: 4221
Joined: Sun Jun 26, 2005 6:43 pm
Location: California, USA

Re: Performance related help

Post by Erwin Coumans »

You can call stepSimulation(dt,0) instead, if you want to force your own internal timestep.

Using multi threaded collision dispatcher might help. How many vertices do the convexes have?
Also don't call convexShape->initializePolyhedralFeatures(), it makes things much slower and it is not reliable.

Can you share an example .bullet file, so we can profile it?
Thanks,
Erwin
codetiger
Posts: 19
Joined: Sat Aug 18, 2012 2:20 am
Location: Chennai, India

Re: Performance related help

Post by codetiger »

Each object is made of 14 to 24 verts. And we have around 150 to 400 objects. I'll try to make a bullet file on Monday. I'll also try multithreaded setup but does it help good enough?
codetiger
Posts: 19
Joined: Sat Aug 18, 2012 2:20 am
Location: Chennai, India

Re: Performance related help

Post by codetiger »

I thought "convexShape->initializePolyhedralFeatures()" helps to optimize the vertices counts and we used it in all shapes. I'll also try removing it tomorrow.
codetiger
Posts: 19
Joined: Sat Aug 18, 2012 2:20 am
Location: Chennai, India

Re: Performance related help

Post by codetiger »

Sorry about the Bumps but I thought the experiment results would help others.

No Optimization: 84 FPS on average.

Experiment #1: Removed convexShape->initializePolyhedralFeatures()
The FPS jumped to 110 fps but however, We could see the objects jumped a bit while stacked. Now, we've to work on fixing it.

Experiment #2: stepSimulation(dt,0)
The FPS dropped down to 45. So stepSimulation(dt,10) is better I guess, but I was not able to understand this effect.

Experiment #3: Multithread 2 thread (on Dual core CPU)
The FPS went up to 170.

Multithreaded Code:

Code: Select all

		btDefaultCollisionConfiguration* collisionConfiguration;
		btCollisionDispatcher* dispatcher;
		btBroadphaseInterface* pairCache;
		btConstraintSolver*	constraintSolver;
		int maxNumOutstandingTasks = 4;
		
		btDefaultCollisionConstructionInfo cci;
		cci.m_defaultMaxPersistentManifoldPoolSize = 32768;
		collisionConfiguration = new btDefaultCollisionConfiguration(cci);

		btThreadSupportInterface* threadSupportCollision = new Win32ThreadSupport(Win32ThreadSupport::Win32ThreadConstructionInfo(
							"collision", processCollisionTask, createCollisionLocalStoreMemory, maxNumOutstandingTasks));
		dispatcher = new SpuGatheringCollisionDispatcher(threadSupportCollision, maxNumOutstandingTasks, collisionConfiguration);

		btThreadSupportInterface* threadSupportSolver = createSolverThreadSupport(maxNumOutstandingTasks);
		constraintSolver = new btParallelConstraintSolver(threadSupportSolver);
		dispatcher->setDispatcherFlags(btCollisionDispatcher::CD_DISABLE_CONTACTPOOL_DYNAMIC_ALLOCATION);

		pairCache = new btDbvtBroadphase();
		dispatcher = new btCollisionDispatcher(collisionConfiguration);
		constraintSolver = new btSequentialImpulseConstraintSolver();
		_bulletWorld = new btDiscreteDynamicsWorld(dispatcher, pairCache, constraintSolver, collisionConfiguration);
		
		btContactSolverInfo& info = _bulletWorld->getSolverInfo();
		info.m_numIterations = 10;
		info.m_minimumSolverBatchSize = 64;
		info.m_splitImpulse = false;
		info.m_solverMode = SOLVER_ENABLE_FRICTION_DIRECTION_CACHING | SOLVER_USE_WARMSTARTING | SOLVER_SIMD;
EDIT:
Finally we managed to surpass Physx performance after fixing some problem in Multi-threaded initialization. Now no other physics engine can beat Bullet's performance. (at least for our game)

EDIT 2: The problem with object stack jumping was fixed. http://bulletphysics.org/Bullet/phpBB3/ ... f=9&t=8488