Improving performances of transported objects [SOLVED]

Post Reply
teolazza
Posts: 21
Joined: Fri Aug 11, 2017 1:16 pm

Improving performances of transported objects [SOLVED]

Post by teolazza »

Hi,

I have some hundred of kinematic rigid bodies (btCompoundShape). They are my Carriers.
I have some hundreed of rigidbodies (btBoxShape). Call them my Loads.

During the program the most of my carriers start carrying one Load. Usually I have one Load for each Carrier.

Once the Load is loaded it stays on the carrier for a long time before being unloaded. During this movement the relative position between Carrier and Load does not changes.

The most the Loads are carrierd the slowest the simulation.

I'm looking for suggestion in order to improve the performances and having highest frame rate. I already use bullet in multithreading.
Last edited by teolazza on Mon Apr 12, 2021 5:36 am, edited 1 time in total.
User avatar
drleviathan
Posts: 849
Joined: Tue Sep 30, 2014 6:03 pm
Location: San Francisco

Re: Improving performances of transported objects

Post by drleviathan »

If you want to first measure where the time is being spent inside the Bullet step before taking optimization measures you could use the CProfileManager utility in Bullet to get a nice printout of time spent in each context. Rather than outline its usage here I suggest you research previous threads about it.

If you want to first try theoretical optimizations then I will speculate:

(1) Are you using shape-sharing? That is, are your Carriers all the same shape? And if so, are you using the same btCompoundShape instance for all of their RigidBodies? Or do you instantiate unique instances of btCompoundShape for each? (this would be necessary if the shape of each Carrier is distinct).

(2) Similarly for the Loads... are they the same shape and are you using shape sharing for them?

(3) If the Loads never change their Carrier-relative local transform during transport then you could remove each Load from the world, give each LoadedCarrier the shape of combined Carrier+Load, and move that around until it is time to split back into Carrier and Load. If all LoadedCarriers are the same shape then you could do shape sharing for those as well. There is some trickery required to drastically change the shape of a RigidBody that is already in the World... you may need to update the body's AABB in the broadphase: this is particularly true when the new shape escapes the bounds of the old AABB. The easy way to do it is to remove the RigidBody from the World and re-add it, however when you have tens of thousands of objects in the World then the add/removal cycle can start to get long enough to actually be noticed and it you would be incentivized to optimize it.

(4) If some groups of objects should never collide as part of the game-play/simulation then you could try disabling collisions between them using collision groups API. This will reduce potential overlaps in the broadphase.

(5) The shape of the ground could matter since you have so many objects. If you're using a triangle mesh then its triangles should NOT be much smaller than the dynamic objects tumbling about on it. In other words: it isn't so much about the number of triangles in the whole mesh but the number of triangles that overlap each dynamic body's AABB: the smaller number of triangles the less work for the narrowphase.
teolazza
Posts: 21
Joined: Fri Aug 11, 2017 1:16 pm

Re: Improving performances of transported objects

Post by teolazza »

Thanks

1) Yes, I am shape-sharing all carriers (they have a single common shape).

2) Yes, I am also shape-sharing Loads (they have like 30 shared different shapes).

3) Merging into a single compound is indeed a good idea, but I don't believe in this case: my Loads have motionstates, that I use for graphics. I should manually manage the motion states of the Loads, as if having kinematic objects.
I will also have a compound for each pair carrier/loads, invalidating what we saied in 1) and 2).

But you gave me the idea of transforming my Loads into kinematic objects, so that the collision/force with the carriers won't be computed.

4) No need to filter: carriers are kinematic objects and "touch" just other carriers and Loads. The 95% of the loads are just in contact with the carrier. No other type of objects in the world.

5) None is touching the ground. All objects are at least one unit higher than it.
User avatar
drleviathan
Posts: 849
Joined: Tue Sep 30, 2014 6:03 pm
Location: San Francisco

Re: Improving performances of transported objects

Post by drleviathan »

It occurs to me:

I don't know if the CProfileManager stuff works in the multi-thread paradigm. So if you tried it... dunno if it would actually work.

I've never played with the multi-threaded Bullet however I would expect single-threaded Bullet to be able to push several hundred dynamic objects, even into the low thousands, while still being able to maintain 60fps as long as the shapes weren't too complicated. This may even be possible on medium strength modern hardware rather than heavy iron.

When it is time to get serious about optimization then cache coherency must be considered. The MotionState API, while convenient, is probably non-optimal when it comes to cache coherency and you would be able to push much more updates by blasting through large coherent memory buffers instead of hopping all around. Here is an outline of how to do it:

(1) Pre-allocate all the RigidBody's you might ever want into one big continuous array. We will call this number: N.

(2) Each Body is identified by index (you could store it in btRigidBody.m_userIndex if you ever need to look it up by Body) and this index matches the index to the renderable object instance in your render pipeline. As you add Bodies to the world you keep track of MaxAllocatedIndex (which must always be lower than N) and also any empty indices to Bodies that were removed from the world. You recycle unused indices (sorted so you can always recycle the lower unused indices first) when possible, else increment MaxAllocatedIndex and use that. (BTW, if you decide to go this route I have a unit-tested IndexAllocator class (under Apache2.0 license IIRC) designed to track MaxAllocatedIndex and freed indices).

(3) You maintain a sorted list of active indices. In your case you probably would want to maintain two sorted index lists: all kinematic bodies, and all dynamic bodies.

(4) When it comes time to update the transforms of kinematic bodies you do them all at once, in order by index. This is where the cache coherency savings kick in. Ideally the incoming transforms have been computed in another thread and already lined up in a contiguous buffer sorted by index. The striding copy from contiguous memory to relatively contiguous memory will avoid much cache misses and will be FAST.

(5) When it comes time to harvest the transforms of dynamic bodies you do them all at once, in order by index. Again, much savings. Again ideally the new transforms are copied into a contiguous array. In fact, best if you would copy into a full array that has transforms for ALL objects up to MaxAllocatedIndex, even when some indices are empty and non-empty indices correspond to kinematic or dynamic objects. The kinematic transforms could have been updated in step (4) and the interleaved dynamic transforms here in step (5).

(6) Once all of the transforms are computed you "throw" the fresh transforms at the render pipeline which would be running on another thread and is frame locked with the video card. That is, double-buffer the transforms and swap pointers old for new.

TL;DR Update all kinematic transforms once per substep with a single CustomAction rather than many MotionStates. Harvest all dynamic transforms once per step with a single CustomAction rather than many MotionStates. Throw the transforms at the render pipeline which runs on a devoted thread. Wash, rinse, repeat.
teolazza
Posts: 21
Joined: Fri Aug 11, 2017 1:16 pm

Re: Improving performances of transported objects

Post by teolazza »

Thanks again.

I'll test your hint asap.

Since I already iterate all the manifolds before the simulation step I have also checked when a Load moves at the same speed of the Carrier.
From that moment on I turn my Load into a kinematic object.

That has greatly improved the performances. I think my problem was because before this trick the engine had to compute each time the friction between carrier and Load in order to move the load, but I already knew that the speed should have been the same.

Now I have another problem: I had some sensors, defined just as empty collision objects. Now that my Loads are kinematic objects they are no more triggered....
User avatar
drleviathan
Posts: 849
Joined: Tue Sep 30, 2014 6:03 pm
Location: San Francisco

Re: Improving performances of transported objects

Post by drleviathan »

To ensure sensor detection: make sure your kinematic objects are kept "active". This should always be true when using MotionStates to move them around because the MotionState.getWorldTransform() is only called on active kinematic objects.

I believe kinematic-vs-static objects do not "collide" by default (and also kinematic-vs-kinematic), but am not 100% sure about this. You might need to use collision groups to enable overlap detection.
teolazza
Posts: 21
Joined: Fri Aug 11, 2017 1:16 pm

Re: Improving performances of transported objects

Post by teolazza »

I think you are right about the fact that kinematic and static don't collide by default:
as a matter of fact this is how I transform my Load into kinematic

Code: Select all

m_world.removeRigidBody(&LoadBody);

// transform the body
LoadBody.setMassProps(0, btVector3(0, 0, 0));
LoadBody.setCollisionFlags(parcelBody.getCollisionFlags() |
					btCollisionObject::CF_KINEMATIC_OBJECT |
					btCollisionObject::CF_STATIC_OBJECT);	// NOTE: I've tried with and without this flag, with no changes in the result
LoadBody.forceActivationState(DISABLE_DEACTIVATION);
m_world.addRigidBody(&LoadBody);
so my kinematic should already be active.

This is the creation of the sensor

Code: Select all

	// build the object
	btCollisionObject* obj = new btCollisionObject();
	obj->setCollisionFlags(obj->getCollisionFlags() | btCollisionObject::CF_NO_CONTACT_RESPONSE);
	obj->setUserPointer(...);
	assert(shape != nullptr);
	obj->setCollisionShape(shape);
	m_pDynamicsWorld->addCollisionObject(obj);
Note that it's not a rigid body (other objects should move throught it).
Without the flag CF_NO_CONTACT_RESPONSE non kinematic objects will be phyisically stopped by the sensor.

I check the sensor iterating over all manifolds:

Code: Select all

		if (manifold.getNumContacts() > 0) {
			manifold.clearManifold();
			sensor.hit() = true;
		}

From what I remember Collision groups work in avoiding collisions between objects that by default should already collide, so I am not optimist, but I will try.
User avatar
drleviathan
Posts: 849
Joined: Tue Sep 30, 2014 6:03 pm
Location: San Francisco

Re: Improving performances of transported objects

Post by drleviathan »

Maybe use btGhostObject for your sensors instead of just btCollisionObject. If you need your sensors have a fancy shape instead of just a box then there is extra work involved.
teolazza
Posts: 21
Joined: Fri Aug 11, 2017 1:16 pm

Re: Improving performances of transported objects

Post by teolazza »

What I remembered was wrong:
you can apply filters so that a kinematic object can have collisions with a collisionObject (thus also with a ghostObject)
Post Reply