threshold in GJK causing wrong penetrating contact

fastflo
Posts: 14
Joined: Mon Oct 10, 2011 10:49 am

threshold in GJK causing wrong penetrating contact

Post by fastflo »

hui, this one took me many hours...

in btGjkEpa2.cpp around line 219 (in bullet-2.79) is a check whether the length of a vector is "zero".

this is implemented like this:

Code: Select all

	
const btScalar	rl=m_ray.length();
if(rl< GJK_MIN_DISTANCE)
with

Code: Select all

#define GJK_MIN_DISTANCE	((btScalar)0.0001)
with my relatively small convex hulls this constant seems to be too large.

i have a body->setMargin(0.001) -> that means one millimeter for me.

the effect i'm observing is like this:
- one bodyA is slowly approaching another bodyB
- they are so close that the NarrowPhase collision detection is running in each step.

- bodies are checked for penetration in btGjkPairDetector with m_penetrationDepthSolver->calcPenDepth()
- gjkepa2_impl::GJK::Evaluate behind this calcPenDepth() exits with the if-statement above and a rl.length() beeing 0.000073...
- gjk_status is then set to GJK::eStatus::Inside but EPA::Evaluate() returns a epa.m_depth of EXACTLY 0 !!

-- i think this should not happen! the bodies are not really touching each other yet

- epa.m_depth beeing exactly 0 also causes the two reported world contact points on each body to be EXACTLY the same:

Code: Select all

				results.witnesses[0]	=	wtrs0*w0;
				results.witnesses[1]	=	wtrs0*(w0-epa.m_normal*epa.m_depth);
- btGjkPairDetector::getClosestPointsNonVirtual() reports this penetration with

Code: Select all

						btScalar distance2 = -(tmpPointOnA-tmpPointOnB).length();
						//only replace valid penetrations when the result is deeper (check)
						if (!isValid || (distance2 < distance))
						{
							distance = distance2;
							pointOnA = tmpPointOnA;
							pointOnB = tmpPointOnB;
							normalInB = tmpNormalInB;
							isValid = true;
							m_lastUsedMethod = 3;
- remember: distance2 and then distance is exactly -0 because tmpPointOnA and tmpPointOnB are exactly the same because of epa.m_depth being exactly 0
- at the end of btGjkPairDetector::getClosestPointsNonVirtual() this penetrating contact point is reported:

Code: Select all

		m_cachedSeparatingAxis = normalInB;
		m_cachedSeparatingDistance = distance;
output.addContactPoint(
			normalInB,
			pointOnB+positionOffset,
			distance);
lets continue in btConvexConvexAlgorithm::processCollision() in BulletCollision/CollisionDispatch/btConvexConvexAlgorithm.cpp:442 (ZERO_MARGIN is not defined for my build, so i remove these code lines)
gjkPairDetector.getClosestPoints() returns with this 0-contact from above.

Code: Select all

				gjkPairDetector.getClosestPoints(input,dummy,dispatchInfo.m_debugDraw);
				btScalar l2 = gjkPairDetector.getCachedSeparatingAxis().length2();
				if (l2>SIMD_EPSILON)
				{
					sepNormalWorldSpace = gjkPairDetector.getCachedSeparatingAxis()*(1.f/l2);
					//minDist = -1e30f;//gjkPairDetector.getCachedSeparatingDistance();
					minDist = gjkPairDetector.getCachedSeparatingDistance()-min0->getMargin()-min1->getMargin();
	
					foundSepAxis = gjkPairDetector.getCachedSeparatingDistance()<(min0->getMargin()+min1->getMargin());
gjkPairDetector.getCachedSeparatingDistance() is our beloved -0 (ieee 32bit float big endian 0x80000000) so minDist is set to

Code: Select all

minDist = -0 - min0->getMargin() - min1->getMargin();
both margins are 0.001 for me so i get a

Code: Select all

minDist = -0.002
-- here its getting really dangerous ... ;)

a few lines below we have

Code: Select all

				btPolyhedralContactClipping::clipHullAgainstHull(sepNormalWorldSpace, *polyhedronA->getConvexPolyhedron(), *polyhedronB->getConvexPolyhedron(),
					body0->getWorldTransform(), 
					body1->getWorldTransform(), minDist-threshold, threshold, *resultOut);
btPolyhedralContactClipping::clipHullAgainstHull() gets called with its minDist parameter set to
(-0.002 - threshold) (with threshold NOT beeing a constant this time -- its a relatively small computed value of 0.000678... in this case. minDist is 0.0026782261835028051

btPolyhedralContactClipping::clipFaceAgainstHull() also gets this parameter. in there this contact gets added to
btDiscreteCollisionDetectorInterface::Result& resultOut here:

Code: Select all

	// only keep points that are behind the witness face
	{
		btVector3 localPlaneNormal (polyA.m_plane[0],polyA.m_plane[1],polyA.m_plane[2]);
		btScalar localPlaneEq = polyA.m_plane[3];
		btVector3 planeNormalWS = transA.getBasis()*localPlaneNormal;
		btScalar planeEqWS=localPlaneEq-planeNormalWS.dot(transA.getOrigin());
		for (int i=0;i<pVtxIn->size();i++)
		{
			
			btScalar depth = planeNormalWS.dot(pVtxIn->at(i))+planeEqWS;
			if (depth <=minDist)
			{
//				printf("clamped: depth=%f to minDist=%f\n",depth,minDist);
				depth = minDist;
			}

			if (depth <=maxDist)
			{
				btVector3 point = pVtxIn->at(i);
#ifdef ONLY_REPORT_DEEPEST_POINT
				curMaxDist = depth;
#else
#if 0
				if (depth<-3)
				{
					printf("error in btPolyhedralContactClipping depth = %f\n", depth);
					printf("likely wrong separatingNormal passed in\n");
				} 
#endif				
				resultOut.addContactPoint(separatingNormal,point,depth);
#endif
			}
		}
	}
with depth still being -0.00267822621 because the computed depth of planeNormalWS.dot(pVtxIn->at(i))+planeEqWS is -0.0714240894 and by that smaller than minDist -> so it is "clamped" to depth = minDist -- our famous -0.00267822621.

voilla: i get a sudden penetration of 2.6 millimeters which causes the bodies to be thrown apart in a really ugly manner.

attached is a small example program which reproduces this very deterministically.... (start it with -high as commandline argument or set start_high = true)
and here you can see these sudden "deep" penetrating contacts:
sudden_penetration.png

ok, so i think the root of this problem is EPA::Evaluate() exiting in its fallback-case:

Code: Select all

				/* Fallback		*/ 
				m_status	=	eStatus::FallBack;
				m_normal	=	-guess;
				const btScalar	nl=m_normal.length();
				if(nl>0)
					m_normal	=	m_normal/nl;
				else
					m_normal	=	btVector3(1,0,0);
				m_depth	=	0;
				m_result.rank=1;
				m_result.c[0]=simplex.c[0];
				m_result.p[0]=1;	
				return(m_status);
with m_depth = 0 and then bullet substracts margins and threshold from this which produces the "deep" penetration.
probably EPA::Evaluate() goes thru this fallback because GJK::Evaluate() reports a "strange"/not real contact because of that too rough zero checking constant.

another funny side note to this: if i choose the start position of the bodies to be very close, then GJK will not run in this problem and no wrong penetrations are generated. you can test this with the test-program run with start_high = false.
You do not have the required permissions to view the files attached to this post.
fastflo
Posts: 14
Joined: Mon Oct 10, 2011 10:49 am

Re: threshold in GJK causing wrong penetrating contact

Post by fastflo »

what works for me is to introduce a new define in btGjkEpa2.cpp which is used to test whether the m_ray.length() is zero:

Code: Select all

#define GJK_TEST_ZERO_EPS 1e-6
...
					/* Check zero							*/ 
					const btScalar	rl=m_ray.length();
					if(rl < GJK_TEST_ZERO_EPS)
					{/* Touching or inside				*/ 
						m_status=eStatus::Inside;
###
i'm not saying that this is a good option for all users -- maybe the loop makes a few more iterations with that constant.... but currently i don't care and prefer plausible behaviour.
Last edited by fastflo on Fri Oct 14, 2011 2:59 pm, edited 2 times in total.
fastflo
Posts: 14
Joined: Mon Oct 10, 2011 10:49 am

Re: threshold in GJK causing wrong penetrating contact

Post by fastflo »

beside this constant GJK_MIN_DISTANCE i think that it is still an errornous behaviour when EPA::Evaluate() exits in its fallback case, returning m_depth = 0 which then gets "increased" by the collision margins and threshold to a penetration...

i'd say that iff EPA::Evaluate() exits in that fallback case - no contact should be generated?!
fastflo
Posts: 14
Joined: Mon Oct 10, 2011 10:49 am

Re: threshold in GJK causing wrong penetrating contact

Post by fastflo »

User avatar
Erwin Coumans
Site Admin
Posts: 4221
Joined: Sun Jun 26, 2005 6:43 pm
Location: California, USA

Re: threshold in GJK causing wrong penetrating contact

Post by Erwin Coumans »

There are indeed issues with the GJK/EPA implementation of Bullet if the collision margin and/or the features (points etc) of a convex shape are very small/close to zero. That is why Bullet uses a rather large collision margin by default (0.04) and the recommendation to keep shapes larger than say 0.2 units (meter).

As you are working with very small shapes you are likely getting into those issues, and I appreciate that you have a closer look at it. Perhaps you can make the GJK/EPA implementation more robust.
I'll have a look at your issue in the tracker,
Thanks!
Erwin
fastflo
Posts: 14
Joined: Mon Oct 10, 2011 10:49 am

Re: threshold in GJK causing wrong penetrating contact

Post by fastflo »

sorry, i posted to the wrong topic.
that issue 557 was about the coplanar face merging...