Gino's Book - Robustness section

omicron · Post by **omicron** » Thu Feb 02, 2006 6:24 pm

Hi,

This question is mainly directed to Gino van den Bergen as it is a little doubt from his book. However, anyone able to help is very welcome.

So, from page 59 I'm not understanding what Gino is trying to explain in the phrase that starts with "Assume". It's located right in the bottom of the page just before he finds the upper bound for the relative error for approximation of gamma. I don't want to break any copyright law, else, I would have copied it over here.

There's no need to explain what's below the upper bound inequality.

Thank you!

Christer Ericson · Post by **Christer Ericson** » Fri Feb 03, 2006 7:55 am

omicron wrote:So, from page 59 I'm not understanding what Gino is trying to explain in the phrase that starts with "Assume".

I'm not Gino, but perhaps I can explain anyway. Let's start by nothing that a typical floating point operation 'op' introduces an error along the lines of:

fl(a op b) = (a op b)(1+e) where |e| <= u (where u is the machine epsilon).

Now consider the computation of e.g. a*b*c in floating point. This is equivalent to [((a*b)*(1+e1))*c]*(1+e2) = a*b*c*(1+e1)*(1+e2) = a*b*c*(1+e1+e2+e1*e2). This is just straightforward manipulation so far.

At this point, with |ei| <= u, we can simply say that 1+e1+e2+e1*e2 = 1+e1+e2+O(u^2) and we then promptly ignore the O(u^2) term because it is so small that it (typically) won't affect the result. That leaves us with:

a*b*c*(1+e1+e2)

as the value computed for the floating-point expression a*b*c. Gino's sentence on page 59 is in reference to dropping this O(u^2) term that I talked about above.

Hope that helps to clarify what's meant.

gino · Post by **gino** » Fri Feb 03, 2006 4:20 pm

Christer Ericson wrote:
omicron wrote:So, from page 59 I'm not understanding what Gino is trying to explain in the phrase that starts with "Assume".
Hope that helps to clarify what's meant.

Well put. Thanks Christer.

The only thing I'd like to add is that computations would still classify as 'fairly accurate' if |ei| <= C * u for C greater than 1. For example, for errors ei of 100 * u, the term e1 * e2 would still not be anywhere near the machine epsilon of single-precision floats and can be safely ignored in your analysis.

omicron · Post by **omicron** » Fri Feb 03, 2006 6:38 pm

Well, first off, thank you both for your help!

So, I wrote this little program to test cancellation when subtracting two nearly equal values.

Code: Select all

	// Both delta0 and delta1 suffer from rounding
	// errors because they're not representable in IEEE-754

	float delta0 = 0.432343547394f;
	float delta1 = 0.432343587394f;

	// Relative errors from both delta0 and delta1
	// are bounded by machine epsilon

	float delta0_rel_error = FLT_EPSILON;
	float delta1_rel_error = FLT_EPSILON;
	
	float add = delta1 + delta0;
	float sub = delta1 - delta0;
	float mul = delta1 * delta0;

	float add_rel_error = ( fabs( delta0 * delta0_rel_error ) + fabs( delta1 * delta1_rel_error ) ) / ( fabs( add ) ) + FLT_EPSILON;
	float sub_rel_error = ( fabs( delta0 * delta0_rel_error ) + fabs( delta1 * delta1_rel_error ) ) / ( fabs( sub ) ) + FLT_EPSILON;
	float mul_rel_error = fabs( delta0_rel_error ) + fabs( delta1_rel_error ) + FLT_EPSILON;

The results are:

add = 0.86468714475631714;
sub = 5.96046644775390625e-8;
mul = 0.18692097067832947;

add_rel_error = 2.384185791015625e-7;
sub_rel_error = 1.7293744087219238;
mul_rel_error = 3.5763786865234375e-7;

Using exact arithmetic sub would be 4e-8 but with floating-point arithmetic it is quite far from it ( see above ). So, I guess that this turned out to be a catastrophic cancellation situation because both results ( from floating-point and exact arithmetic ) don't have any digits in common, therefore, the relative error is huge ( see above ), am I right ?

Thanks!

gino · Post by **gino** » Mon Feb 06, 2006 12:24 pm

omicron wrote:
Using exact arithmetic sub would be 4e-8 but with floating-point arithmetic it is quite far from it ( see above ). So, I guess that this turned out to be a catastrophic cancellation situation because both results ( from floating-point and exact arithmetic ) don't have any digits in common, therefore, the relative error is huge ( see above ), am I right ?

Thanks!

Correct. Note that IEEE 754 single-precision numbers have only 24 accurate bits (= +/- 7 decimal digits). In reality things may not be that bad (on Intel at least), since temporary values are often evaluated and stored at higher precisions (IA-32 float registers have 64-bit mantissas).

Real-Time Physics Simulation Forum

Gino's Book - Robustness section

Gino's Book - Robustness section

Re: Gino's Book - Robustness section

Re: Gino's Book - Robustness section