Background:
Since some SSE instructions require 16 byte alignment to operate (i.e. movaps) all structures that use SSE must be allocated and aligned on 16 byte boundaries. Internally bullet does all of this with with __declspec(align(16)) and custom allocators. However, when using bullet's API functions the SSE versions of the functions are disabled so that your application is not required to have everything aligned.
Problem:
When I built my application on windows with MSVC 2010 it kept crashing in one of my classes that had a btQuaternion member variable. It turns out that on Windows it was crashing in btQuadWord operator= function -- the compiler generated a movaps SSE instruction and due to my structure not being aligned to 16 byte boundaries it crashed. On a side note: when I built it on Linux it did not crash -- I did not take the time to investigate if gcc was aligning everything to 16 byte boundaries by default, if gcc generates different instructions that don't require alignment (i.e. movups), or if the generation of the implicit operators in btQuaternion were different in some other way.
Analysis:
btQuaternion only compiles the explicit SSE versions of the operator= and ctors when BT_USE_SSE_IN_API is defined. However, the parent class, btQuadWord, does not disable compilation of the SSE versions of operator= and ctors with BT_USE_SSE_IN_API. So when using btQuaternions in your own structures/classes that may not be aligned to 16 byte boundaries (which the movaps SSE instruction requires) it may crash.
Workaround:
In Bullet's source file btQuadWord.h on line 76 (the #ifdefs around the btQuadWord constructor, copy constructor, and assignment operator):
Code: Select all
#if defined(BT_USE_SSE) || defined(BT_USE_NEON)
Code: Select all
#if (defined(BT_USE_SSE_IN_API) && defined(BT_USE_SSE))|| defined(BT_USE_NEON)