Bullet on GPU

User avatar
SteveBaker
Posts: 127
Joined: Sun Aug 13, 2006 4:41 pm
Location: Cedar Hill, Texas
Contact:

Re: Something wrong with the GPU demo..

Post by SteveBaker » Tue Nov 14, 2006 3:25 pm

tasora wrote:Hallo,
I am testing your GPU demo since I am going to implement some
GP-GPU stuff for my multibody simulator.
The demo is very interesting, but I found some problems on my
system.
There are a lot of strange things about the way this demo does or does not run on different systems. Unfortunately, I only have access to a handful of different Linux systems (with nVidia 6800 hardware) to do intensive testing on. We really need other OpenSource developers to do some diagnosis on other systems in order that we have a chance to fix this.
GPUphysics -s:
ops: program crashes!
Can you get us a stack backtrace from the debugger so we can understand where it crashed? That would tell me a lot.
GPUphysics -p:
OK, static green cubes displayed.
Are the cubes really solidly green or are they shades of green/blue/cyan?
GPUphysics -v:
OK, cubes are moving & colliding.

GPUphysics :
OK, cubes are moving & colliding.
Exactly as in GPUphysics -v mode (in fact there is
always the message about vertex shader being disabled,
by default on Win systems.)
We don't understand why the vertex shader coide doesn't work under Windows - that's a very bizarre thing that certainly slows the demo down somewhat.
The performance is a bit too low.. I suspect that there's something
wrong.. Just to give you an idea, the 'average' result is the following:
'Performance: 2 passes 256 cubes: other=0.501740ms collisions=16.285310ms'
Is this normal on a Nvidia 7900GS board?
On my nVidia 6800-Ultra (on a deskside machine with a 2.8GHz CPU, I get:

Performance: 2 passes 256 cubes: other=0.296000ms collisions=1.746000ms

...so no, what you are seeing is definitely not good.

One possibility is that you have set the nVidia control panel (I'm not entirely sure how this works under Windows) such that the system's buffer swap is locked to the video vertical retrace signal. That would force the system to run no faster than 60Hz - or 16.667ms - if the delay somehow ended up being counted inside the collisions code then that might explain the problem. The additional 'other' time is due to the vertex shader being disabled under Windows.
After I close the window by mouse clicking, the background DOS
window starts printing the following line THOUSANDS of times:
'GLSL_ShaderPair::getUniformLocation: OpenGL Error - operazione non valida'
until I stop this by Ctrl+C.
That's weird too. Once again, this does not happen under Linux - so I'm not able to diagnose it. But it seems that one or more of the shader programs has been killed off - yet the main C++ renderer is still running - so whenever C++ tries to send data to the shader that's an invalid operation...but that's a guess. I'm not familar with Windows.
I suspect this should not happen - also, I guess that this may be related
with note 1), that is the poor performance.
If the message was coming out all the time then that would indeed be a problem. Can you start the program from within a DOS shell so you can see the messages coming out as the program runs? It seems hard to believe that this particular error could be happening while the program was running normally (albeit slowly).
PS: have you tested the GP-GPU code in
http://www.mathematik.uni-dortmund.de/~ ... orial.html
There are some hints about supported OpenGL modes
on ATI-Nvidia etc.
No I'll take a look.

User avatar
tasora
Posts: 27
Joined: Mon Aug 15, 2005 8:57 am
Location: Italy
Contact:

Debugging the GPU demo..

Post by tasora » Tue Nov 14, 2006 4:00 pm

Hi Steve,
Are the cubes really solidly green or are they shades of green/blue/cyan?
They are shaded - no problems here.
We don't understand why the vertex shader coide doesn't work under Windows - that's a very bizarre thing that certainly slows the demo down somewhat.
This is not a big issue on my opinion, for the moment - maybe
this could be useful later when tuning the code for faster execution.
One possibility is that you have set the nVidia control panel (I'm not entirely sure how this works under Windows) such that the system's buffer swap is locked to the video vertical retrace signal.
Right! you are a genius of remote debugging! :) In fact I forced the
vertical retrace = off, which was in 'Controlled by application' mode, and
now the performance is 30x faster!
Well, the now result is 0.5ms for 'other' stuff and less than 0.5 ms for
collision, on average.
BTW: Funny how the performance is still slow for few tenth of
seconds at the beginning, then it runs at full speed. Is this normal?
I'd like to install updated Nvidia drivers, to see if these glitches
may change, but I see that latest updates of the mobile (GO series)
drivers are not so up-to-date.
RE: 'GLSL_ShaderPair::getUniformLocation
That's weird too. Once again, this does not happen under Linux - so I'm not able to diagnose it. But it seems that one or more of the shader programs has been killed off - yet the main C++ renderer is still running - so whenever C++ tries to send data to the shader that's an invalid operation...but that's a guess.
Ok, sorry, this was my fault! Looks like the process did not
close just by pressing the 'close' button on the top-right corner
of the window , so it simply runs without the window and that's
why all these errors came up :) Quitting the process by Ctrl+C
in the console just ends it correctly.
(I formerly thought the console was outputting all these errors as
they were queued into a buffer, since once I saw an app doing so.. )

Ok, I look forward testing a new release of your GPU demos!

Alessandro

User avatar
SteveBaker
Posts: 127
Joined: Sun Aug 13, 2006 4:41 pm
Location: Cedar Hill, Texas
Contact:

Re: Debugging the GPU demo..

Post by SteveBaker » Tue Nov 14, 2006 4:45 pm

We don't understand why the vertex shader coide doesn't work under Windows - that's a very bizarre thing that certainly slows the demo down somewhat.
This is not a big issue on my opinion, for the moment - maybe
this could be useful later when tuning the code for faster execution.
What seems to be the problem is that Windows doesn't let me use floating point vertex shader texture maps - I have no idea why because the Windows and the Linux nVidia drivers share 95% of their code so you'd expect them both to support the same kinds of stuff in that regard. That's not really serious for the physics side of things - we'll be using fragment shaders most of the time. But in general, it's a problem with the libraries I've written - and I want to use those for other things (Doing AI on the GPU is interesting!).
One possibility is that you have set the nVidia control panel (I'm not entirely sure how this works under Windows) such that the system's buffer swap is locked to the video vertical retrace signal.
Right! you are a genius of remote debugging! :)
Well, the 16.xx milliseconds was a big clue. 16.667ms is 60Hz - and if that's your video frame rate (which it typically is on laptops) then all else is obvious.
Well, the now result is 0.5ms for 'other' stuff and less than 0.5 ms for
collision, on average.
Right - so the 'other stuff' is slow because of the problem with vertex shader textures in Windows. If we can figure out what's broken, that'll jump to a tenth of a millisecond or so. 0.5ms for collisions is three times faster than I get with my 6800 ultra - and that's showing the benefits of the 7800's faster paths for floating point textures.
BTW: Funny how the performance is still slow for few tenth of
seconds at the beginning, then it runs at full speed. Is this normal?
Not on my Linux box. If forced to guess, I would suspect that this was something to do with memory organisation and caching and such. Maybe the graphics card has it's memory full of some other junk that needs to be shuffled out of GPU memory. That seems unlikely because you have a half gig of video RAM. But I don't know what Windows does with this stuff. Are you running other 3D applications at the same time (even applications that are iconified or paused or something)?
Ok, sorry, this was my fault! Looks like the process did not
close just by pressing the 'close' button on the top-right corner
of the window , so it simply runs without the window and that's
why all these errors came up :) Quitting the process by Ctrl+C
in the console just ends it correctly.
(I formerly thought the console was outputting all these errors as
they were queued into a buffer, since once I saw an app doing so.. )
So the 'close window' widget doesn't kill the application?!? Which version of GLUT are you using?

User avatar
tasora
Posts: 27
Joined: Mon Aug 15, 2005 8:57 am
Location: Italy
Contact:

Post by tasora » Tue Nov 28, 2006 10:22 am

Hallo Steve,
by the way, have you ever tried using _multiple_ graphic boards for GPGPU processing? I mean, one may buy 4 graphic adaptors and use them in parallel on the same PCI bus - a cheap way to build a supercomputer at home, if bus bandwidth doesn't become a serious bottleneck..
For example, as in this thread:

http://www.gpgpu.org/forums/viewtopic.p ... highlight=

(I found no code samples, so I wonder how can the application send the 'draw rectangle for gpu stuff' to N boards without waiting for each one..)

regards

Alessandro

endif
Posts: 26
Joined: Wed Feb 07, 2007 5:24 pm

Post by endif » Thu Feb 22, 2007 11:04 pm

when ever i run any of the precompiled versions from the sourceforge it says my card doesnt support vertex shaders so the results wont be as impressive

i have a radeon 9600 xt which im pretty sure supports vertex shaders

i running windoze so that may be the problem

or is the demo only for nvida?

(pardon my lack of technical knowledge on these points)

Arseny Kapoulkine
Posts: 2
Joined: Tue Feb 27, 2007 2:40 am

Post by Arseny Kapoulkine » Tue Feb 27, 2007 2:49 am

Windows XP, MSVC 8.0. NV 6600 GT, drivers 84.60.

Everything works when vertex textures are turned on. :)

other ~= 0.5 ms, collisions ~= 0.9 ms, though numbers deviate too much to be certain.

edit: the version is compiled from sources; just grabbed them from SVN repository.

endif
Posts: 26
Joined: Wed Feb 07, 2007 5:24 pm

Post by endif » Tue Feb 27, 2007 8:22 pm

i dont think i turned of vertex shaders...

im not even sure you can turn them off on my card

Arseny Kapoulkine
Posts: 2
Joined: Tue Feb 27, 2007 2:40 am

Post by Arseny Kapoulkine » Thu Mar 01, 2007 4:28 pm

endif, it complains about the lack of vertex texturing support (ability to sample textures in vertex shaders). It's not supported by any released ATi card.

DevO
Posts: 95
Joined: Fri Mar 31, 2006 7:13 pm

Re: Bullet on GPU

Post by DevO » Tue Sep 18, 2007 10:47 pm

Hello!!

What is the status of Bullet on GPU?
Is this development stopped now???

Another a bit of topic question.
What is about freeglut, does it work well on Win64???
I was able to compile it after only one change in code for Win64.
http://freeglut.sourceforge.net/

Why Bulet does not use freeglut?

bsanders
Posts: 6
Joined: Wed Sep 19, 2007 10:09 pm
Location: California, USA

Re: Bullet on GPU

Post by bsanders » Mon Sep 24, 2007 11:59 pm

would it be possible to have Bullet work on the physX card? just curious.

DevO
Posts: 95
Joined: Fri Mar 31, 2006 7:13 pm

Re: Bullet on GPU

Post by DevO » Sat Jul 12, 2008 12:27 pm

Hi,

Well the PhysX card is now absolute.
PhysX run now on GeForce GPU via CUDA.
The question is now how hard it will be to port Bulet to CUDA or/and AMD Stream???
If I wanted to try to port Bullet to GPU using CUDA or Stream where I wold start?
What are most time consuming parts of Bullet?
What part can be easily calculated by GPU?

sparkprime
Posts: 508
Joined: Fri May 30, 2008 2:51 am
Location: Ossining, New York
Contact:

Re: Bullet on GPU

Post by sparkprime » Mon Jul 14, 2008 6:50 pm

Probably just convert the SPU solver stuff to use CUDA instead of the SPU libs. NV GPUs and the cell are similar from the POV of the programmer. You have to work harder to get full performance from NV GPU though.

Hamstray
Posts: 15
Joined: Thu Jan 11, 2007 7:45 pm

Re: Bullet on GPU

Post by Hamstray » Mon Sep 01, 2008 10:38 pm

Just a quick question: why bother doing physics on the GPU? It is the bottleneck with doing all the fancy drawing and shading of much less objects than the physics can handle. The average GPU is a GMA anyways...

User avatar
Erwin Coumans
Site Admin
Posts: 4184
Joined: Sun Jun 26, 2005 6:43 pm
Location: California, USA
Contact:

Re: Bullet on GPU

Post by Erwin Coumans » Tue Sep 02, 2008 9:00 pm

Optimizing and parallelizing hotspots for multi-core, Cell SPU, CUDA and Larrabee is an interesting and important topic for future Bullet development.

Several stages of the physics pipeline should be considered. For SPU we focussed on narrowphase collision detection, aabb tree traversal, constraint solver and ray casting.

For Bullet 2.71, we will share some r&d work in using CUDA for the collision detection broadphase, the btCudaBroadphase.
Thanks,
Erwin

DIMEDROLL
Posts: 5
Joined: Fri May 21, 2010 12:11 pm

Re: Bullet on GPU

Post by DIMEDROLL » Fri May 21, 2010 5:07 pm

Hi,
Can you tell me what is the current status of CUDA connection? Is it possible to simulate Cloth(SoftBody) collisions on GPU right now?

Post Reply