[LLVMdev] Help adding the Bullet physics sdk benchmark to the LLVM test suite?

Tue Jan 5 13:38:53 PST 2010

On Tuesday 05 January 2010 14:53, Erwin Coumans wrote:
> How do other benchmarks deal with unstable algorithms or differences in
> floating point results?
>
> >> haven't been following this thread, but this sounds like a typical
> >> unstable algorithm problem.  Are you always operating that close to
> >> the tolerance level of the algorithm or are there some sets of inputs
> >> that will behave reasonably?
>
> What do you mean by "reasonably" or "affect codes so horribly"?

"Reasonably" means the numerics won't blow up due to small changes in 
floating-point results caused by compiler transformations like reassociation.

"Affects code so horribly" means that the compiler is causing an unstable
algorithm to blow up, generating useless results.  This shouldn't happen 
unless the user allows it with an explicit compiler flag.  AFAIK LLVM has
no such flag yet.  It has some flags to control changes in precision, which
helps, but I don't think there's a flag that says "don't do anything risky,
ever."

For example, a gfortran-fronted LLVM should have a way to always respect
ordering indicated by parentheses.  I don't know if gfortran even has that,
let alone LLVM proper.

> The accumulation of algorithms in a physics pipeline is unstable and unless
> the compiler/platform guarantees 100% identical floating point results, the 
> outcome will diverge. 

Yep.  100% reproducability is really important.  LLVM should have a flag to
guarantee it.

> Do you think LLVM can be forced to produce identical floating point
> results? Even when using different optimization levels or even different
> CPUs?

Not right now, but the support can certainly be added.  It really *should*
be added.  It will take a bit of work, however.

> Some CPUs use 80bit FPU precision for intermediate results (on-chip in
> registers), while variables in-memory only use 32-bit or 64bit precision.
> In combination with cancellation and other re-ordering this can
> give slightly different results.

Yep, which is why good compilers have ways to control this.  llc, for example,
has the -disable-excess-fp-precision and -enable-unsafe-fp-math options.  I
don't know if there's a way to control usage of the x87 stack, however.

> >> If not, the code doesn't seem very useful to me.  How could anyone rely
> >> on the results, ever?
>
> The code has proven to be useful for games and special effects in film,
> but this particular benchmark might not suite LLVM testing indeed.

We can make it suit it.  If it works for real world situations it must
work when compiled with LLVM.  Otherwise it's an LLVM bug (assuming the
code is not doing undefined things).

> I suggest working on a better benchmark that tests independent parts of the
> pipeline,

That's useful in itself.

> so we don't accumulate results (several frames) but we test a single
> algorithm at a time,

No, we should be testing this accumulated stuff as well.  As LLVM gets used in 
more arenas, this type of problem will crop up, guaranteed.  In fact the only
way we (Cray) get away with it is that we don't use very many LLVM passes and
we stricly target SSE only.

                              -Dave