[llvm-dev] a proposed script to help with test-suite programs that output _lots_ of FP numbers

Thu Sep 29 14:25:00 PDT 2016

On 29 September 2016 at 18:59, Abe Skolnik <a.skolnik at samsung.com> wrote:
> As part of working on making test-suite less demanding of exact FP results
> so my FP-contraction patch can go back into trunk and stay there, today I
> analyzed "MultiSource/Benchmarks/VersaBench/beamformer".  I found that the
> raw output from that program is 2789780 bytes [i.e. ~2.7 _megabytes_] of
> floating-point text, which IMO is too much to put into a patch -- or at
> least a _civilized_ patch.  ;-)

Not to mention having to debug the whole thing every time it breaks. :S

How I "fixed" it in the past was to do some heuristics like you did,
while still trying to keep the meaning.

I think the idea of having a "number of results" is good, but I also
think you can separate the 300k values in logical groups, maybe adding
them up.

Of course, the more you do to the results, the higher will be the
rounding errors, and the less meaning the results will have.

I don't know much about this specific benchmark, but if it has some
kind of internal aggregation values, you can dump those instead?

> As a result, I wrote the below Python program, which I think should deal
> with the problem fairly well, or at least is a good first attempt at doing
> so and can be improved later.

The python script is a good prototype for what you want, but I'd
rather change the source code of the benchmark to print less, more
valuable, stuff.

The more you print, the more your run will be tied to stdio and less
to what you wanted to benchmark in the fist place.

> Vanilla compiler, i.e. without FP-contraction patch
> ---------------------------------------------------
> 286720
> 9178782.5878
>
> Compiler WITH FP-contraction patch
> ----------------------------------
> 286720
> 9178782.58444

This looks like a small enough change to me, given the amount of
precision you're losing. But it'd be better to make sure the result
has at least some meaning related to the benchmark.

> If "fpcmp" currently cannot ignore tolerances for integers and cannot easily
> be extended to be able to do so, then I propose that the two calculated
> results go to separate outputs [2 files?] to be tested separately.

That'd be fine, too.

cheers,
--renato