[cfe-dev] [llvm-dev] fixing overly-demanding-of-exact-equality FP tests: MultiSource/Benchmarks/MiBench/telecomm-FFT

Wed Sep 28 15:35:10 PDT 2016

> On Sep 28, 2016, at 2:49 PM, Abe Skolnik via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> Dear all,
> 
> As a result of my proposed patch at <https://reviews.llvm.org/D24481> being reverted b/c it breaks test-suite tests that either expect exact equality [as this one, i.e. "telecomm-FFT", currently does] or too-strict tolerance, I am investigating 20 tests and working on solutions. This is the first publicly-proposed solution, or at least the beginnings of one.
> 
> The first problem with "telecomm-FFT" is that the test is currently configured to capture the output [688111 bytes of floating-point text, including lots of spaces and just a few newlines], hash it, throw away the original text, then compare the hashes.  Unless somebody wants to rewrite this test in integer code [good luck ;-)], this obviously won`t "do".
> 
> Does anybody know of a way to have test-suite automatically uncompress a compressed reference file, run the test and compare [with "fpcmp" and a tolerance, in this case], then delete the temporarily-uncompressed reference file and compress the raw output [for post-run analysis]?  I don`t yet know enough about the inner workings of test-suite to write that by myself.

This is not possible and I also would like to not encourage its usage/adding it because of the following reason:

A benchmark that produces a lot of text output is a bad compiler/CPU test. The system spends a lot (sometimes most) of the time in the kernel and libc dealing with I/O and the md5sum utility we pipe the data to instead of the actual program. I therefore consider the HASH_PROGRAM_OUTPUT flag usage a bad smell; In the long term we should rather change the benchmarks that produce so much output that we would want to hash and/or compress it.

In your case you should probably just add -ffp-contract=off for now. (I'd be happy to see the benchmarks improved but this seems out of scope for the problem at hand).

- Matthias

> 
> As for the tolerance, I experimentally derived the absolute tolerance 0.000002 using [after minimization] the following:
> 
>  > ../../../../tools/fpcmp -a 0.000001 new_reference_output /work/Abe/Sept._27_2016/LLVM_test-suite_build_004___AArch64___Clang_2016-09-07-20-03-19-b0768e8-master/MultiSource/Benchmarks/MiBench/telecomm-FFT/new_reference_output
>  ../../../../tools/fpcmp: Compared: 3.610700e-02 and 3.610600e-02
>  abs. diff = 1.000000e-06 rel.diff = 2.769623e-05
>  Out of tolerance: rel/abs: 0.000000e+00/1.000000e-06
> 
>  > ../../../../tools/fpcmp -a 0.000002 new_reference_output /work/Abe/Sept._27_2016/LLVM_test-suite_build_004___AArch64___Clang_2016-09-07-20-03-19-b0768e8-master/MultiSource/Benchmarks/MiBench/telecomm-FFT/new_reference_output
>  > echo $?
>  0
> 
> 
> ... where the two runs were both on the same AArch64 device, once built with a compiler from this commit:
> 
>  commit b0768e805d1d33d730e5bd44ba578df043dfbc66
>  Author: George Burgess IV <george.burgess.iv at gmail.com>
>  Date:   Wed Sep 7 20:03:19 2016 +0000
> 
> ... and the other with my fusion-enabling patch on top of that.
> 
> Regards,
> 
> Abe
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev