[cfe-dev] [llvm-dev] fixing overly-demanding-of-exact-equality FP tests: MultiSource/Benchmarks/MiBench/telecomm-FFT
Matthias Braun via cfe-dev
cfe-dev at lists.llvm.org
Wed Sep 28 15:35:10 PDT 2016
> On Sep 28, 2016, at 2:49 PM, Abe Skolnik via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> Dear all,
> As a result of my proposed patch at <https://reviews.llvm.org/D24481> being reverted b/c it breaks test-suite tests that either expect exact equality [as this one, i.e. "telecomm-FFT", currently does] or too-strict tolerance, I am investigating 20 tests and working on solutions. This is the first publicly-proposed solution, or at least the beginnings of one.
> The first problem with "telecomm-FFT" is that the test is currently configured to capture the output [688111 bytes of floating-point text, including lots of spaces and just a few newlines], hash it, throw away the original text, then compare the hashes. Unless somebody wants to rewrite this test in integer code [good luck ;-)], this obviously won`t "do".
> Does anybody know of a way to have test-suite automatically uncompress a compressed reference file, run the test and compare [with "fpcmp" and a tolerance, in this case], then delete the temporarily-uncompressed reference file and compress the raw output [for post-run analysis]? I don`t yet know enough about the inner workings of test-suite to write that by myself.
This is not possible and I also would like to not encourage its usage/adding it because of the following reason:
A benchmark that produces a lot of text output is a bad compiler/CPU test. The system spends a lot (sometimes most) of the time in the kernel and libc dealing with I/O and the md5sum utility we pipe the data to instead of the actual program. I therefore consider the HASH_PROGRAM_OUTPUT flag usage a bad smell; In the long term we should rather change the benchmarks that produce so much output that we would want to hash and/or compress it.
In your case you should probably just add -ffp-contract=off for now. (I'd be happy to see the benchmarks improved but this seems out of scope for the problem at hand).
> As for the tolerance, I experimentally derived the absolute tolerance 0.000002 using [after minimization] the following:
> > ../../../../tools/fpcmp -a 0.000001 new_reference_output /work/Abe/Sept._27_2016/LLVM_test-suite_build_004___AArch64___Clang_2016-09-07-20-03-19-b0768e8-master/MultiSource/Benchmarks/MiBench/telecomm-FFT/new_reference_output
> ../../../../tools/fpcmp: Compared: 3.610700e-02 and 3.610600e-02
> abs. diff = 1.000000e-06 rel.diff = 2.769623e-05
> Out of tolerance: rel/abs: 0.000000e+00/1.000000e-06
> > ../../../../tools/fpcmp -a 0.000002 new_reference_output /work/Abe/Sept._27_2016/LLVM_test-suite_build_004___AArch64___Clang_2016-09-07-20-03-19-b0768e8-master/MultiSource/Benchmarks/MiBench/telecomm-FFT/new_reference_output
> > echo $?
> ... where the two runs were both on the same AArch64 device, once built with a compiler from this commit:
> commit b0768e805d1d33d730e5bd44ba578df043dfbc66
> Author: George Burgess IV <george.burgess.iv at gmail.com>
> Date: Wed Sep 7 20:03:19 2016 +0000
> ... and the other with my fusion-enabling patch on top of that.
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
More information about the cfe-dev