[llvm-dev] [test-suite] making the test-suite succeed with "-Ofast" and "-ffp-contract=on"

Sebastian Pop via llvm-dev llvm-dev at lists.llvm.org
Fri Oct 7 17:34:40 PDT 2016


Hi,

I would like to provide a summary of the different proposals on how to
fix the test-suite to make it succeed when specifying extra CFLAGS
"-Ofast" and "-ffp-contract=on".  I would like to expose the issue and
proposed ways to fix it to other potential reviewers that could
provide extra feedback.  We also need to decide which proposal (or
combination of) to implement and commit.

Proposal 1: https://reviews.llvm.org/D25277
modify the CMakes to compile and run each of these benchmarks twice:
once with added CFLAGS -ffp-contract=off.  Record on disk the full
output of both runs and compare with FP_TOLERANCE.  Hash the output of
the run with -ffp-contract=off and exact match against the reference
output.

The good for Proposal 1:
- changes contained in the build system: no change to the code of the benchmarks
- runs benchmarks under an extra configuration with CFLAGS += -ffp-contract=off

The bad for Proposal 1:
- compilation time will double
- running time on the device will double
- build system is more complex
- the build directory goes from 300M to 1.2G due to the extra
reference outputs recorded under -ffp-contract=off,
- when running test-suite over small devices it will cost 1G more
transfer over the network.

Proposal 2: https://reviews.llvm.org/D25346
like Proposal 1, except that there are no files written to disk
(transferred over the network from the device to the host that does
the fpcmp and hashing), the outputs of both normal compilation and the
kernel compiled under "#pragma STDC FP_CONTRACT OFF" are computed and
compared on the device running the benchmark.  The output of
-ffp-contract=off is written to disk, and as currently done in the
test-suite, the output is hashed and exactly matched against the
reference output.

The good for Proposal 2:
- no modifications to CMake and Makefiles
- no extra space to store the extra reference output
- tests both user CFLAGS specified mode and fast-math and fp-contraction=off.

The bad for Proposal 2:
- compilation time will double: e.g., Polly will optimize both kernels,
- memory requirements on the device will almost double: added one
extra output array, input arrays are not modified, so no need to
duplicate them,
- compute time on the device will more than double: running the kernel
twice, plus an extra loop over both outputs to compare with
FP_TOLERANCE.
- requires modifications to the code of the benchmarks: some
benchmarks may not be easily modified and will need to be only run
under -ffp-contract=off (as in Proposal 3.)

Proposal 3: https://reviews.llvm.org/D25351
modify the Makefiles and CMakes to explicitly specify the flags under
which the results will match the recorded reference output.

The good for Proposal 3:
- no modifications to the benchmarks
- minimal modifications to the build system

The bad for Proposal 3:
- these benchmarks will not be tested with -ffp-contract=on: exact
matching of the reference output requires -ffp-contract=off
- adding more tests (as in Proposals 1 and 2) is actually a good thing
for the test-suite

I would like to invite other people to review the above proposals and
suggest a way forward on fixing the current state of the test-suite
when running under CFLAGS="-Ofast" and "-ffp-contract=on." Once
consensus is achieved, I am willing to implement and follow up with
addressing all reviews necessary to commit the change to the
test-suite.

Thank you,
Sebastian


More information about the llvm-dev mailing list