[PATCH] D72675: [Clang][Driver] Fix -ffast-math/-ffp-contract interaction
Andy Kaylor via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Mar 12 15:12:13 PDT 2020
andrew.w.kaylor added a comment.
In D72675#1920309 <https://reviews.llvm.org/D72675#1920309>, @lebedev.ri wrote:
> I may be wrong, but i suspect those failures aren't actually due to the fact
> that we pessimize optimizations with this change, but that the whole execution
> just fails. Can you try running test-suite locally? Do tests themselves actually pass,
> ignoring the question of their performance?
I find the LNT output very hard to decipher, but I thought that all of the failures on the Broadwell (x86) LNT bot were just performance regressions. There were many perf improvements and also many regressions. I investigated the top regression and found that the loop unroller made a different decision when the llvm.fmuladd intrinsic was used than it did for separate mul and add operations. The central loop of the test was unrolled eight times rather than four. Broadwell gets less benefit from FMA than either Haswell or Skylake, so any other factors that might drop performance are less likely to be mitigated by having fused these operations. In a more general sense, I don't see any reason apart from secondary changes in compiler behavior like this that allowing FMA would cause performance to drop.
At least one other target had execution failures caused by Melanie's change, but I understood it to be simply exposing a latent problem in the target-specific code.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D72675/new/
https://reviews.llvm.org/D72675
More information about the cfe-commits
mailing list