[PATCH] D20341: [CUDA] Enable fusing FP ops for CUDA by default.
Artem Belevich via cfe-commits
cfe-commits at lists.llvm.org
Tue May 17 16:21:24 PDT 2016
tra added a comment.
In http://reviews.llvm.org/D20341#432494, @hfinkel wrote:
> That having been said, is this change the equivalent of -ffp-contract=fast or -ffp-contract=on? I think it is the latter and we want the former (i.e. where we let the backend be as aggressive as possible *after* inlining).
It is -ffp-contract=on. As it happens, it appears to produce better code compared to -ffp-contract=fast at least on some benchmarks. Apparently smaller IR (smaller number of intrinsic call instructions vs multiple separate mul+add) makes job easier for straight line strength reduction pass and it's able to remove more redundant calculations in unrolled loops.
More information about the cfe-commits