[PATCH] D20341: [CUDA] Enable fusing FP ops for CUDA by default.

Artem Belevich via cfe-commits cfe-commits at lists.llvm.org
Tue May 17 16:21:24 PDT 2016


tra added a comment.

In http://reviews.llvm.org/D20341#432494, @hfinkel wrote:

>




> That having been said, is this change the equivalent of -ffp-contract=fast or -ffp-contract=on? I think it is the latter and we want the former (i.e. where we let the backend be as aggressive as possible *after* inlining).


It is -ffp-contract=on. As it happens, it appears to produce better code compared to -ffp-contract=fast at least on some benchmarks. Apparently smaller IR (smaller number of intrinsic call instructions vs multiple separate mul+add) makes job easier for straight line strength reduction pass and it's able to remove more redundant calculations in unrolled loops.


http://reviews.llvm.org/D20341





More information about the cfe-commits mailing list