[cfe-dev] fp-contract at -O0
Kaylor, Andrew via cfe-dev
cfe-dev at lists.llvm.org
Fri Feb 14 10:00:00 PST 2020
> 3 Make -ffp-contract=fast always imply =on as well (so the frontend would form fmuladd nodes in both modes, but =fast would additionally license forming fma out of mul+add pairs).
This option could potentially impede optimizations that are currently performed. Having the contract flag set on FP operations instead of using the fmuladd intrinsic gives the backend freedom to mix and match operations from different source expressions. I’ve come across a case recently where this is beneficial.
Perhaps the problem is with my expectation. The option isn’t very well documented in clang (or gcc).
“Form fused FP ops (e.g. FMAs): fast (everywhere) | on (according to FP_CONTRACT pragma) | off (never fuse). Default is ‘fast’ for CUDA/HIP and ‘on’ otherwise.”
Obviously, we don’t form fused FP ops “everywhere.” What this probably should say is that we form fused ops potentially anywhere, at the discretion of the compiler. A more verbose explanation would be good. With the right wording this would reasonably explain why such ops aren’t fused at -O0.
Having given it more thought, I’d be OK with option 0 -- leave things as they are (or recently have been/soon will be) with =on as the default and the front end forming fmuladd or setting the contract flag without regard to the optimization level.
BTW, I also noticed some time ago that the front end will form fmuladd with =fast if the code in question is subject to a pragma STDC FP_CONTRACT ON. That seemed wrong to me at the time but now seems reasonable and consistent.
From: scanon at apple.com <scanon at apple.com>
Sent: Friday, February 14, 2020 6:00 AM
To: Kaylor, Andrew <andrew.kaylor at intel.com>
Cc: cfe-dev at lists.llvm.org
Subject: Re: [cfe-dev] fp-contract at -O0
On Feb 13, 2020, at 9:17 PM, Kaylor, Andrew <andrew.kaylor at intel.com<mailto:andrew.kaylor at intel.com>> wrote:
> I don’t see a problem with the godbolt link; is your concern simply that you think -ffp-contract=fast should fuse a super-set of what is done by =on, or is there something else?
Yes, that is my concern. I think =fast should always produce at least as many FMA’s as =on.
I can imagine a few ways to handle this, if we really want to do something about it:
1 A diagnostic when combining -ffp-contract=fast with -O0 that you aren’t going to get FMA formation.
2 Make -ffp-contract=fast decay to =on under -O0.
3 Make -ffp-contract=fast always imply =on as well (so the frontend would form fmuladd nodes in both modes, but =fast would additionally license forming fma out of mul+add pairs).
Option 1 is easy but silly. Option 2 is only slightly more invasive and definitely fixes the “problem”, but is maybe a little too clever. Option 3 may be the best, but I haven’t thought through all the details, and it would require some experimentation.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cfe-dev