[PATCH] D66092: [CodeGen] Generate constrained fp intrinsics depending on FPOptions
Serge Pavlov via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Aug 15 01:39:08 PDT 2019
sepavloff added a comment.
In D66092#1629460 <https://reviews.llvm.org/D66092#1629460>, @andrew.w.kaylor wrote:
> In D66092#1627339 <https://reviews.llvm.org/D66092#1627339>, @sepavloff wrote:
> > In D66092#1625380 <https://reviews.llvm.org/D66092#1625380>, @kpn wrote:
> > > Also, if any constrained intrinsics are used in a function then the entire function needs to be constrained. Is this handled anywhere?
> > If we decided to make the entire function constrained, it should be done somewhere in IR transformations, because inlining may mix function bodies with different fp options.
> Kevin is right. We have decided that if constrained intrinsics are used anywhere in a function they must be used throughout the function. Otherwise, there would be nothing to prevent the non-constrained FP operations from migrating across constrained operations and the handling could get botched. The "relaxed" arguments ("round.tonearest" and "fpexcept.ignore") should be used where the default settings would apply. The front end should also be setting the "strictfp" attribute on calls within a constrained scope and, I think, functions that contain constrained intrinsics.
> We will need to teach the inliner to enforce this rule if it isn't already doing so, but if things aren't correct coming out of the front end an incorrect optimization could already happen before we get to the inliner. We always rely on the front end producing IR with fully correct semantics.
Replacement of floating point operations with constrained intrinsics seems more an optimization helper then a semantic requirement. IR where constrained operations are mixed with unconstrained is still valid in sense of IR specification. Tools that use IR for something other than code generation may don't need such replacement. If the replacement is made by a separate pass, such tool can turn it off, but if it is a part of clang codegen, there is no simple solution, the tool must be reworked.
Another issue is non-standard rounding. It can be represented by constrained intrinsics only. The rounding does not require restrictions on code motion, so mixture of constrained and unconstrained operation is OK. Replacement of all operations with constrained intrinsics would give poorly optimized code, because compiler does not optimize them. It would be a bad thing if a user adds the pragma to execute a statement with specific rounding mode and loses optimization.
Using dedicated pass to shape fp operations seems a flexible solution. It allows to implement things like `#pragma STDC FENV_ROUND` without teaching all passes to work with constrained intrinsics.
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
More information about the cfe-commits