[PATCH] D53157: Teach the IRBuilder about constrained fadd and friends

Mon Nov 19 13:14:09 PST 2018

cameron.mcinally added a comment.

In https://reviews.llvm.org/D53157#1302724, @uweigand wrote:

> A couple of comments on the previous discussion:
>
> 1. Instead of defining a new command line option, I'd prefer to use the existing options -frounding-math and -ftrapping-math to set the default behavior of math operations w.r.t. rounding modes and exception status.  (For compatibility with GCC if nothing else.)

I like this proposal.

> 1. I also read the C standard to imply that it is a requirement of **user code** to reset the status flag to default before switching back to FENV_ACCESS OFF.  The fundamental characterization of the pragma says "The FENV_ACCESS pragma provides a means **to inform the implementation** when a program might access the floating-point environment to test floating-point status flags or run under non-default floating-point control modes."  There is no mention anywhere that using the pragma, on its own, will ever **change** those control modes.   The last sentence about "... the floating-point control modes have their default setting", while indeed a bit ambiguous, is still consistent with an interpretation that it is the responsibility of user code to ensure that state, there is no explicit statement that the implementation will do so.

That's a fair interpretation. Andy mentioned:

In https://reviews.llvm.org/D53157#1303193, @andrew.w.kaylor wrote:

> I definitely agree with this interpretation of the standard. My understanding is that behavior is undefined if the user has not left the FP environment in the default state when transitioning to an FENV_ACCESS OFF region.

If we all agree upon that, then we simply have to treat the functions that modify the FPEnv, e.g. fesetexcept(...), as barriers. That way it does not matter if a FENV_ACCESS=OFF function is translated with constrained intrinsics or not, since nothing can be scheduled around these barriers.

> 1. I agree that we need to be careful about intermixing "normal" floating-point operations with strict ones.  However, I'm still not convinced that the pragma itself must be the scheduling barrier.  It seems to me that the compiler already knows where FP control flags are ever modified directly (this can only happen with intrinsics or the like), so the main issue is whether function calls need to be considered.  This is where the pragma comes in: in my mind, the primary difference between FENV_ACCESS ON and FENV_ACCESS OFF regions is that where the pragma is ON, function calls need to be considered (unless otherwise known for sure) to access FP control flags, while where the pragma is OFF, function calls can be considered to never touch FP control flags.  So the real scheduling barrier would be any **function call within a FENV_ACCESS ON region**.  Those would have to be marked by the front-end in the IR, presumably using a function attribute.  The common LLVM optimizers would then need to respect that scheduling barrier (here is where we likely still have an open issue, there doesn't appear to be any way to express that at the IR level for regular floating-point operations ...), and likewise the back-ends (but that looks straightforward: a back-end typically will model FP status as residing in a register or in a pseudo-memory slot, and those can simply be considered used/clobbered by function calls marked as within FENV_ACCESS ON regions).

I'm not sure if I fully understand this, but it seems to be an acceptable solution to the problem.

As mentioned above, couldn't we make the helper functions that read/write the FPEnv the barriers? That seems like a simpler solution.

https://reviews.llvm.org/D53157