[PATCH] D53157: Teach the IRBuilder about constrained fadd and friends

Mon Nov 19 04:08:04 PST 2018

uweigand added a comment.

A couple of comments on the previous discussion:

1. Instead of defining a new command line option, I'd prefer to use the existing options -frounding-math and -ftrapping-math to set the default behavior of math operations w.r.t. rounding modes and exception status.  (For compatibility with GCC if nothing else.)
2. I also read the C standard to imply that it is a requirement of **user code** to reset the status flag to default before switching back to FENV_ACCESS OFF.  The fundamental characterization of the pragma says "The FENV_ACCESS pragma provides a means **to inform the implementation** when a program might access the floating-point environment to test floating-point status flags or run under non-default floating-point control modes."  There is no mention anywhere that using the pragma, on its own, will ever **change** those control modes.   The last sentence about "... the floating-point control modes have their default setting", while indeed a bit ambiguous, is still consistent with an interpretation that it is the responsibility of user code to ensure that state, there is no explicit statement that the implementation will do so.
3. I agree that we need to be careful about intermixing "normal" floating-point operations with strict ones.  However, I'm still not convinced that the pragma itself must be the scheduling barrier.  It seems to me that the compiler already knows where FP control flags are ever modified directly (this can only happen with intrinsics or the like), so the main issue is whether function calls need to be considered.  This is where the pragma comes in: in my mind, the primary difference between FENV_ACCESS ON and FENV_ACCESS OFF regions is that where the pragma is ON, function calls need to be considered (unless otherwise known for sure) to access FP control flags, while where the pragma is OFF, function calls can be considered to never touch FP control flags.  So the real scheduling barrier would be any **function call within a FENV_ACCESS ON region**.  Those would have to be marked by the front-end in the IR, presumably using a function attribute.  The common LLVM optimizers would then need to respect that scheduling barrier (here is where we likely still have an open issue, there doesn't appear to be any way to express that at the IR level for regular floating-point operations ...), and likewise the back-ends (but that looks straightforward: a back-end typically will model FP status as residing in a register or in a pseudo-memory slot, and those can simply be considered used/clobbered by function calls marked as within FENV_ACCESS ON regions).

https://reviews.llvm.org/D53157