[PATCH] D63782: [FPEnv] Add fptosi and fptoui constrained intrinsics

Thu Aug 1 11:37:10 PDT 2019

efriedma added inline comments.

================
Comment at: include/llvm/CodeGen/ISDOpcodes.h:307
+    /// unsigned integer. These have the same semantics as fptosi and fptoui 
+    /// in IR. If the FP value cannot fit in the integer type, the results are
+    /// undefined.
----------------
cameron.mcinally wrote:
> craig.topper wrote:
> > pengfei wrote:
> > > kpn wrote:
> > > > craig.topper wrote:
> > > > > Is this what we want from a strict implementation. "icc -fp-model=strict" goes out of its way to generate an invalid exception when the input type doesn't fit and the hardware can't generate an exception on its own. https://godbolt.org/z/ABU80i
> > > > No, I don't think this is what we want. I'll remove that sentence.
> > > I don't understand the behavior of icc here. SDM says this instruction can raise floating-point invalid exception when the result out of range. Why icc add the code to raise another divz exception?
> > The instructions raises an exception based on a 32-bit result size. But the C code has a 16-bit result size. We need to generate an exception if it doesn't fit in 16-bits to match the C code. But the available instructions can't do that.
> That seems like a hardware problem.
> 
> Does LLVM have a policy on fixing up hardware deficiencies? It's not obvious that we should care about the hardware doing the wrong thing.
> 
> @eli.friedman 
In general, IR instructions have set semantics, and we generate whatever native code is necessary to make the generated code match the specified semantics.  (This influences the way we specify instructions; for example, one of the reasons shl produces poison for large shift amounts is to allow it to be directly mapped to a hardware instruction on common targets.)

In this case, I think it's pretty clear; if we're going to have a "strict" fp-to-int conversion that promises to raise an overflow exception if and only if the conversion overflows, that should work as documented regardless of what the underlying hardware supports.  Every architecture has some fp-to-int conversions that need to be emulated, and the exact set can vary depending on the specific CPU target.  We don't want to produce unpredictable exceptions for emulated conversions.

If the performance penalty for this is too large, we could allow frontends to request a non-strict conversion that's allowed to generate arbitrary exceptions... but I'm not sure that would be useful in code that's cares about exceptions.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63782/new/

https://reviews.llvm.org/D63782