[PATCH] D55897: Add constrained fptrunc and fpext intrinsics

Wed Jan 30 11:09:53 PST 2019

andrew.w.kaylor added inline comments.

================
Comment at: docs/LangRef.rst:14529
+      @llvm.experimental.constrained.fptrunc(<type> <value>,
+                                          metadata <exception behavior>)
+
----------------
cwabbott wrote:
> cwabbott wrote:
> > kpn wrote:
> > > cwabbott wrote:
> > > > Hi,
> > > > 
> > > > I've been working on implementing a Vulkan extension which allows the user to specifiy different rounding modes for the AMDGPU backend. I'm not sure how this works in C/C++, but we're required to support floating-point truncation with non-standard rounding modes. Is there a reason the rounding mode isn't an argument here?
> > > Going back and rereading D43515, I don't see an explicit reason given back then. And I can't find anything in the C99 or IEEE 754 standards, or in the LLVM documentation, that would mandate any particular rounding mode. So I'm open to adding a rounding mode argument to the constrained fptrunc.
> > > 
> > > Andrew? What do you think of making constrained fptrunc go back to taking a rounding mode argument? 
> > > 
> > > Having said that, the constrained FP intrinsics are to avoid optimizations that change program behavior taking traps into account. Is this the behavior you need for Vulcan?
> > Vulkan doesn't support trapping floating-point exceptions, so we don't have to worry about that. However, my understanding is that we still need to communicate the rounding mode to LLVM, to prevent it from constant folding floating-point operations with the wrong rounding mode, so we still need to use the intrinsics.
> > 
> > There's also the complication that we may need to emit some "internal" floating-point operations which need to have some defined rounding mode different from what the user specified. The backend has total control over the control register that specifies the rounding mode, and in some cases (e.g. fp32 -> fp16 truncation) the rounding mode is actually specified statically by the instruction itself rather than the control register, so my thought was that we could make AMDGPU just always use the rounding specified by the argument to the constrained intrinsic, emitting changes to the control register if necessary. I'm not sure if the CodeGen infrastructure is set up to do that.
> Oh, and I forgot another thing: the extension also adds support for letting the user either flush or preserve denormalized values. However, this is per-source-module, and sometimes we need to stitch together multiple source modules which have different rounding needs, emitting an instruction in between them to change the denorm flushing and/or rounding mode. So it seems we really do need to use the constrained intrinsics, to prevent code motion of floating-point operations around that register setting.
We should definitely have a rounding mode argument for fptrunc. I think the reason we missed that the first time around is probably due to the unfortunate naming of this operation (i.e. it isn't actually truncating) and the confusion with ISD::ROUND.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55897/new/

https://reviews.llvm.org/D55897