[PATCH] D54749: Saturating float to int casts.

Thu Aug 27 02:32:52 PDT 2020

ebevhan added a comment.

In D54749#2239801 <https://reviews.llvm.org/D54749#2239801>, @rjmccall wrote:

> I can't competently review LLVM code-gen patches, but I can review intrinsic design.  If the intended use case here is fixed point, should there be a scale parameter rather than expecting the caller to scale the input?  Or is scaling the input basically the expected implementation on all hardware?  Scaling the input can be lossy if we overflow to infinity, so that would be making a tacit assumption here that if that happens, the original value must've been outside the representation range of the integer.  For `float` (and `bfloat`), the exponent goes up to 127, so you would need to be converting to a pretty large fixed-point format to have problems with lossy overflow.  For `half`, the exponent only goes up to 16, so converting to a 32-bit fixed-point format with a scale could overflow, and you probably won't be able to use these intrinsics.

This is a good point that I hadn't considered. I left a comment in D86632 <https://reviews.llvm.org/D86632>.

I was originally planning on adding such an intrinsic (with a scaling factor) but dropped the idea when I found this patch. You are right in that the small exponent of half precision is problematic, though. Not sure what to do about that.

In D54749#2239911 <https://reviews.llvm.org/D54749#2239911>, @efriedma wrote:

> Please take care of the clang-format lints.

I presume it's fine to only fix the ones that don't conform to some existing unconventional code layout?

> There are some missing optimizations here.  It's probably worth teaching LegalizeVectorOps to expand the conversion into vector ops.  And for conversions that correspond to a native instruction, there's probably some more efficient code sequence.  But those can be dealt with in followups.

For improved native support, I've pulled out D86079 <https://reviews.llvm.org/D86079> and D86078 <https://reviews.llvm.org/D86078> from the original patch.

By expand into vector ops, you mean to avoid breaking up the vector operation into scalar operations and then legalizing those, and instead expanding the vector operation directly to the corresponding operations? What would be the criteria for doing this rather than splitting it up first?

> Making the saturation type an argument seemed a little strange at first glance, but it seems to work out reasonably well.  Maybe could change the argument to be a type, like ISD::SIGN_EXTEND_INREG.

It was originally a type operand similar to SIGN_EXTEND_INREG, but the suggestion was to replace it with a constant operand instead to make things simpler. The only thing we care about is the saturation bit width.

It might have been fine to make it a scalar type operand even for vector types, though.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp:2683
+    if (TLI.isOperationLegalOrCustom(Opcode, NewOutTy))
+      break;
+  }
----------------
efriedma wrote:
> This can't be the only place that uses this loop; can it be refactored?
There are similar loops in some of the other PromoteLegal* functions. The loops in those functions could probably be refactored - they look compatible - but this loop isn't quite the same since the signed nodes can't be used to implement the unsigned ones.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:7987
+    SrcVT = Src.getValueType();
+  }
+
----------------
efriedma wrote:
> ebevhan wrote:
> > I also added this to solve issues in expanding f16 FP_TO_XINT nodes to libcalls. Libcalls with a source of f16 don't seem to exist, so we can't emit such nodes if we will end up making libcalls of them.
> > 
> > Maybe this should check operation and type legality, though.
> In theory, legalization should ensure we generate appropriate libcalls.
> 
> Maybe that doesn't actually happen at the moment, though.
There's some conditions in the libcall emission which do cause promotion of the source before emission, but it doesn't happen in every case (even though no libcalls from half->int seem to exist in the first place).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D54749/new/

https://reviews.llvm.org/D54749