[PATCH] D86078: [AArch64] Improved lowering for saturating float to int.

Fri Sep 11 02:19:40 PDT 2020

ebevhan added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:3050
+        DAG.getNode(IsSigned ? ISD::FP_TO_SINT : ISD::FP_TO_UINT, dl, DstVT,
+                    Clamped);
+    return FpToInt;
----------------
efriedma wrote:
> ebevhan wrote:
> > Do you think it's still wrong to use FP_TO_INT in these lowerings?
> My concern is that you're making assumptions that doesn't match the specified semantics of FP_TO_SINT.  This is confusing, and some code could see this operation and perform some illegal transform on it.  (This sort of usage has caused miscompiles in the past.)
> 
> int_aarch64_neon_fcvtzs has the right semantics, or you could introduce a new target-specific node, or you could generate another FP_TO_SINT_SAT.  I guess I'd slightly prefer a new node.
Hm, I agree with what you're saying, so I opted for emitting FP_TO_INT_SAT nodes with DstWidth == SatWidth. 

However, while implementing I'm finding that I need to reimplement all of the custom vector lowering for FP_TO_INT for these as well, which seems a bit superfluous.

Would it be wrong to construct an FP_TO_INT node and then just immediately manually call LowerFP_TO_INT?

================
Comment at: llvm/test/CodeGen/AArch64/fptosi-sat-vector.ll:175
+; CHECK-NEXT:    fcvtzs v0.2d, v0.2d
+; CHECK-NEXT:    xtn v0.2s, v0.2d
 ; CHECK-NEXT:    ret
----------------
efriedma wrote:
> ebevhan wrote:
> > efriedma wrote:
> > > This should probably just be "fcvtzs v0.2d, v0.2d; sqxtn v0.2s, v0.2d".
> > Is sqxtn ever selected without intrinsics? I would assume that a simple minmax and trunc pattern would catch it, but that doesn't happen.
> Not sure off the top of my head, but I wouldn't be surprised if the answer is no.  Nobody has spent much time on saturation patterns.
> 
> The current patch generates xtn?  That isn't saturating, I think?
xtn doesn't saturate, no... It just seems to truncate.

But... Doesn't this actually mean that it's wrong? This is supposed to saturate to i32, but it's actually converting to i64 and then truncating. That can't be right.

You weren't wrong about assumptions on node behavior, lowering this to FP_TO_INT clearly didn't do the right thing. So this means that vector FP_TO_INT only saturates if the scalar types are the same bit width? I wonder if the same applies to f16.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86078/new/

https://reviews.llvm.org/D86078