[PATCH] D86078: [AArch64] Improved lowering for saturating float to int.

Fri Sep 11 15:28:05 PDT 2020

efriedma added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:3050
+        DAG.getNode(IsSigned ? ISD::FP_TO_SINT : ISD::FP_TO_UINT, dl, DstVT,
+                    Clamped);
+    return FpToInt;
----------------
ebevhan wrote:
> efriedma wrote:
> > ebevhan wrote:
> > > Do you think it's still wrong to use FP_TO_INT in these lowerings?
> > My concern is that you're making assumptions that doesn't match the specified semantics of FP_TO_SINT.  This is confusing, and some code could see this operation and perform some illegal transform on it.  (This sort of usage has caused miscompiles in the past.)
> > 
> > int_aarch64_neon_fcvtzs has the right semantics, or you could introduce a new target-specific node, or you could generate another FP_TO_SINT_SAT.  I guess I'd slightly prefer a new node.
> Hm, I agree with what you're saying, so I opted for emitting FP_TO_INT_SAT nodes with DstWidth == SatWidth. 
> 
> However, while implementing I'm finding that I need to reimplement all of the custom vector lowering for FP_TO_INT for these as well, which seems a bit superfluous.
> 
> Would it be wrong to construct an FP_TO_INT node and then just immediately manually call LowerFP_TO_INT?
> Would it be wrong to construct an FP_TO_INT node and then just immediately manually call LowerFP_TO_INT?

I think I'd prefer to just make LowerVectorFP_TO_INT work on both FP_TO_INT and FP_TO_INT_SAT nodes.

================
Comment at: llvm/test/CodeGen/AArch64/fptosi-sat-vector.ll:175
+; CHECK-NEXT:    fcvtzs v0.2d, v0.2d
+; CHECK-NEXT:    xtn v0.2s, v0.2d
 ; CHECK-NEXT:    ret
----------------
ebevhan wrote:
> efriedma wrote:
> > ebevhan wrote:
> > > efriedma wrote:
> > > > This should probably just be "fcvtzs v0.2d, v0.2d; sqxtn v0.2s, v0.2d".
> > > Is sqxtn ever selected without intrinsics? I would assume that a simple minmax and trunc pattern would catch it, but that doesn't happen.
> > Not sure off the top of my head, but I wouldn't be surprised if the answer is no.  Nobody has spent much time on saturation patterns.
> > 
> > The current patch generates xtn?  That isn't saturating, I think?
> xtn doesn't saturate, no... It just seems to truncate.
> 
> But... Doesn't this actually mean that it's wrong? This is supposed to saturate to i32, but it's actually converting to i64 and then truncating. That can't be right.
> 
> You weren't wrong about assumptions on node behavior, lowering this to FP_TO_INT clearly didn't do the right thing. So this means that vector FP_TO_INT only saturates if the scalar types are the same bit width? I wonder if the same applies to f16.
ISD::FP_TO_SINT always has the same semantics as the IR fptosi, regardless of what the actual underlying instructions do.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86078/new/

https://reviews.llvm.org/D86078