[PATCH] D83216: [Intrinsic] Add sshl.sat/ushl.sat, saturated shift intrinsics.

Tue Jul 7 09:27:31 PDT 2020

lebedev.ri added a comment.

Thanks. This looks good to me in principle.

Alive2 support for these intrinsics: https://github.com/AliveToolkit/alive2/pull/448

================
Comment at: llvm/include/llvm/CodeGen/ISDOpcodes.h:313-314

+  /// RESULT = [US]SHLSAT(LHS, RHS) - Perform saturation left shift on 2
+  /// integers with the same bit width (W). If the true value of LHS << RHS
+  /// exceeds the largest value that can be represented by W bits, the
----------------
I'm not sure what `left shift on 2 integers` means.
Perhaps this needs some rewording.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp:810
+    Op1Promoted = SExtPromotedInteger(Op1);
+    Op2Promoted = SExtPromotedInteger(Op2);
+    ShiftOp = ISD::SRA;
----------------
`Op2Promoted = ZExtPromotedInteger(Op2);`

================
Comment at: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:7360-7365
+  // For signed shifts, we can check for overflow by checking if we would have
+  // shifted out any bits that disagree with the sign bit. For unsigned shifts,
+  // we can just check if we would have shifted out any ones.
+  // TODO: On targets that don't support CTLZ, it may be more efficient to pull
+  // down the bits to be shifted out and compare those to the signmask/zero
+  // instead.
----------------
Have you checked if naive `x != ((x << y) u/s>> y)` results in worse lowering?

================
Comment at: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:7377-7378
+    SDValue LSignBits = DAG.getNode(ISD::CTLZ, dl, VT, XORLHS);
+    Threshold = DAG.getNode(ISD::SUB, dl, VT, LSignBits,
+                            DAG.getConstant(1, dl, VT));
+  } else {
----------------
Why not just change predicate to `ISD::SETUGE`?

================
Comment at: llvm/test/CodeGen/X86/sshl_sat.ll:9
+declare  i18 @llvm.sshl.sat.i18  (i18, i18)
+declare  i64 @llvm.sshl.sat.i64  (i64, i64)
+declare  <4 x i32> @llvm.sshl.sat.v4i32(<4 x i32>, <4 x i32>)
----------------
Add `i32` test while at it?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83216/new/

https://reviews.llvm.org/D83216