[PATCH] D46179: [X86] Lowering adds/addus/subs/subus intrinsics to native IR (LLVM part)

Tue May 15 22:16:47 PDT 2018

sroland added a comment.

FWIW as said I think could live with autoupgrade of at least the unsigned sat ones. These have similar complexity to pabs (pabs is cmp/sel/neg whereas these are add(or sub)/cmp/sel, although the add one also requires a constant), so if there were no concerns about the possibility of optimizations breaking pabs recognition, there might not be any here neither, if it actually works (which doesn't always seem to be the case according to the test cases?). (Although personally I'd have liked to keep pabs intrinsics too...)
Though disappearing intrinsics causing a null call in the generated code rather than some error when compiling remains a concern - but certainly I know about this one now so easily debugged :-).
Other than that, there's a bug in the logic of the addus, otherwise looks reasonable to me, though I'm not an expert on llvm internals.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:32275-32277
+      // x < x+y ? x+y : ~0 --> addus x, y
+      // x <= x+y ? x+y : ~0 --> addus x, y
+      if ((CC == ISD::SETULE || CC == ISD::SETULT) &&
----------------
This isn't quite right.
For  x < x+y this is false for y == 0 but it wouldn't be an overflow.
So it must be x <= x+y ? x+y : ~0 (or some reversed form like x > x+y ? ~0 : x+y - I suppose it doesn't make a difference in the end)

================
Comment at: test/CodeGen/X86/avx2-intrinsics-fast-isel.ll:2585-2587
+; CHECK-NEXT:    vpmaxub %ymm1, %ymm0, %ymm0
+; CHECK-NEXT:    vpsubb %ymm1, %ymm0, %ymm0
 ; CHECK-NEXT:    ret{{[l|q]}}
----------------
Err, so it doesn't actually recognize the pattern?

Repository:
  rL LLVM

https://reviews.llvm.org/D46179