[PATCH] D58881: [Transform] Improve fold of sadd.with.overflow

Dan Robertson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Mar 3 16:49:53 PST 2019


dlrobertson marked 12 inline comments as done.
dlrobertson added inline comments.


================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:2044
+    // - Opposite signs with |C2|>|C1| can be folded to a single overflow call
+    // - Opposite signs with |C2|<=|C1| can be folded to a single nsw call
+    //
----------------
nagisa wrote:
> This last case can also be folded into an `overflow` call. While not optimal, it "feels" to me that we better do that, rather than skipping this case entirely, if just for now.
Per the comment below I'll update this in a follow up PR if that is okay.


================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:2046
+    //
+    // FIXME: Add the last case.
+    Value *X = nullptr;
----------------
nikic wrote:
> dlrobertson wrote:
> > I was not entirely sure how to solve the last case. `sadd.with.overflow` returns an aggregate, while `add nsw` does not. I was not sure how to construct the aggregate, so I did not address the last case in my first attempt.
> You can construct the aggregate with the `CreateOverflowTuple` helper, with overflow set to a false constant.
> 
> It's probably better to leave it out in this patch though, as this is quite a different optimization. Generally I'd drop all the sign/range checks here, as the fold is valid in either case -- there's just an even better one possible that's not implemented yet.
Ah! That makes perfect sense. Thanks!

> It's probably better to leave it out in this patch though, as this is quite a different optimization. 

Will do.


================
Comment at: llvm/test/Transforms/InstCombine/call-add-with-overflow.ll:1
+; RUN: opt < %s -instcombine -S | FileCheck %s
+
----------------
nikic wrote:
> To make future diffs easier, it's preferred to automatically generate checks using the `utils/update_test_checks.py` script.
Thanks!


================
Comment at: llvm/test/Transforms/InstCombine/call-add-with-overflow.ll:22
+  ret { i32, i1 } %3
+}
----------------
nikic wrote:
> As we recently added vector support for these intrinsics, we should also have some vector tests. Good cases to cover are:
> 
>  * Splat constant `<2 x i32> <i32 42, i32 42>` (will work out of the box once you switch to m_APInt)
>  * Splat constant with undef `<2 x i32> <i32 42, i32 undef>` (will also work out of the box)
>  * Non-splat constant `<2 x i32> <i32 42, i32 24>` (will not work, and that's okay)
> 
> Additionally, the overflowing case mentioned above should be checked, as a negative example. The inverted case here with `-7` and `13` rather than the other way around, would also be good.
> Splat constant with undef <2 x i32> <i32 42, i32 undef> (will also work out of the box)

The test I wrote for this did not fold, so I may have done something wrong.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D58881/new/

https://reviews.llvm.org/D58881





More information about the llvm-commits mailing list