[PATCH] D58881: [Transform] Improve fold of sadd.with.overflow
Dan Robertson via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sun Mar 3 16:49:53 PST 2019
dlrobertson marked 12 inline comments as done.
dlrobertson added inline comments.
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:2044
+ // - Opposite signs with |C2|>|C1| can be folded to a single overflow call
+ // - Opposite signs with |C2|<=|C1| can be folded to a single nsw call
+ //
----------------
nagisa wrote:
> This last case can also be folded into an `overflow` call. While not optimal, it "feels" to me that we better do that, rather than skipping this case entirely, if just for now.
Per the comment below I'll update this in a follow up PR if that is okay.
================
Comment at: llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp:2046
+ //
+ // FIXME: Add the last case.
+ Value *X = nullptr;
----------------
nikic wrote:
> dlrobertson wrote:
> > I was not entirely sure how to solve the last case. `sadd.with.overflow` returns an aggregate, while `add nsw` does not. I was not sure how to construct the aggregate, so I did not address the last case in my first attempt.
> You can construct the aggregate with the `CreateOverflowTuple` helper, with overflow set to a false constant.
>
> It's probably better to leave it out in this patch though, as this is quite a different optimization. Generally I'd drop all the sign/range checks here, as the fold is valid in either case -- there's just an even better one possible that's not implemented yet.
Ah! That makes perfect sense. Thanks!
> It's probably better to leave it out in this patch though, as this is quite a different optimization.
Will do.
================
Comment at: llvm/test/Transforms/InstCombine/call-add-with-overflow.ll:1
+; RUN: opt < %s -instcombine -S | FileCheck %s
+
----------------
nikic wrote:
> To make future diffs easier, it's preferred to automatically generate checks using the `utils/update_test_checks.py` script.
Thanks!
================
Comment at: llvm/test/Transforms/InstCombine/call-add-with-overflow.ll:22
+ ret { i32, i1 } %3
+}
----------------
nikic wrote:
> As we recently added vector support for these intrinsics, we should also have some vector tests. Good cases to cover are:
>
> * Splat constant `<2 x i32> <i32 42, i32 42>` (will work out of the box once you switch to m_APInt)
> * Splat constant with undef `<2 x i32> <i32 42, i32 undef>` (will also work out of the box)
> * Non-splat constant `<2 x i32> <i32 42, i32 24>` (will not work, and that's okay)
>
> Additionally, the overflowing case mentioned above should be checked, as a negative example. The inverted case here with `-7` and `13` rather than the other way around, would also be good.
> Splat constant with undef <2 x i32> <i32 42, i32 undef> (will also work out of the box)
The test I wrote for this did not fold, so I may have done something wrong.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D58881/new/
https://reviews.llvm.org/D58881
More information about the llvm-commits
mailing list