[PATCH] D107692: [DAGCombine] Prevent the transform of combine for multi-use operand
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Aug 9 06:50:00 PDT 2021
spatel added reviewers: dmgreen, lebedev.ri, craig.topper, spatel, RKSimon.
spatel added a comment.
The transform that we want to make can be shown with this example (and so this test should be added somewhere as a preliminary commit):
define i32 @src(i32 %x, i32 %y) {
%add = add i32 %x, 65535 ; --> turn this into -1 because that can be encoded as an immediate operand
%srl = lshr i32 %y, 16
%r = and i32 %srl, %add
ret i32 %r
}
https://alive2.llvm.org/ce/z/Ehadpq
For AArch and PowerPC and possibly all targets (x86 would benefit too), we want this to use a -1 (or `sub 1`), not an `add 65535`:
sub w8, w0, #1
and w0, w8, w1, lsr #16
vs.
mov w8, #65535
add w8, w0, w8
and w0, w8, w1, lsr #16
But if you try this test with main today, it doesn't work:
https://godbolt.org/z/fe7h1zYv6
...because this transform has another bug: it doesn't match the commuted variant.
Over in PR51321, I mentioned that I could delete the whole transform and not see any test changes. That's because SimplifyDemandedBits does the same transform. But it has a similar problem - it only catches the case where it processes the shift before it sees the add. So it misses the same commuted pattern. I'm not sure how we'd fix that in DemandedBits; it seems like a generic ordering problem for commutative ops.
So this transform has at least 3 problems (assuming we want to keep it instead of enhancing DemandedBits):
1. It miscompiles the multi-use case because it is coded weirdly and missed a constraint.
2. It misses the commuted case.
3. It misses what is likely profitable for all targets by overspecifying the legality constraints.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D107692/new/
https://reviews.llvm.org/D107692
More information about the llvm-commits
mailing list