[PATCH] D36498: [InstCombine] Teach foldSelectICmpAnd to recognize a (icmp slt trunc X, 0) and (icmp sgt trunc X, -1) as equivalent to an and with the sign bit of the truncated type
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 15 07:27:01 PDT 2017
spatel added a comment.
In https://reviews.llvm.org/D36498#841022, @craig.topper wrote:
> This patch is really just making InstCombine self consistent. We currently optimize this case differently depending on whether i8 is legal in datalayout.
>
> define i32 @test71(i32 %x) {
> ; CHECK-LABEL: @test71(
> ; CHECK-NEXT: [[TMP1:%.*]] = lshr i32 [[X:%.*]], 6
> ; CHECK-NEXT: [[TMP2:%.*]] = and i32 [[TMP1]], 2
> ; CHECK-NEXT: [[TMP3:%.*]] = or i32 [[TMP2]], 40
> ; CHECK-NEXT: ret i32 [[TMP3]]
> ;
>
> %1 = and i32 %x, 128
> %2 = icmp eq i32 %1, 0
> %3 = select i1 %2, i32 40, i32 42
> ret i32 %3
>
> }
>
> If we want to remove foldSelectICmpAnd that's a different question.
Ah, I didn't recognize what was going on. This is a sibling to https://reviews.llvm.org/D22537. Can you include a test that has a trunc in it from the start, so we are not dependent on the other combine? A code comment to show the complete transform would also make it a bit clearer for me.
FWIW, test71 is converted to math in the x86 backend for all 3 possibilities, but this doesn't happen for AArch64 or PPC where it's also likely a win. And for x86, it's different asm in all 3 cases:
With mask+cmp+sel:
%1 = and i32 %x, 128
%2 = icmp eq i32 %1, 0
%3 = select i1 %2, i32 40, i32 42
ret i32 %3
-->
andl $128, %edi
shrl $6, %edi
leal 40(%rdi), %eax
With shift+mask+or:
%1 = lshr i32 %x, 6
%2 = and i32 %1, 2
%3 = or i32 %2, 40
-->
shrl $6, %edi
andl $2, %edi
leal 40(%rdi), %eax
With trunc+cmp+sel
%1 = trunc i32 %x to i8
%2 = icmp sgt i8 %1, -1
%3 = select i1 %2, i32 40, i32 42
-->
xorl %eax, %eax
testb %dil, %dil
sets %al
leal 40(%rax,%rax), %eax
https://reviews.llvm.org/D36498
More information about the llvm-commits
mailing list