[PATCH] D136672: [ExpandMemCmp][AArch64] Add a new option PreferCmpToExpand in inMemCmpExpansionOptions and enable on AArch64
chenglin.bi via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 27 19:34:32 PDT 2022
bcl5980 added a comment.
In D136672#3890483 <https://reviews.llvm.org/D136672#3890483>, @Allen wrote:
>> I have rebase to the code already include D136244 <https://reviews.llvm.org/D136244>. These two tickets looks the same but it looks D136244 <https://reviews.llvm.org/D136244> can't fix the original case bcmp 3bytes.
>> And not only AArch64, maybe this change also help some potential platform.
>
>
>
> - For the case bcmp3, the combine pattern is mismatch as we restrict to **ISD::XOR**, so the **ISD::AND** may be enhanced too ?
>
> (gdb) p LHS.getOperand(0)->getOpcode() == ISD::XOR && LHS.getOperand(1)->getOpcode() == ISD::XOR
> $16 = false
> (gdb) p LHS.dump()
> t20: i32 = or t46, t50
> $17 = void
> (gdb) p LHS.getOperand(0).dump()
> t46: i32 = and t44, Constant:i32<65535>
> $18 = void
> (gdb) p LHS.getOperand(1).dump()
> t50: i32 = and t48, Constant:i32<255>
Of course, we can do it on AArch64. And I guess the code pattern match will be more complicated and long, you need to detect more 3 patterns:
or (and (xor a, b), C1), (xor c, d)
or (xor a, b), (and (xor c, d), C2)
or (and (xor a, b), C1), (and (xor c, d), C2)
And you should only consider eq 0 when `and` invovle. And C1 <https://reviews.llvm.org/C1>, C2 value also involve some different result.
I guess there will be some tasks if you want to start.
For now, what I want to discuss is: Do we need this patch?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D136672/new/
https://reviews.llvm.org/D136672
More information about the llvm-commits
mailing list