[PATCH] D136672: [ExpandMemCmp][AArch64] Add a new option PreferCmpToExpand in inMemCmpExpansionOptions and enable on AArch64

chenglin.bi via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 27 19:34:32 PDT 2022


bcl5980 added a comment.

In D136672#3890483 <https://reviews.llvm.org/D136672#3890483>, @Allen wrote:

>> I have rebase to the code already include D136244 <https://reviews.llvm.org/D136244>. These two tickets looks the same but it looks D136244 <https://reviews.llvm.org/D136244> can't fix the original case bcmp 3bytes. 
>> And not only AArch64, maybe this change also help some potential platform.
>
>
>
> - For the case bcmp3, the combine pattern is mismatch as we restrict to **ISD::XOR**, so the **ISD::AND** may be enhanced too ?
>
>   (gdb) p LHS.getOperand(0)->getOpcode() == ISD::XOR && LHS.getOperand(1)->getOpcode() == ISD::XOR
>   $16 = false
>   (gdb) p LHS.dump()
>   t20: i32 = or t46, t50
>   $17 = void
>   (gdb) p LHS.getOperand(0).dump()
>   t46: i32 = and t44, Constant:i32<65535>
>   $18 = void
>   (gdb) p LHS.getOperand(1).dump()
>   t50: i32 = and t48, Constant:i32<255>

Of course, we can do it on AArch64. And I guess the code pattern match will be more complicated and long, you need to detect more 3 patterns:

  or (and (xor a, b), C1), (xor c, d)
  or (xor a, b), (and (xor c, d), C2)
  or (and (xor a, b), C1), (and (xor c, d), C2)

And you should only consider eq 0 when `and` invovle. And C1 <https://reviews.llvm.org/C1>, C2 value also involve some different result.
I guess there will be some tasks if you want to start.

For now, what I want to discuss is: Do we need this patch?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136672/new/

https://reviews.llvm.org/D136672



More information about the llvm-commits mailing list