[PATCH] D137721: [AArch64] Optimize more memcmp when the result is tested for [in]equality with 0

Wed Nov 9 19:18:57 PST 2022

Allen marked 5 inline comments as done.
Allen added inline comments.

================
Comment at: llvm/test/CodeGen/AArch64/bcmp.ll:43
+; or (and (xor a, b), C1), (and (xor c, d), C2)
 define i1 @bcmp3(ptr %a, ptr %b) {
 ; CHECK-LABEL: bcmp3:
----------------
dmgreen wrote:
> I think this might be fixed if performMemcmpCombine was called from lowerSETCC as well as from the combine. It looks like the issue is that it we do not call the performMemcmpCombine after the operands further up the tree have been simplified.
Thansk, apply with your commend

================
Comment at: llvm/test/CodeGen/AArch64/bcmp.ll:150
 ; CHECK-NEXT:    eor w9, w9, w10
 ; CHECK-NEXT:    and x9, x9, #0xff
 ; CHECK-NEXT:    eor x8, x8, x11
----------------
dmgreen wrote:
> This And shouldn't be here - it should be pushed higher into the ldrb's. There is some code that already tried to push ands up into loads, but I've not looked into whether it could be extended to handle this too.
Ok, I'll try to figure out that with a following patch, add a TODO for this case

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137721/new/

https://reviews.llvm.org/D137721