[llvm] [PowerPC] fold i128 equality/inequality compares of two loads into a vectorized compare using vcmpequb.p when Altivec is available (PR #158657)
zhijian lin via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 14 09:06:05 PDT 2025
================
@@ -35,18 +35,13 @@ define signext i32 @zeroEqualityTest02(ptr %x, ptr %y) {
define signext i32 @zeroEqualityTest01(ptr %x, ptr %y) {
; CHECK-LABEL: zeroEqualityTest01:
; CHECK: # %bb.0:
-; CHECK-NEXT: ld 5, 0(3)
-; CHECK-NEXT: ld 6, 0(4)
-; CHECK-NEXT: cmpld 5, 6
-; CHECK-NEXT: bne 0, .LBB1_2
-; CHECK-NEXT: # %bb.1: # %loadbb1
-; CHECK-NEXT: ld 5, 8(3)
-; CHECK-NEXT: ld 4, 8(4)
-; CHECK-NEXT: li 3, 0
-; CHECK-NEXT: cmpld 5, 4
-; CHECK-NEXT: beqlr 0
-; CHECK-NEXT: .LBB1_2: # %res_block
-; CHECK-NEXT: li 3, 1
+; CHECK-NEXT: lxvd2x 34, 0, 4
+; CHECK-NEXT: lxvd2x 35, 0, 3
+; CHECK-NEXT: vcmpequb. 2, 3, 2
+; CHECK-NEXT: mfocrf 3, 2
+; CHECK-NEXT: rlwinm 3, 3, 25, 31, 31
+; CHECK-NEXT: cntlzw 3, 3
+; CHECK-NEXT: srwi 3, 3, 5
----------------
diggerlin wrote:
in the patch , we just make the
```
#include <memory.h>
int cmp16(const void *a, const void *b)
{
return memcmp(a, b, 16) == 0;
}
```
equal to
```
#include <altivec.h>
bool cmpeq16_2(const void *a, const void *b)
{
const vector unsigned char va = vec_xl(0, (unsigned char *)a);
const vector unsigned char vb = vec_xl(0, (unsigned char *)b);
return vec_all_eq(va, vb);
}
```
that is
Following code transforms the DAG
```
t0: ch,glue = EntryToken
t2: i64,ch = CopyFromReg t0, Register:i64 %0
t3: i128,ch = load<(load (s128) from %ir.a, align 1)> t0, t2,
undef:i64 t4: i64,ch = CopyFromReg t0, Register:i64 %1 t5: i128,ch =
load<(load (s128) from %ir.b, align 1)> t0, t4, undef:i64 t6: i1 =
setcc t3, t5, setne:ch
```
---->
```
t0: ch,glue = EntryToken
t2: i64,ch = CopyFromReg t0, Register:i64 %0
t3: v16i8,ch = load<(load (s128) from %ir.a, align 1)> t0, t2,
undef:i64 t4: i64,ch = CopyFromReg t0, Register:i64 %1 t5: v16i8,ch =
load<(load (s128) from %ir.b, align 1)> t0, t4, undef:i64 t6: i32 =
llvm.ppc.altivec.vcmpequb.p TargetConstant:i32<10505>,
Constant:i32<2>, t3, t5 t7: i1 = setcc t6, Constant:i32<0>, seteq:ch
```
I think we can have another patch to let
```
llvm.ppc.altivec.vcmpequb.p TargetConstant:i32<10505>,
Constant:i32<2>, t3, t5 t7: i1 = setcc t6, Constant:i32<0>, seteq:ch
```
convert to your instructions.
https://github.com/llvm/llvm-project/pull/158657
More information about the llvm-commits
mailing list