[llvm] [PowerPC] Implement a more efficient memcmp in cases where the length is known. (PR #158657)

Mon Oct 6 11:23:20 PDT 2025

================
@@ -15556,6 +15556,89 @@ SDValue PPCTargetLowering::combineSetCC(SDNode *N,
       SDValue Add = DAG.getNode(ISD::ADD, DL, OpVT, LHS, RHS.getOperand(1));
       return DAG.getSetCC(DL, VT, Add, DAG.getConstant(0, DL, OpVT), CC);
     }
+
+    // Optimization: Fold i128 equality/inequality compares of two loads into a
+    // vectorized compare using vcmpequb.p when VSX is available.
+    //
+    // Rationale:
+    //   A scalar i128 SETCC (eq/ne) normally lowers to multiple scalar ops.
+    //   On VSX-capable subtargets, we can instead reinterpret the i128 loads
+    //   as v16i8 vectors and use the Altivec/VSX vcmpequb.p instruction to
+    //   perform a full 128-bit equality check in a single vector compare.
+
+    if (Subtarget.hasVSX()) {
----------------
RolandF77 wrote:

vcmpequb is an Altivec vector instruction, not VSX.

https://github.com/llvm/llvm-project/pull/158657