[llvm] [WASM] Add support for memcmp expansion (PR #148298)

Sun Jul 13 06:47:58 PDT 2025

================
@@ -141,6 +141,16 @@ InstructionCost WebAssemblyTTIImpl::getCastInstrCost(
   return BaseT::getCastInstrCost(Opcode, Dst, Src, CCH, CostKind, I);
 }
 
+WebAssemblyTTIImpl::TTI::MemCmpExpansionOptions
+WebAssemblyTTIImpl::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const {
+  TTI::MemCmpExpansionOptions Options;
+  // INFO: I'm not sure what determines this, setting 2 conservatively
+  Options.NumLoadsPerBlock = 2;
+  Options.LoadSizes.append({8, 4, 2, 1});
----------------
badumbatish wrote:

I can add them in but the wasm won't use load128 for now since the type legalization doesn't understand load i128 is load i64x2 etc etc. Should I let this one be in another PR ?
```
Optimized lowered selection DAG: %bb.0 'memcmp_expand_32:'
SelectionDAG has 21 nodes:
  t0: ch,glue = EntryToken
  t2: i32 = WebAssemblyISD::ARGUMENT TargetConstant:i32<0>
  t4: i32 = WebAssemblyISD::ARGUMENT TargetConstant:i32<1>
            t7: i128,ch = load<(load (s128) from %ir.a, align 1)> t0, t2, undef:i32
            t8: i128,ch = load<(load (s128) from %ir.b, align 1)> t0, t4, undef:i32
          t9: i128 = xor t7, t8
              t11: i32 = add t2, Constant:i32<16>
            t13: i128,ch = load<(load (s128) from %ir.4, align 1)> t0, t11, undef:i32
              t12: i32 = add t4, Constant:i32<16>
            t14: i128,ch = load<(load (s128) from %ir.5, align 1)> t0, t12, undef:i32
          t15: i128 = xor t13, t14
        t16: i128 = or t9, t15
      t26: i1 = setcc t16, Constant:i128<0>, seteq:ch
    t23: i32 = any_extend t26
  t24: ch = WebAssemblyISD::RETURN t0, t23



Type-legalized selection DAG: %bb.0 'memcmp_expand_32:'
SelectionDAG has 33 nodes:
  t0: ch,glue = EntryToken
  t2: i32 = WebAssemblyISD::ARGUMENT TargetConstant:i32<0>
  t4: i32 = WebAssemblyISD::ARGUMENT TargetConstant:i32<1>
  t11: i32 = add t2, Constant:i32<16>
  t12: i32 = add t4, Constant:i32<16>
            t30: i64,ch = load<(load (s64) from %ir.a, align 1)> t0, t2, undef:i32
            t35: i64,ch = load<(load (s64) from %ir.b, align 1)> t0, t4, undef:i32
          t39: i64 = xor t30, t35
            t41: i64,ch = load<(load (s64) from %ir.4, align 1)> t0, t11, undef:i32
            t45: i64,ch = load<(load (s64) from %ir.5, align 1)> t0, t12, undef:i32
          t49: i64 = xor t41, t45
        t51: i64 = or t39, t49
              t32: i32 = add nuw t2, Constant:i32<8>
            t33: i64,ch = load<(load (s64) from %ir.a + 8, align 1)> t0, t32, undef:i32
              t36: i32 = add nuw t4, Constant:i32<8>
            t37: i64,ch = load<(load (s64) from %ir.b + 8, align 1)> t0, t36, undef:i32
          t40: i64 = xor t33, t37
              t42: i32 = add nuw t11, Constant:i32<8>
            t43: i64,ch = load<(load (s64) from %ir.4 + 8, align 1)> t0, t42, undef:i32
              t46: i32 = add nuw t12, Constant:i32<8>
            t47: i64,ch = load<(load (s64) from %ir.5 + 8, align 1)> t0, t46, undef:i32
          t50: i64 = xor t43, t47
        t52: i64 = or t40, t50
      t54: i64 = or t51, t52
    t53: i32 = setcc t54, Constant:i64<0>, seteq:ch
  t24: ch = WebAssemblyISD::RETURN t0, t53
```

https://github.com/llvm/llvm-project/pull/148298