[llvm] [Intrinsics][AArch64] Add intrinsic to mask off aliasing vector lanes (PR #117007)

Wed Aug 6 03:58:41 PDT 2025

================
@@ -5201,6 +5215,51 @@ SDValue AArch64TargetLowering::LowerFSINCOS(SDValue Op,
 
 static MVT getSVEContainerType(EVT ContentTy);
 
+SDValue
+AArch64TargetLowering::LowerLOOP_DEPENDENCE_MASK(SDValue Op,
+                                                 SelectionDAG &DAG) const {
+  SDLoc DL(Op);
+  uint64_t EltSize = Op.getConstantOperandVal(2);
+  EVT VT = Op.getValueType();
+  // Make sure that the promoted mask size and element size match
----------------
sdesmalen-arm wrote:

This limitation is actually something that should be supported, e.g.
```
define <vscale x 16 x i1> @whilewr_nxv16i1_halfwords(ptr %a, ptr %b) {
entry:
  %0 = call <vscale x 16 x i1> @llvm.loop.dependence.war.mask(ptr %a, ptr %b, i64 2)
  ret <vscale x 16 x i1> %0
}
```
should not result in a compile-time failure, but instead result in code that is correct (even if suboptimal).
You can let this default to `Expand` by returning `SDValue()` here. The same thing should happen for EltSize  that's not in `{1, 2, 4, 8}`.

That means the operation on scalable types should be made `Custom` again (reverting from the previous revision)

Can you also add a test for both cases?

https://github.com/llvm/llvm-project/pull/117007