[llvm-branch-commits] [llvm] [AArch64] Split large loop dependence masks (PR #153187)

Fri Aug 29 06:06:02 PDT 2025

================
@@ -5286,41 +5285,44 @@ AArch64TargetLowering::LowerLOOP_DEPENDENCE_MASK(SDValue Op,
       PtrA = DAG.getNode(ISD::ADD, DL, MVT::i64, PtrA, Addend);
     }
 
-    if (VT.isScalableVT())
-      return DAG.getNode(Op.getOpcode(), DL, VT, PtrA, PtrB, Op.getOperand(2));
-
-    // We can use the SVE whilewr/whilerw instruction to lower this
-    // intrinsic by creating the appropriate sequence of scalable vector
-    // operations and then extracting a fixed-width subvector from the scalable
-    // vector. Scalable vector variants are already legal.
-    EVT ContainerVT =
-        EVT::getVectorVT(*DAG.getContext(), VT.getVectorElementType(),
-                         VT.getVectorNumElements(), true);
-    EVT WhileVT = ContainerVT.changeElementType(MVT::i1);
-
-    SDValue Mask =
-        DAG.getNode(Op.getOpcode(), DL, WhileVT, PtrA, PtrB, Op.getOperand(2));
-    SDValue MaskAsInt = DAG.getNode(ISD::SIGN_EXTEND, DL, ContainerVT, Mask);
-    return DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, VT, MaskAsInt,
-                       DAG.getVectorIdxConstant(0, DL));
+    return DAG.getNode(Op.getOpcode(), DL, VT, PtrA, PtrB, Op.getOperand(2));
   };
 
   SDValue Result;
-  if (!Split) {
-    Result = LowerToWhile(FullVT, 0);
-  } else {
-
+  if (Split) {
----------------
MacDue wrote:

The big difference with this revision is that you keep splitting masks, then only once you can't split anymore do you attempt to containerize fixed-with vectors into scalable vectors. Your previous approach would containerize fixed-width vectors as part of the first split, which leads to a better lowering. 

https://github.com/llvm/llvm-project/pull/153187