[llvm-branch-commits] [llvm] [AArch64] Split large loop dependence masks (PR #153187)

Sun Aug 31 04:03:05 PDT 2025

================
@@ -5286,41 +5285,44 @@ AArch64TargetLowering::LowerLOOP_DEPENDENCE_MASK(SDValue Op,
       PtrA = DAG.getNode(ISD::ADD, DL, MVT::i64, PtrA, Addend);
     }
 
-    if (VT.isScalableVT())
-      return DAG.getNode(Op.getOpcode(), DL, VT, PtrA, PtrB, Op.getOperand(2));
-
-    // We can use the SVE whilewr/whilerw instruction to lower this
-    // intrinsic by creating the appropriate sequence of scalable vector
-    // operations and then extracting a fixed-width subvector from the scalable
-    // vector. Scalable vector variants are already legal.
-    EVT ContainerVT =
-        EVT::getVectorVT(*DAG.getContext(), VT.getVectorElementType(),
-                         VT.getVectorNumElements(), true);
-    EVT WhileVT = ContainerVT.changeElementType(MVT::i1);
-
-    SDValue Mask =
-        DAG.getNode(Op.getOpcode(), DL, WhileVT, PtrA, PtrB, Op.getOperand(2));
-    SDValue MaskAsInt = DAG.getNode(ISD::SIGN_EXTEND, DL, ContainerVT, Mask);
-    return DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, VT, MaskAsInt,
-                       DAG.getVectorIdxConstant(0, DL));
+    return DAG.getNode(Op.getOpcode(), DL, VT, PtrA, PtrB, Op.getOperand(2));
   };
 
   SDValue Result;
-  if (!Split) {
-    Result = LowerToWhile(FullVT, 0);
-  } else {
-
+  if (Split) {
----------------
SamTebbs33 wrote:

Yeah I separated the splitting and containerisation logic as per the request [here](https://github.com/llvm/llvm-project/pull/153187/commits/85b52942d61712aba884c12a60b98cbdaee2b233#r2304356700). I've experimented with adding another lambda that containerises without re-entering and that seems to have fixed the codegen.

https://github.com/llvm/llvm-project/pull/153187