[llvm] [LAA] Add initial support for non-power-of-2 store-load forwarding distance (PR #137873)

Alexey Bataev via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 14 10:46:29 PDT 2025


================
@@ -1731,24 +1732,61 @@ bool MemoryDepChecker::couldPreventStoreLoadForward(uint64_t Distance,
       break;
     }
   }
+  // RISCV VLA supports non-power-2 vector factor. So, we iterate in a
+  // backward order to find largest VF, which allows aligned stores-loads or
+  // the number of iterations between conflicting memory addresses is not less
+  // than 8 (NumItersForStoreLoadThroughMemory).
+  if (AllowNonPow2Deps) {
+    MaxVFWithoutSLForwardIssuesNonPowerOf2 =
+        std::min(8 * VectorizerParams::MaxVectorWidth / TypeByteSize,
+                 MaxNonPowerOf2StoreLoadForwardSafeDistanceInBits);
+
+    for (uint64_t VF = MaxVFWithoutSLForwardIssuesNonPowerOf2;
----------------
alexey-bataev wrote:

> Ok I see. But then it behaves different to `MaxVFWithoutSLForwardIssuesPowerOf2`. With `MaxVFWithoutSLForwardIssuesPowerOf2` limits the Max VF we can use and we can also use any VF between 1 and MaxVF.
> 

Not quite so. Any power-of-2 VF, but not any VF.

> Is `MaxVFWithoutSLForwardIssuesNonPowerOf2` a single non-power-of-2 VF we can use, but other VFs between 1 .. MaxVFWithoutSLForwardIssuesNonPowerOf2 may not be used?
> 

Not necessary. We can use any whole divider of the  MaxVFWithoutSLForwardIssuesNonPowerOf2. Say, if MaxVFWithoutSLForwardIssuesNonPowerOf2 is 9, then we can use 3 and 9. If it is 6, we can use 2, 3, 6. All these are safe.

> Am I understanding correctly for example with, Max pow2 VF = 2, MaxNonPowOf2 VF = 9, LV can either chose 2 or 16 (with limiting VF to 9)?
> 

The vector factor can be 2, 4, 8 or 16. But with non-power-of-2 we need an extra check ( or special instruction), that the number of the processed elements is limited by 9 or 3 elements only.

> If that's the case, I think it. would be good to update the comment for `MaxNonPowerOf2StoreLoadForwardSafeDistanceInBits` to make this difference to the power-of-2 variant clear, as they would behave quite differently.

Suggestions? Any preferences here?

https://github.com/llvm/llvm-project/pull/137873


More information about the llvm-commits mailing list