[llvm] [VPlan] Support scalable VFs in narrowInterleaveGroups. (PR #154842)

Thu Aug 28 03:14:04 PDT 2025

================
@@ -16,18 +16,15 @@ define void @load_store_interleave_group(ptr noalias %data) {
 ; CHECK-NEXT:    [[TMP3:%.*]] = mul nuw i64 [[TMP2]], 2
 ; CHECK-NEXT:    [[N_MOD_VF:%.*]] = urem i64 100, [[TMP3]]
 ; CHECK-NEXT:    [[N_VEC:%.*]] = sub i64 100, [[N_MOD_VF]]
+; CHECK-NEXT:    [[TMP6:%.*]] = call i64 @llvm.vscale.i64()
----------------
david-arm wrote:

Ah I think I see. We only perform the transform if the VF is divisible by the interleave factor, which currently excludes interleave factors that aren't powers of 2. So doesn't have to be done in this PR, but I do think N_VEC should be recalculated because we're making the scalar tail longer than it needs to be. Suppose the original trip count was 19, the interleave factor is 4 and the VF is 4. N_VEC will be 19 - (19 % 4) = 16, which means we're only processing 16 iterations when in reality we can process all 19 and delete the tail completely. For scalable VFs we can't delete the tail, but we can still process more iterations in the vector loop, if that makes sense?

https://github.com/llvm/llvm-project/pull/154842