[llvm] [LoopVectorize] Generate wide active lane masks (PR #147535)

Tue Jul 8 07:33:57 PDT 2025

================
@@ -0,0 +1,121 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mattr=+sve    < %s | FileCheck %s -check-prefix CHECK-SVE
+; RUN: llc -mattr=+sve2p1 < %s | FileCheck %s -check-prefix CHECK-SVE2p1
+
+target triple = "aarch64-unknown-linux"
+
+define void @scalable_wide_active_lane_mask(ptr %dst, ptr %src, i64 %n) #0 {
----------------
david-arm wrote:

I think for codegen tests it's not usual practice to have such large loops like this. Are you specifically trying to test something that requires a loop? If not, perhaps you can pick out the bits that you really care about and extract them into smaller tests? For example, if you're interested in the codegen for this:

```
  %active.lane.mask.next = tail call <vscale x 32 x i1> @llvm.get.active.lane.mask.nxv32i1.i64(i64 %index, i64 %4)
  %17 = tail call <vscale x 16 x i1> @llvm.vector.extract.nxv16i1.nxv32i1(<vscale x 32 x i1> %active.lane.mask.next, i64 0)
  %18 = tail call <vscale x 16 x i1> @llvm.vector.extract.nxv16i1.nxv32i1(<vscale x 32 x i1> %active.lane.mask.next, i64 16)
  %19 = extractelement <vscale x 16 x i1> %17, i64 0
```

you could pull that out into it's own test that returns the extracted element?

https://github.com/llvm/llvm-project/pull/147535