[llvm] [SVE] Wide active lane mask (PR #76514)

Wed Jan 10 05:48:06 PST 2024

================
@@ -1772,15 +1772,17 @@ void AArch64TargetLowering::addTypeForNEON(MVT VT) {
 
 bool AArch64TargetLowering::shouldExpandGetActiveLaneMask(EVT ResVT,
                                                           EVT OpVT) const {
-  // Only SVE has a 1:1 mapping from intrinsic -> instruction (whilelo).
-  if (!Subtarget->hasSVE())
+  // Only SVE/SME has a 1:1 mapping from intrinsic -> instruction (whilelo).
+  if (!Subtarget->hasSVEorSME())
----------------
david-arm wrote:

I agree with @CarolineConcatto. There are actually two changes happening here:

1. We are permitting the direct lowering of get.active.lane.mask intrinsic calls to while instructions when using SME, even if SVE is not present. It makes sense, but we should have an extra RUN line for SME in active_lane_mask.ll, i.e. -mattr=+sme because the nxv2i1, nxv4i1, nxv8i1 and nxv16i1 types should all work for SME.
2. We are introducing support for the nxv32i1 type when either SVE2p1 or SME2 is available. So we need two RUN lines in get-active-lane-mask-32x1.ll - one for SVE2p1 and one for SME2.

https://github.com/llvm/llvm-project/pull/76514