[PATCH] D136862: [AArch64][SME2] Add CodeGen support for target("aarch64.svcount").

Thu Feb 16 05:51:44 PST 2023

paulwalker-arm added a comment.

My `class PPRClass` related comment is possibly problematic.  My  preference is to have a clean split between the two predicate types but I realise this is the start of the journey so if making the change results is too much churn then I'm happy to defer it to a refactoring exercise once the initial support has settled.  That said, I'd still prefer to see how custom lowering works out rather than duplicating isel patterns.

I'm just going to throw this out there without giving it much thought but what about enabling the ability to `ISD::BITCAST` between `svcount` and `nxv16i1`?  I think this better reflects what's going on compared to using reinterpret_cast.  It also means we might get "free" operation legalisation by using `setOperationPromotedToType`.

================
Comment at: llvm/include/llvm/Support/MachineValueType.h:401-406
+      switch (SimpleTy) {
+      case MVT::aarch64svcount:
+        return true;
+      default:
+        return false;
+      }
----------------
This is more extensible but would `return SimpleTy == MVT::aarch64svcount;` be so bad?

================
Comment at: llvm/lib/CodeGen/CodeGenPrepare.cpp:7699
   // low/high parts. Bailing out for now.
-  if (isa<ScalableVectorType>(StoreType))
+  if (isa<ScalableVectorType>(StoreType) || StoreType->isScalableTargetExtTy())
     return false;
----------------
Do you think it's worth `Type` having an `isScalable()` function? if this will be a common idiom.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:17556-17557

+  if (N->getValueType(0).isScalableTargetExtVT())
+    return SDValue();
+
----------------
What is it about this combine that's problematic? Can you add a code comment for the rational.

Is it specially related to scalable target types or just target types in general?

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:6448-6449
       case CCValAssign::Indirect:
         assert((VA.getValVT().isScalableVector() ||
+                VA.getValVT() == MVT::aarch64svcount ||
                 Subtarget->isWindowsArm64EC()) &&
----------------
Is this type specific or is it worth adding `VT.isScalable()`?

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:9281-9293
+    TVal = DAG.getNode(AArch64ISD::REINTERPRET_CAST, DL, MVT::nxv16i1, TVal);
+    FVal = DAG.getNode(AArch64ISD::REINTERPRET_CAST, DL, MVT::nxv16i1, FVal);
+    EVT CCVT = CCVal.getValueType();
+    SDValue ID =
+        DAG.getTargetConstant(Intrinsic::aarch64_sve_whilelo, DL, CCVT);
+    SDValue Zero = DAG.getConstant(0, DL, CCVT);
+    SDValue SplatVal = DAG.getNode(ISD::SIGN_EXTEND_INREG, DL, CCVT, CCVal,
----------------
Can this be simplified to:
```
REINTERPRET_CAST(SELECT(CCVal, REINTERPRET_CAST(TVAL), REINTERPRET_CAST(FVAL)))
```
Then we'll renter this function and reuse the existing predicate lowering code. Perhaps you tried this and it didn't work out?

================
Comment at: llvm/lib/Target/AArch64/AArch64RegisterInfo.td:894
                                   "AArch64",
-                                  [ nxv16i1, nxv8i1, nxv4i1, nxv2i1, nxv1i1 ], 16,
+                                  [ nxv16i1, nxv8i1, nxv4i1, nxv2i1, nxv1i1, aarch64svcount ], 16,
                                   (sequence "P%u", firstreg, lastreg)> {
----------------
Would it be overkill for `aarch64svcount` to have its own register class? I'm thinking it'll be nice to have isel protect us from incorrectly mixing predicate types and instructions. I'm also not sure if things like `SDTCisSameNumEltsAs` will cause is trouble, although I guess you've not hit anything so far.

The sequence below is using the non-count naming. Does this matter?

================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:2800
   defm Pat_Store_P16 : unpred_store_predicate<nxv16i1, STR_PXI>;
+  defm Pat_Store_PredAsCount : unpred_store_predicate<aarch64svcount, STR_PXI>;

----------------
Can the loads and stores (well anything that isn't a native svcount instruction) be done during lower?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136862/new/

https://reviews.llvm.org/D136862