[clang] [llvm] [mlir] [AArch64][SME] Remove immediate argument restriction for svldr and svstr (PR #68565)

Sam Tebbs via cfe-commits cfe-commits at lists.llvm.org
Tue Nov 14 03:33:02 PST 2023


================
@@ -4825,6 +4827,113 @@ SDValue AArch64TargetLowering::getPStateSM(SelectionDAG &DAG, SDValue Chain,
                      Mask);
 }
 
+// Lower an SME LDR/STR ZA intrinsic to LDR_ZA_PSEUDO or STR_ZA.
+// Case 1: If the vector number (vecnum) is an immediate in range, it gets
+// folded into the instruction
+//    ldr(%tileslice, %ptr, 11) -> ldr [%tileslice, 11], [%ptr, 11]
+// Case 2: If the vecnum is not an immediate, then it is used to modify the base
+// and tile slice registers
+//    ldr(%tileslice, %ptr, %vecnum)
+//    ->
+//    %svl = rdsvl
+//    %ptr2 = %ptr + %svl * %vecnum
+//    %tileslice2 = %tileslice + %vecnum
+//    ldr [%tileslice2, 0], [%ptr2, 0]
+// Case 3: If the vecnum is an immediate out of range, then the same is done as
+// case 2, but the base and slice registers are modified by the greatest
+// multiple of 15 lower than the vecnum and the remainder is folded into the
+// instruction. This means that successive loads and stores that are offset from
+// each other can share the same base and slice register updates.
+//    ldr(%tileslice, %ptr, 22)
+//    ldr(%tileslice, %ptr, 23)
+//    ->
+//    %svl = rdsvl
+//    %ptr2 = %ptr + %svl * 15
+//    %tileslice2 = %tileslice + 15
+//    ldr [%tileslice2, 7], [%ptr2, 7]
+//    ldr [%tileslice2, 8], [%ptr2, 8]
+// Case 4: If the vecnum is an add of an immediate, then the non-immediate
+// operand and the immediate can be folded into the instruction, like case 2.
+//    ldr(%tileslice, %ptr, %vecnum + 7)
+//    ldr(%tileslice, %ptr, %vecnum + 8)
+//    ->
+//    %svl = rdsvl
+//    %ptr2 = %ptr + %svl * %vecnum
+//    %tileslice2 = %tileslice + %vecnum
+//    ldr [%tileslice2, 7], [%ptr2, 7]
+//    ldr [%tileslice2, 8], [%ptr2, 8]
+// Case 5: The vecnum being an add of an immediate out of range is also handled,
+// in which case the same remainder logic as case 3 is used.
+SDValue LowerSMELdrStr(SDValue N, SelectionDAG &DAG, bool IsLoad) {
+  SDLoc DL(N);
+
+  SDValue TileSlice = N->getOperand(2);
+  SDValue Base = N->getOperand(3);
+  SDValue VecNum = N->getOperand(4);
+  int Addend = 0;
+
+  // If the vnum is an add, we can fold that add into the instruction if the
+  // operand is an immediate. The range check is performed below.
+  if (VecNum.getOpcode() == ISD::ADD) {
+    if (auto ImmNode = dyn_cast<ConstantSDNode>(VecNum.getOperand(1))) {
+      Addend = ImmNode->getSExtValue();
+      VecNum = VecNum.getOperand(0);
+    }
+  }
+
+  SDValue Remainder = DAG.getTargetConstant(Addend, DL, MVT::i32);
+
+  // true if the base and slice registers need to be modified
+  bool NeedsAdd = true;
+  auto ImmNode = dyn_cast<ConstantSDNode>(VecNum);
+  if (ImmNode || Addend != 0) {
+    int Imm = ImmNode ? ImmNode->getSExtValue() + Addend : Addend;
+    Remainder = DAG.getTargetConstant(Imm % 16, DL, MVT::i32);
+    if (Imm >= 0 && Imm <= 15) {
+      // If vnum is an immediate in range then we don't need to modify the tile
+      // slice and base register. We could also get here because Addend != 0 but
+      // vecnum is not an immediate, in which case we still want the base and
+      // slice register to be modified
+      NeedsAdd = !ImmNode;
----------------
SamTebbs33 wrote:

I actually didn't realise that `SDValue()` is a falsey value. That certainly does eliminate the need for the `NeedsAdd` boolean. Thank you!

https://github.com/llvm/llvm-project/pull/68565


More information about the cfe-commits mailing list