[llvm] [TableGen] List the indices of sub-operands (PR #163723)

Thu Oct 16 01:16:36 PDT 2025

https://github.com/statham-arm created https://github.com/llvm/llvm-project/pull/163723

Some instances of the `Operand` class used in Tablegen instruction definitions expand to a cluster of multiple operands at the MC layer, such as complex addressing modes involving base + offset + shift, or clusters of operands describing conditional Arm instructions or predicated MVE instructions. There's currently no convenient way for C++ code to know the offset of one of those sub-operands from the start of the cluster: instead it just hard-codes magic numbers like `index+2`, which is hard to read and fragile.

This patch adds an extra piece of output to `InstrInfoEmitter` to define those instruction offsets, based on the name of the `Operand` class instance in Tablegen, and the names assigned to the sub-operands in the `MIOperandInfo` field. For example, if target Foo were to define

  def Bar : Operand {
    let MIOperandInfo = (ops GPR:$first, i32imm:$second);
    // ...
  }

then the new constants would be `Foo::SUBOP_Bar_first` and `Foo::SUBOP_Bar_second`, defined as 0 and 1 respectively.

As an example, I've converted some magic numbers related to the MVE predication operand types (`vpred_n` and its superset `vpred_r`) to use the new named constants in place of the integer literals they previously used. This is more verbose, but also clearer, because it explains why the integer is chosen instead of what its value is.

>From d940ad2062a04e1294277c459b585db990f77d9f Mon Sep 17 00:00:00 2001
From: Simon Tatham <simon.tatham at arm.com>
Date: Wed, 15 Oct 2025 17:00:21 +0100
Subject: [PATCH] [TableGen] List the indices of sub-operands

Some instances of the `Operand` class used in Tablegen instruction
definitions expand to a cluster of multiple operands at the MC layer,
such as complex addressing modes involving base + offset + shift, or
clusters of operands describing conditional Arm instructions or
predicated MVE instructions. There's currently no convenient way for
C++ code to know the offset of one of those sub-operands from the
start of the cluster: instead it just hard-codes magic numbers like
`index+2`, which is hard to read and fragile.

This patch adds an extra piece of output to `InstrInfoEmitter` to
define those instruction offsets, based on the name of the `Operand`
class instance in Tablegen, and the names assigned to the sub-operands
in the `MIOperandInfo` field. For example, if target Foo were to
define

  def Bar : Operand {
    let MIOperandInfo = (ops GPR:$first, i32imm:$second);
    // ...
  }

then the new constants would be `Foo::SUBOP_Bar_first` and
`Foo::SUBOP_Bar_second`, defined as 0 and 1 respectively.

As an example, I've converted some magic numbers related to the MVE
predication operand types (`vpred_n` and its superset `vpred_r`) to
use the new named constants in place of the integer literals they
previously used. This is more verbose, but also clearer, because it
explains why the integer is chosen instead of what its value is.
---
 llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp        |  3 ++-
 .../Target/ARM/MVETPAndVPTOptimisationsPass.cpp    |  2 +-
 llvm/utils/TableGen/InstrInfoEmitter.cpp           | 14 ++++++++++++++
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp b/llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
index 96ee69cf3f4ce..eda84602d310c 100644
--- a/llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
+++ b/llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp
@@ -882,7 +882,8 @@ static bool producesFalseLanesZero(MachineInstr &MI,
       continue;
     // Skip the lr predicate reg
     int PIdx = llvm::findFirstVPTPredOperandIdx(MI);
-    if (PIdx != -1 && (int)MO.getOperandNo() == PIdx + 2)
+    if (PIdx != -1 &&
+        (int)MO.getOperandNo() == PIdx + ARM::SUBOP_vpred_n_tp_reg)
       continue;
 
     // Check that this instruction will produce zeros in its false lanes:
diff --git a/llvm/lib/Target/ARM/MVETPAndVPTOptimisationsPass.cpp b/llvm/lib/Target/ARM/MVETPAndVPTOptimisationsPass.cpp
index 5eeb4fe995485..413e8442419fd 100644
--- a/llvm/lib/Target/ARM/MVETPAndVPTOptimisationsPass.cpp
+++ b/llvm/lib/Target/ARM/MVETPAndVPTOptimisationsPass.cpp
@@ -534,7 +534,7 @@ bool MVETPAndVPTOptimisations::ConvertTailPredLoop(MachineLoop *ML,
     Register LR = LoopPhi->getOperand(0).getReg();
     for (MachineInstr *MI : MVEInstrs) {
       int Idx = findFirstVPTPredOperandIdx(*MI);
-      MI->getOperand(Idx + 2).setReg(LR);
+      MI->getOperand(Idx + ARM::SUBOP_vpred_n_tp_reg).setReg(LR);
     }
   }
 
diff --git a/llvm/utils/TableGen/InstrInfoEmitter.cpp b/llvm/utils/TableGen/InstrInfoEmitter.cpp
index 176e4b250b82a..b783303af529c 100644
--- a/llvm/utils/TableGen/InstrInfoEmitter.cpp
+++ b/llvm/utils/TableGen/InstrInfoEmitter.cpp
@@ -1135,6 +1135,20 @@ void InstrInfoEmitter::run(raw_ostream &OS) {
 
   OS << "\n};\n} // end namespace llvm\n";
 
+  OS << "namespace llvm::" << Target.getInstNamespace() << " {\n";
+  for (const Record *R : Records.getAllDerivedDefinitions("Operand")) {
+    if (R->isAnonymous())
+      continue;
+    if (const DagInit *D = R->getValueAsDag("MIOperandInfo")) {
+      for (unsigned i = 0, e = D->getNumArgs(); i < e; ++i) {
+        if (const StringInit *Name = D->getArgName(i))
+          OS << "  constexpr unsigned SUBOP_" << R->getName() << "_"
+             << Name->getValue() << " = " << i << ";\n";
+      }
+    }
+  }
+  OS << "} // end namespace llvm::" << Target.getInstNamespace() << "\n";
+
   OS << "#endif // GET_INSTRINFO_HEADER\n\n";
 
   OS << "#ifdef GET_INSTRINFO_HELPER_DECLS\n";