[llvm-branch-commits] [llvm] [MIR2Vec] Handle Operands (PR #163281)

S. VenkataKeerthy via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Wed Oct 15 11:17:22 PDT 2025


================
@@ -74,31 +76,114 @@ class MIRVocabulary {
   friend class llvm::MIR2VecVocabLegacyAnalysis;
   using VocabMap = std::map<std::string, ir2vec::Embedding>;
 
-private:
-  // Define vocabulary layout - adapted for MIR
+  // MIRVocabulary Layout:
+  // +-------------------+-----------------------------------------------------+
+  // | Entity Type       | Description                                         |
+  // +-------------------+-----------------------------------------------------+
+  // | 1. Opcodes        | Target specific opcodes derived from TII, grouped   |
+  // |                   | by instruction semantics.                           |
+  // | 2. Common Operands| All common operand types, except register operands, |
+  // |                   | defined by MachineOperand::MachineOperandType enum. |
+  // | 3. Physical       | Register classes defined by the target, specialized |
+  // |    Registers      | by physical registers.                              |
+  // | 4. Virtual        | Register classes defined by the target, specialized |
+  // |    Registers      | by virtual and physical registers.                  |
+  // +-------------------+-----------------------------------------------------+
+
+  /// Layout information for the MIR vocabulary. Defines the starting index
+  /// and size of each section in the vocabulary.
   struct {
     size_t OpcodeBase = 0;
-    size_t OperandBase = 0;
+    size_t CommonOperandBase = 0;
+    size_t PhyRegBase = 0;
+    size_t VirtRegBase = 0;
     size_t TotalEntries = 0;
   } Layout;
 
-  enum class Section : unsigned { Opcodes = 0, MaxSections };
+  enum class Section : unsigned {
+    Opcodes = 0,
+    CommonOperands = 1,
+    PhyRegisters = 2,
+    VirtRegisters = 3,
+    MaxSections
+  };
 
   ir2vec::VocabStorage Storage;
   mutable std::set<std::string> UniqueBaseOpcodeNames;
+  mutable SmallVector<std::string, 24> RegisterOperandNames;
+
+  // Some instructions have optional register operands that may be NoRegister.
+  // We return a zero vector in such cases.
+  mutable Embedding ZeroEmbedding;
----------------
svkeerthy wrote:

Problem with making it as `const` is that `Dimension` is not known till `Storage` is created in the end of the constructor. 

https://github.com/llvm/llvm-project/pull/163281


More information about the llvm-branch-commits mailing list