[PATCH 1/2] TableGen: Generate a function for getting operand indices based on their defined names

Wed Jun 19 11:18:32 PDT 2013

Hi Sean,

Attached is an updated patch based on your suggestions.  Let me know
what you think.

-Tom

On Mon, Jun 17, 2013 at 06:59:03PM -0700, Sean Silva wrote:
> Hi Tom,
> 
> This patch could really use some documentation.
> 
> I recommend adding a description of this new TableGen capability to the
> appropriate part of <http://llvm.org/docs/WritingAnLLVMBackend.html>
> (docs/WritingAnLLVMBackend.rst), or possibly adding a new page like <
> http://llvm.org/docs/HowToUseInstrMappings.html> (although that was a
> larger patch; most of my comments in the review for that feature are likely
> relevant for this patch <
> http://thread.gmane.org/gmane.comp.compilers.llvm.cvs/123031>) and linking
> to it from WritingAnLLVMBackend . At the very least, please describe the
> interface of the generated code (something like "this results in the
> generation of a namespace that looks like ..., and a function with the
> following signature and name: ..."), and a before/after of how to modify
> existing TableGen patterns to take advantage of this functionality. If you
> decide to make a separate page, you might find <
> http://llvm.org/docs/SphinxQuickstartTemplate.html> helpful.
> 
> Your second patch seems to introduce some tricky magic #define's to change
> the content of the .inc file. I think those ought to be documented
> alongside what I mentioned above.
> 
> but may have a different operand index depending on the instruction type.
> > It is useful to have a convenient way to look up the operand indices,
> > so these bits can be generically set on any instruction.
> > v2:
> >   - Don't uppercase enum values
> >   - Use table compresion to reduce function size
> > v3:
> >   - Only generate table for instructions with the UseNamedOperandTable
> >     bit set.
> >
> 
> These "patch versioning" remarks in the commit message don't seem to be
> really adding much. The review thread is sufficient record of the iteration
> on the patch.
> 
> +
> +  // Operand name -> index mapping
> +
> 
> Could you maybe give some description of what this does and what data
> structures it creates in the .inc file (or cross-reference to such
> documentation in docs/)? Also, possibly move the new code into separate
> functions/methods (unless it needs to access a ton of local variables).
> 
> +  std::map<std::string, unsigned> Operands;
> +  OpNameMapTy OperandMap;
> +  for (unsigned i = 0, e = NumberedInstructions.size(), NumOperands = 0;
> +                                                                  i != e;
> ++i) {
> +    const CodeGenInstruction *Inst = NumberedInstructions[i];
> +    if (!Inst->TheDef->getValueAsBit("UseNamedOperandTable")) {
> +      continue;
> +    }
> +    std::map<unsigned, unsigned> OpList;
> +    for (unsigned j = 0, je = Inst->Operands.size(); j != je; ++j) {
> +      CGIOperandList::OperandInfo Info = Inst->Operands[j];
> +      std::string Name =  Info.Name;
> +      if (Operands.count(Name) == 0) {
> +        Operands[Name] = NumOperands++;
> +      }
> +      unsigned OperandId = Operands[Name];
> +      OpList[OperandId] = Info.MIOperandNo;
> +    }
> +    OperandMap[OpList].push_back(Namespace + "::" +
> Inst->TheDef->getName());
> +  }
> 
> I really feel like this computation (which seems to just be initializing
> `Operands` and `OperandMap`) ought to be factored out into a method.
> 
> +  typedef std::map<std::map<unsigned, unsigned>,
> +                   std::vector<std::string> > OpNameMapTy;
> 
> This is a pretty complex data type. It deserves a bit of documentation
> about what each part means (what does each of the two `unsigned`'s
> represent? what does each std::string represent? what is being modeled by
> having a vector of strings?)
> 
> +    for (std::vector<std::string>::iterator ii = OpcodeList.begin(),
> +                                            ie = OpcodeList.end();
> +                                            ii != ie; ++ii) {
> +      std::string OpName = *ii;
> +      OS << "  case " << OpName << ":\n";
> +    }
> 
> This is really verbose. You should be able to reduce this to two lines by
> using a counted loop:
> 
> for (unsigned ii = 0, ie = OpcodeList.size() ; ii != ie; ++ii)
>   OS << "  case " << OpcodeList[ii] << ":\n";
> 
> +    for (unsigned Idx = 0; Idx < Operands.size(); ++Idx) {
> +      if (OpList.count(Idx) == 0) {
> +        OS << "-1";
> +      } else {
> +        OS << OpList[Idx];
> +      }
> +      OS << ", ";
> +    }
> 
> Don't evaluate `.size()` every time through the loop. Also, although it is
> a bit terse, the following loop seems more compact and readable to me:
> 
> for (unsigned i = 0, e = Operands.size(); i != e; ++i)
>   OS << (OpList.count(Idx) == 0 ? "-1" : OpList[Idx]) << ", ";
> 
> Target::getNamedOperandIdx(Target::ADD, Target::OpName::DST)  => 0
> > Target::getNamedOperandIdx(Target::ADD, Target::OpName::SRC0) => 1
> > Target::getNamedOperandIdx(Target::ADD, Target::OpName::SRC1) => 2
> >
> 
> 
> > The operand names are case insensitive, so $dst is equivalent to $DST.
> 
> 
> In the second patch, it seems like all the values in the namespace
> Target::OpName are purely lowercase, such as `src0_sel` in:
> 
> -    TII->getOperandIdx(Opcode, R600Operands::SRC0_SEL),
> -    TII->getOperandIdx(Opcode, R600Operands::SRC1_SEL),
> -    TII->getOperandIdx(Opcode, R600Operands::SRC2_SEL)
> +    TII->getOperandIdx(Opcode, AMDGPU::OpName::src0_sel),
> +    TII->getOperandIdx(Opcode, AMDGPU::OpName::src1_sel),
> +    TII->getOperandIdx(Opcode, AMDGPU::OpName::src2_sel)
> 
> I think the commit message (or better yet, documentation in docs/) of the
> first patch should indicate the case of the resulting constants (do they
> all get lowercased?).
> 
> Also, as a side note, the AMDGPU backend doesn't seem to have any content
> in <http://llvm.org/docs/CompilerWriterInfo.html>
> (docs/CompilerWriterInfo.rst). Would you mind adding some links to the
> appropriate manuals?
> 
> Thanks,
> 
> -- Sean Silva
-------------- next part --------------
>From f9d39dcf6aeafada398c62104a2d69ce113f1211 Mon Sep 17 00:00:00 2001
From: Tom Stellard <thomas.stellard at amd.com>
Date: Thu, 30 May 2013 10:10:50 -0700
Subject: [PATCH] TableGen: Generate a function for getting operand indices
 based on their defined names v4

This patch modifies TableGen to generate a function in
${TARGET}GenInstrInfo.inc called getNamedOperandIdx(), which can be used
to look up indices for operands based on their names.

In order to activate this feature for an instruction, you must set the
UseNamedOperandTable bit.

For example, if you have an instruction like:

def ADD : TargetInstr <(outs GPR:$dst), (ins GPR:$src0, GPR:$src1)>;

You can look up the operand indices using the new function, like this:

Target::getNamedOperandIdx(Target::ADD, Target::OpName::dst)  => 0
Target::getNamedOperandIdx(Target::ADD, Target::OpName::src0) => 1
Target::getNamedOperandIdx(Target::ADD, Target::OpName::src1) => 2

The operand names are case sensitive, so $dst and $DST are considered
different operands.

This change is useful for R600 which has instructions with a large number
of operands, many of which model single bit instruction configuration
values.  These configuration bits are common across most instructions,
but may have a different operand index depending on the instruction type.
It is useful to have a convenient way to look up the operand indices,
so these bits can be generically set on any instruction.

v2:
  - Don't uppercase enum values
  - Use table compresion to reduce function size

v3:
  - Only generate table for instructions with the UseNamedOperandTable
    bit set.

v4:
  - Clean up code and add documentation
---
 docs/WritingAnLLVMBackend.rst       |  41 ++++++++++++++
 include/llvm/Target/Target.td       |   5 ++
 utils/TableGen/InstrInfoEmitter.cpp | 109 ++++++++++++++++++++++++++++++++++++
 3 files changed, 155 insertions(+)

diff --git a/docs/WritingAnLLVMBackend.rst b/docs/WritingAnLLVMBackend.rst
index a03a5e4..f128bb1 100644
--- a/docs/WritingAnLLVMBackend.rst
+++ b/docs/WritingAnLLVMBackend.rst
@@ -911,6 +911,47 @@ format instructions will bind the operands to the ``rd``, ``rs1``, and ``rs2``
 fields.  This results in the ``XNORrr`` instruction binding ``$dst``, ``$b``,
 and ``$c`` operands to the ``rd``, ``rs1``, and ``rs2`` fields respectively.
 
+TableGen will also generate a function called getNamedOperandIdx() which
+can be used to look up an operand's index in a MachineInstr based on its
+TableGen name.  Setting the UseNamedOperandTable bit in an instruction's
+TableGen definition will add all of its operands to an enumeration in the
+llvm::XXX:OpName namespace and also add an entry for it into the OperandMap
+table, which can be queried using getNamedOperandIdx()
+
+.. code-block:: llvm
+
+  int DstIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::dst); // => 0
+  int BIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::b);     // => 1
+  int CIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::c);     // => 2
+  int DIndex = SP::getNamedOperandIdx(SP::XNORrr, SP::OpName::d);     // => -1
+
+  ...
+
+The entries in the OpName enum are taken verbatim from the TableGen definitions,
+so operands with lowercase names will have lower case entries in the enum.
+
+To include the getNamedOperandIdx() function in your backend, you will need
+to define a few preprocessor macros in XXXInstrInfo.cpp and XXXInstrInfo.h.
+For example:
+
+XXXInstrInfo.cpp:
+
+.. code-block:: llvm
+
+  #define GET_INSTRINFO_NAMED_OPS // For getNamedOperandIdx() function
+  #include "XXXGenInstrInfo.inc"
+
+XXXInstrInfo.h:
+
+.. code-block:: llvm
+
+  #define GET_INSTRINFO_OPERAND_ENUM // For OpName enum
+  #include "XXXGenInstrInfo.inc"
+
+  namespace XXX {
+    int16_t getNamedOperandIdx(uint16_t Opcode, uint16_t NamedIndex);
+  } // End namespace XXX
+
 Instruction Relation Mapping
 ----------------------------
 
diff --git a/include/llvm/Target/Target.td b/include/llvm/Target/Target.td
index a9644d4..97df64c 100644
--- a/include/llvm/Target/Target.td
+++ b/include/llvm/Target/Target.td
@@ -445,6 +445,11 @@ class Instruction {
   string TwoOperandAliasConstraint = "";
 
   ///@}
+
+  /// UseNamedOperandTable - If set, the operand indices of this instruction
+  /// can be queried via the getNamedOperandIdx() function which is generated
+  /// by TableGen.
+  bit UseNamedOperandTable = 0;
 }
 
 /// PseudoInstExpansion - Expansion information for a pseudo-instruction.
diff --git a/utils/TableGen/InstrInfoEmitter.cpp b/utils/TableGen/InstrInfoEmitter.cpp
index d6020a8..8e6d375 100644
--- a/utils/TableGen/InstrInfoEmitter.cpp
+++ b/utils/TableGen/InstrInfoEmitter.cpp
@@ -45,11 +45,19 @@ private:
   void emitEnums(raw_ostream &OS);
 
   typedef std::map<std::vector<std::string>, unsigned> OperandInfoMapTy;
+
+  /// The keys of this map are maps which have OpName enum values as their keys
+  /// and instruction operand indices as their values.  The values of this map
+  /// are lists of instruction names.
+  typedef std::map<std::map<unsigned, unsigned>,
+                   std::vector<std::string> > OpNameMapTy;
   void emitRecord(const CodeGenInstruction &Inst, unsigned Num,
                   Record *InstrInfo,
                   std::map<std::vector<Record*>, unsigned> &EL,
                   const OperandInfoMapTy &OpInfo,
                   raw_ostream &OS);
+   void emitOperandNameMappings(raw_ostream &OS, const CodeGenTarget &Target,
+            const std::vector<const CodeGenInstruction*> &NumberedInstructions);
 
   // Operand information.
   void EmitOperandInfo(raw_ostream &OS, OperandInfoMapTy &OperandInfoIDs);
@@ -176,6 +184,105 @@ void InstrInfoEmitter::EmitOperandInfo(raw_ostream &OS,
   }
 }
 
+/// Generate a table and function for looking up the indices of operands by
+/// name.
+///
+/// This code generates:
+/// - An enum in the llvm::TargetNamespace::OpName namespace, with one entry
+///   for each operand name.
+/// - A 2-dimensional table called OperandMap for mapping OpName enum values to
+///   operand indices.
+/// - A function called getNamedOperandIdx(uint16_t Opcode, uint16_t NamedIdx)
+///   for looking up the operand index for an instruction, given a vlue from
+///   OpName enum
+void InstrInfoEmitter::emitOperandNameMappings(raw_ostream &OS,
+           const CodeGenTarget &Target,
+           const std::vector<const CodeGenInstruction*> &NumberedInstructions) {
+
+  std::string Namespace = Target.getInstNamespace();
+  std::string OpNameNS = "OpName";
+  // Map of operand names to their enumeration value.  This will be used to
+  // generate the OpName enum.
+  std::map<std::string, unsigned> Operands;
+  OpNameMapTy OperandMap;
+
+  for (unsigned i = 0, e = NumberedInstructions.size(), NumOperands = 0;
+                                                                  i != e; ++i) {
+    const CodeGenInstruction *Inst = NumberedInstructions[i];
+    if (!Inst->TheDef->getValueAsBit("UseNamedOperandTable")) {
+      continue;
+    }
+    std::map<unsigned, unsigned> OpList;
+    for (unsigned j = 0, je = Inst->Operands.size(); j != je; ++j) {
+      CGIOperandList::OperandInfo Info = Inst->Operands[j];
+      std::string Name =  Info.Name;
+      if (Operands.count(Name) == 0) {
+        Operands[Name] = NumOperands++;
+      }
+      unsigned OperandId = Operands[Name];
+      OpList[OperandId] = Info.MIOperandNo;
+    }
+    OperandMap[OpList].push_back(Namespace + "::" + Inst->TheDef->getName());
+  }
+
+  OS << "#ifdef GET_INSTRINFO_OPERAND_ENUM\n";
+  OS << "#undef GET_INSTRINFO_OPERAND_ENUM\n";
+  OS << "namespace llvm {";
+  OS << "namespace " << Namespace << " {\n";
+  OS << "namespace " << OpNameNS << " { \n";
+  OS << "enum {\n";
+  for (std::map<std::string, unsigned>::iterator i = Operands.begin(),
+                                             e = Operands.end();
+                                             i != e; ++i) {
+    OS << "  " << i->first << " = " << i->second << ",\n";
+  }
+  OS << "OPERAND_LAST";
+  OS << "\n};\n";
+  OS << "} // End namespace OpName\n";
+  OS << "} // End namespace " << Namespace << "\n";
+  OS << "} // End namespace llvm\n";
+  OS << "#endif //GET_INSTRINFO_OPERAND_ENUM\n";
+
+  OS << "#ifdef GET_INSTRINFO_NAMED_OPS\n";
+  OS << "#undef GET_INSTRINFO_NAMED_OPS\n";
+  OS << "namespace llvm {";
+  OS << "namespace " << Namespace << " {\n";
+  OS << "int16_t getNamedOperandIdx(uint16_t Opcode, uint16_t NamedIdx) {\n";
+  OS << "  static const int16_t OperandMap []["<< Operands.size() << "] = {\n";
+  for (OpNameMapTy::iterator i = OperandMap.begin(), e = OperandMap.end();
+                                                     i != e; ++i) {
+    std::map<unsigned, unsigned> OpList = i->first;
+    OS << "{";
+
+    // Emit a row of the OperandMap table
+    for (unsigned i = 0, e = Operands.size(); i != e; ++i)
+      OS << (OpList.count(i) == 0 ? -1 : (int)OpList[i]) << ", ";
+
+    OS << "},\n";
+  }
+  OS << "};\n";
+
+  OS << "  switch(Opcode) {\n";
+  unsigned TableIndex = 0;
+  for (OpNameMapTy::iterator i = OperandMap.begin(), e = OperandMap.end();
+                                                     i != e; ++i, ++TableIndex) {
+    std::map<unsigned, unsigned> OpList = i->first;
+    std::vector<std::string> OpcodeList = i->second;
+
+    for (unsigned ii = 0, ie = OpcodeList.size(); ii != ie; ++ii)
+      OS << "  case " << OpcodeList[ii] << ":\n";
+
+    OS << "    return OperandMap[" << TableIndex << "][NamedIdx];\n";
+  }
+  OS << "    default: return -1;\n";
+  OS << "  }\n";
+  OS << "}\n";
+  OS << "} // End namespace " << Namespace << "\n";
+  OS << "} // End namespace llvm\n";
+  OS << "#endif //GET_INSTRINFO_NAMED_OPS\n";
+
+}
+
 //===----------------------------------------------------------------------===//
 // Main Output.
 //===----------------------------------------------------------------------===//
@@ -293,6 +400,8 @@ void InstrInfoEmitter::run(raw_ostream &OS) {
   OS << "} // End llvm namespace \n";
 
   OS << "#endif // GET_INSTRINFO_CTOR\n\n";
+
+  emitOperandNameMappings(OS, Target, NumberedInstructions);
 }
 
 void InstrInfoEmitter::emitRecord(const CodeGenInstruction &Inst, unsigned Num,
-- 
1.7.11.4