[llvm] 5351878 - [TableGen] Add useDeprecatedPositionallyEncodedOperands option.

Sat Sep 24 06:41:33 PDT 2022

Author: James Y Knight
Date: 2022-09-24T09:40:45-04:00
New Revision: 5351878ba1963a84600df3a9e907b458b0529851

URL: https://github.com/llvm/llvm-project/commit/5351878ba1963a84600df3a9e907b458b0529851
DIFF: https://github.com/llvm/llvm-project/commit/5351878ba1963a84600df3a9e907b458b0529851.diff

LOG: [TableGen] Add useDeprecatedPositionallyEncodedOperands option.

Summary:
The existing undefined-bitfield-to-operand matching behavior is very
hard to understand, due to the combination of positional and named
matching. This can make it difficult to track down a bug in a target's
instruction definitions.

Over the last decade, folks have tried to work-around this in various
ways, but it's time to finally ditch the positional matching. With
https://reviews.llvm.org/D131003, there are no longer cases that
_require_ positional matching, and it's time to start removing usage
and support for it.

Therefore: add a (default-false) option, and set it to true only in
those targets that require positional matching today. Subsequent
changes will start cleaning up additional in-tree targets.

NOTE TO OUT OF TREE TARGET MAINTAINERS:

If this change breaks your build, you may restore the previous
behavior simply by adding:
  let useDeprecatedPositionallyEncodedOperands = 1;
to your target's InstrInfo tablegen definition. However, this is
temporary -- the option will be removed in the future.

If your target does not set 'decodePositionallyEncodedOperands', you
may thus start migrating to named operands. However, if you _do_
currently set that option, I recommend waiting until a subsequent
change lands, which adds decoder support for named sub-operands.

Differential Revision: https://reviews.llvm.org/D134073

Added: 
    llvm/test/TableGen/MissingOperandField.td

Modified: 
    llvm/include/llvm/Target/Target.td
    llvm/lib/Target/AMDGPU/AMDGPU.td
    llvm/lib/Target/AMDGPU/R600.td
    llvm/lib/Target/AVR/AVR.td
    llvm/lib/Target/Lanai/Lanai.td
    llvm/lib/Target/Mips/Mips.td
    llvm/lib/Target/PowerPC/PPC.td
    llvm/lib/Target/Sparc/Sparc.td
    llvm/lib/Target/VE/VE.td
    llvm/test/TableGen/InsufficientPositionalOperands.td
    llvm/utils/TableGen/CodeEmitterGen.cpp
    llvm/utils/TableGen/DecoderEmitter.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/include/llvm/Target/Target.td b/llvm/include/llvm/Target/Target.td
index 55d9ff7b858c0..234bf8dad08f7 100644

--- a/llvm/include/llvm/Target/Target.td
+++ b/llvm/include/llvm/Target/Target.td
@@ -1048,24 +1048,42 @@ class InstrInfo {
   bit guessInstructionProperties = true;
 
   // TableGen's instruction encoder generator has support for matching operands
-  // to bit-field variables both by name and by position. While matching by
-  // name is preferred, this is currently not possible for complex operands,
-  // and some targets still reply on the positional encoding rules. When
-  // generating a decoder for such targets, the positional encoding rules must
-  // be used by the decoder generator as well.
+  // to bit-field variables both by name and by position. Support for matching
+  // by position is DEPRECATED, and WILL BE REMOVED. Positional matching is
+  // confusing to use, and makes it very easy to accidentally write buggy
+  // instruction definitions.
   //
-  // This option is temporary; it will go away once the TableGen decoder
-  // generator has better support for complex operands and targets have
-  // migrated away from using positionally encoded operands.
+  // In previous versions of LLVM, the ability to match operands by position was
+  // enabled unconditionally. It is now controllable by this option -- and
+  // disabled by default. The previous behavior can be restored by setting this
+  // option to true.
+  //
+  // This option is temporary, and will go away once all in-tree targets have
+  // migrated.
+  //
+  // TODO: clean up and remove these options.
+  bit useDeprecatedPositionallyEncodedOperands = false;
+
+  // If positional encoding rules are used for the encoder generator, they may
+  // also need to be used by the decoder generator -- if so, enable this
+  // variable.
+  //
+  // This option is a no-op unless useDeprecatedPositionallyEncodedOperands is
+  // true.
+  //
+  // This option is temporary, and will go away once all in-tree targets have
+  // migrated.
   bit decodePositionallyEncodedOperands = false;
 
   // When set, this indicates that there will be no overlap between those
   // operands that are matched by ordering (positional operands) and those
   // matched by name.
   //
-  // This option is temporary; it will go away once the TableGen decoder
-  // generator has better support for complex operands and targets have
-  // migrated away from using positionally encoded operands.
+  // This is a no-op unless useDeprecatedPositionallyEncodedOperands is true
+  // (though it does modify the "would've used positional operand XXX" error.)
+  //
+  // This option is temporary, and will go away once all in-tree targets have
+  // migrated.
   bit noNamedPositionallyEncodedOperands = false;
 }
 

diff  --git a/llvm/lib/Target/AMDGPU/AMDGPU.td b/llvm/lib/Target/AMDGPU/AMDGPU.td
index 2f3213aaea1f3..d85b79fc78be7 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.td
@@ -1323,6 +1323,7 @@ def FeatureISAVersion11_0_3 : FeatureSet<
 def AMDGPUInstrInfo : InstrInfo {
   let guessInstructionProperties = 1;
   let noNamedPositionallyEncodedOperands = 1;
+  let useDeprecatedPositionallyEncodedOperands = 1;
 }
 
 def AMDGPUAsmParser : AsmParser {

diff  --git a/llvm/lib/Target/AMDGPU/R600.td b/llvm/lib/Target/AMDGPU/R600.td
index 45bc955d4f4c2..80ad2706620a9 100644
--- a/llvm/lib/Target/AMDGPU/R600.td
+++ b/llvm/lib/Target/AMDGPU/R600.td
@@ -11,6 +11,7 @@ include "llvm/Target/Target.td"
 def R600InstrInfo : InstrInfo {
   let guessInstructionProperties = 1;
   let noNamedPositionallyEncodedOperands = 1;
+  let useDeprecatedPositionallyEncodedOperands = 1;
 }
 
 def R600 : Target {

diff  --git a/llvm/lib/Target/AVR/AVR.td b/llvm/lib/Target/AVR/AVR.td
index 22ffc4a368ad6..0e4714951ff41 100644
--- a/llvm/lib/Target/AVR/AVR.td
+++ b/llvm/lib/Target/AVR/AVR.td
@@ -32,7 +32,9 @@ include "AVRRegisterInfo.td"
 
 include "AVRInstrInfo.td"
 
-def AVRInstrInfo : InstrInfo;
+def AVRInstrInfo : InstrInfo {
+  let useDeprecatedPositionallyEncodedOperands = true;
+}
 
 //===---------------------------------------------------------------------===//
 // Calling Conventions

diff  --git a/llvm/lib/Target/Lanai/Lanai.td b/llvm/lib/Target/Lanai/Lanai.td
index c6d949f42047e..d2780177a05d2 100644
--- a/llvm/lib/Target/Lanai/Lanai.td
+++ b/llvm/lib/Target/Lanai/Lanai.td
@@ -21,7 +21,9 @@ include "LanaiRegisterInfo.td"
 include "LanaiCallingConv.td"
 include "LanaiInstrInfo.td"
 
-def LanaiInstrInfo : InstrInfo;
+def LanaiInstrInfo : InstrInfo {
+  let useDeprecatedPositionallyEncodedOperands = 1;
+}
 
 //===----------------------------------------------------------------------===//
 // Lanai processors supported.

diff  --git a/llvm/lib/Target/Mips/Mips.td b/llvm/lib/Target/Mips/Mips.td
index 398c38e678ba3..2d47e6194de5a 100644
--- a/llvm/lib/Target/Mips/Mips.td
+++ b/llvm/lib/Target/Mips/Mips.td
@@ -223,7 +223,9 @@ include "MipsCombine.td"
 include "MipsScheduleP5600.td"
 include "MipsScheduleGeneric.td"
 
-def MipsInstrInfo : InstrInfo;
+def MipsInstrInfo : InstrInfo {
+ let useDeprecatedPositionallyEncodedOperands = 1;
+}
 
 //===----------------------------------------------------------------------===//
 // Mips processors supported.

diff  --git a/llvm/lib/Target/PowerPC/PPC.td b/llvm/lib/Target/PowerPC/PPC.td
index 310bf8125f1c3..c231bb4069d23 100644
--- a/llvm/lib/Target/PowerPC/PPC.td
+++ b/llvm/lib/Target/PowerPC/PPC.td
@@ -664,6 +664,7 @@ def PPCInstrInfo : InstrInfo {
   let decodePositionallyEncodedOperands = 1;
 
   let noNamedPositionallyEncodedOperands = 1;
+  let useDeprecatedPositionallyEncodedOperands = 1;
 }
 
 def PPCAsmWriter : AsmWriter {

diff  --git a/llvm/lib/Target/Sparc/Sparc.td b/llvm/lib/Target/Sparc/Sparc.td
index da95602309a1e..b1c349519c633 100644
--- a/llvm/lib/Target/Sparc/Sparc.td
+++ b/llvm/lib/Target/Sparc/Sparc.td
@@ -74,7 +74,9 @@ include "SparcCallingConv.td"
 include "SparcSchedule.td"
 include "SparcInstrInfo.td"
 
-def SparcInstrInfo : InstrInfo;
+def SparcInstrInfo : InstrInfo {
+  let useDeprecatedPositionallyEncodedOperands = 1;
+}
 
 def SparcAsmParser : AsmParser {
   bit ShouldEmitMatchRegisterName = 0;

diff  --git a/llvm/lib/Target/VE/VE.td b/llvm/lib/Target/VE/VE.td
index 9e8adcd42077d..16d6c36ee4ab6 100644
--- a/llvm/lib/Target/VE/VE.td
+++ b/llvm/lib/Target/VE/VE.td
@@ -30,7 +30,9 @@ include "VERegisterInfo.td"
 include "VECallingConv.td"
 include "VEInstrInfo.td"
 
-def VEInstrInfo : InstrInfo;
+def VEInstrInfo : InstrInfo {
+  let useDeprecatedPositionallyEncodedOperands = 1;
+}
 
 def VEAsmParser : AsmParser {
   // Use both VE register name matcher to accept "S0~S63" register names

diff  --git a/llvm/test/TableGen/InsufficientPositionalOperands.td b/llvm/test/TableGen/InsufficientPositionalOperands.td
index 5927bf9ce4404..131c0861ad3c5 100644
--- a/llvm/test/TableGen/InsufficientPositionalOperands.td
+++ b/llvm/test/TableGen/InsufficientPositionalOperands.td
@@ -1,11 +1,15 @@
-// RUN: not llvm-tblgen -gen-emitter -I %p/../../include %s 2>&1 | FileCheck %s
+// RUN: not llvm-tblgen -gen-emitter -I %p/../../include %s 2>&1 | FileCheck %s --implicit-check-not=error:
 
 // Check that TableGen doesn't crash on insufficient positional
 // instruction operands.
 
+// TODO: remove this test when removing useDeprecatedPositionallyEncodedOperands
+
 include "llvm/Target/Target.td"
 
-def ArchInstrInfo : InstrInfo { }
+def ArchInstrInfo : InstrInfo {
+  let useDeprecatedPositionallyEncodedOperands = 1;
+}
 
 def Arch : Target {
   let InstructionSet = ArchInstrInfo;
@@ -15,6 +19,8 @@ def Reg : Register<"reg">;
 
 def Regs : RegisterClass<"foo", [i32], 0, (add Reg)>;
 
+// CHECK: error: Too few operands in record foo (no match for variable rs)
+// CHECK: note: Dumping record for previous error:
 def foo : Instruction {
   bits<3> rd;
   bits<3> rs;
@@ -24,7 +30,6 @@ def foo : Instruction {
   let Inst{4-2} = rd;
   let Inst{7-5} = rs;
 
-// CHECK: Too few operands in record foo (no match for variable rs)
   let OutOperandList = (outs Regs:$xd);
   let InOperandList = (ins);
 }

diff  --git a/llvm/test/TableGen/MissingOperandField.td b/llvm/test/TableGen/MissingOperandField.td
new file mode 100644
index 0000000000000..3b550813540a6
--- /dev/null
+++ b/llvm/test/TableGen/MissingOperandField.td
@@ -0,0 +1,32 @@
+// RUN: not llvm-tblgen -gen-emitter -I %p/../../include %s 2>&1 | FileCheck %s --implicit-check-not=error:
+
+// Check that we emit reasonable diagnostics when fields do not have
+// corresponding operands.
+
+include "llvm/Target/Target.td"
+
+def ArchInstrInfo : InstrInfo { }
+
+def Arch : Target {
+  let InstructionSet = ArchInstrInfo;
+}
+
+def Reg : Register<"reg">;
+
+def Regs : RegisterClass<"foo", [i32], 0, (add Reg)>;
+
+// CHECK: error: No operand named rd in record foo (would've used positional operand #0 ('xd') sub-op #0 with useDeprecatedPositionallyEncodedOperands=true)
+// CHECK: error: No operand named rs in record foo (would've given 'too few operands' error with useDeprecatedPositionallyEncodedOperands=true)
+// CHECK: note: Dumping record for previous error:
+def foo : Instruction {
+  bits<3> rd;
+  bits<3> rs;
+
+  bits<8> Inst;
+  let Inst{1-0} = 0;
+  let Inst{4-2} = rd;
+  let Inst{7-5} = rs;
+
+  let OutOperandList = (outs Regs:$xd);
+  let InOperandList = (ins);
+}

diff  --git a/llvm/utils/TableGen/CodeEmitterGen.cpp b/llvm/utils/TableGen/CodeEmitterGen.cpp
index 31b2eb95f58ee..f999e14aa76bd 100644
--- a/llvm/utils/TableGen/CodeEmitterGen.cpp
+++ b/llvm/utils/TableGen/CodeEmitterGen.cpp
@@ -50,9 +50,8 @@ class CodeEmitterGen {
   std::string getInstructionCase(Record *R, CodeGenTarget &Target);
   std::string getInstructionCaseForEncoding(Record *R, Record *EncodingDef,
                                             CodeGenTarget &Target);
-  void AddCodeToMergeInOperand(Record *R, BitsInit *BI,
-                               const std::string &VarName,
-                               unsigned &NumberedOp,
+  bool addCodeToMergeInOperand(Record *R, BitsInit *BI,
+                               const std::string &VarName, unsigned &NumberedOp,
                                std::set<unsigned> &NamedOpIndices,
                                std::string &Case, CodeGenTarget &Target);
 
@@ -79,11 +78,13 @@ int CodeEmitterGen::getVariableBit(const std::string &VarName,
   return -1;
 }
 
-void CodeEmitterGen::
-AddCodeToMergeInOperand(Record *R, BitsInit *BI, const std::string &VarName,
-                        unsigned &NumberedOp,
-                        std::set<unsigned> &NamedOpIndices,
-                        std::string &Case, CodeGenTarget &Target) {
+// Returns true if it succeeds, false if an error.
+bool CodeEmitterGen::addCodeToMergeInOperand(Record *R, BitsInit *BI,
+                                             const std::string &VarName,
+                                             unsigned &NumberedOp,
+                                             std::set<unsigned> &NamedOpIndices,
+                                             std::string &Case,
+                                             CodeGenTarget &Target) {
   CodeGenInstruction &CGI = Target.getInstruction(R);
 
   // Determine if VarName actually contributes to the Inst encoding.
@@ -99,8 +100,9 @@ AddCodeToMergeInOperand(Record *R, BitsInit *BI, const std::string &VarName,
   
   // If we found no bits, ignore this value, otherwise emit the call to get the
   // operand encoding.
-  if (bit < 0) return;
-  
+  if (bit < 0)
+    return true;
+
   // If the operand matches by name, reference according to that
   // operand number. Non-matching operands are assumed to be in
   // order.
@@ -114,6 +116,13 @@ AddCodeToMergeInOperand(Record *R, BitsInit *BI, const std::string &VarName,
     assert(!CGI.Operands.isFlatOperandNotEmitted(OpIdx) &&
            "Explicitly used operand also marked as not emitted!");
   } else {
+    // Fall back to positional lookup. By default, we now disable positional
+    // lookup (and print an error, below), but even so, we'll do the lookup to
+    // help print a helpful diagnostic message.
+    //
+    // TODO: When we remove useDeprecatedPositionallyEncodedOperands, delete all
+    // this code, just leaving a "no operand named X in record Y" error.
+
     unsigned NumberOps = CGI.Operands.size();
     /// If this operand is not supposed to be emitted by the
     /// generated emitter, skip it.
@@ -126,15 +135,33 @@ AddCodeToMergeInOperand(Record *R, BitsInit *BI, const std::string &VarName,
 
     if (NumberedOp >=
         CGI.Operands.back().MIOperandNo + CGI.Operands.back().MINumOperands) {
-      std::string E;
-      raw_string_ostream S(E);
-      S << "Too few operands in record " << R->getName()
-        << " (no match for variable " << VarName << "):\n";
-      S << *R;
-      PrintFatalError(R, E);
+      if (!Target.getInstructionSet()->getValueAsBit(
+              "useDeprecatedPositionallyEncodedOperands")) {
+        PrintError(R, Twine("No operand named ") + VarName + " in record " +
+                          R->getName() +
+                          " (would've given 'too few operands' error with "
+                          "useDeprecatedPositionallyEncodedOperands=true)");
+      } else {
+        PrintError(R, "Too few operands in record " + R->getName() +
+                          " (no match for variable " + VarName + ")");
+      }
+      return false;
     }
 
     OpIdx = NumberedOp++;
+
+    if (!Target.getInstructionSet()->getValueAsBit(
+            "useDeprecatedPositionallyEncodedOperands")) {
+      std::pair<unsigned, unsigned> SO =
+          CGI.Operands.getSubOperandNumber(OpIdx);
+      std::string OpName = CGI.Operands[SO.first].Name;
+      PrintError(R, Twine("No operand named ") + VarName + " in record " +
+                        R->getName() + " (would've used positional operand #" +
+                        Twine(SO.first) + " ('" + OpName + "') sub-op #" +
+                        Twine(SO.second) +
+                        " with useDeprecatedPositionallyEncodedOperands=true)");
+      return false;
+    }
   }
 
   std::pair<unsigned, unsigned> SO = CGI.Operands.getSubOperandNumber(OpIdx);
@@ -266,6 +293,7 @@ AddCodeToMergeInOperand(Record *R, BitsInit *BI, const std::string &VarName,
       }
     }
   }
+  return true;
 }
 
 std::string CodeEmitterGen::getInstructionCase(Record *R,
@@ -313,14 +341,25 @@ std::string CodeEmitterGen::getInstructionCaseForEncoding(Record *R, Record *Enc
 
   // Loop over all of the fields in the instruction, determining which are the
   // operands to the instruction.
+  bool Success = true;
   for (const RecordVal &RV : EncodingDef->getValues()) {
     // Ignore fixed fields in the record, we're looking for values like:
     //    bits<5> RST = { ?, ?, ?, ?, ? };
     if (RV.isNonconcreteOK() || RV.getValue()->isComplete())
       continue;
 
-    AddCodeToMergeInOperand(R, BI, std::string(RV.getName()), NumberedOp,
-                            NamedOpIndices, Case, Target);
+    Success &=
+        addCodeToMergeInOperand(R, BI, std::string(RV.getName()), NumberedOp,
+                                NamedOpIndices, Case, Target);
+  }
+
+  if (!Success) {
+    // Dump the record, so we can see what's going on...
+    std::string E;
+    raw_string_ostream S(E);
+    S << "Dumping record for previous error:\n";
+    S << *R;
+    PrintNote(E);
   }
 
   StringRef PostEmitter = R->getValueAsString("PostEncoderMethod");

diff  --git a/llvm/utils/TableGen/DecoderEmitter.cpp b/llvm/utils/TableGen/DecoderEmitter.cpp
index d75ee3492539e..0e2b106279997 100644
--- a/llvm/utils/TableGen/DecoderEmitter.cpp
+++ b/llvm/utils/TableGen/DecoderEmitter.cpp
@@ -1982,8 +1982,12 @@ populateInstruction(CodeGenTarget &Target, const Record &EncodingDef,
   } else {
     std::map<std::string, std::vector<OperandInfo>> NumberedInsnOperands;
     std::set<std::string> NumberedInsnOperandsNoTie;
-    if (Target.getInstructionSet()->getValueAsBit(
-            "decodePositionallyEncodedOperands")) {
+    bool SupportPositionalDecoding =
+        Target.getInstructionSet()->getValueAsBit(
+            "useDeprecatedPositionallyEncodedOperands") &&
+        Target.getInstructionSet()->getValueAsBit(
+            "decodePositionallyEncodedOperands");
+    if (SupportPositionalDecoding) {
       const std::vector<RecordVal> &Vals = Def.getValues();
       unsigned NumberedOp = 0;
 
@@ -2127,33 +2131,35 @@ populateInstruction(CodeGenTarget &Target, const Record &EncodingDef,
 
     // For each operand, see if we can figure out where it is encoded.
     for (const auto &Op : InOutOperands) {
-      if (!NumberedInsnOperands[std::string(Op.second)].empty()) {
-        llvm::append_range(InsnOperands,
-                           NumberedInsnOperands[std::string(Op.second)]);
-        continue;
-      }
-      if (!NumberedInsnOperands[TiedNames[std::string(Op.second)]].empty()) {
-        if (!NumberedInsnOperandsNoTie.count(
-                TiedNames[std::string(Op.second)])) {
-          // Figure out to which (sub)operand we're tied.
-          unsigned i =
-              CGI.Operands.getOperandNamed(TiedNames[std::string(Op.second)]);
-          int tiedTo = CGI.Operands[i].getTiedRegister();
-          if (tiedTo == -1) {
-            i = CGI.Operands.getOperandNamed(Op.second);
-            tiedTo = CGI.Operands[i].getTiedRegister();
-          }
-
-          if (tiedTo != -1) {
-            std::pair<unsigned, unsigned> SO =
-                CGI.Operands.getSubOperandNumber(tiedTo);
-
-            InsnOperands.push_back(
-                NumberedInsnOperands[TiedNames[std::string(Op.second)]]
-                                    [SO.second]);
+      if (SupportPositionalDecoding) {
+        if (!NumberedInsnOperands[std::string(Op.second)].empty()) {
+          llvm::append_range(InsnOperands,
+                             NumberedInsnOperands[std::string(Op.second)]);
+          continue;
+        }
+        if (!NumberedInsnOperands[TiedNames[std::string(Op.second)]].empty()) {
+          if (!NumberedInsnOperandsNoTie.count(
+                  TiedNames[std::string(Op.second)])) {
+            // Figure out to which (sub)operand we're tied.
+            unsigned i =
+                CGI.Operands.getOperandNamed(TiedNames[std::string(Op.second)]);
+            int tiedTo = CGI.Operands[i].getTiedRegister();
+            if (tiedTo == -1) {
+              i = CGI.Operands.getOperandNamed(Op.second);
+              tiedTo = CGI.Operands[i].getTiedRegister();
+            }
+
+            if (tiedTo != -1) {
+              std::pair<unsigned, unsigned> SO =
+                  CGI.Operands.getSubOperandNumber(tiedTo);
+
+              InsnOperands.push_back(
+                  NumberedInsnOperands[TiedNames[std::string(Op.second)]]
+                                      [SO.second]);
+            }
           }
+          continue;
         }
-        continue;
       }
 
       // At this point, we can locate the decoder field, but we need to know how