[PATCH] D43962: [GlobalISel][utils] Adding the init version of Instruction Select Testgen

Daniel Sanders via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Apr 18 15:27:37 PDT 2018


dsanders added a comment.

Aside from being a large patch with few function-level comments, some of the non-functional changes also make this difficult to review. It would be helpful to move things like the indentation correction on testImm*(), the introduction of buildTable and getMatchTable(), the changes to coverage, moving the emission of selectImpl() down, etc. into a separate patch(es).

I think the overall approach of parsing the match table, emitting something that matches, and constructing some scaffolding around it is a good plan. We'll need to find a good way of handling immediates, C++ predicates, ComplexPattern and similar but that should definitely be left for later patches.

Some targets have a rather large number of rules (e.g. X86 is around 17k IIRC). Do we have a mechanism for keeping the number of generated tests to a reasonable level?



================
Comment at: include/llvm/CodeGen/GlobalISel/InstructionSelector.h:225
   /// - TempRegID - The temporary register ID to add
+  /// - TempRegFlags - The register flags to set
   GIR_AddTempRegister,
----------------
Is this comment accurate? GIR_AddRegister is listed as having 2 operands and I only see 2 in my build


================
Comment at: include/llvm/CodeGen/GlobalISel/InstructionSelectorTestgen.h:26-28
+namespace {
+class LiveInRA;
+} // end anonymous namespace
----------------
This one probably isn't harmful since InstructionSelectorTestgen.h isn't going to be widely included but we ought to avoid anonymous namespaces in headers since each compilation unit that includes the header will get it's own version of the declaration. It's probably best to put it in the llvm namespace if we can't push it into the cpp entirely.

InstructionSelectorTestgen doesn't seem to have any state so it looks like generateTestCase() could be a static function outside the class


================
Comment at: include/llvm/CodeGen/GlobalISel/InstructionSelectorTestgen.h:32-37
+extern cl::opt<unsigned> TestgenFromRule;
+extern cl::opt<unsigned> TestgenUntilRule;
+extern cl::list<unsigned> TestgenExcludeRules;
+extern cl::list<unsigned> TestgenIncludeOnly;
+extern cl::opt<bool> TestgenSetAllFeatures;
+extern cl::opt<bool> TestgenNoABI;
----------------
Are all of these really needed in the header? Most seem to only be used from InstructionSelectorTestgen.cpp


================
Comment at: include/llvm/Support/CodeGenCoverage.h:26
 public:
+  using covered_iterator = BitVector::const_set_bits_iterator;
+
----------------
Just a small nit: we should probably have 'const' somewhere in the 'covered_iterator' name.


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectTestgen.cpp:45-47
+  for (const Function &F : M.functions())
+    if (const MachineFunction *MF = MMI.getMachineFunction(F))
+      MF->verify(this, "Pre-existing function", /*AbortOnErrors*/ true);
----------------
Could you add a comment indicating that verification is the only thing we do with these functions and why?


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:58
+MachineIRBuilder
+InstructionSelectorTestgen::createEmptyTestCase(StringRef Name, Module &M,
+                                                MachineModuleInfo &MMI) {
----------------
If the function already exists somehow then this might not return an empty test case due to the getOrInsertFunction() we should probably fail in that case


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:102
+    return PointerType::get(Type::getInt32Ty(Context), LLTy.getAddressSpace());
+  return Type::getIntNTy(Context, LLTy.getSizeInBits());
+}
----------------
LLT's scalars aren't always integers. So long as we inject bitcasts this will be fine though


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:123
+namespace {
+class LiveInRA {
+public:
----------------
If I understand correctly, this is a register allocator that is used to generate the live-ins for the test function. This needs some explaining in the comments and possibly also renaming (I don't have a good spelling for it, maybe InputRegAllocator)


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:273-308
+  auto selectRegBank = [&MF, &Sizes, &Size2RegBanks, &Hist](unsigned Reg) {
+    MachineRegisterInfo &MRI = MF.getRegInfo();
+    if (MRI.getRegBankOrNull(Reg))
+      return;
+    const LLT LLTy = MRI.getType(Reg);
+    unsigned Size = LLTy.getSizeInBits();
+    const auto SizeI = std::lower_bound(Sizes.begin(), Sizes.end(), Size);
----------------
This lambda is getting pretty big. I'd be inclined to make it a static function in its own right


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:282
+      dbgs() << "- Didn't find a register bank containing allocatable\n"
+                "  registers of the required by LLT " << LLTy << " size of\n"
+                "  " << Size << " bits or larger for an unconstrained vreg %"
----------------
I think a word is missing from "of the required by"


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:306
+    // out of all banks, but only out of a subset of them, update the stats so
+    // we could greadily select the locally best result for the next vreg:
+    ++Hist[RB];
----------------
greadily -> greedily


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:343-345
+        dbgs() << "- Undefined virtual register %"
+               << TargetRegisterInfo::virtReg2Index(Reg)
+               << " doesn't have a reg bank\n";
----------------
As a general thing: Writes to dbgs() should generally be wrapped in DEBUG(...) or DEBUG_ONLY(...) so that we don't format strings we're not going to emit.


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:353
+               << TargetRegisterInfo::virtReg2Index(Reg) << "\n";
+        MIRBuilder.buildInstr(TargetOpcode::IMPLICIT_DEF).addDef(Reg);
+      } else if (TestgenNoABI)
----------------
I'm not sure IMPLICIT_DEF is the right thing to use for these if-statements. IMPLICIT_DEF is a definition with an unknown value (much like UNDEF) so it wouldn't be wrong to propagate it in something like
   %0 = IMPLICIT_DEF
   %2 = G_ADD %0, %1
to:
  %2 = IMPLICIT_DEF
a COPY of a live-in phys-reg would be safer but it's probably ok since constant propagation isn't ISel's job. I think this would only become an issue if we started porting this to combiners.


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:366
+                                uint64_t CurrentIdx) {
+  static constexpr unsigned NumberOfOperands[GIU_NumberOfOpcodes] = {
+      1 /*GIM_Try*/,
----------------
This table is going to be quite fragile. We should at put this somewhere near the GIM_*/GIR_* declarations or at least cross-reference them in the comments. Tablegen-erating them might be sensible if we get additional metadata.


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:421
+
+static MachineInstrBuilder &ensureNumOperands(MachineInstrBuilder &InsnB,
+                                              int64_t NumOperands,
----------------
If we have a rule with variadic instructions, what do we do about the number of defs?


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:468
+
+  if (InsnB->getNumOperands() == OpIdx)
+    OperandAdder(InsnB);
----------------
Is this ever false? ensureNumOperands() looks like it adds operands until this is true


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:484
+
+static const LLT DummyLLT = LLT::scalar(32);
+// Assuming that all the parts of the MatchTable that don't affect the selection
----------------
What is this for?


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:485-488
+// Assuming that all the parts of the MatchTable that don't affect the selection
+// process but only identify a rule, like GIR_Coverage opcodes, come within a
+// rule as a single continuous block, so the meaningful parts of the rule could
+// be represented as some prefix and suffix:
----------------
This comment explains the reasons behind something but doesn't really explain what that something is


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:497
+  int64_t FirstRuleID = -1;
+  bool CoverageBlockPassed = false;
+  unsigned NestingLevel = 0;
----------------
I don't think I understand this variable. What does it represent?


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:502
+    case GIM_Try: {
+      CurrentIdx++;
+      if (++NestingLevel > 1) {
----------------
This should probably indicate what is being skipped


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:503
+      CurrentIdx++;
+      if (++NestingLevel > 1) {
+#ifdef NDEBUG
----------------
In some ways it would be nice to support this (e.g. to check the tests are the same) but I agree it's way too big a task for a first patch.


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:504-510
+#ifdef NDEBUG
+        dbgs()
+            << "Testgen doesn't support non-linear or optimized Match Tables\n";
+        exit(1);
+#endif
+        assert(NestingLevel == 1 &&
+               "Testgen doesn't support non-linear or optimized Match Tables");
----------------
Doesn't NDEBUG also disable the dbgs() stream?

That assert can't succeed (NestingLevel > 1 vs NestingLevel == 1). It looks like this ought to be report_fatal_error() or similar


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:596
+          ensureNumOperands(InsnBuilders[InsnID], OpIdx + 1)->getOperand(OpIdx);
+      unsigned Reg = Use.isReg() && Use.getReg() ? Use.getReg() : Def.getReg();
+      Reg = Reg ? Reg : MRI.createGenericVirtualRegister(DummyLLT);
----------------
I don't think I understand the Def.getReg() part of this. Both the Use and the Def must be the same register so I'd expect to always use one or the other for all cases.


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:680-681
+      if (!Testgen.checkFeatures(ExpectedBitsetID)) {
+        dbgs() << "! Testgen deficiency: Don't know how to satisfy\n"
+               << "  Module & Function Target Features yet!\n";
+        // This test definitely won't match the rule being tested and might not
----------------
Function-level features can be handled by listing them in the function attributes:
  attributes #0 = { "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" }
The tricky bit will be mapping the enum back to the feature name.

Module-level features are harder since they're only read once per module. You'd have to separate them into multiple files



================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:714
+      setOperand(InsnBuilders[InsnID], 1,
+                 [](MachineInstrBuilder &InsnB) { InsnB.addImm(1); });
+      break;
----------------
There's no guarantee that 1 will match. There's a few predicates that check they're multiples of N


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:760-783
+    case GIM_CheckIsSameOperand: {
+      int64_t InsnID = MatchTable[CurrentIdx++];
+      int64_t OpIdx = MatchTable[CurrentIdx++];
+      int64_t OtherInsnID = MatchTable[CurrentIdx++];
+      int64_t OtherOpIdx = MatchTable[CurrentIdx++];
+      dbgs() << "GIM_CheckIsSameOperand " << InsnID << ", " << OpIdx << ", "
+             << ", " << OtherInsnID << ", " << OtherOpIdx << "\n";
----------------
One thing to mention here is that if something changes the register number of OtherInsnB.getOperand(OtherOpIdx) after this opcode is processed then these might diverge. I think we're ok on that since setReg() doesn't seem to be called from latePass()


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:795
+            InsnB.addCImm(ConstantInt::get(MF.getFunction().getContext(),
+                                           APInt(LLTy.getSizeInBits(), 1)));
+          });
----------------
As with the other immediate case, '1' might not match all predicates


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:848
+  dbgs() << "** Selecting RegBanks for VRegs unconstrained by the rule\n";
+  selectUnconstrainedRegBanks(RA.getSize2RegBanks(), MF);
+  // selectUnconstrainedRegBanks relies on the input vregs being undefined,
----------------
I was confused by this function name at first. It sounded like the regbanks were unconstrained and that didn't make sense. I see it's about assigning regbanks to vregs that don't already check one. I haven't thought of a good spelling ('assignRegBanksToUnconstrainedVregs()' or 'ensureRegistersHaveBanks()' are a couple ideas) but we may be able to find a clearer name.


================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:883
+      const auto &Name = buildName(RuleNum, CurrentIdx, RuleID);
+      SmallVectorImpl<RuleSig> &Dups = RuleDuplicates[RuleBody];
+      Dups.emplace_back(RuleNum, CurrentIdx, RuleID);
----------------
We should have a couple examples of how duplicates can happen (e.g. i32 and f32 both mapping to s32)


================
Comment at: utils/TableGen/GlobalISelEmitter.cpp:3983-3995
+  OS << "bool " << Target.getName()
+     << "InstructionSelector::selectImpl(MachineInstr &I, CodeGenCoverage "
+     << "&CoverageInfo) const {\n"
+     << "  MachineFunction &MF = *I.getParent()->getParent();\n"
+     << "  MachineRegisterInfo &MRI = MF.getRegInfo();\n"
+     << "  // FIXME: This should be computed on a per-function basis rather "
+     << "than per-insn.\n"
----------------
It seems that this block has been moved down from the other side of the sort. Non-functional changes like this ought to be in separate patches


================
Comment at: utils/TableGen/GlobalISelEmitter.cpp:4005
+     << "namespace {\n"
+     << "class " << Target.getName() << "InstructionSelectorTestgen\n"
+     << "    : public llvm::InstructionSelectorTestgen {\n"
----------------
This definition should probably be wrapped in a #ifdef


================
Comment at: utils/update_instruction_select_testgen_tests.sh:39-42
+    | perl -pe 's/(test_rule\d+)(_id\d+)?_at_idx\d+/\1/' \
+    | perl -pe 's/^(name:\s*test_rule\d+)/# TESTGEND-LABEL: \1/' \
+    | perl -pe 's/^([^#\n])/# TESTGEND:       \1/' \
+    | perl -pe 's/^\n$/#\n/' >> "$testgend"
----------------
The 'perl' commands might be an issue for the windows bots.


Repository:
  rL LLVM

https://reviews.llvm.org/D43962





More information about the llvm-commits mailing list