[PATCH] D43962: [GlobalISel][utils] Adding the init version of Instruction Select Testgen

Thu Apr 26 18:47:12 PDT 2018

rtereshin marked 31 inline comments as done.
rtereshin added inline comments.

================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:343-345
+        dbgs() << "- Undefined virtual register %"
+               << TargetRegisterInfo::virtReg2Index(Reg)
+               << " doesn't have a reg bank\n";
----------------
dsanders wrote:
> As a general thing: Writes to dbgs() should generally be wrapped in DEBUG(...) or DEBUG_ONLY(...) so that we don't format strings we're not going to emit.
Done.

> so that we don't format strings we're not going to emit.

w/o DEBUG macro we actually emit all of this in assert and release builds likewise.

================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:421
+
+static MachineInstrBuilder &ensureNumOperands(MachineInstrBuilder &InsnB,
+                                              int64_t NumOperands,
----------------
dsanders wrote:
> If we have a rule with variadic instructions, what do we do about the number of defs?
That's a very good question, thanks!

I suppose, I will have to replace `NumDefs = InsnB->getDesc().NumDefs;` line below with `NumDefs = std::max(InsnB->getDesc().NumDefs, InsnB->getNumExplicitDefs());` as soon as we get `MachineInstr::getNumExplicitDefs()` merged in ;-) (https://reviews.llvm.org/D45640).

It will help, but won't solve the problem.

It's not a problem for now, though, as `InstructionSelector` can't really handle those either. For instance, record instruction opcode clearly assumes that the definition is always the operand 0.

When it does support the case, however, most likely it won't be doing that by checking how many definitions an instruction has, it will most likely just rely on MIR being valid.

That naturally assumes machine verifier can check this stuff. So I guess at some point machine verifier will special case instructions like `G_UNMERGE_VALUES`. We can implement that check in the machine verifier as `unsigned getNumExplicitDefsExpected(const MachineInstr &I)` refactored out and then checking if the actual `I` has the number of defs expected.

And then we can reuse `getNumExplicitDefsExpected` right here in Testgen.

If the generic opcode has a very flexible number of defs, as in, it's not derivable from the number of operands and their types (do we even have these?), it will probably be still all right, we just see the highest operand index that the pattern explicitly requires to be a definition (via record instruction opcodes, for instance), and we say that that operand is the last def (and of course every operand with a lower index is also a def). And that should produce a) valid MIR, we started by saying this mysterious opcode is very flexible with defs b) MIR that could be matched by the pattern, and it's all we care about.

Not to mention, if we end up having generic opcodes like this - with non-derivable number of defs and the number of defs having an impact on the instructions' semantics - we will end up having a match table opcode checking the number of defs explicitly.

But again, it's not a problem for now.

================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:485-488
+// Assuming that all the parts of the MatchTable that don't affect the selection
+// process but only identify a rule, like GIR_Coverage opcodes, come within a
+// rule as a single continuous block, so the meaningful parts of the rule could
+// be represented as some prefix and suffix:
----------------
dsanders wrote:
> This comment explains the reasons behind something but doesn't really explain what that something is
I'm replacing the comment with the following:

```
// Raw representation of a single match table rule as an ordered union of
// several continuous regions of the match table. The representation tries its
// best to ignore parts of the table that don't affect semantics of a single
// rule in isolation, like labels and rule IDs.
//
// Assuming that all the parts of the MatchTable that don't affect the
// selection process but only identify a rule, like GIR_Coverage opcodes, come
// within a rule as a single continuous block, the meaningful parts of the
// rule could be represented as some prefix (starting from the first
// non-control flow opcode, in other words, skipping GIM_Try and its
// label-operand) and suffix:
using RuleBodyTy = std::pair<ArrayRef<int64_t>, ArrayRef<int64_t>>;
```

================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:497
+  int64_t FirstRuleID = -1;
+  bool CoverageBlockPassed = false;
+  unsigned NestingLevel = 0;
----------------
dsanders wrote:
> I don't think I understand this variable. What does it represent?
I'm renaming the variable from `CoverageBlockPassed ` to `ExcludedRegionPassed` and adding a bunch of comments that should make it clearer, like this:
```
// Get a rule descriptor, containing the index of its GIR_Done opcode, RuleID,
// and a raw representation of the entire body.
//
// \pre MatchTable is a non-optimized linear match table.
// \pre CurrentIdx points to the first (and only) GIM_Try opcode of the rule
//   that has all its semantically meaningless opcodes that to be excluded from
//   the body as a single continuous subregion somewhere.
// \post the prefix of the body is a range from the first opcode after the
//   initial GIM_Try until GIR_Coverage (or, more generally, the first
//   semantically meaningless opcode), the suffix is a range from the first
//   opcode after the last semantically meaningless opcode until GIR_Done
//   (exclusive).
static std::tuple<uint64_t, int64_t, RuleBodyTy>
getDoneIdxRuleIDAndBody(const int64_t *MatchTable, uint64_t CurrentIdx) {
  // skipping GIM_Try
  const uint64_t BodyIdx = nextGIOpcodeIdx(MatchTable, CurrentIdx);
  uint64_t ExcludedFirst = BodyIdx;
  uint64_t ExcludedLast = ExcludedFirst;
  // RuleID we discovered so far. Or the first one if we have many O_o
  int64_t FirstRuleID = -1;
  // Did we already iterated over that continuous region of unimportant opcodes
  // we are going to exclude from the body?
  bool ExcludedRegionPassed = false;
  unsigned NestingLevel = 0;
  do {
```

================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:502
+    case GIM_Try: {
+      CurrentIdx++;
+      if (++NestingLevel > 1) {
----------------
dsanders wrote:
> This should probably indicate what is being skipped
I'm adding the following comment:
```
// Skipping the OnFail label operand
```

================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:503
+      CurrentIdx++;
+      if (++NestingLevel > 1) {
+#ifdef NDEBUG
----------------
dsanders wrote:
> In some ways it would be nice to support this (e.g. to check the tests are the same) but I agree it's way too big a task for a first patch.
It will be hard to make sure that the tests are the same.

For instance, let's suppose that non-optimized table does some meaningless checks, for instance, checks register banks on internal vregs (the registers defined and used inside the pattern and not being the pattern's overall inputs or outputs), while optimized table doesn't. Semantically they are the same, but in the case of the latter Testgen will have to guess more regbanks, and it might guess it differently (from what is explicitly checked by the non-optimized table). It makes no difference in the selected code, and the *selected test will pass, however, the *testgend test (and that one only tests the Testgen itself, not the selector) will be technically different.

Also, if optimization reorders opcodes, the Testgen might easily end up with different virtual register names, while keeping the actual def-use chain the same, or schedule the def-use chain differently.

Not to mention, it will noticeably increase the maintenance burden, and I think it's best to keep that at a minimum.

================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:504-510
+#ifdef NDEBUG
+        dbgs()
+            << "Testgen doesn't support non-linear or optimized Match Tables\n";
+        exit(1);
+#endif
+        assert(NestingLevel == 1 &&
+               "Testgen doesn't support non-linear or optimized Match Tables");
----------------
dsanders wrote:
> Doesn't NDEBUG also disable the dbgs() stream?
> 
> That assert can't succeed (NestingLevel > 1 vs NestingLevel == 1). It looks like this ought to be report_fatal_error() or similar
> Doesn't NDEBUG also disable the dbgs() stream?

No, the only difference is that `NDEBUG` resolves `dbgs()` directly to `errs()` and therefore sends the output straight to `stderr` while in assert builds there is a circular buffer in the middle (that smart enough to flush if killed).

> That assert can't succeed (NestingLevel > 1 vs NestingLevel == 1). It looks like this ought to be report_fatal_error() or similar

True, good catch, I'm tidying this up.

================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:596
+          ensureNumOperands(InsnBuilders[InsnID], OpIdx + 1)->getOperand(OpIdx);
+      unsigned Reg = Use.isReg() && Use.getReg() ? Use.getReg() : Def.getReg();
+      Reg = Reg ? Reg : MRI.createGenericVirtualRegister(DummyLLT);
----------------
dsanders wrote:
> I don't think I understand the Def.getReg() part of this. Both the Use and the Def must be the same register so I'd expect to always use one or the other for all cases.
> Both the Use and the Def must be the same register so I'd expect to always use one or the other for all cases.

They must be the same register by the end of this `case`, but we know little in the beginning. We know that `Def` is a register and that's about it. Either (and we don't know which) or both `Def` and `Use` could be `%noreg`s (`0`). If one of them is an actual vreg (not `%noreg`) it might have a meaningful type or / and a bank assigned, and we can't loose that assignment. Technically, to make it even more robust we need to intersect both of their constraints here (what if both of them are valid vregs already, one has a bank checked, but not a type yet, and the other has the type checked, but not the bank?), but so far it was working pretty well as is.

================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:680-681
+      if (!Testgen.checkFeatures(ExpectedBitsetID)) {
+        dbgs() << "! Testgen deficiency: Don't know how to satisfy\n"
+               << "  Module & Function Target Features yet!\n";
+        // This test definitely won't match the rule being tested and might not
----------------
dsanders wrote:
> Function-level features can be handled by listing them in the function attributes:
>   attributes #0 = { "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" }
> The tricky bit will be mapping the enum back to the feature name.
> 
> Module-level features are harder since they're only read once per module. You'd have to separate them into multiple files
> 
Thanks, I didn't dig into this deep enough yet to know that for myself!

And yes, the mapping is the hardest part. So this is why that ugly `TestgenSetAllFeatures` command line option for now, to overcome this the cheap and dirty way. Of course, that strips us off testing the checking features part of every pattern and the match table as a whole.

This is for future patches and improvements, though.

================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:714
+      setOperand(InsnBuilders[InsnID], 1,
+                 [](MachineInstrBuilder &InsnB) { InsnB.addImm(1); });
+      break;
----------------
dsanders wrote:
> There's no guarantee that 1 will match. There's a few predicates that check they're multiples of N
Absolutely no guarantee. However, I actually took a quick look and it appeared to me that 1 would match more often then any other fixed constant, did't measure it though.

Full support for immediate predicates is for future patches.

================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:760-783
+    case GIM_CheckIsSameOperand: {
+      int64_t InsnID = MatchTable[CurrentIdx++];
+      int64_t OpIdx = MatchTable[CurrentIdx++];
+      int64_t OtherInsnID = MatchTable[CurrentIdx++];
+      int64_t OtherOpIdx = MatchTable[CurrentIdx++];
+      dbgs() << "GIM_CheckIsSameOperand " << InsnID << ", " << OpIdx << ", "
+             << ", " << OtherInsnID << ", " << OtherOpIdx << "\n";
----------------
dsanders wrote:
> One thing to mention here is that if something changes the register number of OtherInsnB.getOperand(OtherOpIdx) after this opcode is processed then these might diverge. I think we're ok on that since setReg() doesn't seem to be called from latePass()
True, and that's the whole point of having more than one pass: satisfying dependencies.

If the number and complexity of the dependencies were much greater, I would probably process the table once, "box" (as in incapsulate) every opcode (with its operands) in an object, and then sort the objects in a way that satisfies the dependencies.

However, the dependencies we've got seem to be simple enough to get away with just a few passes over the table, so I find the "an object per opcode" solution greatly over-engineered and not needed.

================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:795
+            InsnB.addCImm(ConstantInt::get(MF.getFunction().getContext(),
+                                           APInt(LLTy.getSizeInBits(), 1)));
+          });
----------------
dsanders wrote:
> As with the other immediate case, '1' might not match all predicates
For future patches.

================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:848
+  dbgs() << "** Selecting RegBanks for VRegs unconstrained by the rule\n";
+  selectUnconstrainedRegBanks(RA.getSize2RegBanks(), MF);
+  // selectUnconstrainedRegBanks relies on the input vregs being undefined,
----------------
dsanders wrote:
> I was confused by this function name at first. It sounded like the regbanks were unconstrained and that didn't make sense. I see it's about assigning regbanks to vregs that don't already check one. I haven't thought of a good spelling ('assignRegBanksToUnconstrainedVregs()' or 'ensureRegistersHaveBanks()' are a couple ideas) but we may be able to find a clearer name.
I'm renaming `selectUnconstrainedRegBanks` to `assignRegBanksToUnconstrainedVRegs` and `selectRegBank` helper function (former lambda) to `assignRegBank` for consistency.

================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:883
+      const auto &Name = buildName(RuleNum, CurrentIdx, RuleID);
+      SmallVectorImpl<RuleSig> &Dups = RuleDuplicates[RuleBody];
+      Dups.emplace_back(RuleNum, CurrentIdx, RuleID);
----------------
dsanders wrote:
> We should have a couple examples of how duplicates can happen (e.g. i32 and f32 both mapping to s32)
I'm adding the following comment:
```
      // The major source of literal duplicates is the fact that we map MVTs
      // like i32 and f32 to the same s32 LLT, therefore 2 or more patterns
      // originally written for SelectionDAG ISel get imported as the exact same
      // sequence of semantically meaningful match table opcodes, matching and
      // rendering opcodes both:
```

================
Comment at: lib/CodeGen/GlobalISel/InstructionSelectorTestgen.cpp:781
+      assert(MO.isReg() && OtherMO.isReg() && "Expected regs as same operands");
+      if (MRI.def_empty(OtherMO.getReg()) && MO.getReg())
+        setOperand(OtherInsnB, OtherOpIdx,
----------------
@dsanders This situation, btw, is very similar to the one with record instruction opcode: we don't know which register is defined and which is not, and we can not afford loosing the definition. With record instruction we don't know which register might already have LLT and / or regbank "checked", and we can't afford loosing that info.

================
Comment at: utils/TableGen/GlobalISelEmitter.cpp:4005
+     << "namespace {\n"
+     << "class " << Target.getName() << "InstructionSelectorTestgen\n"
+     << "    : public llvm::InstructionSelectorTestgen {\n"
----------------
dsanders wrote:
> This definition should probably be wrapped in a #ifdef
It is wrapped in `GET_GLOBALISEL_IMPL` along with the InstructionSelector implementation. Due to `InstructionSelector::getTestgen` method we need the class definition even if we aren't planning to use testgen, but why would we not?

================
Comment at: utils/update_instruction_select_testgen_tests.sh:39-42
+    | perl -pe 's/(test_rule\d+)(_id\d+)?_at_idx\d+/\1/' \
+    | perl -pe 's/^(name:\s*test_rule\d+)/# TESTGEND-LABEL: \1/' \
+    | perl -pe 's/^([^#\n])/# TESTGEND:       \1/' \
+    | perl -pe 's/^\n$/#\n/' >> "$testgend"
----------------
dsanders wrote:
> The 'perl' commands might be an issue for the windows bots.
`sed` is horrible with multiline patterns, hm... This script isn't required to run tests, only to generate / update them, so maybe we could deal with it a bit later. Maybe with a little help from somebody running windows?

Repository:
  rL LLVM

https://reviews.llvm.org/D43962