[PATCH] D39034: [WIP][GlobalISel][TableGen] Optimize MatchTable for faster instruction selection

Tue Oct 31 09:35:55 PDT 2017

dsanders added a comment.

In https://reviews.llvm.org/D39034#905807, @qcolombet wrote:

> > For this, I was thinking there would be two kinds of partitioner. Those that deal with structure (number of operands, opcodes?*, nested instructions, etc.) and those that deal with predicates.
>
> I would rather avoid the partitioners to be kind of "semantic-aware". If at all possible I would rather stick to simple concept like put "actions" that are identical together. On a different topic, the same apply for the matcher. Right now, we do the distinction between capturing, matching and so on, but all in all, this is just actions.
>
> What I am saying is the design is more complicated than I would have thought and I wonder if on the long run it won't get in the way of modifying it... Unless it is well documented, in particular the intent :).

I'm also trying to avoid semantic awareness but a certain amount of it is necessary on the structure side of things since we need to walk the tree (occasionally it's a DAG but predicates deal with that). For example, you need to record an instruction with GIM_RecordInsn before you can test predicates on the instruction or its operands. The current separation between capturing and everything else is intended to keep the table optimization simple since the artificial barrier avoids the need to worry about putting an opcode check before the instruction has been recorded.

On the structure side, there's limited scope for re-ordering in any case due to the need to walk the tree. However, at this level it's also easy to group rules that share the same structure but have different predicates. For example:

  (G_FADD (G_FMUL a, b), c) // 1
  (G_FSUB (G_FMUL a, b), c)  // 2
  (G_ADD a, imm16) // 3
  (G_SUB a, imm16) // 4
  (G_MUL a, imm16) // 5
  (G_ADD a, b) // 6
  (G_SUB a, b) // 7
  (G_MUL a, b) // 8
  (G_FADD a, b) // 9
  ...

only has three distinct structures:

  (X (Y a, b), c) // 1 and 2
  (X c, (Y a, b)) // 1 (commutative G_FADD)
  (X a, b) // 3 (including commutative), 4, 5 (including commutative), 6 (including commutative), 7, 8 (including commutative), and 9 (including commutative)

I'd expect the main structure partitioner to simply group rules with the same shapes together. This should be a big win optimization-wise since there's a small number of distinct shapes in the ruleset. Once you're past structure matching and have captured all the relevant instructions, there's no relevant semantics left to preserve and you're free to re-order the predicates as much as you like.

The artificial boundary between the semantic-aware structure and the semantic-ignorant predicates does come at a cost though. We limit the potential optimization slightly since we can't rule particular structures out based on predicates. I suspect we may eventually want to allow some predicates (e.g. a check for whether an opcode is in a particular set) to cross that boundary but doing so requires more semantic awareness.

As a side note, I'd eventually like to be able to have a GIM_BuildMI/GIM_MutateOpcode that can directly map input opcodes to output opcodes so that multiple rules can share the same actions. Maybe something like:

  GIM_CheckOpcodeInSet ... G_ADD, G_SUB, 0
  GIR_BuildMIWithMap, ..., G_ADD, ADDWrr, G_SUB, SUBWrr, 0,
  ... other actions ...

but I haven't given that much thought yet. The reason I'd like it is that there are an awful lot of near-identical binary rules (e.g. (G_ADD a, b)) that differ only by the opcodes involved.

In https://reviews.llvm.org/D39034#905811, @qcolombet wrote:

> One thing to note: When you try a group, there is no way out. I.e., the assumption is that groups are mutually independent. This is actually why I haven't push for more nesting level, because the patch doesn't provide a way to check that this is true.

That was true when I first introduced GIM_Try but it's not the case anymore. GIM_Reject is equivalent to a failed predicate and will resume processing from the label specified by the GIM_Try. If there is no active GIM_Try then it will give up matching entirely.

It's handled by these lines in InstructionSelectorImpl.h:

  if (handleReject() == RejectAndGiveUp)
    return false;

Repository:
  rL LLVM

https://reviews.llvm.org/D39034