[llvm-dev] [RFC] Tablegen-erated GlobalISel Combine Rules

Tue Nov 13 08:01:55 PST 2018

Daniel Sanders via llvm-dev <llvm-dev at lists.llvm.org> writes:

> That's an interesting idea. Certainly tablegenerating InstCombine
> ought to be possible and sharing code sounds like it ought to be
> doable. MIR and IR are pretty similar especially after IRTranslator
> (which is a direct translation) through to the Legalizer (which is the
> first point target instructions can until targets make custom
> passes). From the Legalizer to ISel, there's still likely to be a fair
> amount of overlap between the two as a lot of the G_* opcodes directly
> correspond to LLVM-IR instructions. The tricky bit will be the escape
> hatches into C++ would need to either have Instruction/MachineInstr
> versions or would need to accept both.

Yes.  I wonder if templates can help with this.  I'm thinking of
LoopInfo, which is parameterized with BasicBlock/Loop or
MachineBasicBlock/MachineLoop, as a guide.  I don't know if Instruction
and MachineInstr have enough APIs in common, though.  Perhaps a
longer-term effort could be made to develop a common API with the same
spellings for both, at least enough for dagcombine purposes.

Your initial response to Nicolai's suggestion to use TableGen's dag type
triggered something else in my mind.  The TableGen SelectionDAG stuff is
really restrictive in a lot of ways because the TableGen dag type is
used to express actual DAGs and you rightly pointed out that that's too
unwieldy for dagcombine.  It's unwieldy for isel too, which has many of
the same issues (multiple outputs, use of things like EXTRACT_SUBREG and
so on).  It's never handled selecting to multiple instructions very well
either.  I wonder if it would be possible to express isel patterns using
TableGen's dag type in the way that Nicolai suggests for dagcombine.  In
a lot of ways, isel is "just another kind of dagcombine."

>> The use of '%' vs. '$' here really threw me for a loop.  I would have
>> expected '$D' and '$S' everywhere.  What's the significance of '%'
>> vs. '$'?  I know MIR uses '%' for names but must that be the case in
>> these match blocks?
>
> In MIR, '%' and '$' have a semantic difference when used on operands.
> '%foo' is a virtual register named foo but '$foo' is the physical register foo.

Ok, thanks for the explanation.  I have not worked extensively with MIR
yet.

> The main reason I didn't pick something distinct from either
> (e.g. '${foo}') is that I'd like to minimize the need to modify the
> MIR parser to support pattern-specific syntax.

Sure, that makes sense.

>> My understanding is that "root" means the sink here, but later in the
>> "upside-down" example, "root" means the source.  It seems to me the use
>> of "root" here is a misnomer/confusing.  I liked Nicolai's ideas about
>> specifying the insert point.  Why do users need to know about "root" at
>> all?  Is there some other special meaning attached to it?
>
> It doesn't correspond with any property of the DAG being matched. It's
> the entry point that the combine algorithm uses to begin an attempt to
> match the rule. In DAGCombine terms, it's the first node checked for
> the rule inside the call to DAGCombine::visit(SDNode *).

Got it.  "root" has a specific meeting for a tree/dag and I think I was
getting hung up on that, but I see how it has meaning as the starting
point of a match.  I was going to say maybe we can find another name for
it, but now I can't think of a better one.  :)

                               -David