[llvm-dev] GlobalISel and commutative binops in Pat?

Mon Apr 26 16:00:27 PDT 2021

Hi Björn,

TL;DR In GISel the idea is that you own the canonical representation for your target.

> On Apr 26, 2021, at 4:23 AM, Björn Pettersson A <bjorn.a.pettersson at ericsson.com> wrote:
> 
>> -----Original Message-----
>> From: Daniel Sanders <daniel_l_sanders at apple.com <mailto:daniel_l_sanders at apple.com>>
>> Sent: den 22 april 2021 02:01
>> To: Quentin Colombet <qcolombet at apple.com <mailto:qcolombet at apple.com>>
>> Cc: Björn Pettersson A <bjorn.a.pettersson at ericsson.com <mailto:bjorn.a.pettersson at ericsson.com>>; llvm-dev <llvm-
>> dev at lists.llvm.org <mailto:dev at lists.llvm.org>>
>> Subject: Re: [llvm-dev] GlobalISel and commutative binops in Pat?
>> 
>> 
>> 
>>> On Apr 21, 2021, at 09:48, Quentin Colombet <qcolombet at apple.com> wrote:
>>> 
>>> + Daniel
>>> 
>>> Hi Björn,
>>> 
>>>> On Apr 21, 2021, at 3:08 AM, Björn Pettersson A via llvm-dev <llvm-
>> dev at lists.llvm.org> wrote:
>>>> 
>>>> Hi!
>>>> 
>>>> I'm working on porting an OOT-target to use GlobalISel.
>>>> 
>>>> One thing that I've noticed is that if I for example have a Pat like
>> this
>>>> 
>>>> def : Pat<(add i16:$reg, MyImmLeaf:$imm), (add16 i16:$src, imm:$imm)>;
>>>> 
>>>> it only matches if the G_ADD has the immediate operand as the second
>> operand.
>>>> 
>>>> 
>>>> Here are some questions related to that:
>>>> 
>>>> 1) I've been trying to use the generic pre/post legalize combiner, but I
>> can't see that any canonicalization related to putting the immediate as
>> second operand happens (I think it is something that the DAGCombiner would
>> do).
>>>> Is this something not-yet-implemented that we want to do?
>>>> 
>>>> 2) Isn't the old SelectionDAG ISel automatically handling matching
>> during selection for commutative SDNodes? So even without any prior
>> canonicalization the matcher should handle this. I mean, in general a
>> commutative operation could take two predicated non-leaf operands, so we
>> can't solve all kinds combinations using canonicalization.
>>>> Is this something not-yet-implemented in the GISel matcher that we want
>> to do?
>>> 
>>> IIRC, at one point Daniel and I talked about supporting this in the
>> matcher tables. I think we end up postponing that because we didn’t have
>> any use case for this and the IR coming in was canonical (plus it would
>> have increase the compile time for no good reasons at that point).
>> 
>> I think it was mostly that it wasn't doing much harm in practice. IIRC
>> CodeGenDAGPatterns duplicates patterns for commutative operations but then
>> goes on to filter a few cases out that DAGISel traditionally handled
>> itself. I remember I was trying to figure out how to change the filter for
>> GlobalISel without breaking DAGISel but I don't think I found a good
>> solution
>> 
>>> On the canonicalization part, we don’t do a whole lot because the
>> incoming IR is already supposed to be canonical, thus any non canonical
>> representation comes from GISel transformations themself and can be fixed.
>>> By default we have only limited canonicalization, but we can add more
>> with the combiners and IIRC the machine IR builder does some of it as well
>> (and I believe you can plug a custom machine IR builder that does more).
> 
> 
> My main concerns here are in the line of this:
> 
> - Being able to write MIR-tests for the GlobalISel steps is neat,
>  but if relying on some sort of canonical input I need to be careful
>  when writing such test cases to actually use the canonical form.
>  And I really need to track if the canonical form changes, or else I end
>  up with lots of irrelevant test cases verifying that my in practice
>  "dead" ISel patterns works, while I'm actually lacking patterns that
>  match the new canonical form.

GISel doesn’t do any canonicalization steps by default by design.
The rationale is we learnt from SDISel that some canonical patterns were actually harmful to some targets and in SDISel there was an art around undoing the canonicalization or working around it.

So instead, by default GISel doesn’t do any canonicalization.
However, GISel supports canonicalization through the mirbuilder and the combiners.

The idea is you should add the canonicalization steps that make sense for your target in your combiners, by reusing the existing combiner helpers.

Then, for your tests, you can run your combiners on the input IR/MIR before feeding it to the pass you want to test (e.g., by using —run-pass=myCombiner,thePassIWantToTest).

Another option for writing tests is to keep the LLVM IR as input and just make several stop points to check that the intermediate IR looks the way you expect.
E.g., in just one file, have several run lines with different stop-after options, like —stop-after irtranslator, —stop-after legalizer and so on.

> 
>  It is at least a litte bit scary if my most basic ISel patterns suddenly
>  depend on how opt is canonicalizing IR.

Agree, and you should really try to have something that works with whatever O0 inputs you get.
If you require some canonicalization to happen, then you need to put them into your mandatory combiners.

> 
> - If I want to implement regressions tests that verify that I get the
>  same codegen regardless of the input IR (or if I already have lots of
>  such tests for my target), then I basically need to run opt before llc
>  in such lit tests.
>  I'm not sure if instcombine, or an O0 pipeline, is enough as a
>  "canonicalizer" in such cases, or if the GlobalISel pattern matcher is
>  fine-tuned for the canonical form produced by an O<n> pipeline?

The O0 pipeline is not enough for sure :).
Again, you need to explicitly add any combiner rules/patterns that makes sense on what you want to support.

> 
> - How do I figure out if my old ISel patterns will work correct for
>  GlobalISel. They might have been written using the non-canonical form.

It’s the other way around: GISel supports any MIR, but your target may only support a subset of that. You are responsible of the canonicalization, so you should know if/when it changes.

What I am saying is the generic GISel passes shouldn’t change the canonical representation, since there is none :).
I’ll add a caveat to that: sometimes we change which nodes are produced, create new ones when we find this is generally beneficial.
Ideally, you should be able to have the generic passes generate whatever canonical representation you want with the custom MIRBuilder (I don’t remember if we end up completely opening that for target customization and if we didn’t I think we should!)

>  I guess I either need to rely on benchmarks suites, or write lots of
>  lit test cases based on "the current canonical form". So maybe the real
>  questions is "How do I figure out the current canonical form for a
>  certain baseline commit?".
> 
> Additional short-comings IMHO:
> 
> - The canonical form isn't really documented anywhere afaik. This makes
>  all of the above extra complicated.

That’s because there is none.

> 
> - There is no verifier that checks that the IR has canonical form
>  before ISel.

Same answer. The verifier will only verify that the IR is well formed with respect to types and registers usages. At some point, I pushed for having some target customization in the verifier to be able to check that, but I don’t think we ended up having time to do this.

> 
> - For simple things like binop:s, isn't it stupid that the IR allows
>  the non-canonical form in the first place if passes should expect
>  that the non-canonical form isn't used.

Again there is no concept of canonical in the IR. Either you can select it or you can’t. What’s canonical for you, may not be for us.
For instance, I wish we represent all `G_SUB` with `G_ADD lhs, (G_NEG rhs)` so following this, G_SUB shouldn’t exist. On the other hand, you may find G_SUB to be the canonical representation for that, so what do we do?

I think fundamentally there are two things to distinguish:
- What is required for correctness?
- What is required to generate good code?

When you look at canonicalization in SDISel terms, you’ll find a little bit of both (e.g., always putting the immediate field on the RHS of a binop may not be required for correctness, but if we don’t you take may generate less efficient code (with ldimm an rr variant (two registers as input))).

In GISel we tried to rethink that so that as a target owner, you’ll be able to classify what is required for correctness (and thus run at O0) and what is required to generate good code (runs only for optimized pipeline).

Cheers,
-Quentin

> 
> 
>>> 
>>> Take all this with a grain of salt, it’s been a while since I looked into
>> GISel actual implementation.
>>> 
>>> Cheers,
>>> -Quentin
>>> 
>>> 
>>> 
>>>> 
>>>> 
>>>> Maybe I've simply missed something, or is doing something wrong as this
>> currently fails for me. Adding all the commutative patterns as new
>> definitions might be a lot of work for any target, so I suppose this should
>> work out-of-the-box somehow.
>>>> 
>>>> /Björn
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210426/a45ff666/attachment.html>