[llvm-dev] [RFC] Tablegen-erated GlobalISel Combine Rules

Fri Nov 30 06:43:10 PST 2018

On 30.11.18 01:04, Daniel Sanders wrote:
>> I was wondering whether it also makes sense to allow eliding the 
>> (match) if it only has one child. It seems convenient, although having 
>> it always be required may reduce the learning curve
> 
> There's no technical reason we can't for the case where there's a single 
> child and the child is a DagInit but I'm not sure there's a good reason 
> to make it optional. Backend writers can always define a class that 
> tacks it on using !cons if they want.

Yup.

>> The other point is that I have been wondering in the past whether it 
>> wouldn't make sense to add an 'any' type to TableGen, so that 
>> heterogenous lists of type list<any> are possible. TableGen already 
>> internally has any-of-list-of-classes types -- there's no syntax to 
>> write these types down, but they were needed to consistently describe 
>> the type of a def that is derived from multiple classes (these types 
>> are printed as {Class1,Class2,...} in debug output).
>>
>> I haven't *really* needed an 'any' type yet, but there were times when 
>> I thought it might be convenient, so I'd be open to that.
>>
>> Then again, explicitly writing (match) also has the advantage that 
>> it'd be obvious which part of the rule one is looking at.
> 
> It wasn't something I consciously set out to do (I was really trying to 
> achieve something akin to the list<any> you described above and needed 
> it to conform to the dag type) but I did like that effect when I saw it 
> written down :-).

:)

>> [snip]
>>> _Passing arbitrary data from match to apply_
>>> The main change in this section that hasn't already been discussed is 
>>> that the result of extending_load_predicate has been moved to the new 
>>> 'outs' section of GIMatchPredicate and the code expansion refers to a 
>>> particular output of the predicate using 'matchinfo.B' similar to a 
>>> struct member or multi-operand ComplexPatterns.
>>>     def extending_load_matchdata : GIDefMatchData<"PreferredTuple">;
>>>     def extending_load_predicate : GIMatchPredicate<
>>>          bool, (ins reg:$A), (outs extending_load_matchdata:$B), [{
>>>       return Helper.matchCombineExtendingLoads(${A}, ${B});
>>>     }]>;
>>>     def extending_loads : GICombineRule<
>>>       (defs operand:$D, reg:$A, extending_load_matchdata:$matchinfo),
>>>       (match (G_LOAD $D, $A),
>>>              (extending_load_predicate operand:$A,
>>>                                         extending_load_matchdata:$matchinfo)),
>>>       (apply (exec [{ Helper.applyCombineExtendingLoads(${D}, 
>>> ${matchinfo.B}); }],
>>>                    reg:$D, extending_load_matchdata:$matchinfo)>;
>>> The main problem with this is that this prevents access to the 
>>> contents of matchinfo from outside of C++. To overcome this, I'd 
>>> suggest allowing the '${foo.bar}' syntax in tablegen's dag type. 
>>> It should be a fairly simple change and there's no tblgen-internal 
>>> reason to not allow dots in the names. Existing backends would have 
>>> to check for it though which could be a tblgen performance issue.
>>
>> What does the ${matchinfo.B} mean, exactly? Why is $matchinfo alone 
>> not good enough? It appears syntactically in the place that I would 
>> expect corresponds to the out argument $B of extending_load_predicate.
> 
> ${matchinfo} is the return value of the predicate which will always be 
> true in this particular case. In other cases, the predicate could return 
> a MachineOperand& or an unsigned (register number) and use that value.
> 
> ${matchinfo.B) is the $B argument from the outs list of the predicate 
> referred to by $matchinfo.

This doesn't make sense to me, I have to admit.

In

        (extending_load_predicate operand:$A,
                                  extending_load_matchdata:$matchinfo)

the $matchinfo is syntactically in the position where I would expect $B. 
At least that's how I always read that fragment.

In general, I would expect a fragment `t:$b` to consistently mean "$b is 
of 'type' t", where 'type' can potentially be interpreted somewhat 
loosely, e.g. it could be a unary predicate.

But that kind of interpretation clashes with $matchinfo being the return 
value of extending_load_predicate in the example fragment.

Also, how does this all work with multiple outs?

To me it would feel quite natural for the general pattern to be:

   def XYZpred : GIMatchPredicate<
         ret_type, (ins type:$in0, type:$in1),
                   (outs type:$out0, type:$out1), [{
     ... C++ code here ...
   }]>;
   def : GICombineRule<
     (defs ...),
     (match ...,
            (XYZpred $a, $b, $c, $d):$r)),
     (apply ...)>;

... where $a and $b are passed into the C++ code as $in0 and $in1, 
respectively, $c and $d are assigned to whatever the C++ code outputs in 
$out0 and $out1, and $r is assigned to what's returned by the C++ return 
statement.

(This is also somewhat similar to how SDISel's complex patterns work.)

>>> _Macros_
>>> Aside from changing to the DAG style, the main change here is that 
>>> the children of 'oneof' all have 'match' at the top-level. This is 
>>> for the same reason 'match' was kept on the simple example: the 
>>> relaxed type checking on 'dag' compared to 'list<X>' allows us to 
>>> freely mix DAG-style matches and predicates.
>>>     def ANYLOAD : GIMacro<(defs def:$R, use:$S, uint64_t:$IDX),
>>>                           (match (oneof (match (G_LOAD $R, $S)),
>>>                                         (match (G_SEXTLOAD $R, $S)),
>>>                                         (match (G_ZEXTLOAD $R, 
>>> $S))):$IDX>;
>>>     def extending_loads : GICombineRule<
>>>       (defs reg:$D, reg:$A, extending_load_matchdata:$matchinfo, 
>>> ANYLOAD:$ANYLOAD),
>>>       (match (ANYLOAD $D, $A),
>>>              (extending_load_predicate operand:$A,
>>>                                         extending_load_matchdata:$matchinfo)),
>>>       (apply (exec [{ Helper.applyCombineExtendingLoads(${D}, 
>>> ${matchinfo.B}); }],
>>>                    reg:$D, extending_load_matchdata:$matchinfo)>;
>>
>> $IDX doesn't seem to be used in the (apply) part.
> 
> The rule I based this example on doesn't actually need it (the opcode is 
> inside the ${matchinfo} data) but if it did, the apply section would be:
> 
>        (apply (exec [{ Helper.applyCombineExtendingLoads(${D}, 
> ${matchinfo.B}, ${IDX}); }],
>                     reg:$D, extending_load_matchdata:$matchinfo, 
> uint64_t:$IDX)>;
> 
>>> [snip]
>>
>>
>>> _Subregisters_
>>> This is a new example based on our discussion about MIR having direct 
>>> support for subregisters. We use subregister indexes to specify that 
>>> it's a subregister that should be emitted by the apply step.
>>> def : GICombineRule<
>>>   (defs reg:$D, reg:$A, reg:$B),
>>>   (match (G_TRUNC s32:$t1, s64:$A),
>>>          (G_TRUNC s32:$t2, s64:$B),
>>>          (G_ADD $D, $t1, $t2),
>>>   (apply (ADD32 $D, (sub_lo $A), (sub_lo $B)))>;
>>> _Matching MachineMemOperands_
>>> While re-writing these examples, I also realized I didn't have any 
>>> examples for testing properties of the MachineMemOperand, so here's one:
>>>     def mmo_is_load_8 : GIMatchPredicate<
>>>          (ins instr:$A), (outs), [{
>>>       if (!${A}.hasOneMemOperand())
>>>         return false;
>>>       const auto &MMO = *${A}.memoperands_begin();
>>>       return MMO.isLoad() && MMO.getSize() == 1;
>>>     }]>;
>>>     def : GICombineRule<
>>>       (defs operand:$D, operand:$A),
>>>       (match (G_LOAD $D, $A):$MI,
>>>              (mmo_is_load8 instr:$MI)),
>>>       (apply ...)>;
>>> I've made use of the naming instructions here since this was better 
>>> in the debug info section. If we choose the alternative in 
>>> that section then we can change it to $D or $A instead (either way 
>>> the generated code would call getParent()).
>>
>> Looks good to me. How would getParent() work if $A also appears as a 
>> def of an address calculation?
> 
> It depends on what data needs to get to the apply. It it just needs to 
> recognize the address calculation but doesn't need to use the value in 
> the apply then it would be something like:
>      def : GICombineRule<
>        (defs operand:$D, operand:$A),
>        (match (G_LOAD $D, $A):$MI,
>               (mmo_is_load8 instr:$MI)
>               (is_addr_plus_1 operand:$A),
>        (apply (TGT_LOAD $D, $A, 1)>;
> or if a value like an offset needs passing, it would be something like:
>      def : GICombineRule<
>        (defs operand:$D, addr_plus_simm16:$A),
>        (match (G_LOAD $D, $A):$MI,
>               (mmo_is_load8 instr:$MI)
>               (is_addr_plus_simm16 operand:$A),
>        (apply (create_imm [{ ${A}.imm }], addr_plus_simm16:$A):$IMM),
>               (TGT_LOAD $D, $A, $IMM))>;

Okay, that makes sense. Though I was wondering more about stuff like:

   (match (G_ADD $A, ...),
          (G_LOAD $D, $A), ...)

In that case, the parent of $A becomes ambiguous.

I guess in that case the shorthand of using operands would just be 
forbidden and you have to explicitly name the instructions.

Cheers,
Nicolai

> 
>> Cheers,
>> Nicolai
>> -- 
>> Lerne, wie die Welt wirklich ist,
>> Aber vergiss niemals, wie sie sein sollte.
> 

-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.