[llvm-dev] [RFC] Tablegen-erated GlobalISel Combine Rules

Daniel Sanders via llvm-dev llvm-dev at lists.llvm.org
Fri Nov 30 09:40:59 PST 2018



> On Nov 30, 2018, at 06:43, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
> 
> On 30.11.18 01:04, Daniel Sanders wrote:
>>> I was wondering whether it also makes sense to allow eliding the (match) if it only has one child. It seems convenient, although having it always be required may reduce the learning curve
>> There's no technical reason we can't for the case where there's a single child and the child is a DagInit but I'm not sure there's a good reason to make it optional. Backend writers can always define a class that tacks it on using !cons if they want.
> 
> Yup.
> 
> 
>>> The other point is that I have been wondering in the past whether it wouldn't make sense to add an 'any' type to TableGen, so that heterogenous lists of type list<any> are possible. TableGen already internally has any-of-list-of-classes types -- there's no syntax to write these types down, but they were needed to consistently describe the type of a def that is derived from multiple classes (these types are printed as {Class1,Class2,...} in debug output).
>>> 
>>> I haven't *really* needed an 'any' type yet, but there were times when I thought it might be convenient, so I'd be open to that.
>>> 
>>> Then again, explicitly writing (match) also has the advantage that it'd be obvious which part of the rule one is looking at.
>> It wasn't something I consciously set out to do (I was really trying to achieve something akin to the list<any> you described above and needed it to conform to the dag type) but I did like that effect when I saw it written down :-).
> 
> :)
> 
> 
>>> [snip]
>>>> _Passing arbitrary data from match to apply_
>>>> The main change in this section that hasn't already been discussed is that the result of extending_load_predicate has been moved to the new 'outs' section of GIMatchPredicate and the code expansion refers to a particular output of the predicate using 'matchinfo.B' similar to a struct member or multi-operand ComplexPatterns.
>>>>     def extending_load_matchdata : GIDefMatchData<"PreferredTuple">;
>>>>     def extending_load_predicate : GIMatchPredicate<
>>>>          bool, (ins reg:$A), (outs extending_load_matchdata:$B), [{
>>>>       return Helper.matchCombineExtendingLoads(${A}, ${B});
>>>>     }]>;
>>>>     def extending_loads : GICombineRule<
>>>>       (defs operand:$D, reg:$A, extending_load_matchdata:$matchinfo),
>>>>       (match (G_LOAD $D, $A),
>>>>              (extending_load_predicate operand:$A,
>>>>                                         extending_load_matchdata:$matchinfo)),
>>>>       (apply (exec [{ Helper.applyCombineExtendingLoads(${D}, ${matchinfo.B}); }],
>>>>                    reg:$D, extending_load_matchdata:$matchinfo)>;
>>>> The main problem with this is that this prevents access to the contents of matchinfo from outside of C++. To overcome this, I'd suggest allowing the '${foo.bar}' syntax in tablegen's dag type. It should be a fairly simple change and there's no tblgen-internal reason to not allow dots in the names. Existing backends would have to check for it though which could be a tblgen performance issue.
>>> 
>>> What does the ${matchinfo.B} mean, exactly? Why is $matchinfo alone not good enough? It appears syntactically in the place that I would expect corresponds to the out argument $B of extending_load_predicate.
>> ${matchinfo} is the return value of the predicate which will always be true in this particular case. In other cases, the predicate could return a MachineOperand& or an unsigned (register number) and use that value.
>> ${matchinfo.B) is the $B argument from the outs list of the predicate referred to by $matchinfo.
> 
> This doesn't make sense to me, I have to admit.
> 
> In
> 
>       (extending_load_predicate operand:$A,
>                                 extending_load_matchdata:$matchinfo)
> 
> the $matchinfo is syntactically in the position where I would expect $B. At least that's how I always read that fragment.
> 
> In general, I would expect a fragment `t:$b` to consistently mean "$b is of 'type' t", where 'type' can potentially be interpreted somewhat loosely, e.g. it could be a unary predicate.
> 
> But that kind of interpretation clashes with $matchinfo being the return value of extending_load_predicate in the example fragment.
> 
> Also, how does this all work with multiple outs?
> 
> To me it would feel quite natural for the general pattern to be:
> 
>  def XYZpred : GIMatchPredicate<
>        ret_type, (ins type:$in0, type:$in1),
>                  (outs type:$out0, type:$out1), [{
>    ... C++ code here ...
>  }]>;
>  def : GICombineRule<
>    (defs ...),
>    (match ...,
>           (XYZpred $a, $b, $c, $d):$r)),
>    (apply ...)>;
> 
> ... where $a and $b are passed into the C++ code as $in0 and $in1, respectively, $c and $d are assigned to whatever the C++ code outputs in $out0 and $out1, and $r is assigned to what's returned by the C++ return statement.
> 
> (This is also somewhat similar to how SDISel's complex patterns work.)

You're absolutely right. I actually have the (pred ...):$name in one of the examples in my original notes I was using to write the initial email, although none of them go on to actually use $name even where it would have made sense. The closest one puts the data in the result _and_ the outs list but only uses the outs list for some reason.

>>>> _Macros_
>>>> Aside from changing to the DAG style, the main change here is that the children of 'oneof' all have 'match' at the top-level. This is for the same reason 'match' was kept on the simple example: the relaxed type checking on 'dag' compared to 'list<X>' allows us to freely mix DAG-style matches and predicates.
>>>>     def ANYLOAD : GIMacro<(defs def:$R, use:$S, uint64_t:$IDX),
>>>>                           (match (oneof (match (G_LOAD $R, $S)),
>>>>                                         (match (G_SEXTLOAD $R, $S)),
>>>>                                         (match (G_ZEXTLOAD $R, $S))):$IDX>;
>>>>     def extending_loads : GICombineRule<
>>>>       (defs reg:$D, reg:$A, extending_load_matchdata:$matchinfo, ANYLOAD:$ANYLOAD),
>>>>       (match (ANYLOAD $D, $A),
>>>>              (extending_load_predicate operand:$A,
>>>>                                         extending_load_matchdata:$matchinfo)),
>>>>       (apply (exec [{ Helper.applyCombineExtendingLoads(${D}, ${matchinfo.B}); }],
>>>>                    reg:$D, extending_load_matchdata:$matchinfo)>;
>>> 
>>> $IDX doesn't seem to be used in the (apply) part.
>> The rule I based this example on doesn't actually need it (the opcode is inside the ${matchinfo} data) but if it did, the apply section would be:
>>       (apply (exec [{ Helper.applyCombineExtendingLoads(${D}, ${matchinfo.B}, ${IDX}); }],
>>                    reg:$D, extending_load_matchdata:$matchinfo, uint64_t:$IDX)>;
>>>> [snip]
>>> 
>>> 
>>>> _Subregisters_
>>>> This is a new example based on our discussion about MIR having direct support for subregisters. We use subregister indexes to specify that it's a subregister that should be emitted by the apply step.
>>>> def : GICombineRule<
>>>>   (defs reg:$D, reg:$A, reg:$B),
>>>>   (match (G_TRUNC s32:$t1, s64:$A),
>>>>          (G_TRUNC s32:$t2, s64:$B),
>>>>          (G_ADD $D, $t1, $t2),
>>>>   (apply (ADD32 $D, (sub_lo $A), (sub_lo $B)))>;
>>>> _Matching MachineMemOperands_
>>>> While re-writing these examples, I also realized I didn't have any examples for testing properties of the MachineMemOperand, so here's one:
>>>>     def mmo_is_load_8 : GIMatchPredicate<
>>>>          (ins instr:$A), (outs), [{
>>>>       if (!${A}.hasOneMemOperand())
>>>>         return false;
>>>>       const auto &MMO = *${A}.memoperands_begin();
>>>>       return MMO.isLoad() && MMO.getSize() == 1;
>>>>     }]>;
>>>>     def : GICombineRule<
>>>>       (defs operand:$D, operand:$A),
>>>>       (match (G_LOAD $D, $A):$MI,
>>>>              (mmo_is_load8 instr:$MI)),
>>>>       (apply ...)>;
>>>> I've made use of the naming instructions here since this was better in the debug info section. If we choose the alternative in that section then we can change it to $D or $A instead (either way the generated code would call getParent()).
>>> 
>>> Looks good to me. How would getParent() work if $A also appears as a def of an address calculation?
>> It depends on what data needs to get to the apply. It it just needs to recognize the address calculation but doesn't need to use the value in the apply then it would be something like:
>>     def : GICombineRule<
>>       (defs operand:$D, operand:$A),
>>       (match (G_LOAD $D, $A):$MI,
>>              (mmo_is_load8 instr:$MI)
>>              (is_addr_plus_1 operand:$A),
>>       (apply (TGT_LOAD $D, $A, 1)>;
>> or if a value like an offset needs passing, it would be something like:
>>     def : GICombineRule<
>>       (defs operand:$D, addr_plus_simm16:$A),
>>       (match (G_LOAD $D, $A):$MI,
>>              (mmo_is_load8 instr:$MI)
>>              (is_addr_plus_simm16 operand:$A),
>>       (apply (create_imm [{ ${A}.imm }], addr_plus_simm16:$A):$IMM),
>>              (TGT_LOAD $D, $A, $IMM))>;
> 
> Okay, that makes sense. Though I was wondering more about stuff like:
> 
>  (match (G_ADD $A, ...),
>         (G_LOAD $D, $A), ...)
> 
> In that case, the parent of $A becomes ambiguous.
> 
> I guess in that case the shorthand of using operands would just be forbidden and you have to explicitly name the instructions.

That's a good point. I was thinking it would be the def but that doesn't quite work because not everything has a def. Let's go with the naming instructions solution for debug info.

The same problem is there for using an operand-typed value in the apply section. When used in an apply section (FOO ..., $A, ...) it can potentially be disambiguated by whether it's a def context or a use context (using the MCDesc). However, this doesn't work when variadic operands are involved or when it's pasting into code and it becomes ambiguous again if you're matching multiple uses of $A. I suppose referring to specific instances with ${A[0]} would fix it but that's not a good way to solve it. The reg type is a better solution for that since MachineOperand::getReg() is the same for both. I think we'd have to restrict operand to only allow a single use in the match pattern and direct people to reg when it has multiple uses.

> Cheers,
> Nicolai
> 
> 
>>> Cheers,
>>> Nicolai
>>> -- 
>>> Lerne, wie die Welt wirklich ist,
>>> Aber vergiss niemals, wie sie sein sollte.
> 
> -- 
> Lerne, wie die Welt wirklich ist,
> Aber vergiss niemals, wie sie sein sollte.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20181130/55debf80/attachment-0001.html>


More information about the llvm-dev mailing list