[llvm-dev] [RFC] Tablegen-erated GlobalISel Combine Rules
Nicolai Hähnle via llvm-dev
llvm-dev at lists.llvm.org
Fri Nov 30 06:43:10 PST 2018
On 30.11.18 01:04, Daniel Sanders wrote:
>> I was wondering whether it also makes sense to allow eliding the
>> (match) if it only has one child. It seems convenient, although having
>> it always be required may reduce the learning curve
>
> There's no technical reason we can't for the case where there's a single
> child and the child is a DagInit but I'm not sure there's a good reason
> to make it optional. Backend writers can always define a class that
> tacks it on using !cons if they want.
Yup.
>> The other point is that I have been wondering in the past whether it
>> wouldn't make sense to add an 'any' type to TableGen, so that
>> heterogenous lists of type list<any> are possible. TableGen already
>> internally has any-of-list-of-classes types -- there's no syntax to
>> write these types down, but they were needed to consistently describe
>> the type of a def that is derived from multiple classes (these types
>> are printed as {Class1,Class2,...} in debug output).
>>
>> I haven't *really* needed an 'any' type yet, but there were times when
>> I thought it might be convenient, so I'd be open to that.
>>
>> Then again, explicitly writing (match) also has the advantage that
>> it'd be obvious which part of the rule one is looking at.
>
> It wasn't something I consciously set out to do (I was really trying to
> achieve something akin to the list<any> you described above and needed
> it to conform to the dag type) but I did like that effect when I saw it
> written down :-).
:)
>> [snip]
>>> _Passing arbitrary data from match to apply_
>>> The main change in this section that hasn't already been discussed is
>>> that the result of extending_load_predicate has been moved to the new
>>> 'outs' section of GIMatchPredicate and the code expansion refers to a
>>> particular output of the predicate using 'matchinfo.B' similar to a
>>> struct member or multi-operand ComplexPatterns.
>>> def extending_load_matchdata : GIDefMatchData<"PreferredTuple">;
>>> def extending_load_predicate : GIMatchPredicate<
>>> bool, (ins reg:$A), (outs extending_load_matchdata:$B), [{
>>> return Helper.matchCombineExtendingLoads(${A}, ${B});
>>> }]>;
>>> def extending_loads : GICombineRule<
>>> (defs operand:$D, reg:$A, extending_load_matchdata:$matchinfo),
>>> (match (G_LOAD $D, $A),
>>> (extending_load_predicate operand:$A,
>>> extending_load_matchdata:$matchinfo)),
>>> (apply (exec [{ Helper.applyCombineExtendingLoads(${D},
>>> ${matchinfo.B}); }],
>>> reg:$D, extending_load_matchdata:$matchinfo)>;
>>> The main problem with this is that this prevents access to the
>>> contents of matchinfo from outside of C++. To overcome this, I'd
>>> suggest allowing the '${foo.bar}' syntax in tablegen's dag type.
>>> It should be a fairly simple change and there's no tblgen-internal
>>> reason to not allow dots in the names. Existing backends would have
>>> to check for it though which could be a tblgen performance issue.
>>
>> What does the ${matchinfo.B} mean, exactly? Why is $matchinfo alone
>> not good enough? It appears syntactically in the place that I would
>> expect corresponds to the out argument $B of extending_load_predicate.
>
> ${matchinfo} is the return value of the predicate which will always be
> true in this particular case. In other cases, the predicate could return
> a MachineOperand& or an unsigned (register number) and use that value.
>
> ${matchinfo.B) is the $B argument from the outs list of the predicate
> referred to by $matchinfo.
This doesn't make sense to me, I have to admit.
In
(extending_load_predicate operand:$A,
extending_load_matchdata:$matchinfo)
the $matchinfo is syntactically in the position where I would expect $B.
At least that's how I always read that fragment.
In general, I would expect a fragment `t:$b` to consistently mean "$b is
of 'type' t", where 'type' can potentially be interpreted somewhat
loosely, e.g. it could be a unary predicate.
But that kind of interpretation clashes with $matchinfo being the return
value of extending_load_predicate in the example fragment.
Also, how does this all work with multiple outs?
To me it would feel quite natural for the general pattern to be:
def XYZpred : GIMatchPredicate<
ret_type, (ins type:$in0, type:$in1),
(outs type:$out0, type:$out1), [{
... C++ code here ...
}]>;
def : GICombineRule<
(defs ...),
(match ...,
(XYZpred $a, $b, $c, $d):$r)),
(apply ...)>;
... where $a and $b are passed into the C++ code as $in0 and $in1,
respectively, $c and $d are assigned to whatever the C++ code outputs in
$out0 and $out1, and $r is assigned to what's returned by the C++ return
statement.
(This is also somewhat similar to how SDISel's complex patterns work.)
>>> _Macros_
>>> Aside from changing to the DAG style, the main change here is that
>>> the children of 'oneof' all have 'match' at the top-level. This is
>>> for the same reason 'match' was kept on the simple example: the
>>> relaxed type checking on 'dag' compared to 'list<X>' allows us to
>>> freely mix DAG-style matches and predicates.
>>> def ANYLOAD : GIMacro<(defs def:$R, use:$S, uint64_t:$IDX),
>>> (match (oneof (match (G_LOAD $R, $S)),
>>> (match (G_SEXTLOAD $R, $S)),
>>> (match (G_ZEXTLOAD $R,
>>> $S))):$IDX>;
>>> def extending_loads : GICombineRule<
>>> (defs reg:$D, reg:$A, extending_load_matchdata:$matchinfo,
>>> ANYLOAD:$ANYLOAD),
>>> (match (ANYLOAD $D, $A),
>>> (extending_load_predicate operand:$A,
>>> extending_load_matchdata:$matchinfo)),
>>> (apply (exec [{ Helper.applyCombineExtendingLoads(${D},
>>> ${matchinfo.B}); }],
>>> reg:$D, extending_load_matchdata:$matchinfo)>;
>>
>> $IDX doesn't seem to be used in the (apply) part.
>
> The rule I based this example on doesn't actually need it (the opcode is
> inside the ${matchinfo} data) but if it did, the apply section would be:
>
> (apply (exec [{ Helper.applyCombineExtendingLoads(${D},
> ${matchinfo.B}, ${IDX}); }],
> reg:$D, extending_load_matchdata:$matchinfo,
> uint64_t:$IDX)>;
>
>>> [snip]
>>
>>
>>> _Subregisters_
>>> This is a new example based on our discussion about MIR having direct
>>> support for subregisters. We use subregister indexes to specify that
>>> it's a subregister that should be emitted by the apply step.
>>> def : GICombineRule<
>>> (defs reg:$D, reg:$A, reg:$B),
>>> (match (G_TRUNC s32:$t1, s64:$A),
>>> (G_TRUNC s32:$t2, s64:$B),
>>> (G_ADD $D, $t1, $t2),
>>> (apply (ADD32 $D, (sub_lo $A), (sub_lo $B)))>;
>>> _Matching MachineMemOperands_
>>> While re-writing these examples, I also realized I didn't have any
>>> examples for testing properties of the MachineMemOperand, so here's one:
>>> def mmo_is_load_8 : GIMatchPredicate<
>>> (ins instr:$A), (outs), [{
>>> if (!${A}.hasOneMemOperand())
>>> return false;
>>> const auto &MMO = *${A}.memoperands_begin();
>>> return MMO.isLoad() && MMO.getSize() == 1;
>>> }]>;
>>> def : GICombineRule<
>>> (defs operand:$D, operand:$A),
>>> (match (G_LOAD $D, $A):$MI,
>>> (mmo_is_load8 instr:$MI)),
>>> (apply ...)>;
>>> I've made use of the naming instructions here since this was better
>>> in the debug info section. If we choose the alternative in
>>> that section then we can change it to $D or $A instead (either way
>>> the generated code would call getParent()).
>>
>> Looks good to me. How would getParent() work if $A also appears as a
>> def of an address calculation?
>
> It depends on what data needs to get to the apply. It it just needs to
> recognize the address calculation but doesn't need to use the value in
> the apply then it would be something like:
> def : GICombineRule<
> (defs operand:$D, operand:$A),
> (match (G_LOAD $D, $A):$MI,
> (mmo_is_load8 instr:$MI)
> (is_addr_plus_1 operand:$A),
> (apply (TGT_LOAD $D, $A, 1)>;
> or if a value like an offset needs passing, it would be something like:
> def : GICombineRule<
> (defs operand:$D, addr_plus_simm16:$A),
> (match (G_LOAD $D, $A):$MI,
> (mmo_is_load8 instr:$MI)
> (is_addr_plus_simm16 operand:$A),
> (apply (create_imm [{ ${A}.imm }], addr_plus_simm16:$A):$IMM),
> (TGT_LOAD $D, $A, $IMM))>;
Okay, that makes sense. Though I was wondering more about stuff like:
(match (G_ADD $A, ...),
(G_LOAD $D, $A), ...)
In that case, the parent of $A becomes ambiguous.
I guess in that case the shorthand of using operands would just be
forbidden and you have to explicitly name the instructions.
Cheers,
Nicolai
>
>> Cheers,
>> Nicolai
>> --
>> Lerne, wie die Welt wirklich ist,
>> Aber vergiss niemals, wie sie sein sollte.
>
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
More information about the llvm-dev
mailing list