[llvm-dev] [RFC] Tablegen-erated GlobalISel Combine Rules

Nicolai Hähnle via llvm-dev llvm-dev at lists.llvm.org
Fri Nov 30 06:28:51 PST 2018


On 30.11.18 01:25, Daniel Sanders wrote:
> 
> 
>> On Nov 29, 2018, at 02:02, Nicolai Hähnle <nhaehnle at gmail.com 
>> <mailto:nhaehnle at gmail.com>> wrote:
>>
>> On 27.11.18 19:01, Daniel Sanders wrote:
>>> ...Continued from the other email
>>> _Removing the defs section_
>>> We can potentially infer quite a lot of the defs section but it 
>>> requires both a complicated ruleset and that tblgen spends processing 
>>> time doing the inferencing. That processing time is 
>>> potentially significant for large combiners and for that reason we 
>>> need to be careful not to let inferencing become a burden on tblgen. 
>>> My main worry here is that it may require analyzing both the match 
>>> and apply section to fully inference the types and I'd like the 
>>> tablegen pass to be able to ignore the apply section until it's about 
>>> to write it out to the output file.
>>
>> Thank you for looking into this!
>>
>>
>>> There are however, some simple rules we can use to minimize the defs 
>>> section with just a scan of the defs section (trivial) and match 
>>> section (which we have to do anyway):
>>> 1. Match roots are implicitly declared in the defs section and have
>>>    type 'reg'.
>>> 2. Names used as argument to an operator that subclasses Instruction
>>>    are implicitly declared in the defs section with type 'reg'. This
>>>    allows leaf-operands to be used in the apply section
>>>      * This does however come at additional run-time memory cost
>>
>> Can you explain this? You mean run-time as in when the compiler runs, 
>> right? I guess I'm not familiar enough with the details of the 
>> matching algorithm.
> 
> That's right. The impact of this rule is to include all the operands 
> covered by the match just in case they need to be delivered to apply. 
> This makes the struct above larger than it really needs to be. For example:
> (defs $d, $a, $b)
>          (match (MUL $t, $a, $b),
>                 (ADD $d, $c, $t))
> would record and pass three operands, but with this rule it would 
> effectively be:
> (defs $d, $t, $a, $b)
> which is slightly larger.
> 
> The run-time memory cost comes from the implicit declaration in the defs 
> section. The defs sections primary purpose is to describe the 
> information that needs to be delivered from the match step to the apply 
> step. So it can be seen as defining*:
> struct {
> MachineOperand &d;
> MachineOperand &t; // due to the implicit declaration.
> MachineOperand &a;
> MachineOperand &b;
> } Foo;
> Foo matchRule(...);
> void applyRule(Foo &, ...);

Okay, this is interesting.

Can't TableGen whittle down the members of that struct automatically to 
the intersection of operands that are referenced by both match and apply?

Cheers,
Nicolai



> *That's not quite what the tablegenerated code will do in practice but 
> it's the gist of what it will do. The real implementation will be 
> managing a buffer that's shared between the whole ruleset and will tell 
> the apply step which bits it should read.
> 
>> Cheers,
>> Nicolai
>>
>>
>>> 3. Names used as argument to an operator that subclasses Instruction
>>>    are assumed to be 'reg'. This allows us to omit the
>>>    formerly-anonymous names such as $t[0-3] in the above examples.
>>>      * Using an 'operand' typed variable as the def of an instruction
>>>        does not change it to 'reg' but does emit the isReg() check that
>>>        'reg' would normally emit.
>>> 4. Names used in an 'outs' position on a GIMatchPredicate are
>>>    implicitly declared in the defs section using the type given by the
>>>    predicate.
>>> 5. Any names not given in the defs section are assumed to be 'instr' if
>>>    they are used as a name for an operator that subclasses Instruction.
>>>    This allows us to omit the defs for MI's involved in debug info.
>>> 6. Any names used solely in the apply section and defined by a
>>>    value-constructing operator (e.g. create*) can be omitted
>>> With these rules, the defs section isn't completely gone but is often 
>>> empty and can be omitted with a variant of GICombineRule. The 
>>> examples would be as below. I've indicated which rule eliminates 
>>> which def with color coding and strike-through:
>>> // Omitted due to rule 1
>>> // Omitted due to rule 2
>>> // We haven't been listing any defs related to rule 3 since these are 
>>> the formerly anonymous
>>> // operands that now need names to fit into the tblgen dag type
>>> // Omitted due to rule 4
>>> // Omitted due to rule 5
>>> //Omitted due to rule 6
>>> def : GICombineRule<(defs reg:$D, reg:$S),
>>> (match (G_ZEXT s32:$t1, s8:$S),
>>> (G_TRUNC s16:$D, s32:$t1)),
>>> (apply (G_ZEXT s16:$D, s8:$S))>;
>>> def : GICombineRule<(defs reg:$D, reg:$S),
>>> (match (G_ZEXT $t1, $S),
>>> (G_TRUNC $D, $t1),
>>> (isScalarType type:$D),
>>> (isLargerType type:$D, type:$S)),
>>> (apply (G_ZEXT $D, $S))>;
>>> def : GICombineRule<(defs reg:$D, reg:$S, instr:$MI0, instr:$MI1),
>>> (match (G_ZEXT $t0, $S):$MI0,
>>> (G_TRUNC $D, $t0):$MI1),
>>> (isScalarType type:$D),
>>> (isLargerType type:$D, type:$S)),
>>> (apply (G_ZEXT $D, $S, (debug_locations $MI0, $MI1)))>;
>>> def : GICombineRule<(defs reg:$D, reg:$S, instr:$MI0, instr:$MI1, 
>>> instr:$MI2, instr:$MI3, debug_expr:$DNewExpr),
>>>                         (match (G_ZEXT $t0, $S):$MI0,
>>>                                (G_TRUNC $D, $t0):$MI1,
>>>                                (DBG_VALUE $t0):$MI2,
>>>                                (DBG_VALUE $D):$MI3,
>>>                                (isScalarType type:$D),
>>>                                (isLargerType type:$D, type:$S)),
>>>                         (apply (createDIExprLLVMFragment 
>>> debug_expr:$MI2, type:$D):$DNewExpr,
>>>                                (G_ZEXT $D, $S, (debug_location $MI0, 
>>> $MI1)),
>>>                                (DBG_VALUE $D, debug_local:$MI2, 
>>> debug_expr:$DNewExpr, (debug_location $MI2)),
>>>                                (DBG_VALUE $D, debug_local:$MI3, 
>>> debug_expr:$MI3, (debug_location $MI3))))>;
>>> // $VAL is needed to indicate it's an immediate and provide the predicate
>>> def : GICombineRule<(defs reg:$D, reg:$A, imm_2:$VAL),
>>> (match (G_MUL $D, $A, $VAL)),
>>> (apply (MYTGT_DOUBLE $D, $A))>;
>>> // $B/$C are needed to indicate they're immediates and provide $C's 
>>> predicate
>>> def : GICombineRule<
>>> (defs reg:$D, reg:$A, imm:$B, imm_2:$C),
>>> (match (G_ADD $t1, $A, $B),
>>> (G_MUL $D, $t1, $C)),
>>> (apply (create_imm [{ 2 * ${B}->getZExtValue() }], apint_value:$B):$NB,
>>> (G_ADD $t1, $A, $A),
>>> (G_ADD $D, $t1, $NB))>;
>>> // $D is needed because we wanted operand instead of reg. We could 
>>> rewrite the predicate to take a reg though.
>>> def extending_loads : GICombineRule<
>>> (defs operand:$D, reg:$A, extending_load_matchdata:$matchinfo),
>>> (match (G_LOAD $D, $A),
>>> (extending_load_predicate operand:$A,
>>> extending_load_matchdata:$matchinfo)),
>>> (apply (exec [{ Helper.applyCombineExtendingLoads(${D}, 
>>> ${matchinfo.B}); }],
>>> reg:$D, extending_load_matchdata:$matchinfo)>;
>>> // I haven't included any rules for omitting defs from GIMacro but I 
>>> can look into this if we want.
>>> def ANYLOAD : GIMacro<(defs def:$R, use:$S, uint64_t:$IDX),
>>> (match (oneof (match (G_LOAD $R, $S)),
>>> (match (G_SEXTLOAD $R, $S)),
>>> (match (G_ZEXTLOAD $R, $S))):$IDX>;
>>> // $ANYLOAD is needed to import the macro and name its instance
>>> def extending_loads : GICombineRule<
>>>       (defs reg:$D, reg:$A, extending_load_matchdata:$matchinfo, 
>>> ANYLOAD:$ANYLOAD),
>>>       (match (ANYLOAD $D, $A),
>>>              (extending_load_predicate operand:$A,
>>>                                         extending_load_matchdata:$matchinfo)),
>>>       (apply (exec [{ Helper.applyCombineExtendingLoads(${D}, 
>>> ${matchinfo.B}); }],
>>>                    reg:$D, extending_load_matchdata:$matchinfo)>;
>>> def : GICombineRule<
>>>   (defs reg:$D, reg:$A),
>>>   (match (G_LOAD $t1, $D),
>>>          (G_SEXT $A, $t1)),
>>>   (apply (G_SEXTLOAD $A, $D))> {
>>>   let MatchStartsFrom = (roots $D);
>>> };
>>> def : GICombineRule<
>>>   (defs reg:$D, reg:$A, reg:$B, reg:$C),
>>>   (match (G_TRUNC s32:$t1, s64:$A),
>>>          (G_TRUNC s32:$t2, s64:$B),
>>>          (G_ADD $D, $t1, $t2)
>>>          (G_SEXT s64:$C, $D)),
>>>   (apply (G_ADD $D, $A, $B),
>>>          (G_SEXT_INREG $C, $D))> {
>>> let MatchStartsFrom = (roots $D);
>>> };
>>> def : GICombineRule<
>>> (defs reg:$D1, reg:$D2, reg:$A, reg:$B),
>>> (match (G_ADD $D1, $A, $B),
>>> (G_SUB $D2, $A, $B)),
>>> (apply (BUTTERFLY $D1, $D2, $A, $B))>;
>>> def : GICombineRule<
>>> (defs reg:$D, reg:$A, reg:$B),
>>> (match (G_TRUNC s32:$t1, s64:$A),
>>> (G_TRUNC s32:$t2, s64:$B),
>>> (G_ADD $D, $t1, $t2),
>>> (apply (ADD32 $D, (sub_lo $A), (sub_lo $B)))>;
>>> // This one keeps the def of $D/$A because we wanted operand instead 
>>> of reg. We didn't have
>>> // anything requiring that though and could trivially change them to reg.
>>> def : GICombineRule<
>>> (defs operand:$D, operand:$A),
>>> (match (G_LOAD $D, $A):$MI,
>>> (mmo_is_load8 instr:$MI)),
>>> (apply ...)>;
>>
>> --
>> Lerne, wie die Welt wirklich ist,
>> Aber vergiss niemals, wie sie sein sollte.
> 

-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.


More information about the llvm-dev mailing list