[llvm-dev] [RFC] Tablegen-erated GlobalISel Combine Rules
Nicolai Hähnle via llvm-dev
llvm-dev at lists.llvm.org
Fri Nov 30 06:28:51 PST 2018
On 30.11.18 01:25, Daniel Sanders wrote:
>
>
>> On Nov 29, 2018, at 02:02, Nicolai Hähnle <nhaehnle at gmail.com
>> <mailto:nhaehnle at gmail.com>> wrote:
>>
>> On 27.11.18 19:01, Daniel Sanders wrote:
>>> ...Continued from the other email
>>> _Removing the defs section_
>>> We can potentially infer quite a lot of the defs section but it
>>> requires both a complicated ruleset and that tblgen spends processing
>>> time doing the inferencing. That processing time is
>>> potentially significant for large combiners and for that reason we
>>> need to be careful not to let inferencing become a burden on tblgen.
>>> My main worry here is that it may require analyzing both the match
>>> and apply section to fully inference the types and I'd like the
>>> tablegen pass to be able to ignore the apply section until it's about
>>> to write it out to the output file.
>>
>> Thank you for looking into this!
>>
>>
>>> There are however, some simple rules we can use to minimize the defs
>>> section with just a scan of the defs section (trivial) and match
>>> section (which we have to do anyway):
>>> 1. Match roots are implicitly declared in the defs section and have
>>> type 'reg'.
>>> 2. Names used as argument to an operator that subclasses Instruction
>>> are implicitly declared in the defs section with type 'reg'. This
>>> allows leaf-operands to be used in the apply section
>>> * This does however come at additional run-time memory cost
>>
>> Can you explain this? You mean run-time as in when the compiler runs,
>> right? I guess I'm not familiar enough with the details of the
>> matching algorithm.
>
> That's right. The impact of this rule is to include all the operands
> covered by the match just in case they need to be delivered to apply.
> This makes the struct above larger than it really needs to be. For example:
> (defs $d, $a, $b)
> (match (MUL $t, $a, $b),
> (ADD $d, $c, $t))
> would record and pass three operands, but with this rule it would
> effectively be:
> (defs $d, $t, $a, $b)
> which is slightly larger.
>
> The run-time memory cost comes from the implicit declaration in the defs
> section. The defs sections primary purpose is to describe the
> information that needs to be delivered from the match step to the apply
> step. So it can be seen as defining*:
> struct {
> MachineOperand &d;
> MachineOperand &t; // due to the implicit declaration.
> MachineOperand &a;
> MachineOperand &b;
> } Foo;
> Foo matchRule(...);
> void applyRule(Foo &, ...);
Okay, this is interesting.
Can't TableGen whittle down the members of that struct automatically to
the intersection of operands that are referenced by both match and apply?
Cheers,
Nicolai
> *That's not quite what the tablegenerated code will do in practice but
> it's the gist of what it will do. The real implementation will be
> managing a buffer that's shared between the whole ruleset and will tell
> the apply step which bits it should read.
>
>> Cheers,
>> Nicolai
>>
>>
>>> 3. Names used as argument to an operator that subclasses Instruction
>>> are assumed to be 'reg'. This allows us to omit the
>>> formerly-anonymous names such as $t[0-3] in the above examples.
>>> * Using an 'operand' typed variable as the def of an instruction
>>> does not change it to 'reg' but does emit the isReg() check that
>>> 'reg' would normally emit.
>>> 4. Names used in an 'outs' position on a GIMatchPredicate are
>>> implicitly declared in the defs section using the type given by the
>>> predicate.
>>> 5. Any names not given in the defs section are assumed to be 'instr' if
>>> they are used as a name for an operator that subclasses Instruction.
>>> This allows us to omit the defs for MI's involved in debug info.
>>> 6. Any names used solely in the apply section and defined by a
>>> value-constructing operator (e.g. create*) can be omitted
>>> With these rules, the defs section isn't completely gone but is often
>>> empty and can be omitted with a variant of GICombineRule. The
>>> examples would be as below. I've indicated which rule eliminates
>>> which def with color coding and strike-through:
>>> // Omitted due to rule 1
>>> // Omitted due to rule 2
>>> // We haven't been listing any defs related to rule 3 since these are
>>> the formerly anonymous
>>> // operands that now need names to fit into the tblgen dag type
>>> // Omitted due to rule 4
>>> // Omitted due to rule 5
>>> //Omitted due to rule 6
>>> def : GICombineRule<(defs reg:$D, reg:$S),
>>> (match (G_ZEXT s32:$t1, s8:$S),
>>> (G_TRUNC s16:$D, s32:$t1)),
>>> (apply (G_ZEXT s16:$D, s8:$S))>;
>>> def : GICombineRule<(defs reg:$D, reg:$S),
>>> (match (G_ZEXT $t1, $S),
>>> (G_TRUNC $D, $t1),
>>> (isScalarType type:$D),
>>> (isLargerType type:$D, type:$S)),
>>> (apply (G_ZEXT $D, $S))>;
>>> def : GICombineRule<(defs reg:$D, reg:$S, instr:$MI0, instr:$MI1),
>>> (match (G_ZEXT $t0, $S):$MI0,
>>> (G_TRUNC $D, $t0):$MI1),
>>> (isScalarType type:$D),
>>> (isLargerType type:$D, type:$S)),
>>> (apply (G_ZEXT $D, $S, (debug_locations $MI0, $MI1)))>;
>>> def : GICombineRule<(defs reg:$D, reg:$S, instr:$MI0, instr:$MI1,
>>> instr:$MI2, instr:$MI3, debug_expr:$DNewExpr),
>>> (match (G_ZEXT $t0, $S):$MI0,
>>> (G_TRUNC $D, $t0):$MI1,
>>> (DBG_VALUE $t0):$MI2,
>>> (DBG_VALUE $D):$MI3,
>>> (isScalarType type:$D),
>>> (isLargerType type:$D, type:$S)),
>>> (apply (createDIExprLLVMFragment
>>> debug_expr:$MI2, type:$D):$DNewExpr,
>>> (G_ZEXT $D, $S, (debug_location $MI0,
>>> $MI1)),
>>> (DBG_VALUE $D, debug_local:$MI2,
>>> debug_expr:$DNewExpr, (debug_location $MI2)),
>>> (DBG_VALUE $D, debug_local:$MI3,
>>> debug_expr:$MI3, (debug_location $MI3))))>;
>>> // $VAL is needed to indicate it's an immediate and provide the predicate
>>> def : GICombineRule<(defs reg:$D, reg:$A, imm_2:$VAL),
>>> (match (G_MUL $D, $A, $VAL)),
>>> (apply (MYTGT_DOUBLE $D, $A))>;
>>> // $B/$C are needed to indicate they're immediates and provide $C's
>>> predicate
>>> def : GICombineRule<
>>> (defs reg:$D, reg:$A, imm:$B, imm_2:$C),
>>> (match (G_ADD $t1, $A, $B),
>>> (G_MUL $D, $t1, $C)),
>>> (apply (create_imm [{ 2 * ${B}->getZExtValue() }], apint_value:$B):$NB,
>>> (G_ADD $t1, $A, $A),
>>> (G_ADD $D, $t1, $NB))>;
>>> // $D is needed because we wanted operand instead of reg. We could
>>> rewrite the predicate to take a reg though.
>>> def extending_loads : GICombineRule<
>>> (defs operand:$D, reg:$A, extending_load_matchdata:$matchinfo),
>>> (match (G_LOAD $D, $A),
>>> (extending_load_predicate operand:$A,
>>> extending_load_matchdata:$matchinfo)),
>>> (apply (exec [{ Helper.applyCombineExtendingLoads(${D},
>>> ${matchinfo.B}); }],
>>> reg:$D, extending_load_matchdata:$matchinfo)>;
>>> // I haven't included any rules for omitting defs from GIMacro but I
>>> can look into this if we want.
>>> def ANYLOAD : GIMacro<(defs def:$R, use:$S, uint64_t:$IDX),
>>> (match (oneof (match (G_LOAD $R, $S)),
>>> (match (G_SEXTLOAD $R, $S)),
>>> (match (G_ZEXTLOAD $R, $S))):$IDX>;
>>> // $ANYLOAD is needed to import the macro and name its instance
>>> def extending_loads : GICombineRule<
>>> (defs reg:$D, reg:$A, extending_load_matchdata:$matchinfo,
>>> ANYLOAD:$ANYLOAD),
>>> (match (ANYLOAD $D, $A),
>>> (extending_load_predicate operand:$A,
>>> extending_load_matchdata:$matchinfo)),
>>> (apply (exec [{ Helper.applyCombineExtendingLoads(${D},
>>> ${matchinfo.B}); }],
>>> reg:$D, extending_load_matchdata:$matchinfo)>;
>>> def : GICombineRule<
>>> (defs reg:$D, reg:$A),
>>> (match (G_LOAD $t1, $D),
>>> (G_SEXT $A, $t1)),
>>> (apply (G_SEXTLOAD $A, $D))> {
>>> let MatchStartsFrom = (roots $D);
>>> };
>>> def : GICombineRule<
>>> (defs reg:$D, reg:$A, reg:$B, reg:$C),
>>> (match (G_TRUNC s32:$t1, s64:$A),
>>> (G_TRUNC s32:$t2, s64:$B),
>>> (G_ADD $D, $t1, $t2)
>>> (G_SEXT s64:$C, $D)),
>>> (apply (G_ADD $D, $A, $B),
>>> (G_SEXT_INREG $C, $D))> {
>>> let MatchStartsFrom = (roots $D);
>>> };
>>> def : GICombineRule<
>>> (defs reg:$D1, reg:$D2, reg:$A, reg:$B),
>>> (match (G_ADD $D1, $A, $B),
>>> (G_SUB $D2, $A, $B)),
>>> (apply (BUTTERFLY $D1, $D2, $A, $B))>;
>>> def : GICombineRule<
>>> (defs reg:$D, reg:$A, reg:$B),
>>> (match (G_TRUNC s32:$t1, s64:$A),
>>> (G_TRUNC s32:$t2, s64:$B),
>>> (G_ADD $D, $t1, $t2),
>>> (apply (ADD32 $D, (sub_lo $A), (sub_lo $B)))>;
>>> // This one keeps the def of $D/$A because we wanted operand instead
>>> of reg. We didn't have
>>> // anything requiring that though and could trivially change them to reg.
>>> def : GICombineRule<
>>> (defs operand:$D, operand:$A),
>>> (match (G_LOAD $D, $A):$MI,
>>> (mmo_is_load8 instr:$MI)),
>>> (apply ...)>;
>>
>> --
>> Lerne, wie die Welt wirklich ist,
>> Aber vergiss niemals, wie sie sein sollte.
>
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
More information about the llvm-dev
mailing list