[llvm-dev] Scheduler: modelling long register reservations?

Wed Apr 12 00:25:26 PDT 2017

Hi Nick,

ScheduleDAGInstrs::addPhysRegDeps(SUnit *SU, unsigned OperIdx) is the 
method that adds the edges with their latencies for Output dependencies 
(def -> def). It seems unfortunately that there currently isn't a way to 
specify latency for output deps with computeOperandLatency() or similar.

I am then thinking that one option might be to add a DAGMutator where 
you could manually set the latency of the anti-edge to 25, after the DAG 
has been built.

If you have a problem with subregs, did you try to model the stalling 
subreg def as defining the whole vector reg, while in the output 
adjusting the register operand text, or similar?

/Jonas

On 2017-04-10 19:50, Johnson, Nicholas Paul via llvm-dev wrote:
> (Thank you Alex Bradbury for publicizing this thread in the weekly)
>
> I'll update the thread with my partial solution.  I have introduced a pseudo-instruction 'DontOverwriteFlexResult' as in Snippet1 (below).  That instruction has no effect.  Then, I updated some instruction selection patterns so that they wrap every occurrence of FXLV within a DontOverwriteFlexResult pseudo-instruction (Snippet2, below).   The scheduler will attempt to schedule the pseudo-instruction to satisfy the long latency.  This extends the live-interval of the FXLV's result vector register, and prevents the register allocator from prematurely overwriting subvectors of the result register.
>
> This solution works in some cases, but doesn't yet support the case in which the FXLV result is completely unused, since the 'DontOverwriteFlexResult' pseudo will get dead-code-eliminated.  I'm planning on marking the pseudo as side-effecting to inhibit dead code elimination, but still need a plan to prevent that from pessimizing the scheduler.
>
> Nick Johnson
> D. E. Shaw Research
>
>
> // Snippet 1
> // Here is a fancy fake instruction which prevent the compiler
> // from clobbering all or part of a flex api instruction's result.
> let hasNoSchedulingInfo = 1, mayLoad=0, mayStore=0, hasSideEffects=0, isAsCheapAsAMove=1  in
> {
>    def DontOverwriteFlexResults :
>      DesGCv3PseudoInst<
>        (outs VecRegs:$rd),
>        (ins  VecRegs:$rs),
>        "# DontOverwriteFlexResults_v4i32\t$rd",
>        []>
>    {
>      let Constraints = "$rd = $rs";
>    }
> }
>
> // Snippet 2
> def : Pat<
>    (v4i32 (Aligned16LoadFromFlex (i32 DesGCv3RegPlusInt26:$ptr) )),
>    (DontOverwriteFlexResults (v4i32 (FXLV_UNCOUNTED (i32 DesGCv3RegPlusInt26:$ptr) )))>;
>
>
>> -----Original Message-----
>> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of
>> Johnson, Nicholas Paul via llvm-dev
>> Sent: Monday, April 03, 2017 3:38 PM
>> To: llvm-dev at lists.llvm.org
>> Subject: [llvm-dev] Scheduler: modelling long register reservations?
>>
>> Hello,
>>
>> My out-of-tree target features some high latency instructions (let's call them
>> FXLV).  When an FXLV issues, it reserves its destination register and
>> execution continues; if a subsequent instruction attempts to read or write
>> that register, the pipline will stall until the FXLV completes.  I have
>> attempted to encode this constraint in the machine scheduler (excerpt at
>> bottom of email).  This solves half of the problem: the scheduler moves any
>> instruction that reads the FXLV result register to a much later position.
>>
>> However, this doesn't solve all of the problem.  In particular, the scheduler
>> seems indifferent to an instruction which overwrites the FXLV's result
>> register---including instructions which overwrite only one lane of the vector
>> result.  Am I specifying the scheduling constraints incorrectly?  Can llvm
>> support this kind of constraint?
>>
>> Thank you,
>> Nick Johnson
>> D. E. Shaw Research
>>
>>
>> // Excerpted from lib/Target/MyTarget/MyTargetSchedule.td:
>> //
>> def DesGCv3GenericModel : SchedMachineModel
>> {
>>   let IssueWidth = 1;
>>   let MicroOpBufferSize = 0;
>>
>>   let CompleteModel = 1;
>> }
>> // ...
>> def FlexU        : ProcResource<64> { let BufferSize = 1; }
>> def : WriteRes<IIFlexRead,   [FlexU]>          { let Latency = 25; let
>> ResourceCycles = [25]; }
>> class SchedFlexRead    : Sched< [IIFlexRead] >; // I apply this to the definition
>> of FXLV instruction
>> // ...
>>
>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev