[llvm-dev] Scheduler: modelling long register reservations?
Jonas Paulsson via llvm-dev
llvm-dev at lists.llvm.org
Wed Apr 12 00:25:26 PDT 2017
Hi Nick,
ScheduleDAGInstrs::addPhysRegDeps(SUnit *SU, unsigned OperIdx) is the
method that adds the edges with their latencies for Output dependencies
(def -> def). It seems unfortunately that there currently isn't a way to
specify latency for output deps with computeOperandLatency() or similar.
I am then thinking that one option might be to add a DAGMutator where
you could manually set the latency of the anti-edge to 25, after the DAG
has been built.
If you have a problem with subregs, did you try to model the stalling
subreg def as defining the whole vector reg, while in the output
adjusting the register operand text, or similar?
/Jonas
On 2017-04-10 19:50, Johnson, Nicholas Paul via llvm-dev wrote:
> (Thank you Alex Bradbury for publicizing this thread in the weekly)
>
> I'll update the thread with my partial solution. I have introduced a pseudo-instruction 'DontOverwriteFlexResult' as in Snippet1 (below). That instruction has no effect. Then, I updated some instruction selection patterns so that they wrap every occurrence of FXLV within a DontOverwriteFlexResult pseudo-instruction (Snippet2, below). The scheduler will attempt to schedule the pseudo-instruction to satisfy the long latency. This extends the live-interval of the FXLV's result vector register, and prevents the register allocator from prematurely overwriting subvectors of the result register.
>
> This solution works in some cases, but doesn't yet support the case in which the FXLV result is completely unused, since the 'DontOverwriteFlexResult' pseudo will get dead-code-eliminated. I'm planning on marking the pseudo as side-effecting to inhibit dead code elimination, but still need a plan to prevent that from pessimizing the scheduler.
>
> Nick Johnson
> D. E. Shaw Research
>
>
> // Snippet 1
> // Here is a fancy fake instruction which prevent the compiler
> // from clobbering all or part of a flex api instruction's result.
> let hasNoSchedulingInfo = 1, mayLoad=0, mayStore=0, hasSideEffects=0, isAsCheapAsAMove=1 in
> {
> def DontOverwriteFlexResults :
> DesGCv3PseudoInst<
> (outs VecRegs:$rd),
> (ins VecRegs:$rs),
> "# DontOverwriteFlexResults_v4i32\t$rd",
> []>
> {
> let Constraints = "$rd = $rs";
> }
> }
>
> // Snippet 2
> def : Pat<
> (v4i32 (Aligned16LoadFromFlex (i32 DesGCv3RegPlusInt26:$ptr) )),
> (DontOverwriteFlexResults (v4i32 (FXLV_UNCOUNTED (i32 DesGCv3RegPlusInt26:$ptr) )))>;
>
>
>> -----Original Message-----
>> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of
>> Johnson, Nicholas Paul via llvm-dev
>> Sent: Monday, April 03, 2017 3:38 PM
>> To: llvm-dev at lists.llvm.org
>> Subject: [llvm-dev] Scheduler: modelling long register reservations?
>>
>> Hello,
>>
>> My out-of-tree target features some high latency instructions (let's call them
>> FXLV). When an FXLV issues, it reserves its destination register and
>> execution continues; if a subsequent instruction attempts to read or write
>> that register, the pipline will stall until the FXLV completes. I have
>> attempted to encode this constraint in the machine scheduler (excerpt at
>> bottom of email). This solves half of the problem: the scheduler moves any
>> instruction that reads the FXLV result register to a much later position.
>>
>> However, this doesn't solve all of the problem. In particular, the scheduler
>> seems indifferent to an instruction which overwrites the FXLV's result
>> register---including instructions which overwrite only one lane of the vector
>> result. Am I specifying the scheduling constraints incorrectly? Can llvm
>> support this kind of constraint?
>>
>> Thank you,
>> Nick Johnson
>> D. E. Shaw Research
>>
>>
>> // Excerpted from lib/Target/MyTarget/MyTargetSchedule.td:
>> //
>> def DesGCv3GenericModel : SchedMachineModel
>> {
>> let IssueWidth = 1;
>> let MicroOpBufferSize = 0;
>>
>> let CompleteModel = 1;
>> }
>> // ...
>> def FlexU : ProcResource<64> { let BufferSize = 1; }
>> def : WriteRes<IIFlexRead, [FlexU]> { let Latency = 25; let
>> ResourceCycles = [25]; }
>> class SchedFlexRead : Sched< [IIFlexRead] >; // I apply this to the definition
>> of FXLV instruction
>> // ...
>>
>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
More information about the llvm-dev
mailing list