[PATCH] D104730: [MCA] [AMDGPU] Adding CustomBehaviour implementation for AMDGPU.
Patrick Holland via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 24 10:31:12 PDT 2021
holland11 added a comment.
> ... could you change this to let SchedModel = SIFullSpeedModel, RetireOOO = 1 and the same for all the other let SchedModel = ... lines? Would that still work? I think that would make it slightly easier for the next person who cut'n'pastes one of these schedmodels to create a new one, to get it right.
Unfortunately, this does not build. The `RetireOOO` flag can only be applied to scheduling classes so its `let` statement can't include any `InstRW` expressions.
I understand and agree with your point though. Do you have any other ideas to achieve something similar? I'm a rookie when it comes to tablegen so I can't really think of a better way to do it than I'm doing it right now.
I could maybe format it a bit better. Something like (moving the flag to the top of the block so it's a bit more obvious):
let SchedModel = GFX10SpeedModel in {
let RetireOOO = 1 in { // llvm-mca specific flag
// The latency values are 1 / (operations / cycle).
// Add 1 stall cycle for VGPR read.
def : HWWriteRes<Write32Bit, [HWVALU, HWRC], 5>;
def : HWWriteRes<WriteFloatCvt, [HWVALU, HWRC], 5>;
def : HWWriteRes<Write64Bit, [HWVALU, HWRC], 6>;
def : HWWriteRes<WriteTrans32, [HWTransVALU, HWRC], 10>;
def : HWWriteRes<WriteQuarterRate32, [HWVALU, HWRC], 8>;
def : HWWriteRes<WriteFloatFMA, [HWVALU, HWRC], 5>;
def : HWWriteRes<WriteDouble, [HWVALU, HWRC], 22>;
def : HWWriteRes<WriteDoubleAdd, [HWVALU, HWRC], 22>;
def : HWWriteRes<WriteDoubleCvt, [HWVALU, HWRC], 22>;
def : HWWriteRes<WriteIntMul, [HWVALU, HWRC], 8>;
def : HWWriteRes<WriteTrans64, [HWVALU, HWTransVALU, HWRC], 24>;
def : HWWriteRes<WriteBranch, [HWBranch], 32>;
def : HWWriteRes<WriteExport, [HWExport, HWRC], 16>;
def : HWWriteRes<WriteLDS, [HWLGKM, HWRC], 20>;
def : HWWriteRes<WriteSALU, [HWSALU, HWRC], 2>;
def : HWWriteRes<WriteSMEM, [HWLGKM, HWRC], 20>;
def : HWWriteRes<WriteVMEM, [HWVMEM, HWRC], 320>;
def : HWWriteRes<WriteBarrier, [HWBranch], 2000>;
} // End RetireOOO = 1 (Can't be applied to InstRW expressions)
def : InstRW<[WriteCopy], (instrs COPY)>;
} // End SchedModel = GFX10SpeedModel
Let me know if you want me to make this change or have any other ideas.
> Thanks. The RetireOOO output looks much better. Independent VALU instructions can definitely execute while there are outstanding loads. It would be great to add a new mca test case that shows this effect directly, if you wouldn't mind.
I will update the current tests with your new `timeline-max-cycles` change and I will also add 1-2 new tests.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D104730/new/
https://reviews.llvm.org/D104730
More information about the llvm-commits
mailing list