[PATCH] D88291: [AMDGPU] Insert waterfall loops for divergent calls
Sebastian Neubauer via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 29 02:48:28 PDT 2020
Flakebi added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:4897
+ MI.dump();
+ Begin->dump();
+ MachineBasicBlock::iterator I(&MI);
----------------
madhur13490 wrote:
> Dump() is not required.
Thanks, I forgot to remove them.
================
Comment at: llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:5127
+ MachineOperand *Dest = &MI.getOperand(0);
+ if (!RI.isSGPRClass(MRI.getRegClass(Dest->getReg()))) {
+ // Also move the copies to physical registers into the loop block
----------------
madhur13490 wrote:
> Should this block be executed for AGPRs too? If this is meant only for VGPRs then !SGPR is not correct.
I don’t know how AGPRs work and documentation seems to be scarce, the check for calls is the same as for image operations above.
It seems like AGPRs can be copied to VGPRs but not to SGPRs (https://reviews.llvm.org/D68358#change-cnuV0NuhUhzL), so I think calling a function pointer that is stored in AGPRs should copy them to VGPRs and insert a waterfall loop.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D88291/new/
https://reviews.llvm.org/D88291
More information about the llvm-commits
mailing list