[PATCH] D136169: [AMDGPU] Avoid SCC clobbering before S_CSELECT_B32
Alexander via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 24 09:20:32 PDT 2022
alex-t marked 2 inline comments as done.
alex-t added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp:2390
BuildMI(*MBB, MI, DL, TII->get(AMDGPU::S_ADD_I32), ScaledReg)
- .addReg(ScaledReg, RegState::Kill)
- .addImm(-Offset);
+ .addReg(ScaledReg, RegState::Kill)
+ .addImm(-Offset);
----------------
foad wrote:
> Remove these changes from the patch.
This is obviously a mistake. "-Offset" cannot work here.
Given that we always have Reg and Offset aligned, hence even, we should just use S_SUBB_U32 here.
================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.h:96
+ unsigned getSCCLivenessScanMaxLength() const {
+ return 100;
+ }
----------------
rampitec wrote:
> 100 seems to large number to me. Compiling with spills is already sufficiently and painfully slow. I would not go more than 10 here. In a worst case you are now burning just 2 instructions and probably 8 cycles.
>
> Besides I do not think it deserves a special function, just a constant threshold right inside the code. Just because it is only used once.
> 100 seems to large number to me. Compiling with spills is already sufficiently and painfully slow. I would not go more than 10 here. In a worst case you are now burning just 2 instructions and probably 8 cycles.
>
> Besides I do not think it deserves a special function, just a constant threshold right inside the code. Just because it is only used once.
I would not agree. Really there may be tens of the odd SCC preserving code sequences in 100 lines length BB if it has frequent stack access. So, this is not just 2 instructions but 2*N. I can show you the example of assembly if necessary.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D136169/new/
https://reviews.llvm.org/D136169
More information about the llvm-commits
mailing list