[PATCH] D29700: [AMDGPU] Implement register pressure callbacks
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Feb 7 19:48:22 PST 2017
rampitec added inline comments.
================
Comment at: lib/Target/AMDGPU/SIRegisterInfo.cpp:1509-1510
+ switch (RC->getID()) {
+ default:
+ return AMDGPURegisterInfo::getRegPressureLimit(RC, MF);
+ case AMDGPU::VGPR_32RegClassID:
----------------
arsenm wrote:
> Why doesn't this need to handle the tuples of registers?
Since we are tracking subregs all supreregs are counted as their parts. I.e. only 32 bit registers are relevant. If you check GCNScheduler it does the same. In fact that is correct, as you want to count them in the same bucket.
================
Comment at: lib/Target/AMDGPU/SIRegisterInfo.cpp:1513
+ return std::min(getMaxNumVGPRs(Occupancy), getMaxNumVGPRs(MF));
+ case AMDGPU::SGPR_32RegClassID:
+ return std::min(getMaxNumSGPRs(ST, Occupancy, true), getMaxNumSGPRs(MF));
----------------
arsenm wrote:
> SGPR register classes are also more complex because of the variants that exclude m0 and add vcc etc. Why don't those need to be handled?
Again, what is matter are real 32 bit SGPRs. Everything else either counted through subregs, or returns RC size in a default case. What we need is to limit SGPR and VGPR more than their RC size to handle occupancy.
Repository:
rL LLVM
https://reviews.llvm.org/D29700
More information about the llvm-commits
mailing list