[PATCH] D29473: [AMDGPU] Unroll preferences improvements
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Feb 2 16:31:16 PST 2017
arsenm added inline comments.
================
Comment at: lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:93-96
+ if (ST->getGeneration() >= AMDGPUSubtarget::VOLCANIC_ISLANDS)
return 256;
+ else if (ST->getGeneration() >= AMDGPUSubtarget::SOUTHERN_ISLANDS)
+ return 128;
----------------
rampitec wrote:
> arsenm wrote:
> > I don't think we should change this away from the hardware sizes. I think this hook is only used by the vectorizers we don't use. We should define a different constant for use for the alloca heuristic
> The numbers here were just incorrect. SI to CI have 128 registers. Then it makes sense to take into consideration real register file size, which is target dependent. As a todo we need to limit it further if we have occupancy attributes.
No, there have always been 256 VGPRs.
Repository:
rL LLVM
https://reviews.llvm.org/D29473
More information about the llvm-commits
mailing list