[llvm] [AMDGPU] Enable kernarg preloading by default on gfx940 (PR #110691)
Austin Kerbow via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 2 13:35:00 PDT 2024
================
@@ -1014,12 +1014,49 @@ struct AAAMDGPUNoAGPR
const char AAAMDGPUNoAGPR::ID = 0;
+static unsigned getMaxNumPreloadArgs(const Function &F, const DataLayout &DL,
+ const TargetMachine &TM) {
+ const GCNSubtarget &ST = TM.getSubtarget<GCNSubtarget>(F);
+ unsigned Offset = 0;
+ unsigned ArgsToPreload = 0;
+ for (const auto &Arg : F.args()) {
+ if (Arg.hasByRefAttr())
+ break;
+
+ Type *Ty = Arg.getType();
+ Align ArgAlign = DL.getABITypeAlign(Ty);
+ auto Size = DL.getTypeAllocSize(Ty);
----------------
kerbowa wrote:
I think the alloc size is correct with respect to the number of user SGPRs that are initially allocated for preloading, since it maps directly on to how the data looks in the kernarg segment, and the number of registers used for preloading is derived from that.
> I think it would be better to be more precise (and maybe even make the inreg a hard requirement to respect)
I short of agree, but we have to decide what to do when frontends like Triton just add inreg to every argument. Should we remove it in cases where we cannot preload the argument? Print a warning if we cannot preload?
I'm leaning towards the first option where we remove inreg from arguments that wont actually be preloaded somewhere like AMDGPULowerKernelArguments after all the attributes are finalized, ect.
https://github.com/llvm/llvm-project/pull/110691
More information about the llvm-commits
mailing list