[PATCH] D94648: [amdgpu] Implement lower function LDS pass
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 28 18:46:49 PST 2021
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerFunctionLDSPass.cpp:218
+ LLVMContext &Ctx = M.getContext();
+ DataLayout const &DL = M.getDataLayout();
+
----------------
const first
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerFunctionLDSPass.cpp:291
+ StructType *LDSTy = StructType::create(
+ Ctx, LocalVarTypes, llvm::StringRef("llvm.amdgcn.function.lds.t"));
+
----------------
Doesn't really have to do with functions anymore. llvm.amdgcn.module.lds.t?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerFunctionLDSPass.cpp:299
+ M, LDSTy, false, GlobalValue::InternalLinkage, UndefValue::get(LDSTy),
+ "llvm.amdgcn.function.lds", nullptr, GlobalValue::NotThreadLocal,
+ AMDGPUAS::LOCAL_ADDRESS, false);
----------------
.module?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerFunctionLDSPass.cpp:331-332
+ Function *Func = &I;
+ if (isKernelCC(Func)) {
+ if (!Kernels.contains(Func)) {
+ markUsedByKernel(Builder, Func, SGV);
----------------
Merge to one if
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerFunctionLDSPass.cpp:102
+
+ // All users can be lowered by llc or removed from the used list
+ return false;
----------------
No reason to mention llc
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerFunctionLDSPass.cpp:142
+
+ static void removeFromUsedList(Module &M, StringRef Name,
+ SmallPtrSetImpl<Constant *> &ToRemove) {
----------------
I feel like this should be a utility function somewhere else if it really doesn't already exist
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerFunctionLDSPass.cpp:188
+ InlineAsm *IA =
+ InlineAsm::get(FTy, "// Alloc function lds block", "r", true);
+ Builder.CreateCall(
----------------
arsenm wrote:
> JonChesterfield wrote:
> > arsenm wrote:
> > > Definitely shouldn't be introducing inline asm, not sure why you are doing this. Also "r" is bad and we shouldn't support it
> > This would be a hack. I wanted a construct that looks like a use of the instance (and won't be deleted by IR passes and generates minimal code), so that other passes will accurately account for the amount of LDS used by a kernel. Specifically promoteAlloca but I may have missed some.
> >
> > An intrinsic that evaporates later would work. I haven't thought of an alternative, will see if a cleaner answer comes to mind.
> >
> > (aside: what's bad about r in particular? I'm unfamiliar with our inline assembler, perhaps there's an immediate option instead)
> Well since the allocation point isn't really fixed yet, whether this size is really correct is questionable. AMDGPUPromoteAlloca currently assumes a worst case placement for padding to avoid going over.
>
> r is "pick any register of any class". We have a hard split between VGPRs and SGPRs, so "r" is unpredictable and not very helpful.n
This is a tricky one. I think checking for a specific intrinsic global variable when allocating the kernel's LDS wouldn't be too bad. However, I did come up with this alternative hack:
```
@arst = addrspace(3) global [256 x i32] zeroinitializer
declare void @llvm.donothing()
define void @foo() {
call void @llvm.donothing() [ "dummyuse"([256 x i32] addrspace(3)* @arst)]
ret void
}
```
This at least solves the PromoteAlloca case. The use disappears for the actual codegen amount so that doesn't quite solve everything. I guess an intrinsic that takes a pointer and returns the same value would be slightly better without changing the existing LDS lowering
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D94648/new/
https://reviews.llvm.org/D94648
More information about the llvm-commits
mailing list