[PATCH] D91516: [AMDGPU][WIP] Lower LDS Global Variables.
Mahesha S via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 12 20:38:52 PST 2021
hsmhsm marked 7 inline comments as done.
hsmhsm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerLDSGlobal.cpp:311
+ // FIXME: Looks like `Instruction` is not mutatable? hence requiring to clone.
+ //
+ // Create clone of `I`, say, it is `NewI`. Within `NewI`, replace the use(s)
----------------
arsenm wrote:
> You should only need to do the use replacement, you aren't changing the types of the instructions so cloning/hacking on them shouldn't be needed
Added FIXME comment, will see how to fix it.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerLDSGlobal.cpp:608
+ LDSToOffSet[LDS] = CurOffset;
+ CurOffset += DL.getTypeAllocSize(LDS->getValueType()).getFixedValue();
+ }
----------------
arsenm wrote:
> This still needs to add in alignment padding
Added FIXME comment, will see how to fix it.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerLDSGlobal.cpp:742
+ "null");
+ for (auto GI = CGNode->begin(), GE = CGNode->end(); GI != GE; ++GI) {
+ auto *CGN = GI->second;
----------------
arsenm wrote:
> I don't see why you need to build your own stack. The call graph already found the reachable functions for you
Added FIXME comment, will see how to fix it.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerLDSGlobal.cpp:776
+ SmallPtrSetImpl<Function *> &CalleeSet) {
+ if (auto *MD = CB->getMetadata(LLVMContext::MD_callees)) {
+ // The metadata "!callee" is available at the indirect call site `CB`, which
----------------
arsenm wrote:
> I think trying to handle callees is left for a later patch. Additionally, I think this should be the CallGraph analysis's responsibility to deal with
Added FIXME comment, will see how to fix it.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerLDSGlobal.cpp:854-857
+// Traverse `CallGraph` starting from the `CallGraphNode` associated with each
+// kernel `K` in DFS manner and collect all the callees which are reachable from
+// K (including indirectly called callees).
+void LowerLDSGlobalImpl::createKernelToCalleesMap() {
----------------
arsenm wrote:
> The callgraph should already give this to you. Iterating the call graph should give you all of the functions you care about. You don't actually need to worry about which functions call which, since you need to touch every function in the SCC
Added FIXME comment, will see how to fix it.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D91516/new/
https://reviews.llvm.org/D91516
More information about the llvm-commits
mailing list