[PATCH] D103225: [AMDGPU] Replace non-kernel function uses of LDS globals by pointers.

Mahesha S via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu May 27 05:45:12 PDT 2021


hsmhsm added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/Utils/AMDGPULDSUtils.cpp:44
+
+  // Collect all reachable call graph nodes from given call graph node.
+  SmallPtrSet<CallGraphNode *, 8> collectCGNodes(CallGraphNode *CGN) {
----------------
foad wrote:
> hsmhsm wrote:
> > foad wrote:
> > > You say "all reachable ... nodes" but then you only iterate over an SCC, which is different, and will miss some reachable nodes.
> > > 
> > > If you really want all reachable nodes then use depth_first_ext which will do it all for you. (See EliminateUnreachableBlocks in BasicBlockUtils.cpp for an example of how to use it.)
> > The comment is bit vague. I actually meant this - "In a call graph,  for a given (caller) node, collect all reachable *callee* nodes".  
> > 
> > Here, when I say reachable callee nodes, it includes those which are called within callees of caller, and so on. For example, say, f1 calls f2, f2 calls f3, and f3 calls f4. Then reachable callee set for f1 is {f2, f3, f4}
> > 
> > So, given f1, is not that SCC iterator will collect {f2, f3, f4} for me?
> > 
> > What are the cases that it cannot handle?
> See https://en.wikipedia.org/wiki/Strongly_connected_component
> 
> f4 will only be in the same SCC as f1 if there is a path from f1 to f4 **and back to f1 again**.
But, that is fine, it does not matter. What we are fundamentally doing here is - We collect all reachable SCCs from given kernel, then we collect all the nodes within those collected SCCs. 

Actually, I am not sure, if I get you here. Let me further think about it, in case if I am missing any obvious point here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103225/new/

https://reviews.llvm.org/D103225



More information about the llvm-commits mailing list