[PATCH] D117364: AMDGPU: Use module level register maximums for unknown callees

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 17 08:33:21 PST 2022


arsenm added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/amdpal-callable.ll:184
 ; GCN-NEXT:        .stack_frame_size_in_bytes: 0x10{{$}}
-; GCN-NEXT:        .vgpr_count:     0x29{{$}}
+; GCN-NEXT:        .vgpr_count:     0x40{{$}}
 ; GCN-NEXT:      no_stack_extern_call_many_args:
----------------
sebastian-ne wrote:
> This is over-approximating the vgpr_count whenever an indirect call is involved, which is quite a performance hit.
> 
> Can we switch AMDGPUResourceUsageAnalysis to a ModulePass and run `propagateIndirectCallRegisterUsage` at the end, so that all functions with indirect calls will get the maximum VGPR count of all functions in the module?
> (As opposed to max VGPR count of the SCC that is used currently, which I did not intend.)
I'd like to move switching to a module pass into a follow up patch. I'm a bit afraid of unintended side effects by switching to a module pass. We're already paying a compile time cost by using SCC codegen, and module passes will be worse


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117364/new/

https://reviews.llvm.org/D117364



More information about the llvm-commits mailing list