[PATCH] D117364: AMDGPU: Use module level register maximums for unknown callees

Sebastian Neubauer via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 17 02:31:49 PST 2022


sebastian-ne requested changes to this revision.
sebastian-ne added inline comments.
This revision now requires changes to proceed.


================
Comment at: llvm/test/CodeGen/AMDGPU/amdpal-callable.ll:184
 ; GCN-NEXT:        .stack_frame_size_in_bytes: 0x10{{$}}
-; GCN-NEXT:        .vgpr_count:     0x29{{$}}
+; GCN-NEXT:        .vgpr_count:     0x40{{$}}
 ; GCN-NEXT:      no_stack_extern_call_many_args:
----------------
This is over-approximating the vgpr_count whenever an indirect call is involved, which is quite a performance hit.

Can we switch AMDGPUResourceUsageAnalysis to a ModulePass and run `propagateIndirectCallRegisterUsage` at the end, so that all functions with indirect calls will get the maximum VGPR count of all functions in the module?
(As opposed to max VGPR count of the SCC that is used currently, which I did not intend.)


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117364/new/

https://reviews.llvm.org/D117364



More information about the llvm-commits mailing list