[PATCH] D117364: AMDGPU: Use module level register maximums for unknown callees
Sebastian Neubauer via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jan 17 02:31:49 PST 2022
sebastian-ne requested changes to this revision.
sebastian-ne added inline comments.
This revision now requires changes to proceed.
================
Comment at: llvm/test/CodeGen/AMDGPU/amdpal-callable.ll:184
; GCN-NEXT: .stack_frame_size_in_bytes: 0x10{{$}}
-; GCN-NEXT: .vgpr_count: 0x29{{$}}
+; GCN-NEXT: .vgpr_count: 0x40{{$}}
; GCN-NEXT: no_stack_extern_call_many_args:
----------------
This is over-approximating the vgpr_count whenever an indirect call is involved, which is quite a performance hit.
Can we switch AMDGPUResourceUsageAnalysis to a ModulePass and run `propagateIndirectCallRegisterUsage` at the end, so that all functions with indirect calls will get the maximum VGPR count of all functions in the module?
(As opposed to max VGPR count of the SCC that is used currently, which I did not intend.)
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D117364/new/
https://reviews.llvm.org/D117364
More information about the llvm-commits
mailing list