[clang] [llvm] [AMDGPU] Convert AMDGPUResourceUsageAnalysis pass from Module to MF pass (PR #102913)
Janek van Oirschot via cfe-commits
cfe-commits at lists.llvm.org
Fri Aug 16 06:23:37 PDT 2024
================
@@ -3025,8 +3025,8 @@ define amdgpu_kernel void @dyn_extract_v5f64_s_s(ptr addrspace(1) %out, i32 %sel
; GPRIDX-NEXT: amd_machine_version_stepping = 0
; GPRIDX-NEXT: kernel_code_entry_byte_offset = 256
; GPRIDX-NEXT: kernel_code_prefetch_byte_size = 0
-; GPRIDX-NEXT: granulated_workitem_vgpr_count = 0
-; GPRIDX-NEXT: granulated_wavefront_sgpr_count = 1
+; GPRIDX-NEXT: granulated_workitem_vgpr_count = (11468800|(((((alignto(max(max(totalnumvgprs(dyn_extract_v5f64_s_s.num_agpr, dyn_extract_v5f64_s_s.num_vgpr), 1, 0), 1), 4))/4)-1)&63)|(((((alignto(max(max(dyn_extract_v5f64_s_s.num_sgpr+(extrasgprs(dyn_extract_v5f64_s_s.uses_vcc, dyn_extract_v5f64_s_s.uses_flat_scratch, 1)), 1, 0), 1), 8))/8)-1)&15)<<6)))&63
----------------
JanekvO wrote:
If any of the the sub-expressions of a MCExpr is unknown/unresolvable at this point in time (i.e., asm printing for a particular MachineFunction) it will print out the MCExpr in its most verbose way possible. It doesn't help that both the `granulated_workitem_vgpr_count` and `granulated_wavefront_sgpr_count` are basically the same MCExpr, but masked for the only the relevant bits (i.e., `compute_pgm_resource1_registers` masked for whatever we want to retrieve).
I was thinking of explicitly splitting all of the components that compose any of the `compute_pgm_resourceX` registers into their own MCExpr and leave computation of the `compute_pgm_resourceX` register for when they're used/necessary. However, this wouldn't help resolving the unknowns/unresolvables at the time of printing the `amd_kernel_code_t` metadata. Do let me know if splitting the composed registers is still desired, though.
https://github.com/llvm/llvm-project/pull/102913
More information about the cfe-commits
mailing list