[llvm] [AMDGPU] SIWholeQuadMode: Ensure earliest WQM entry point for PS (PR #123266)
Nicolai Hähnle via llvm-commits
llvm-commits at lists.llvm.org
Fri Jan 17 10:06:10 PST 2025
================
@@ -5,12 +5,16 @@
define amdgpu_ps float @_amdgpu_ps_main() #0 {
; GFX10-LABEL: _amdgpu_ps_main:
; GFX10: ; %bb.0: ; %.entry
+; GFX10-NEXT: s_mov_b32 s0, exec_lo
+; GFX10-NEXT: s_wqm_b32 exec_lo, exec_lo
; GFX10-NEXT: image_sample v[0:1], v[0:1], s[0:7], s[0:3] dmask:0x3 dim:SQ_RSRC_IMG_2D
----------------
nhaehnle wrote:
The image_sample itself doesn't have to run in WQM because all helper lanes' values happen to get loaded anyway. So only the instructions that *produce* the inputs need to run in WQM.
https://github.com/llvm/llvm-project/pull/123266
More information about the llvm-commits
mailing list