[llvm] [AMDGPU] SIWholeQuadMode: Ensure earliest WQM entry point for PS (PR #123266)

Nicolai Hähnle via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 17 10:06:10 PST 2025


================
@@ -5,12 +5,16 @@
 define amdgpu_ps float @_amdgpu_ps_main() #0 {
 ; GFX10-LABEL: _amdgpu_ps_main:
 ; GFX10:       ; %bb.0: ; %.entry
+; GFX10-NEXT:    s_mov_b32 s0, exec_lo
+; GFX10-NEXT:    s_wqm_b32 exec_lo, exec_lo
 ; GFX10-NEXT:    image_sample v[0:1], v[0:1], s[0:7], s[0:3] dmask:0x3 dim:SQ_RSRC_IMG_2D
----------------
nhaehnle wrote:

The image_sample itself doesn't have to run in WQM because all helper lanes' values happen to get loaded anyway. So only the instructions that *produce* the inputs need to run in WQM.

https://github.com/llvm/llvm-project/pull/123266


More information about the llvm-commits mailing list