[llvm] [AMDGPU] Create an AMDGPUIfConverter pass (PR #106415)
Juan Manuel Martinez CaamaƱo via llvm-commits
llvm-commits at lists.llvm.org
Wed Sep 18 09:04:50 PDT 2024
jmmartinez wrote:
Something like this:
```asm
v_mov_b32_e32 exec_backup, exec
v_cmpx_lt_i32 exec, s0, 1
s_load_dwordx2 s[6:7], s[4:5], 0x0
s_load_dwordx4 s[0:3], s[4:5], 0x10
s_waitcnt lgkmcnt(0)
s_load_dword s6, s[6:7], 0x0
s_load_dword s4, s[4:5], 0x20
s_waitcnt lgkmcnt(0)
v_mov_b32_e32 v0, s6
v_mov_b32_e32 v1, s4
buffer_store_dword v0, v1, s[0:3], 0 offen
v_mov_b32_e32 exec, exec_backup
```
> Is that really viable, given that the s_load instructions can't be predicated with EXEC?
This depends if the loaded pointers are marked as dereferenceable. If they are we can safely execute them.
https://github.com/llvm/llvm-project/pull/106415
More information about the llvm-commits
mailing list