[llvm] [AMDGPU] Create an AMDGPUIfConverter pass (PR #106415)

Juan Manuel Martinez CaamaƱo via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 18 09:04:50 PDT 2024


jmmartinez wrote:

Something like this:

```asm
        v_mov_b32_e32 exec_backup, exec
        v_cmpx_lt_i32 exec, s0, 1
        s_load_dwordx2 s[6:7], s[4:5], 0x0
        s_load_dwordx4 s[0:3], s[4:5], 0x10
        s_waitcnt lgkmcnt(0)
        s_load_dword s6, s[6:7], 0x0
        s_load_dword s4, s[4:5], 0x20
        s_waitcnt lgkmcnt(0)
        v_mov_b32_e32 v0, s6
        v_mov_b32_e32 v1, s4
        buffer_store_dword v0, v1, s[0:3], 0 offen
        v_mov_b32_e32 exec, exec_backup
```

> Is that really viable, given that the s_load instructions can't be predicated with EXEC?

This depends if the loaded pointers are marked as dereferenceable. If they are we can safely execute them.

https://github.com/llvm/llvm-project/pull/106415


More information about the llvm-commits mailing list