[llvm] [amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic (PR #105822)

Sebastian Neubauer via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 26 04:42:43 PDT 2024


Flakebi wrote:

> Can this replacea llvm.amdgcn.init.exec? (Sorry if that's a dumb question, I have not grokked your new intrinsic yet.)

When used at the start of a function, their use-cases are rather different. `llvm.amdgcn.init.exec` **adds** an instruction that sets exec, `llvm.amdgcn.init.whole.wave` **removes** an instruction that sets exec (with the goal that the frontend can move the exec-setting to the end of the caller and the latency overlaps with the latency of the call).
Maybe there are cases where `llvm.amdgcn.init.exec` is currently used inside a function? (That could then be a candidate to use the new intrinsic, but it seems adventurous to use `llvm.amdgcn.init.exec` that way.)


Is there a test-case that does not use whole-wave-mode?
Due to whole-wave-mode, there’s a lot of exec-setting and prolog/epilog stuff going on.
I imagine the code that uses this new intrinsic does not use wwm as much, as the `tail` is effectively running in whole-wave-mode :)

```llvm
define amdgpu_cs_chain void @basic(<3 x i32> inreg %sgpr, ptr inreg %callee, i32 inreg %exec, { i32, ptr addrspace(5), i32, i32 } %vgpr, i32 %x, i32 %y) {
entry:
  %entry_exec = call i1 @llvm.amdgcn.init.whole.wave()
  br i1 %entry_exec, label %shader, label %tail

shader:
  %newx = add i32 %x, 42
  %oldval = extractvalue { i32, ptr addrspace(5), i32, i32 } %vgpr, 0
  %newval = add i32 %oldval, 5
  %newvgpr = insertvalue { i32, ptr addrspace(5), i32, i32 } %vgpr, i32 %newval, 0

  br label %tail

tail:
  %full.x = phi i32 [%x, %entry], [%newx, %shader]
  %full.vgpr = phi i32 [%vgpr, %entry], [%newvgpr, %shader]
  call void(ptr, i32, <3 x i32>, { i32, ptr addrspace(5), i32, i32 }, i32, ...) @llvm.amdgcn.cs.chain(ptr %callee, i32 %exec, <3 x i32> inreg %sgpr, { i32, ptr addrspace(5), i32, i32 } %full.vgpr, i32 %full.x)
  unreachable
}
```

https://github.com/llvm/llvm-project/pull/105822


More information about the llvm-commits mailing list