[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

Fri May 31 03:05:22 PDT 2024

jayfoad wrote:

There is a latent problem to do with convergence. If you add a new test case like this:
```diff

diff --git a/llvm/test/CodeGen/AMDGPU/convergence-tokens.ll b/llvm/test/CodeGen/AMDGPU/convergence-tokens.ll
index 238f6ab39e83..22995083293d 100644
--- a/llvm/test/CodeGen/AMDGPU/convergence-tokens.ll
+++ b/llvm/test/CodeGen/AMDGPU/convergence-tokens.ll
@@ -55,6 +55,21 @@ else:
   ret i32 %p
 }
 
+define i64 @basic_branch_i64(i64 %src, i1 %cond) #0 {
+entry:
+  %t = call token @llvm.experimental.convergence.anchor()
+  %x = add i64 %src, 1
+  br i1 %cond, label %then, label %else
+
+then:
+  %r = call i64 @llvm.amdgcn.readfirstlane.i64(i64 %x) [ "convergencectrl"(token %t) ]
+  br label %else
+
+else:
+  %p = phi i64 [%r, %then], [%x, %entry]
+  ret i64 %p
+}
+
 ; CHECK-LABEL: name:            basic_loop
 ;       CHECK:    [[TOKEN:%[0-9]+]]{{[^ ]*}} = CONVERGENCECTRL_ANCHOR
 ;       CHECK:  bb.[[#]].loop:
```
Then it will fail with:
```
*** Bad machine code: Cannot mix controlled and uncontrolled convergence in the same function. ***
```
This is related to #87509. Since the readlane/readfirstlane/writelane intrinsics are IntrConvergent, the corresponding ISD nodes should be marked with SDNPInGlue or SDNPOptInGlue. @ssahasra FYI

https://github.com/llvm/llvm-project/pull/89217