[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)
Jay Foad via cfe-commits
cfe-commits at lists.llvm.org
Fri May 31 03:05:22 PDT 2024
jayfoad wrote:
There is a latent problem to do with convergence. If you add a new test case like this:
```diff
diff --git a/llvm/test/CodeGen/AMDGPU/convergence-tokens.ll b/llvm/test/CodeGen/AMDGPU/convergence-tokens.ll
index 238f6ab39e83..22995083293d 100644
--- a/llvm/test/CodeGen/AMDGPU/convergence-tokens.ll
+++ b/llvm/test/CodeGen/AMDGPU/convergence-tokens.ll
@@ -55,6 +55,21 @@ else:
ret i32 %p
}
+define i64 @basic_branch_i64(i64 %src, i1 %cond) #0 {
+entry:
+ %t = call token @llvm.experimental.convergence.anchor()
+ %x = add i64 %src, 1
+ br i1 %cond, label %then, label %else
+
+then:
+ %r = call i64 @llvm.amdgcn.readfirstlane.i64(i64 %x) [ "convergencectrl"(token %t) ]
+ br label %else
+
+else:
+ %p = phi i64 [%r, %then], [%x, %entry]
+ ret i64 %p
+}
+
; CHECK-LABEL: name: basic_loop
; CHECK: [[TOKEN:%[0-9]+]]{{[^ ]*}} = CONVERGENCECTRL_ANCHOR
; CHECK: bb.[[#]].loop:
```
Then it will fail with:
```
*** Bad machine code: Cannot mix controlled and uncontrolled convergence in the same function. ***
```
This is related to #87509. Since the readlane/readfirstlane/writelane intrinsics are IntrConvergent, the corresponding ISD nodes should be marked with SDNPInGlue or SDNPOptInGlue. @ssahasra FYI
https://github.com/llvm/llvm-project/pull/89217
More information about the cfe-commits
mailing list