[PATCH] D74400: AMDGPU: llvm.amdgcn.writelane is a source of divergence
Nicolai Hähnle via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Feb 11 05:43:26 PST 2020
nhaehnle created this revision.
nhaehnle added reviewers: arsenm, foad, mareko.
Herald added subscribers: kerbowa, hiraditya, t-tye, tpr, dstuttard, yaxunl, wdng, jvesely, kzhuravl.
Herald added a project: LLVM.
Consider:
%r = call i32 @llvm.amdgcn.writelane(i32 0, i32 1, i32 2)
This produces a value that is 0 on lane 1, and 2 everywhere else; i.e.,
it is divergent.
Reported-by: Marek Olsak <Marek.Olsak at amd.com>
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D74400
Files:
llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td
llvm/test/Analysis/DivergenceAnalysis/AMDGPU/intrinsics.ll
Index: llvm/test/Analysis/DivergenceAnalysis/AMDGPU/intrinsics.ll
===================================================================
--- llvm/test/Analysis/DivergenceAnalysis/AMDGPU/intrinsics.ll
+++ llvm/test/Analysis/DivergenceAnalysis/AMDGPU/intrinsics.ll
@@ -42,12 +42,20 @@
ret void
}
+; CHECK: DIVERGENT: %tmp0 = call i32 @llvm.amdgcn.writelane(i32 0, i32 1, i32 2)
+define amdgpu_kernel void @writelane(i32 addrspace(1)* %out) #0 {
+ %tmp0 = call i32 @llvm.amdgcn.writelane(i32 0, i32 1, i32 2)
+ store i32 %tmp0, i32 addrspace(1)* %out
+ ret void
+}
+
declare i32 @llvm.amdgcn.ds.swizzle(i32, i32) #1
declare i32 @llvm.amdgcn.permlane16(i32, i32, i32, i32, i1, i1) #1
declare i32 @llvm.amdgcn.permlanex16(i32, i32, i32, i32, i1, i1) #1
declare i32 @llvm.amdgcn.mov.dpp.i32(i32, i32, i32, i32, i1) #1
declare i32 @llvm.amdgcn.mov.dpp8.i32(i32, i32) #1
declare i32 @llvm.amdgcn.update.dpp.i32(i32, i32, i32, i32, i32, i1) #1
+declare i32 @llvm.amdgcn.writelane(i32, i32, i32) #1
attributes #0 = { nounwind convergent }
attributes #1 = { nounwind readnone convergent }
Index: llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td
===================================================================
--- llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td
+++ llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td
@@ -247,6 +247,7 @@
def : SourceOfDivergence<int_amdgcn_mov_dpp>;
def : SourceOfDivergence<int_amdgcn_mov_dpp8>;
def : SourceOfDivergence<int_amdgcn_update_dpp>;
+def : SourceOfDivergence<int_amdgcn_writelane>;
def : SourceOfDivergence<int_amdgcn_mfma_f32_4x4x1f32>;
def : SourceOfDivergence<int_amdgcn_mfma_f32_4x4x1f32>;
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D74400.243836.patch
Type: text/x-patch
Size: 1654 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200211/ad51fa73/attachment.bin>
More information about the llvm-commits
mailing list