[llvm-bugs] [Bug 34846] New: Divergence analysis: overly conservative
via llvm-bugs
llvm-bugs at lists.llvm.org
Thu Oct 5 07:36:45 PDT 2017
https://bugs.llvm.org/show_bug.cgi?id=34846
Bug ID: 34846
Summary: Divergence analysis: overly conservative
Product: libraries
Version: 5.0
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P
Component: Backend: PTX
Assignee: unassignedbugs at nondot.org
Reporter: dnsampaio at gmail.com
CC: llvm-bugs at lists.llvm.org
Created attachment 19240
--> https://bugs.llvm.org/attachment.cgi?id=19240&action=edit
Test case
The divergence analysis should only classify variables as divergent if they are
different within a single warp (SIMD threads block). All these intrinsic values
are uniform and detected as divergent:
%ntidx = call i32 @llvm.nvvm.read.ptx.sreg.ntid.x()
%ntidy = call i32 @llvm.nvvm.read.ptx.sreg.ntid.y()
%ntidz = call i32 @llvm.nvvm.read.ptx.sreg.ntid.z()
%warpid = call i32 @llvm.nvvm.read.ptx.sreg.warpid()
%nwarpid = call i32 @llvm.nvvm.read.ptx.sreg.nwarpid()
%ctaidx = call i32 @llvm.nvvm.read.ptx.sreg.ctaid.x()
%ctaidy = call i32 @llvm.nvvm.read.ptx.sreg.ctaid.y()
%ctaidz = call i32 @llvm.nvvm.read.ptx.sreg.ctaid.z()
%nctaidx = call i32 @llvm.nvvm.read.ptx.sreg.nctaid.x()
%nctaidy = call i32 @llvm.nvvm.read.ptx.sreg.nctaid.y()
%nctaidz = call i32 @llvm.nvvm.read.ptx.sreg.nctaid.z()
%smid = call i32 @llvm.nvvm.read.ptx.sreg.smid()
%nsmid = call i32 @llvm.nvvm.read.ptx.sreg.nsmid()
%gridid = call i32 @llvm.nvvm.read.ptx.sreg.gridid()
====
Also, values loaded from shared memory are also uniform. For the CUDA / C code:
__global__ void no_div(int *A)
{
__shared__ int As;
As = threadIdx.x;
A[0] = As;
}
The PTX ISA Memory consistency model (see
http://docs.nvidia.com/cuda/parallel-thread-execution/index.html#memory-consistency-model)
does not define the value stored into variable As = threadIdx.x; but at least
the value of one thread is stored in there. Thus the loaded value A[0] = As is
undefined, but uniform, and an existing value of threadIdx.x for any thread in
the block.
=== Reproducing: attached file.
opt -analyze -divergence bla-cuda-nvptx64-nvidia-cuda-sm_62.ll
Printing analysis 'Divergence Analysis' for function '_Z6no_divPi':
DIVERGENT: %tidx = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()
DIVERGENT: %tidy = call i32 @llvm.nvvm.read.ptx.sreg.tid.y()
DIVERGENT: %tidz = call i32 @llvm.nvvm.read.ptx.sreg.tid.z()
DIVERGENT: %laneid = call i32 @llvm.nvvm.read.ptx.sreg.laneid()
DIVERGENT: store i32 %tidx, i32* addrspacecast (i32 addrspace(3)*
@_ZZ6no_divPiE2As to i32*), align 4
DIVERGENT: %0 = load i32, i32* addrspacecast (i32 addrspace(3)*
@_ZZ6no_divPiE2As to i32*), align 4
DIVERGENT: %1 = load i32*, i32** %A.addr, align 8
DIVERGENT: %arrayidx = getelementptr inbounds i32, i32* %1, i64 0
DIVERGENT: store i32 %0, i32* %A, align 4
DIVERGENT: %ntidx = call i32 @llvm.nvvm.read.ptx.sreg.ntid.x()
DIVERGENT: %ntidy = call i32 @llvm.nvvm.read.ptx.sreg.ntid.y()
DIVERGENT: %ntidz = call i32 @llvm.nvvm.read.ptx.sreg.ntid.z()
DIVERGENT: %warpid = call i32 @llvm.nvvm.read.ptx.sreg.warpid()
DIVERGENT: %nwarpid = call i32 @llvm.nvvm.read.ptx.sreg.nwarpid()
DIVERGENT: %ctaidx = call i32 @llvm.nvvm.read.ptx.sreg.ctaid.x()
DIVERGENT: %ctaidy = call i32 @llvm.nvvm.read.ptx.sreg.ctaid.y()
DIVERGENT: %ctaidz = call i32 @llvm.nvvm.read.ptx.sreg.ctaid.z()
DIVERGENT: %nctaidx = call i32 @llvm.nvvm.read.ptx.sreg.nctaid.x()
DIVERGENT: %nctaidy = call i32 @llvm.nvvm.read.ptx.sreg.nctaid.y()
DIVERGENT: %nctaidz = call i32 @llvm.nvvm.read.ptx.sreg.nctaid.z()
DIVERGENT: %smid = call i32 @llvm.nvvm.read.ptx.sreg.smid()
DIVERGENT: %nsmid = call i32 @llvm.nvvm.read.ptx.sreg.nsmid()
DIVERGENT: %gridid = call i32 @llvm.nvvm.read.ptx.sreg.gridid()
All values after the "store i32 %tidx" are not divergent.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20171005/55370d07/attachment-0001.html>
More information about the llvm-bugs
mailing list