<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - Divergence analysis: overly conservative"
href="https://bugs.llvm.org/show_bug.cgi?id=34846">34846</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Divergence analysis: overly conservative
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>5.0
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Backend: PTX
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>dnsampaio@gmail.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr></table>
<p>
<div>
<pre>Created <span class=""><a href="attachment.cgi?id=19240" name="attach_19240" title="Test case">attachment 19240</a> <a href="attachment.cgi?id=19240&action=edit" title="Test case">[details]</a></span>
Test case
The divergence analysis should only classify variables as divergent if they are
different within a single warp (SIMD threads block). All these intrinsic values
are uniform and detected as divergent:
%ntidx = call i32 @llvm.nvvm.read.ptx.sreg.ntid.x()
%ntidy = call i32 @llvm.nvvm.read.ptx.sreg.ntid.y()
%ntidz = call i32 @llvm.nvvm.read.ptx.sreg.ntid.z()
%warpid = call i32 @llvm.nvvm.read.ptx.sreg.warpid()
%nwarpid = call i32 @llvm.nvvm.read.ptx.sreg.nwarpid()
%ctaidx = call i32 @llvm.nvvm.read.ptx.sreg.ctaid.x()
%ctaidy = call i32 @llvm.nvvm.read.ptx.sreg.ctaid.y()
%ctaidz = call i32 @llvm.nvvm.read.ptx.sreg.ctaid.z()
%nctaidx = call i32 @llvm.nvvm.read.ptx.sreg.nctaid.x()
%nctaidy = call i32 @llvm.nvvm.read.ptx.sreg.nctaid.y()
%nctaidz = call i32 @llvm.nvvm.read.ptx.sreg.nctaid.z()
%smid = call i32 @llvm.nvvm.read.ptx.sreg.smid()
%nsmid = call i32 @llvm.nvvm.read.ptx.sreg.nsmid()
%gridid = call i32 @llvm.nvvm.read.ptx.sreg.gridid()
====
Also, values loaded from shared memory are also uniform. For the CUDA / C code:
__global__ void no_div(int *A)
{
__shared__ int As;
As = threadIdx.x;
A[0] = As;
}
The PTX ISA Memory consistency model (see
<a href="http://docs.nvidia.com/cuda/parallel-thread-execution/index.html#memory-consistency-model">http://docs.nvidia.com/cuda/parallel-thread-execution/index.html#memory-consistency-model</a>)
does not define the value stored into variable As = threadIdx.x; but at least
the value of one thread is stored in there. Thus the loaded value A[0] = As is
undefined, but uniform, and an existing value of threadIdx.x for any thread in
the block.
=== Reproducing: attached file.
opt -analyze -divergence bla-cuda-nvptx64-nvidia-cuda-sm_62.ll
Printing analysis 'Divergence Analysis' for function '_Z6no_divPi':
DIVERGENT: %tidx = call i32 @llvm.nvvm.read.ptx.sreg.tid.x()
DIVERGENT: %tidy = call i32 @llvm.nvvm.read.ptx.sreg.tid.y()
DIVERGENT: %tidz = call i32 @llvm.nvvm.read.ptx.sreg.tid.z()
DIVERGENT: %laneid = call i32 @llvm.nvvm.read.ptx.sreg.laneid()
DIVERGENT: store i32 %tidx, i32* addrspacecast (i32 addrspace(3)*
@_ZZ6no_divPiE2As to i32*), align 4
DIVERGENT: %0 = load i32, i32* addrspacecast (i32 addrspace(3)*
@_ZZ6no_divPiE2As to i32*), align 4
DIVERGENT: %1 = load i32*, i32** %A.addr, align 8
DIVERGENT: %arrayidx = getelementptr inbounds i32, i32* %1, i64 0
DIVERGENT: store i32 %0, i32* %A, align 4
DIVERGENT: %ntidx = call i32 @llvm.nvvm.read.ptx.sreg.ntid.x()
DIVERGENT: %ntidy = call i32 @llvm.nvvm.read.ptx.sreg.ntid.y()
DIVERGENT: %ntidz = call i32 @llvm.nvvm.read.ptx.sreg.ntid.z()
DIVERGENT: %warpid = call i32 @llvm.nvvm.read.ptx.sreg.warpid()
DIVERGENT: %nwarpid = call i32 @llvm.nvvm.read.ptx.sreg.nwarpid()
DIVERGENT: %ctaidx = call i32 @llvm.nvvm.read.ptx.sreg.ctaid.x()
DIVERGENT: %ctaidy = call i32 @llvm.nvvm.read.ptx.sreg.ctaid.y()
DIVERGENT: %ctaidz = call i32 @llvm.nvvm.read.ptx.sreg.ctaid.z()
DIVERGENT: %nctaidx = call i32 @llvm.nvvm.read.ptx.sreg.nctaid.x()
DIVERGENT: %nctaidy = call i32 @llvm.nvvm.read.ptx.sreg.nctaid.y()
DIVERGENT: %nctaidz = call i32 @llvm.nvvm.read.ptx.sreg.nctaid.z()
DIVERGENT: %smid = call i32 @llvm.nvvm.read.ptx.sreg.smid()
DIVERGENT: %nsmid = call i32 @llvm.nvvm.read.ptx.sreg.nsmid()
DIVERGENT: %gridid = call i32 @llvm.nvvm.read.ptx.sreg.gridid()
All values after the "store i32 %tidx" are not divergent.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>