<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - Regression with r279460 getelementptr argument goes missing"
href="https://bugs.llvm.org/show_bug.cgi?id=32001">32001</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Regression with r279460 getelementptr argument goes missing
</td>
</tr>
<tr>
<th>Product</th>
<td>tools
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>opt
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>tarceri@itsqueeze.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr></table>
<p>
<div>
<pre>Created <span class=""><a href="attachment.cgi?id=18005" name="attach_18005" title="Unoptimised shader">attachment 18005</a> <a href="attachment.cgi?id=18005&action=edit" title="Unoptimised shader">[details]</a></span>
Unoptimised shader
commit f991e38d156c4c10c609ca8425a7c31b951ecbed
Author: James Molloy <<a href="mailto:james.molloy@arm.com">james.molloy@arm.com</a>>
Date: Thu Sep 1 10:44:35 2016 +0000
[SimplifyCFG] Change the algorithm in SinkThenElseCodeToEnd
r279460 rewrote this function to be able to handle more than two incoming
edges and took pains to ensure this didn't regress anything.
On AMGGPU at least this caused a regression (possibly indirectly). I've
included a before and after bellow, you can see that the !amdgpu.uniform !0
goes missing.
I've attached the unoptimised version. I tried to debug this with 'lcc
-march=amdgcn -mcpu=polaris10 llvm_broken_preopt.ll' but it didn't seem to hit
the SimplifyCFG path when doing this.
BEFORE:
br i1 %27, label %else5, label %if1
if1: ; preds = %main_body
%30 = getelementptr [32 x <8 x i32>], [32 x <8 x i32>] addrspace(2)* %2, i64
0, i64 0, !amdgpu.uniform !0
%31 = load <8 x i32>, <8 x i32> addrspace(2)* %30, align 32, !invariant.load
!0
%32 = bitcast [32 x <8 x i32>] addrspace(2)* %2 to [0 x <4 x i32>]
addrspace(2)*
%33 = getelementptr [0 x <4 x i32>], [0 x <4 x i32>] addrspace(2)* %32, i64
0, i64 3, !amdgpu.uniform !0
%34 = load <4 x i32>, <4 x i32> addrspace(2)* %33, align 16, !invariant.load
!0
%35 = bitcast float %28 to i32
%36 = bitcast float %29 to i32
%37 = insertelement <2 x i32> undef, i32 %35, i32 0
%38 = insertelement <2 x i32> %37, i32 %36, i32 1
%39 = call <4 x float> @llvm.SI.image.sample.v2i32(<2 x i32> %38, <8 x i32>
%31, <4 x i32> %34, i32 15, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0)
br label %endif9
else5: ; preds = %main_body
%40 = getelementptr [32 x <8 x i32>], [32 x <8 x i32>] addrspace(2)* %2, i64
0, i64 2, !amdgpu.uniform !0
%41 = load <8 x i32>, <8 x i32> addrspace(2)* %40, align 32, !invariant.load
!0
%42 = bitcast [32 x <8 x i32>] addrspace(2)* %2 to [0 x <4 x i32>]
addrspace(2)*
%43 = getelementptr [0 x <4 x i32>], [0 x <4 x i32>] addrspace(2)* %42, i64
0, i64 7, !amdgpu.uniform !0
%44 = load <4 x i32>, <4 x i32> addrspace(2)* %43, align 16, !invariant.load
!0
%45 = bitcast float %28 to i32
%46 = bitcast float %29 to i32
%47 = insertelement <2 x i32> undef, i32 %45, i32 0
%48 = insertelement <2 x i32> %47, i32 %46, i32 1
%49 = call <4 x float> @llvm.SI.image.sample.v2i32(<2 x i32> %48, <8 x i32>
%41, <4 x i32> %44, i32 15, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0)
br label %endif9
endif9:
AFTER:
%30 = select i1 %27, i64 0, i64 2
%31 = getelementptr [32 x <8 x i32>], [32 x <8 x i32>] addrspace(2)* %2, i64
0, i64 %30
%32 = load <8 x i32>, <8 x i32> addrspace(2)* %31, align 32, !invariant.load
!0
%33 = bitcast [32 x <8 x i32>] addrspace(2)* %2 to [0 x <4 x i32>]
addrspace(2)*
%34 = select i1 %27, i64 3, i64 7
%35 = getelementptr [0 x <4 x i32>], [0 x <4 x i32>] addrspace(2)* %33, i64
0, i64 %34
%36 = load <4 x i32>, <4 x i32> addrspace(2)* %35, align 16, !invariant.load
!0
%37 = bitcast float %28 to i32
%38 = bitcast float %29 to i32
%39 = insertelement <2 x i32> undef, i32 %37, i32 0
%40 = insertelement <2 x i32> %39, i32 %38, i32 1
%41 = call <4 x float> @llvm.SI.image.sample.v2i32(<2 x i32> %40, <8 x i32>
%32, <4 x i32> %36, i32 15, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0)</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>