<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - [ARM] use splat load to break false dependence"
href="https://bugs.llvm.org/show_bug.cgi?id=43410">43410</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>[ARM] use splat load to break false dependence
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Backend: ARM
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>spatel+llvm@rotateright.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org, peter.smith@linaro.org, Ties.Stuij@arm.com
</td>
</tr></table>
<p>
<div>
<pre>Copied from regression test in llvm/test/CodeGen/ARM/a15-partial-update.ll and
as discussed in <a href="https://reviews.llvm.org/D67363">https://reviews.llvm.org/D67363</a>:
define void @t2(<4 x i8> *%in, <4 x i8> *%out, i32 %n) {
; CHECK-LABEL: t2:
; CHECK: @ %bb.0: @ %entry
; CHECK-NEXT: add r0, r0, #4
; CHECK-NEXT: add r1, r1, #4
; CHECK-NEXT: .LBB1_1: @ %loop
; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1
; CHECK-NEXT: vmov.f64 d16, #5.000000e-01
; CHECK-NEXT: vld1.32 {d16[0]}, [r0:32]
; CHECK-NEXT: vmovl.u8 q8, d16
; CHECK-NEXT: vuzp.8 d16, d18
; CHECK-NEXT: vst1.32 {d16[0]}, [r1:32]!
; CHECK-NEXT: add r0, r0, #4
; CHECK-NEXT: subs r2, r2, #1
; CHECK-NEXT: beq .LBB1_1
; CHECK-NEXT: @ %bb.2: @ %ret
; CHECK-NEXT: bx lr
entry:
br label %loop
loop:
%oldcount = phi i32 [0, %entry], [%newcount, %loop]
%newcount = add i32 %oldcount, 1
%p1 = getelementptr <4 x i8>, <4 x i8> *%in, i32 %newcount
%p2 = getelementptr <4 x i8>, <4 x i8> *%out, i32 %newcount
%tmp1 = load <4 x i8> , <4 x i8> *%p1, align 4
store <4 x i8> %tmp1, <4 x i8> *%p2
%cmp = icmp eq i32 %newcount, %n
br i1 %cmp, label %loop, label %ret
ret:
ret void
}
--------------------------------------------------------------------------
We insert "vmov.f64 d16, #5.000000e-01" (the 0.5 is a semi-arbitrarily chosen
constant value) to break a false dependence on d16. But we could break that
false dependence without any extra instructions. Use a splat load to overwrite
the entire d16.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>