<html>
<head>
<base href="https://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - Incorrect code generated for <3 x half> store."
href="https://llvm.org/bugs/show_bug.cgi?id=25492">25492</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Incorrect code generated for <3 x half> store.
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Backend: ARM
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>pirama@google.com
</td>
</tr>
<tr>
<th>CC</th>
<td>ahmed.bougacha@gmail.com, james.molloy@arm.com, llvm-bugs@lists.llvm.org, srhines@google.com
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>Consider the following IR:
target datalayout =
"e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n32"
target triple = "armv7---eabihf"
define void @f1(<3 x half>* %arr, i32 %X) #0 {
%XH = sitofp i32 %X to half
%S = fadd half %XH, 0xH4A00
%1 = insertelement <3 x half> undef, half %S, i32 0
%2 = insertelement <3 x half> %1, half %S, i32 1
%3 = insertelement <3 x half> %2, half %S, i32 2
store <3 x half> %3, <3 x half>* %arr, align 8
ret void
}
When compiled using "llc -mtriple=armv7-none-linux-gnueabi -O3 -o half.s
-mattr=+vfp3,+fp16 < half.ll", the code generated is as follows:
f1: @ @f1
.fnstart
@ BB#0:
mov r2, #18944
vmov s2, r1
vmov s0, r2
vcvtb.f32.f16 s0, s0
vcvt.f32.s32 s2, s2
vadd.f32 s0, s2, s0
vcvtb.f16.f32 s0, s0
vmov r1, s0
orr r2, r1, r1, lsl #16
strh r1, [r0, #4]
vmov d16, r2, r1
vst1.32 {d16[0]}, [r0:32]
bx lr
.Lfunc_end0:
.size f1, .Lfunc_end0-f1
.fnend
The 'orr' instruction ORs r1 with a left-shifted-by-16-bits copy of r1. This
assumes that the top half of r1 is zero, but it need not be. The information
that only the lower 16-bits of r1 are valid and the top 16-bits are not zeroed
doesn't seem to be propagated properly.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>