<html>
    <head>
      <base href="https://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - Incorrect code generated for <3 x half> store."
   href="https://llvm.org/bugs/show_bug.cgi?id=25492">25492</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Incorrect code generated for <3 x half> store.
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Backend: ARM
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>pirama@google.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>ahmed.bougacha@gmail.com, james.molloy@arm.com, llvm-bugs@lists.llvm.org, srhines@google.com
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Consider the following IR:

target datalayout =
"e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n32"
target triple = "armv7---eabihf"

define void @f1(<3 x half>* %arr, i32 %X) #0 {
    %XH = sitofp i32 %X to half
    %S = fadd half %XH, 0xH4A00
    %1 = insertelement <3 x half> undef, half %S, i32 0
    %2 = insertelement <3 x half> %1, half %S, i32 1
    %3 = insertelement <3 x half> %2, half %S, i32 2
    store <3 x half> %3, <3 x half>* %arr, align 8
    ret void
}

When compiled using "llc -mtriple=armv7-none-linux-gnueabi -O3 -o half.s
-mattr=+vfp3,+fp16 < half.ll", the code generated is as follows:

f1:                                     @ @f1
        .fnstart
@ BB#0:
        mov     r2, #18944
        vmov    s2, r1
        vmov    s0, r2
        vcvtb.f32.f16   s0, s0
        vcvt.f32.s32    s2, s2
        vadd.f32        s0, s2, s0
        vcvtb.f16.f32   s0, s0
        vmov    r1, s0
        orr     r2, r1, r1, lsl #16
        strh    r1, [r0, #4]
        vmov    d16, r2, r1
        vst1.32 {d16[0]}, [r0:32]
        bx      lr
.Lfunc_end0:
        .size   f1, .Lfunc_end0-f1
        .fnend

The 'orr' instruction ORs r1 with a left-shifted-by-16-bits copy of r1.  This
assumes that the top half of r1 is zero, but it need not be.  The information
that only the lower 16-bits of r1 are valid and the top 16-bits are not zeroed
doesn't seem to be propagated properly.</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>