[llvm-bugs] [Bug 25492] New: Incorrect code generated for <3 x half> store.

via llvm-bugs llvm-bugs at lists.llvm.org
Wed Nov 11 09:49:36 PST 2015


https://llvm.org/bugs/show_bug.cgi?id=25492

            Bug ID: 25492
           Summary: Incorrect code generated for <3 x half> store.
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Backend: ARM
          Assignee: unassignedbugs at nondot.org
          Reporter: pirama at google.com
                CC: ahmed.bougacha at gmail.com, james.molloy at arm.com,
                    llvm-bugs at lists.llvm.org, srhines at google.com
    Classification: Unclassified

Consider the following IR:

target datalayout =
"e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n32"
target triple = "armv7---eabihf"

define void @f1(<3 x half>* %arr, i32 %X) #0 {
    %XH = sitofp i32 %X to half
    %S = fadd half %XH, 0xH4A00
    %1 = insertelement <3 x half> undef, half %S, i32 0
    %2 = insertelement <3 x half> %1, half %S, i32 1
    %3 = insertelement <3 x half> %2, half %S, i32 2
    store <3 x half> %3, <3 x half>* %arr, align 8
    ret void
}

When compiled using "llc -mtriple=armv7-none-linux-gnueabi -O3 -o half.s
-mattr=+vfp3,+fp16 < half.ll", the code generated is as follows:

f1:                                     @ @f1
        .fnstart
@ BB#0:
        mov     r2, #18944
        vmov    s2, r1
        vmov    s0, r2
        vcvtb.f32.f16   s0, s0
        vcvt.f32.s32    s2, s2
        vadd.f32        s0, s2, s0
        vcvtb.f16.f32   s0, s0
        vmov    r1, s0
        orr     r2, r1, r1, lsl #16
        strh    r1, [r0, #4]
        vmov    d16, r2, r1
        vst1.32 {d16[0]}, [r0:32]
        bx      lr
.Lfunc_end0:
        .size   f1, .Lfunc_end0-f1
        .fnend

The 'orr' instruction ORs r1 with a left-shifted-by-16-bits copy of r1.  This
assumes that the top half of r1 is zero, but it need not be.  The information
that only the lower 16-bits of r1 are valid and the top 16-bits are not zeroed
doesn't seem to be propagated properly.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20151111/eeaff245/attachment-0001.html>


More information about the llvm-bugs mailing list