[llvm-bugs] [Bug 38527] New: x86 backend widens v2f32 vectors only to do more work in software operations later for `sin`, `cos` etc.

via llvm-bugs llvm-bugs at lists.llvm.org
Sat Aug 11 02:07:41 PDT 2018


https://bugs.llvm.org/show_bug.cgi?id=38527

            Bug ID: 38527
           Summary: x86 backend widens v2f32 vectors only to do more work
                    in software operations later for `sin`, `cos` etc.
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: X86
          Assignee: unassignedbugs at nondot.org
          Reporter: simonas+llvm.org at kazlauskas.me
                CC: llvm-bugs at lists.llvm.org

Created attachment 20678
  --> https://bugs.llvm.org/attachment.cgi?id=20678&action=edit
Simple fsin call on v2f32

Given the attached LLVM IR, running llc on it as such

> llc winsimd.ll -O3 -filetype=asm -o -

Will result in following assembly being generated


        movq    %rdi, %rbx
        movsd   (%rdi), %xmm0           # xmm0 = mem[0],zero
        movaps  %xmm0, (%rsp)           # 16-byte Spill
        callq   sinf
        movaps  %xmm0, 16(%rsp)         # 16-byte Spill
        movaps  (%rsp), %xmm0           # 16-byte Reload
        callq   sinf
        movaps  16(%rsp), %xmm1         # 16-byte Reload
        unpcklps        %xmm0, %xmm1    # xmm1 =
xmm1[0],xmm0[0],xmm1[1],xmm0[1]
        movaps  %xmm1, 16(%rsp)         # 16-byte Spill
        movaps  (%rsp), %xmm0           # 16-byte Reload
        callq   sinf
        movaps  %xmm0, 32(%rsp)         # 16-byte Spill
        pshufd  $229, (%rsp), %xmm0     # 16-byte Folded Reload
                                        # xmm0 = mem[1,1,2,3]
        callq   sinf
        movdqa  32(%rsp), %xmm1         # 16-byte Reload
        punpckldq       %xmm0, %xmm1    # xmm1 =
xmm1[0],xmm0[0],xmm1[1],xmm0[1]
        <snip>

Note how this generates 4 distinct calls to `sinf`, although 2 should be
sufficient. This happens, because the backend legalises the operation on v2f32
to v4f32, as indicated by `-debug` output from `llc`. However, the backend then
neglects to take care to not calculate `sinf` on the `undef` components of the
vector, resulting in extra work.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20180811/57aba895/attachment.html>


More information about the llvm-bugs mailing list