[llvm-bugs] [Bug 24936] New: Inefficient loop unrolling on Silvermont (20% performance penalty vs gcc trunk)

Fri Sep 25 05:55:53 PDT 2015

https://llvm.org/bugs/show_bug.cgi?id=24936

            Bug ID: 24936
           Summary: Inefficient loop unrolling on Silvermont (20%
                    performance penalty vs gcc trunk)
           Product: clang
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: LLVM Codegen
          Assignee: unassignedclangbugs at nondot.org
          Reporter: egor.kochetov at intel.com
                CC: llvm-bugs at lists.llvm.org
    Classification: Unclassified

Created attachment 14933
  --> https://llvm.org/bugs/attachment.cgi?id=14933&action=edit
c++ reproducer and Makefile

While compiling one of the programs I've encountered the source that works
faster by 20% if compiled by gcc than if compiled by clang. The target for
compilation is Linux, Intel Silvermont CPU. Affected optsets are (-m32, -m64) ×
(-Ofast, -O2).

The reason for the problem turned out to be the fact that gcc unwinds
constant-sized loop (be it #define in C or constexpr in C++, it does not
matter), while clang does not.

Attached is the source that adds and multiplies floating-point zeros and
reproduces this performance difference. Makefile for this source is also
attached, fix compiler paths in it and run the sample like that:

    make clean run; make clean run COMPILER=gcc

Make will produce the binary main.g++ or main.clang++ and the corresponding .s
file with disassembly of the slow_function.

Compilers under consideration were gcc 6.0 trunk and clang 3.8 trunk.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20150925/39b77d1e/attachment-0001.html>