[llvm-bugs] [Bug 34839] New: Missed constant propagation of aggregate return values of non-inlined static functions

via llvm-bugs llvm-bugs at lists.llvm.org
Wed Oct 4 18:31:28 PDT 2017


            Bug ID: 34839
           Summary: Missed constant propagation of aggregate return values
                    of non-inlined static functions
           Product: new-bugs
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Keywords: performance
          Severity: enhancement
          Priority: P
         Component: new bugs
          Assignee: unassignedbugs at nondot.org
          Reporter: peter at cordes.ca
                CC: llvm-bugs at lists.llvm.org

clang is able to propagate constant return values from non-inlined functions in
simple cases (where it's a scalar int), but not in more complex cases (a
std::pair or std::optional)

// simple case:
int globalvar;
static    // with static, clang omits the mov $42,%eax
 int get_constant() {
     globalvar = 22; 
     return 42; }

    # mov $42,%eax   # omitted with static
    movl    $22, globalvar(%rip)

int call_constant() { return 10 - get_constant(); }
    pushq   %rax
    callq   get_constant()
    movl    $-32, %eax    # whether or not we optimize out mov $42,%eax in the
    popq    %rcx

(related: gcc doesn't even do the simple case:
But a more complex case defeats us:

#include <optional>
using std_optional_int = std::optional<int>;

auto get_std_optional_int() noexcept -> std_optional_int {
    return {42};
        movabsq $4294967338, %rax       # imm = 0x10000002A

int a, b;
int main() {
    a = get_std_optional_int().value();
//    b = get_pair().second;
  // clang6.0 -std=c++17 -stdlib=libc++  -Ofast -march=skylake
  // https://godbolt.org/g/jkn741
main:                                   # @main
        pushq   %rax
        callq   get_std_optional_int()
        movabsq $1095216660480, %rcx    # imm = 0xFF00000000
        testq   %rcx, %rax
        je      .LBB2_2
        movl    %eax, a(%rip)
        xorl    %eax, %eax
        popq    %rcx
        callq   abort

Obviously we'd like to omit the branch, and maybe just movl $42, a(%rip),
although we might as well use the value in %eax if the function's going to put
it there.  (As long as we know it's not the result of a long dependency

BTW, if we do branch:  bt $32, %rax / jnc  would be much smaller code size, and
avoiding a 64-bit constant is good for the uop cache.  This would be a win on
everything: K10/bdver/Ryzen, Jaguar/Atom/Silvermont, P6 and 64-bit-capable-P4,
and Sandybridge-family.  A couple of those are slow for bt reg,reg, but not bt
reg,imm.  It doesn't macro-fuse, though.

There's more to be gained by customizing the return-value passing convention
more for static functions (or LTO), e.g. passing the bool member in a separate
register even for 32-bit T where x86-64 SysV says to pack it into RAX. 
(Working on a separate bug report about that, and maybe having std::optional
store its members in the opposite order)

related: libstdc++ std::optional<int> is not trivially copyable, but libc++'s
is.  https://stackoverflow.com/q/46544019/224132.  Another optimization would
be to return in registers even when the C++ ABI would normally return in

You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20171005/772c8328/attachment.html>

More information about the llvm-bugs mailing list