[llvm-bugs] [Bug 34839] New: Missed constant propagation of aggregate return values of non-inlined static functions
via llvm-bugs
llvm-bugs at lists.llvm.org
Wed Oct 4 18:31:28 PDT 2017
https://bugs.llvm.org/show_bug.cgi?id=34839
Bug ID: 34839
Summary: Missed constant propagation of aggregate return values
of non-inlined static functions
Product: new-bugs
Version: trunk
Hardware: PC
OS: Linux
Status: NEW
Keywords: performance
Severity: enhancement
Priority: P
Component: new bugs
Assignee: unassignedbugs at nondot.org
Reporter: peter at cordes.ca
CC: llvm-bugs at lists.llvm.org
clang is able to propagate constant return values from non-inlined functions in
simple cases (where it's a scalar int), but not in more complex cases (a
std::pair or std::optional)
// simple case:
int globalvar;
static // with static, clang omits the mov $42,%eax
__attribute__((noinline))
int get_constant() {
globalvar = 22;
return 42; }
# mov $42,%eax # omitted with static
movl $22, globalvar(%rip)
retq
int call_constant() { return 10 - get_constant(); }
pushq %rax
callq get_constant()
movl $-32, %eax # whether or not we optimize out mov $42,%eax in the
callee
popq %rcx
retq
(related: gcc doesn't even do the simple case:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82432)
But a more complex case defeats us:
#include <optional>
using std_optional_int = std::optional<int>;
static
__attribute__((noinline))
auto get_std_optional_int() noexcept -> std_optional_int {
return {42};
}
movabsq $4294967338, %rax # imm = 0x10000002A
ret
int a, b;
int main() {
a = get_std_optional_int().value();
// b = get_pair().second;
}
// clang6.0 -std=c++17 -stdlib=libc++ -Ofast -march=skylake
// https://godbolt.org/g/jkn741
main: # @main
pushq %rax
callq get_std_optional_int()
movabsq $1095216660480, %rcx # imm = 0xFF00000000
testq %rcx, %rax
je .LBB2_2
movl %eax, a(%rip)
xorl %eax, %eax
popq %rcx
retq
.LBB2_2:
callq abort
Obviously we'd like to omit the branch, and maybe just movl $42, a(%rip),
although we might as well use the value in %eax if the function's going to put
it there. (As long as we know it's not the result of a long dependency
chain...)
BTW, if we do branch: bt $32, %rax / jnc would be much smaller code size, and
avoiding a 64-bit constant is good for the uop cache. This would be a win on
everything: K10/bdver/Ryzen, Jaguar/Atom/Silvermont, P6 and 64-bit-capable-P4,
and Sandybridge-family. A couple of those are slow for bt reg,reg, but not bt
reg,imm. It doesn't macro-fuse, though.
There's more to be gained by customizing the return-value passing convention
more for static functions (or LTO), e.g. passing the bool member in a separate
register even for 32-bit T where x86-64 SysV says to pack it into RAX.
(Working on a separate bug report about that, and maybe having std::optional
store its members in the opposite order)
related: libstdc++ std::optional<int> is not trivially copyable, but libc++'s
is. https://stackoverflow.com/q/46544019/224132. Another optimization would
be to return in registers even when the C++ ABI would normally return in
memory.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20171005/772c8328/attachment.html>
More information about the llvm-bugs
mailing list