[llvm-dev] Optimizing pass-by-value structs for le64 target

Tue Jul 2 07:55:30 PDT 2019

Consider the following small example:

struct wrapper {
    long value;
};
long read_wrapper(wrapper w) { return w.value; }
long read_primitive(long x) { return x; }

When compiling for x86 at -O1, both functions reduce nicely to a single IR
instruction. Looks like -sroa is performing this transformation, but even
at -O0 it has deduced that the argument is really just an i64.

Before -sroa:
define dso_local i64 @_Z12read_wrapper7wrapper(i64) #0 {
  %2 = alloca %struct.wrapper, align 8
  %3 = getelementptr inbounds %struct.wrapper, %struct.wrapper* %2, i32 0,
i32 0
  store i64 %0, i64* %3, align 8
  %4 = getelementptr inbounds %struct.wrapper, %struct.wrapper* %2, i32 0,
i32 0
  %5 = load i64, i64* %4, align 8
  ret i64 %5
}

After -sroa:
define dso_local i64 @_Z12read_wrapper7wrapper(i64 returned)
local_unnamed_addr #0 {
  ret i64 %0
}

But when I add -target le64, the read_wrapper function accepts
a %struct.wrapper* byval, a pointer to the caller's stack. No level of
optimization is able to make this function look as simple as
read_primitive.

define dso_local i64 @_Z12read_wrapper7wrapper(%struct.wrapper* byval
nocapture readonly align 8) local_unnamed_addr #0 {
  %2 = getelementptr inbounds %struct.wrapper, %struct.wrapper* %0, i64 0,
i32 0
  %3 = load i64, i64* %2, align 8, !tbaa !2
  ret i64 %3
}

We're writing our own LLVM backend for a new architecture, we started with
the generic little-endian 64-bit target (le64) and made customizations from
there. What needs to be done to re-enable this optimization for our target?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190702/2aad1a63/attachment-0001.html>