[cfe-dev] About NRVO (named return value optimization)

David Blaikie dblaikie at gmail.com
Tue Jun 24 23:14:16 PDT 2014


This isn't related to NRVO - as the name suggests, NRVO is about named
return values. The example you gave has no return values and no named
values.

The optimization necessary here is stack reuse, which classically
LLVM/Clang haven't done a great job on. I'm not sure of the precise
details of the current state, but there have been some efforts to make
it better.

One part of that is the lifetime intrinsics (
http://llvm.org/docs/LangRef.html#memory-use-markers ) which would
allow the backend to know that the stack memory used by the first
temporary is dead before the first use of the stack memory for the
second temporary, and thus reuse the stack. I don't know what the
current state of the lifetime markers is (I guess we don't turn them
on by default? not sure whether they're brokne/inefficient/slow/not
valuable enough yet) and whether they're a viable way forward, but
someone thought so at some point.

- David

On Tue, Jun 24, 2014 at 10:57 PM, Jiangning Liu <liujiangning1 at gmail.com> wrote:
> Hi,
>
> For the following small test case,
>
> // RUN: %clang_cc1 -triple i386-unknown-unknown -emit-llvm -O1 -o - %s |
> FileCheck %s
>
> // Test code generation for the named return value optimization.
> class X {
> public:
>   X();
> };
>
> void f(const X& x);
> void test10(bool b) {
>   f(X());
>   f(X());
> }
>
> we are generating the following LLVM IR with "
>
> %class.X = type { i8 }
>
> ; Function Attrs: nounwind
> define void @_Z6test10b(i1 zeroext %b) #0 {
> entry:
>   %ref.tmp = alloca %class.X, align 1
>   %ref.tmp1 = alloca %class.X, align 1
>   call void @_ZN1XC1Ev(%class.X* %ref.tmp) #2
>   call void @_Z1fRK1X(%class.X* nonnull %ref.tmp) #2
>   call void @_ZN1XC1Ev(%class.X* %ref.tmp1) #2
>   call void @_Z1fRK1X(%class.X* nonnull %ref.tmp1) #2
>   ret void
> }
>
> declare void @_Z1fRK1X(%class.X* nonnull) #1
> declare void @_ZN1XC1Ev(%class.X*) #1
>
> So my questions is should NRVO be able to know ref.tmp and ref.tmp1 can be
> merged to be a single one? That is, I'm expecting the following LLVM IR code
> to be generated,
>
> define void @_Z6test10b(i1 zeroext %b) #0 {
> entry:
>   %ref.tmp = alloca %class.X, align 1
>   call void @_ZN1XC1Ev(%class.X* %ref.tmp) #2
>   call void @_Z1fRK1X(%class.X* nonnull %ref.tmp) #2
>   call void @_ZN1XC1Ev(%class.X* %ref.tmp) #2
>   call void @_Z1fRK1X(%class.X* nonnull %ref.tmp) #2
>   ret void
> }
>
> If we leave both ref.tmp and ref.tmp1 to LLVM IR, it seems to be hard for
> middle-end to combine them unless we demangle the function name _ZN1XC1Ev to
> know it is a C++ constructor and do more alias analysis.
>
> Any idea?
>
> Thanks,
> -Jiangning
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>



More information about the cfe-dev mailing list