[PATCH] D47067: Update NRVO logic to support early return
Taiju Tsuiki via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu May 24 08:09:15 PDT 2018
tzik added inline comments.
================
Comment at: lib/Sema/SemaDecl.cpp:12760
// to deduce an implicit return type.
- if (FD->getReturnType()->isRecordType() &&
- (!getLangOpts().CPlusPlus || !FD->isDependentContext()))
+ if (!FD->getReturnType()->isScalarType())
computeNRVO(Body, getCurFunction());
----------------
Quuxplusone wrote:
> rsmith wrote:
> > tzik wrote:
> > > Quuxplusone wrote:
> > > > What is the purpose of this change?
> > > > If it's "no functional change" it should be done separately IMHO; if it is supposed to change codegen, then it needs some codegen tests. (From looking at the code: maybe this affects codegen on functions that return member function pointers by value?)
> > > I think the previous implementation was incorrect.
> > > Though computeNRVO clears ReturnStmt::NRVOCandidate when the corresponding variable is not NRVO variable, CGStmt checks both of ReturnStmt::NRVOCandidate and VarDecl::NRVOVariable anyway.
> > > So, computeNRVO took no effect in the previous code, and the absence of computeNRVO here on function templates did not matter.
> > > Note that there was no chance to call computeNRVO on function template elsewhere too.
> > >
> > > OTOH in the new implementation, computeNRVO is necessary, and its absence on function templates matters.
> > >
> > > We can remove computeNRVO here separately as a NFC patch and readd the new impl to the same place, but it's probably less readable, IMO.
> > What happens if we can't tell whether we have an NRVO candidate or not until instantiation:
> >
> > ```
> > template<typename T> X f() {
> > T v;
> > return v; // is an NRVO candidate if T is X, otherwise is not
> > }
> > ```
> >
> > (I think this is not hard to handle: the dependent construct here can only affect whether you get NRVO at all, not which variable you perform NRVO on, but I want to check that it is handled properly.)
> Ah, so if I understand correctly — which I probably don't —...
> the parts of this diff related to `isRecordType()` make NRVO more reliable on template functions? Because the existing code has lots of checks for `isRecordType()` which means that they don't run for dependent (as-yet-unknown) types, even if those types are going to turn out to be record types occasionally?
>
> I see that line 12838 runs `computeNRVO` unconditionally for *all* ObjCMethodDecls, //even// ones with scalar return types. So maybe running `computeNRVO` unconditionally right here would also be correct?
> Datapoint: I made a patch to do nothing but replace this condition with `if (Body)` and to remove the similar condition guarding `computeNRVO` in `Sema::ActOnBlockStmtExpr`. It passed the Clang test suite with flying colors. I don't know if this means that the test suite is missing some tests, or if it means that the conditions are unnecessary.
>
> IMHO if you're changing this anyway, it would be nice to put the remaining condition/optimization `if (getReturnType().isScalarType()) return;` inside `computeNRVO` itself, instead of repeating that condition at every call site.
`v` here is marked as an NRVO variable in its function template AST, and that propagates to the instantiation if the return type is compatible.
https://github.com/llvm-mirror/clang/blob/9c82d4ff6015e4477e924c8a495a834c2fced29e/lib/Sema/SemaTemplateInstantiateDecl.cpp#L743
================
Comment at: lib/Sema/SemaDecl.cpp:12760
// to deduce an implicit return type.
- if (FD->getReturnType()->isRecordType() &&
- (!getLangOpts().CPlusPlus || !FD->isDependentContext()))
+ if (!FD->getReturnType()->isScalarType())
computeNRVO(Body, getCurFunction());
----------------
tzik wrote:
> Quuxplusone wrote:
> > rsmith wrote:
> > > tzik wrote:
> > > > Quuxplusone wrote:
> > > > > What is the purpose of this change?
> > > > > If it's "no functional change" it should be done separately IMHO; if it is supposed to change codegen, then it needs some codegen tests. (From looking at the code: maybe this affects codegen on functions that return member function pointers by value?)
> > > > I think the previous implementation was incorrect.
> > > > Though computeNRVO clears ReturnStmt::NRVOCandidate when the corresponding variable is not NRVO variable, CGStmt checks both of ReturnStmt::NRVOCandidate and VarDecl::NRVOVariable anyway.
> > > > So, computeNRVO took no effect in the previous code, and the absence of computeNRVO here on function templates did not matter.
> > > > Note that there was no chance to call computeNRVO on function template elsewhere too.
> > > >
> > > > OTOH in the new implementation, computeNRVO is necessary, and its absence on function templates matters.
> > > >
> > > > We can remove computeNRVO here separately as a NFC patch and readd the new impl to the same place, but it's probably less readable, IMO.
> > > What happens if we can't tell whether we have an NRVO candidate or not until instantiation:
> > >
> > > ```
> > > template<typename T> X f() {
> > > T v;
> > > return v; // is an NRVO candidate if T is X, otherwise is not
> > > }
> > > ```
> > >
> > > (I think this is not hard to handle: the dependent construct here can only affect whether you get NRVO at all, not which variable you perform NRVO on, but I want to check that it is handled properly.)
> > Ah, so if I understand correctly — which I probably don't —...
> > the parts of this diff related to `isRecordType()` make NRVO more reliable on template functions? Because the existing code has lots of checks for `isRecordType()` which means that they don't run for dependent (as-yet-unknown) types, even if those types are going to turn out to be record types occasionally?
> >
> > I see that line 12838 runs `computeNRVO` unconditionally for *all* ObjCMethodDecls, //even// ones with scalar return types. So maybe running `computeNRVO` unconditionally right here would also be correct?
> > Datapoint: I made a patch to do nothing but replace this condition with `if (Body)` and to remove the similar condition guarding `computeNRVO` in `Sema::ActOnBlockStmtExpr`. It passed the Clang test suite with flying colors. I don't know if this means that the test suite is missing some tests, or if it means that the conditions are unnecessary.
> >
> > IMHO if you're changing this anyway, it would be nice to put the remaining condition/optimization `if (getReturnType().isScalarType()) return;` inside `computeNRVO` itself, instead of repeating that condition at every call site.
> `v` here is marked as an NRVO variable in its function template AST, and that propagates to the instantiation if the return type is compatible.
> https://github.com/llvm-mirror/clang/blob/9c82d4ff6015e4477e924c8a495a834c2fced29e/lib/Sema/SemaTemplateInstantiateDecl.cpp#L743
> the parts of this diff related to `isRecordType()` [...]
Right, they look not RecordTypes in the template context.
> Because the existing code has lots of checks for `isRecordType()` [...]
Hm, maybe. Some of `isRecordType` don't have dependent type handling around them...
> running `computeNRVO` unconditionally right here would also be correct?
Updated. Agree, scalar types should work, IIUC.
================
Comment at: test/CodeGenCXX/nrvo.cpp:139
}
- // FIXME: we should NRVO this variable too.
- X x;
+ // CHECK: tail call {{.*}} @_ZN1XC1ERKS_
return x;
----------------
rsmith wrote:
> tzik wrote:
> > Quuxplusone wrote:
> > > You've changed this function from testing one thing with a FIXME, to testing a completely different thing. Could you add your new code as a new function, and leave the old FIXME alone until it's fixed?
> > > Alternatively, if your patch actually fixes the FIXME, could you just replace the FIXME comment with a CHECK and leave the rest of this function alone?
> > My patch fixes the FIXME.
> > However, on the resulting code of NRVOing,
> > ```
> > X y;
> > return y;
> > ```
> > and
> > ```
> > X x;
> > return x;
> > ```
> > get to the same code and unified. And, the function is simplified as
> > ```
> > X test3(bool B) {
> > X x;
> > return x;
> > }
> > ```
> > without the if statement.
> > So, there will be nothing to CHECK left here if I leave the code as-is. I think that does not fit to the test scenario.
> ... and this is one reason why we don't generally like clang's IR generation tests to enable optimizations. Please consider starting a new test file to test the output from clang's frontend rather than the output from -O1.
OK, I added one.
Repository:
rC Clang
https://reviews.llvm.org/D47067
More information about the cfe-commits
mailing list