[cfe-dev] [analyzer] Temporaries.

Artem Dergachev via cfe-dev cfe-dev at lists.llvm.org
Fri Mar 23 12:01:08 PDT 2018


== Errata ==

In the previous message i wrote:

 > copy elision makes paths longer

I meant lack of copy elision><


== C++17, Functional casts, Liveness ==

I wanted to share this example to document a certain bug we seem to 
currently have with our liveness analysis, in order to get back to it 
later and also to justify how careful we should be.

A naive attempt to support C++17 mandatory copy elision for variable and 
constructor initializers (return values are a bit more tricky) would be 
to model C++17-specific construction contexts added in 
https://reviews.llvm.org/D44597 and https://reviews.llvm.org/D44763 
similarly to their "simple" counterparts.

This approach fails, however, in a fairly funny manner. Consider an example:

```
class C1 {
   int x;
public:
   C1(int x): x(x) {}
   int getX() const { return x; }
   ~C1();
};

class C2 {
   int x;
   int y;
public:
   C2(int x, int y): x(x), y(y) {}
   int getX() const { return x; }
   int getY() const { return y; }
   ~C2();
};

void foo(int coin) {
   C1 c1 = coin ? C1(1) : C1(2);
   c1.getX();
   C2 c2 = coin ? C2(3, 4) : C2(5, 6);
   c2.getX();
}
```

In this example C1 is a wrapper around a single integer and C2 is a 
wrapper around a pair of integers; there are no other differences 
between these two classes. However, if we attempt to treat 
CXX17ElidedCopyConstructionContext and SimpleVariableConstructionContext 
similarly in ExprEngine::getRegionForConstructedObject(), we suddenly 
get the following warning:

$ clang -cc1 -analyze -analyzer-checker core test.cpp -std=c++17
test.cpp:14:22: warning: Undefined or garbage value returned to caller
   int getX() const { return x; }
                      ^~~~~~~~
1 warning generated.

The warning appears for C2 but not for C1. The warning is a false 
positive that appears because construction results were removed from the 
Store because variable 'c2' is not yet live when the construction 
happens. However, variable 'c1' is live when its construction happens.

It is, emm, uhm, moderately easy to notice that the AST for constructors 
of C1 and C2 on ternary operator branches are completely different: 
C1(1) is a functional cast from integer 1 to class C1, while C2 is a 
normal temporary object expression.

Indeed, here's the AST for c1:

```
     |-DeclStmt 0x7ff92c002570 <line:20:3, col:31>
     | `-VarDecl 0x7ff92c002100 <col:3, col:30> col:6 used c1 'C1' cinit
     |   `-ExprWithCleanups 0x7ff92c002558 <col:11, col:30> 'C1'
     |     `-ConditionalOperator 0x7ff92c002528 <col:11, col:30> 'C1'
     |       |-ImplicitCastExpr 0x7ff92c002510 <col:11> 'bool' 
<IntegralToBoolean>
     |       | `-ImplicitCastExpr 0x7ff92c0024f8 <col:11> 'int' 
<LValueToRValue>
     |       |   `-DeclRefExpr 0x7ff92c002160 <col:11> 'int' lvalue 
ParmVar 0x7ff92c001f80 'coin' 'int'
     |       |-CXXFunctionalCastExpr 0x7ff92c002418 <col:18, col:22> 
'C1' functional cast to class C1 <ConstructorConversion>
     |       | `-CXXBindTemporaryExpr 0x7ff92c0023f8 <col:18, col:22> 
'C1' (CXXTemporary 0x7ff92c0023f0)
     |       |   `-CXXConstructExpr 0x7ff92c002388 <col:18, col:22> 'C1' 
'void (int)'
     |       |     `-IntegerLiteral 0x7ff92c002198 <col:21> 'int' 1
     |       `-CXXFunctionalCastExpr 0x7ff92c0024d0 <col:26, col:30> 
'C1' functional cast to class C1 <ConstructorConversion>
     |         `-CXXBindTemporaryExpr 0x7ff92c0024b0 <col:26, col:30> 
'C1' (CXXTemporary 0x7ff92c0024a8)
     |           `-CXXConstructExpr 0x7ff92c002470 <col:26, col:30> 'C1' 
'void (int)'
     |             `-IntegerLiteral 0x7ff92c002450 <col:29> 'int' 2
```

... and here's the AST for c2:

```
     |-DeclStmt 0x7ff92c002b10 <line:22:3, col:37>
     | `-VarDecl 0x7ff92c0026c8 <col:3, col:36> col:6 used c2 'C2' cinit
     |   `-ExprWithCleanups 0x7ff92c002af8 <col:11, col:36> 'C2'
     |     `-ConditionalOperator 0x7ff92c002ac8 <col:11, col:36> 'C2'
     |       |-ImplicitCastExpr 0x7ff92c002ab0 <col:11> 'bool' 
<IntegralToBoolean>
     |       | `-ImplicitCastExpr 0x7ff92c002a98 <col:11> 'int' 
<LValueToRValue>
     |       |   `-DeclRefExpr 0x7ff92c002728 <col:11> 'int' lvalue 
ParmVar 0x7ff92c001f80 'coin' 'int'
     |       |-CXXBindTemporaryExpr 0x7ff92c0029b8 <col:18, col:25> 'C2' 
(CXXTemporary 0x7ff92c0029b0)
     |       | `-CXXTemporaryObjectExpr 0x7ff92c002968 <col:18, col:25> 
'C2' 'void (int, int)'
     |       |   |-IntegerLiteral 0x7ff92c002760 <col:21> 'int' 3
     |       |   `-IntegerLiteral 0x7ff92c002780 <col:24> 'int' 4
     |       `-CXXBindTemporaryExpr 0x7ff92c002a78 <col:29, col:36> 'C2' 
(CXXTemporary 0x7ff92c002a70)
     |         `-CXXTemporaryObjectExpr 0x7ff92c002a28 <col:29, col:36> 
'C2' 'void (int, int)'
     |           |-IntegerLiteral 0x7ff92c0029e8 <col:32> 'int' 5
     |           `-IntegerLiteral 0x7ff92c002a08 <col:35> 'int' 6
```

Note that the AST for c1 has MORE expressions (which presumably means 
more chances to garbage-collect), but in fact it is c2 that is getting 
garbage-collected.

This looks to me like a bug - or at least an inconsistency - in liveness 
analysis, and the long-term fix that will allow us to finally support 
C++17 would probably be to fix this inconsistency.

The fact that liveness analysis has never given us any problems so far - 
apart from not supporting some new features we requested from it - is 
very surprising. Our liveness analysis must be very conservative if the 
first problem we've seen with it in, like, years is in a C++17 feature 
that was not even planned back when liveness analysis was written. If 
liveness analysis is so conservative, maybe we could speed up the 
analyzer dramatically by making it eliminate more expressions and 
variables during removeDead()?


On 3/21/18 8:45 PM, Artem Dergachev wrote:
> == Copy elision ==
>
> = Why? =
>
> I've underestimated the importance of implementing copy elision at first.
>
> Eliding copies changes observable behavior of the program if the 
> copy/move constructor has side effects. Whether to elide creation of a 
> temporary object in certain operations is up to implementation, but 
> most implementations currently support and prefer copy elision, so if 
> we don't do the same in the analyzer we'd be analyzing a different 
> program, and users who don't care about portability may become grumpy 
> because they are sure that copy elision is happening.
>
> Additionally, copy elision makes paths longer - including the number 
> of diagnostic pieces - which makes reports harder to understand and 
> exhausts the analyzer's budget faster.
(cut)


More information about the cfe-dev mailing list