[cfe-dev] Extra join points in CFG?

Tue Jan 10 07:45:20 PST 2012

On 23-12-11 2:47, Ted Kremenek wrote:
> On Dec 22, 2011, at 6:02 AM, Xu Zhongxing wrote:
>
>> Hi Ted,
>>
>> What do you mean by "does not properly model C++ destructors at all"?
>> Could you give me a test case? Thanks.
>
> Certainly. Consider:
>
> $ cat /tmp/test.cpp
> int foo();
> int bar();
>
> class A {
> public:
> A(int x);
> ~A();
> operator bool();
> };
>
> class B {
> public:
> B(int x);
> ~B();
> operator bool();
> };
>
> class C {
> public:
> C(int x);
> ~C();
> operator bool();
> };
>
> int test(int x, int y) {
> if (C b = A(x) || B(y))
> return foo();
> return bar();
> }
>
>
> Let's look at the CFG without destructors:
>
> $ clang -cc1 -analyze -analyzer-checker=debug.DumpCFG /tmp/test.cpp
> [B6 (ENTRY)]
> Succs (1): B4
>
> [B1]
> 1: bar
> 2: [B1.1] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void))
> 3: [B1.2]()
> 4: return [B1.3];
> Preds (1): B3
> Succs (1): B0
>
> [B2]
> 1: foo
> 2: [B2.1] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void))
> 3: [B2.2]()
> 4: return [B2.3];
> Preds (1): B3
> Succs (1): B0
>
> [B3]
> 1: [B4.8] || [B5.8]
> 2: [B3.1] (ImplicitCastExpr, IntegralCast, int)
> 3: [B3.2] (CXXConstructExpr, class C)
> 4: [B3.3] (ImplicitCastExpr, ConstructorConversion, class C)
> 5: [B3.4] (BindTemporary)
> 6: [B3.5] (ImplicitCastExpr, NoOp, const class C)
> 7: [B3.6]
> 8: C b = A(x).operator _Bool() || B(y).operator _Bool();
> 9: b
> 10: [B3.9].operator _Bool
> 11: [B3.10]()
> 12: [B3.11] (ImplicitCastExpr, UserDefinedConversion, _Bool)
> T: if [B3.12]
> Preds (2): B5 B4
> Succs (2): B2 B1
>
> [B4]
> 1: x
> 2: [B4.1] (ImplicitCastExpr, LValueToRValue, int)
> 3: [B4.2] (CXXConstructExpr, class A)
> 4: [B4.3] (BindTemporary)
> 5: A([B4.4]) (CXXFunctionalCastExpr, ConstructorConversion, class A)
> 6: [B4.5].operator _Bool
> 7: [B4.6]()
> 8: [B4.7] (ImplicitCastExpr, UserDefinedConversion, _Bool)
> T: [B4.8] || ...
> Preds (1): B6
> Succs (2): B3 B5
>
> [B5]
> 1: y
> 2: [B5.1] (ImplicitCastExpr, LValueToRValue, int)
> 3: [B5.2] (CXXConstructExpr, class B)
> 4: [B5.3] (BindTemporary)
> 5: B([B5.4]) (CXXFunctionalCastExpr, ConstructorConversion, class B)
> 6: [B5.5].operator _Bool
> 7: [B5.6]()
> 8: [B5.7] (ImplicitCastExpr, UserDefinedConversion, _Bool)
> Preds (1): B4
> Succs (1): B3
>
> [B0 (EXIT)]
> Preds (2): B1 B2
>
>
> This CFG behaves as expected. We have a single block with the '||'
> operation as a terminator (block 4), and a single block that merges the
> values from the LHS and RHS expressions (block B3) which is then
> terminated with the 'if' branch terminator. My motivation for looking at
> this CFG was to further optimize this so that block B5 was seen as
> dominating block B2. This would improve the quality of some
> flow-sensitive (but path-insensitive) analyses in the frontend and also
> simplify the CFG.
>
> The CFG gets significantly more complicated (and wrong) when we add
> implicit destructors:
>
> $ clang -cc1 -analyze -cfg-add-implicit-dtors
> -analyzer-checker=debug.DumpCFG /tmp/test.cpp
>
> [B8 (ENTRY)]
> Succs (1): B6
>
> [B1]
> 1: [B3.2].~C() (Implicit destructor)
> 2: bar
> 3: [B1.2] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void))
> 4: [B1.3]()
> 5: return [B1.4];
> Preds (1): B3
> Succs (1): B0
>
> [B2]
> 1: foo
> 2: [B2.1] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void))
> 3: [B2.2]()
> 4: return [B2.3];
> 5: [B3.2].~C() (Implicit destructor)
> Preds (1): B3
> Succs (1): B0
>
> [B3]
> 1: ~A() (Temporary object destructor)
> 2: C b = A(x).operator _Bool() || B(y).operator _Bool();
> 3: b
> 4: [B3.3].operator _Bool
> 5: [B3.4]()
> 6: [B3.5] (ImplicitCastExpr, UserDefinedConversion, _Bool)
> T: if [B3.6]
> Preds (2): B4 B5
> Succs (2): B2 B1
>
> [B4]
> 1: ~B() (Temporary object destructor)
> Preds (1): B5
> Succs (1): B3
>
> [B5]
> 1: [B6.8] || [B7.8]
> 2: [B5.1] (ImplicitCastExpr, IntegralCast, int)
> 3: [B5.2] (CXXConstructExpr, class C)
> 4: [B5.3] (ImplicitCastExpr, ConstructorConversion, class C)
> 5: [B5.4] (BindTemporary)
> 6: [B5.5] (ImplicitCastExpr, NoOp, const class C)
> 7: [B5.6]
> 8: ~C() (Temporary object destructor)
> T: [B6.8] || ...
> Preds (2): B7 B6
> Succs (2): B3 B4
>
> [B6]
> 1: x
> 2: [B6.1] (ImplicitCastExpr, LValueToRValue, int)
> 3: [B6.2] (CXXConstructExpr, class A)
> 4: [B6.3] (BindTemporary)
> 5: A([B6.4]) (CXXFunctionalCastExpr, ConstructorConversion, class A)
> 6: [B6.5].operator _Bool
> 7: [B6.6]()
> 8: [B6.7] (ImplicitCastExpr, UserDefinedConversion, _Bool)
> T: [B6.8] || ...
> Preds (1): B8
> Succs (2): B5 B7
>
> [B7]
> 1: y
> 2: [B7.1] (ImplicitCastExpr, LValueToRValue, int)
> 3: [B7.2] (CXXConstructExpr, class B)
> 4: [B7.3] (BindTemporary)
> 5: B([B7.4]) (CXXFunctionalCastExpr, ConstructorConversion, class B)
> 6: [B7.5].operator _Bool
> 7: [B7.6]()
> 8: [B7.7] (ImplicitCastExpr, UserDefinedConversion, _Bool)
> Preds (1): B6
> Succs (1): B5
>
> [B0 (EXIT)]
> Preds (2): B1 B2
>
>
> Notice that blocks B6 and B5 have the same terminator, which itself is
> suspect. The rationale is that the temporary for 'B(y)' should only be
> destructed when we evaluated the RHS of the logical operation. This is
> good enough for the static analyzer, but the CFG itself doesn't
> explicitly capture this dependency. This can be seen by observing that
> there is a path in the CFG from the entry block to the exit block where
> destructor ~B is called but the constructor B() is NOT called. While the
> static analyzer will likely get this right and recover that information
> (since it does enough value flow analysis to understand
> control-dependent branches), this unnecessarily drops a bunch of
> control-dependencies from the CFG which will degrade the quality of
> CFG-based analyses in the frontend.

Well, B6 and B5 have the same conditions, and the CFG is correct in that 
respect. So the flow is that if B7 is executed, B4 will also always be 
executed (and the other way around). So it boils down to: is it allowed 
for one block to access the condition calculated in a preceeding block?

> The second, and more glaring problem, is that ~C is called twice along a
> path, both as an implicit destructor (in B1 and B2 respectively) and a
> destructor for a temporary (in B5). I may be missing something here, but
> this doesn't show up in the LLVM IR (which clearly shows a single
> destructor call to ~C).
>
> Also, look at where the destructors are called. In block B1, ~C is
> correctly called before the call to bar(), but in block B2, the
> destructor appears after the return statement. Instead, it should appear
> immediately after the call to foo().

It gets worse when you write:

int test(int x, int y) {
   if (C b = A(x) || B(y))
     return foo();
   else
     return bar();
}

Not only will both destructor calls appear after the return statement, 
but an extra block will be generated:

  [B1]
    1: [B4.2].~C() (Implicit destructor)
    Succs (1): B0

Also, should the destructors A and/or B be called after the condition 
has been calculated, or only after calling ~C?

-- Erik.