[cfe-dev] Extra join points in CFG?

Thu Dec 22 17:51:53 PST 2011

I've filed this as PR 11645.

On Dec 22, 2011, at 5:47 PM, Ted Kremenek wrote:

> On Dec 22, 2011, at 6:02 AM, Xu Zhongxing wrote:
> 
>> Hi Ted,
>> 
>> What do you mean by "does not properly model C++ destructors at all"?
>> Could you give me a test case? Thanks.
> 
> Certainly.  Consider:
> 
> $ cat /tmp/test.cpp
> int foo();
> int bar();
> 
> class A {
> public:
>   A(int x);
>   ~A();
>   operator bool();
> };
> 
> class B {
> public:
>   B(int x);
>   ~B();
>   operator bool();
> };
> 
> class C {
> public:
>   C(int x);
>   ~C();
>   operator bool();
> };
> 
> int test(int x, int y) {
>   if (C b = A(x) || B(y))
>    return foo();
>   return bar();
> }
> 
> 
> Let's look at the CFG without destructors:
> 
> $ clang -cc1 -analyze -analyzer-checker=debug.DumpCFG /tmp/test.cpp
>  [B6 (ENTRY)]
>    Succs (1): B4
> 
>  [B1]
>    1: bar
>    2: [B1.1] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void))
>    3: [B1.2]()
>    4: return [B1.3];
>    Preds (1): B3
>    Succs (1): B0
> 
>  [B2]
>    1: foo
>    2: [B2.1] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void))
>    3: [B2.2]()
>    4: return [B2.3];
>    Preds (1): B3
>    Succs (1): B0
> 
>  [B3]
>    1: [B4.8] || [B5.8]
>    2: [B3.1] (ImplicitCastExpr, IntegralCast, int)
>    3: [B3.2] (CXXConstructExpr, class C)
>    4: [B3.3] (ImplicitCastExpr, ConstructorConversion, class C)
>    5: [B3.4] (BindTemporary)
>    6: [B3.5] (ImplicitCastExpr, NoOp, const class C)
>    7: [B3.6]
>    8: C b = A(x).operator _Bool() || B(y).operator _Bool();
>    9: b
>   10: [B3.9].operator _Bool
>   11: [B3.10]()
>   12: [B3.11] (ImplicitCastExpr, UserDefinedConversion, _Bool)
>    T: if [B3.12]
>    Preds (2): B5 B4
>    Succs (2): B2 B1
> 
>  [B4]
>    1: x
>    2: [B4.1] (ImplicitCastExpr, LValueToRValue, int)
>    3: [B4.2] (CXXConstructExpr, class A)
>    4: [B4.3] (BindTemporary)
>    5: A([B4.4]) (CXXFunctionalCastExpr, ConstructorConversion, class A)
>    6: [B4.5].operator _Bool
>    7: [B4.6]()
>    8: [B4.7] (ImplicitCastExpr, UserDefinedConversion, _Bool)
>    T: [B4.8] || ...
>    Preds (1): B6
>    Succs (2): B3 B5
> 
>  [B5]
>    1: y
>    2: [B5.1] (ImplicitCastExpr, LValueToRValue, int)
>    3: [B5.2] (CXXConstructExpr, class B)
>    4: [B5.3] (BindTemporary)
>    5: B([B5.4]) (CXXFunctionalCastExpr, ConstructorConversion, class B)
>    6: [B5.5].operator _Bool
>    7: [B5.6]()
>    8: [B5.7] (ImplicitCastExpr, UserDefinedConversion, _Bool)
>    Preds (1): B4
>    Succs (1): B3
> 
>  [B0 (EXIT)]
>    Preds (2): B1 B2
> 
> 
> This CFG behaves as expected.  We have a single block with the '||' operation as a terminator (block 4), and a single block that merges the values from the LHS and RHS expressions (block B3) which is then terminated with the 'if' branch terminator.  My motivation for looking at this CFG was to further optimize this so that block B5 was seen as dominating block B2.   This would improve the quality of some flow-sensitive (but path-insensitive) analyses in the frontend and also simplify the CFG.
> 
> The CFG gets significantly more complicated (and wrong) when we add implicit destructors:
> 
> $ clang -cc1 -analyze -cfg-add-implicit-dtors -analyzer-checker=debug.DumpCFG /tmp/test.cpp
> 
>  [B8 (ENTRY)]
>    Succs (1): B6
> 
>  [B1]
>    1: [B3.2].~C() (Implicit destructor)
>    2: bar
>    3: [B1.2] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void))
>    4: [B1.3]()
>    5: return [B1.4];
>    Preds (1): B3
>    Succs (1): B0
> 
>  [B2]
>    1: foo
>    2: [B2.1] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void))
>    3: [B2.2]()
>    4: return [B2.3];
>    5: [B3.2].~C() (Implicit destructor)
>    Preds (1): B3
>    Succs (1): B0
> 
>  [B3]
>    1: ~A() (Temporary object destructor)
>    2: C b = A(x).operator _Bool() || B(y).operator _Bool();
>    3: b
>    4: [B3.3].operator _Bool
>    5: [B3.4]()
>    6: [B3.5] (ImplicitCastExpr, UserDefinedConversion, _Bool)
>    T: if [B3.6]
>    Preds (2): B4 B5
>    Succs (2): B2 B1
> 
>  [B4]
>    1: ~B() (Temporary object destructor)
>    Preds (1): B5
>    Succs (1): B3
> 
>  [B5]
>    1: [B6.8] || [B7.8]
>    2: [B5.1] (ImplicitCastExpr, IntegralCast, int)
>    3: [B5.2] (CXXConstructExpr, class C)
>    4: [B5.3] (ImplicitCastExpr, ConstructorConversion, class C)
>    5: [B5.4] (BindTemporary)
>    6: [B5.5] (ImplicitCastExpr, NoOp, const class C)
>    7: [B5.6]
>    8: ~C() (Temporary object destructor)
>    T: [B6.8] || ...
>    Preds (2): B7 B6
>    Succs (2): B3 B4
> 
>  [B6]
>    1: x
>    2: [B6.1] (ImplicitCastExpr, LValueToRValue, int)
>    3: [B6.2] (CXXConstructExpr, class A)
>    4: [B6.3] (BindTemporary)
>    5: A([B6.4]) (CXXFunctionalCastExpr, ConstructorConversion, class A)
>    6: [B6.5].operator _Bool
>    7: [B6.6]()
>    8: [B6.7] (ImplicitCastExpr, UserDefinedConversion, _Bool)
>    T: [B6.8] || ...
>    Preds (1): B8
>    Succs (2): B5 B7
> 
>  [B7]
>    1: y
>    2: [B7.1] (ImplicitCastExpr, LValueToRValue, int)
>    3: [B7.2] (CXXConstructExpr, class B)
>    4: [B7.3] (BindTemporary)
>    5: B([B7.4]) (CXXFunctionalCastExpr, ConstructorConversion, class B)
>    6: [B7.5].operator _Bool
>    7: [B7.6]()
>    8: [B7.7] (ImplicitCastExpr, UserDefinedConversion, _Bool)
>    Preds (1): B6
>    Succs (1): B5
> 
>  [B0 (EXIT)]
>    Preds (2): B1 B2
> 
> 
> Notice that blocks B6 and B5 have the same terminator, which itself is suspect.  The rationale is that the temporary for 'B(y)' should only be destructed when we evaluated the RHS of the logical operation.  This is good enough for the static analyzer, but the CFG itself doesn't explicitly capture this dependency.  This can be seen by observing that there is a path in the CFG from the entry block to the exit block where destructor ~B is called but the constructor B() is NOT called.  While the static analyzer will likely get this right and recover that information (since it does enough value flow analysis to understand control-dependent branches), this unnecessarily drops a bunch of control-dependencies from the CFG which will degrade the quality of CFG-based analyses in the frontend.
> 
> The second, and more glaring problem, is that ~C is called twice along a path, both as an implicit destructor (in B1 and B2 respectively) and a destructor for a temporary (in B5).  I may be missing something here, but this doesn't show up in the LLVM IR (which clearly shows a single destructor call to ~C).
> 
> Also, look at where the destructors are called.  In block B1, ~C is correctly called before the call to bar(), but in block B2, the destructor appears after the return statement.  Instead, it should appear immediately after the call to foo().
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev