[cfe-dev] [PATCH] Clang Static Analyzer support for temporary destructors

Alex McCarthy alexmc at google.com
Mon Mar 24 19:57:01 PDT 2014


And one more question: what do you think that the impact would be of
submitting this change without fixing the logical operator liveliness
problem? Would that only cause noreturn destructor-based results to be
sometimes broken (better than the current always broken behavior), or do
you think this change is likely to introduce regressions in other areas?

-Alex


On Mon, Mar 24, 2014 at 11:29 AM, Alex McCarthy <alexmc at google.com> wrote:

> That's very helpful, thanks Jordan!
>
> Pavel's attempt to fix the liveliness problems was
> http://llvm-reviews.chandlerc.com/rL189090, correct? I'll give that a
> more thorough read.
>
> I take it that from the "sometimes-y-ness" of the bug, we don't have any
> better regression test coverage than what's already in temporaries.cpp? In
> particular, we don't have any test cases covering the looping misbehavior
> you described?
>
> Would you mind speeding up my search by pointing me at 1) the IRGen code
> that does this sort of variable synthesis 2) similar code in the clang
> analyzer that does any sort of variable synthesis? (I think Ted said we do
> something similar to track which objects have been constructed so we know
> which destructors need to be called)
>
> Thanks again for your help!
>
> -Alex
>
>
> On Mon, Mar 24, 2014 at 10:15 AM, Manuel Klimek <klimek at google.com> wrote:
>
>> On Mon, Mar 24, 2014 at 6:09 PM, Jordan Rose <jordan_rose at apple.com>wrote:
>>
>>>
>>> On Mar 24, 2014, at 9:29 , Alex McCarthy <alexmc at google.com> wrote:
>>>
>>> I tried to debug this a bit more last week, and I'm still stumped. Ted,
>>> Jordan or Anna, do you have time to take a deeper look at this? Anna, would
>>> you mind clarifying http://llvm.org/bugs/show_bug.cgi?id=18159 (details
>>> below)?
>>>
>>> Here are two test cases similar to the ones I added to
>>> temp-obj-dtors-cfg-output.cpp and temporaries.cpp:
>>>
>>>   int testConsistencyNestedSimple2(bool value) {
>>>     if (value) {
>>>       if (!value || check(NoReturnDtor())) {
>>>         return 1; // impossible
>>>       }
>>>     }
>>>     return 0;
>>>   }
>>>
>>>   int testConsistencyNestedComplex2(bool value) {
>>>     if (value) {
>>>       if (!value || !value || check(NoReturnDtor())) {
>>>         return 1; // impossible
>>>       }
>>>     }
>>>     return 0;
>>>   }
>>>
>>> Since NoReturnDtor is an object with a temporary that never returns,
>>> it's impossible for either function to return 1. Before this patch, I think
>>> both test cases would fail or crash with destructor analysis turned on.
>>> This patch makes the simple case pass, but the compex case fails. I don't
>>> understand why.
>>>
>>>
>>> Take a look at the CFG for testConsistencyNestedComplex2:
>>>
>>> int testConsistencyNestedComplex2(bool value)
>>> * [B10 (ENTRY)]*
>>>    Succs (1): B9
>>>
>>> * [B1]*
>>>    1: 0
>>>    2: return [B1.1];
>>>    Preds (2): B3 B9
>>>    Succs (1): B0
>>>
>>> * [B2]*
>>>    1: 1
>>>    2: return [B2.1];
>>>    Preds (1): B3
>>>    Succs (1): B0
>>>
>>> * [B3]*
>>>    T: if [B5.1]
>>>    Preds (1): B5
>>>    Succs (2): B2 B1
>>>
>>> * [B4]*
>>>    1: ~NoReturnDtor() (Temporary object destructor)
>>>    Preds (1): B5
>>>    Succs (1): B0
>>>
>>> * [B5]*
>>>    1: [B8.3] || [B7.3] || [B6.6]
>>>    T: (Temp Dtor) [B8.3] || [B7.3] || ...
>>>    Preds (3): B6 B7 B8
>>>    Succs (2): B3 B4
>>>
>>> * [B6]*
>>>    1: check
>>>    2: [B6.1] (ImplicitCastExpr, FunctionToPointerDecay, _Bool (*)(struct
>>> NoReturnDtor &&))
>>>    3: NoReturnDtor() (CXXConstructExpr, struct NoReturnDtor)
>>>    4: [B6.3] (BindTemporary)
>>>    5: [B6.4]
>>>    6: [B6.2]([B6.5])
>>>    Preds (1): B7
>>>    Succs (1): B5
>>>
>>> * [B7]*
>>>    1: value
>>>    2: [B7.1] (ImplicitCastExpr, LValueToRValue, _Bool)
>>>    3: ![B7.2]
>>>    T: [B8.3] || [B7.3] || ...
>>>    Preds (1): B8
>>>    Succs (2): B5 B6
>>>
>>> * [B8]*
>>>    1: value
>>>    2: [B8.1] (ImplicitCastExpr, LValueToRValue, _Bool)
>>>    3: ![B8.2]
>>>    T: [B8.3] || ...
>>>    Preds (1): B9
>>>    Succs (2): B5 B7
>>>
>>> * [B9]*
>>>    1: value
>>>    2: [B9.1] (ImplicitCastExpr, LValueToRValue, _Bool)
>>>    T: if [B9.2]
>>>    Preds (1): B10
>>>    Succs (2): B8 B1
>>>
>>> * [B0 (EXIT)]*
>>>    Preds (3): B1 B2 B4
>>>
>>> Notice block B4, where the temporary destructor is run. It's called when
>>> we get to B5 and see that the second-to-last || condition was false.
>>>
>>> What gets us to B5? Every single part of the || chain, because it's the
>>> only way to get to the actual if in B3. But by the time we go there, we've
>>> lost the values of the source expressions, just like you said. IIRC,
>>> without temporary destructors this isn't important, because we can figure
>>> out which branch we came from based on the previous block edge. But now
>>> that's not going to work.
>>>
>>> What Pavel tried is making all the values in the || chain stay live, so
>>> that we could decide whether or not to branch based on that, instead of
>>> based on how we got there. But that created a different problem: if we go
>>> directly from the first || condition to the final one, *we don't have
>>> the second one yet.* Worse, if we're in a loop, the liveness analysis
>>> (sometimes?) gets totally confused, and considers the value of the second
>>> expression from the *last* time through the loop to still be live.
>>> Which then does the wrong thing.
>>>
>>> That's PR18159 <http://llvm.org/bugs/show_bug.cgi?id=18159>.
>>>
>>> (And that "sometimes" is killer. We had a terrible time trying to reduce
>>> the test case—various things would perturb it one way or another to make
>>> the problem appear or not.)
>>>
>>> A while ago Ted mentioned that we should just synthesize shadow bool
>>> variables of some kind to track whether or not something's been
>>> initialized. I'm starting to think more and more that that's the right
>>> idea. (It's basically what IRGen does.) We don't exactly have an
>>> implementation for it, though.
>>>
>>
>> If I remember correctly that was also Chandler's favorite opinion from
>> the start (just do whatever IRGen does).
>>
>>
>>>  Anyway, I think that's what's not working in your test cases. I'll
>>> look through the patch to see if there are more pieces we can commit in
>>> advance. (The extraction of getRegionForConstructedObject seems like good
>>> general cleanup anyway.)
>>>
>>> Jordan
>>>
>>>
>>>
>>> I've attached dumps of clang analyzer's exploded graph state while
>>> processing these functions with my patch with destructor analysis turned on
>>> and off. (the diagrams in the pdfs are very tiny, you'll need something
>>> with serious zoom like the mac Preview app to read them). I think the bug
>>> *might* be caused by constraints being dropped when we analyze new blocks:
>>> in the Simple cases, we maintain a constraint on value being [1, 255] until
>>> we've completely finished processing the if statement and we've executed
>>> its terminator. In the complex case, we forget the constraints on value
>>> when transitioning between blocks before we've finished the if statement. I
>>> *think* this is when we enter the new block created to contain the
>>> temporary. But I'm still not sure how to read this, or whether this is a
>>> red herring.
>>>
>>> If this is a problem where we're forgetting constraints on variables,
>>> this might be http://llvm.org/bugs/show_bug.cgi?id=18159 which Anna
>>> filed in December. The bug says that this problem can be reproduced by
>>> dumping live variables and looking at the output: I've run these test cases
>>> with debug.DumpLiveVars but I'm not sure what to look for. Anna, could
>>> you please add more detail to the bug, or potentially add some test cases
>>> that pass or fail if this bug is fixed or still present?
>>>
>>> Thanks for your help, all. Let me know if you want to do some physical
>>> or virtual in-person debugging. I'm moving pretty slowly due to
>>> unfamiliarity with the tools, codebase, and compilers in general, and I'm
>>> sure a half hour of your experienced eyes on the problem would go a long
>>> way :)
>>>
>>>
>>>
>>> -Alex
>>>
>>>
>>> On Sat, Mar 22, 2014 at 5:33 PM, Alexander Kornienko <alexfh at google.com>wrote:
>>>
>>>> I'm waiting impatiently for this patch to land ;)
>>>> Any progress here? Jordan, AFAIU the ball is on your side?
>>>>
>>>>
>>>>
>>>> On Thu, Mar 6, 2014 at 8:47 PM, Alex McCarthy <alexmc at google.com>wrote:
>>>>
>>>>> Thanks for taking a look, Jordan, I really appreciate the extra eyes.
>>>>> Please find an updated patch attached.
>>>>>
>>>>> I misread the test in temporaries.cpp, thanks. I've restored the test
>>>>> so it's testing the right thing: it still fails and I'm currently stumped.
>>>>> I've produced some simpler test cases that are much easier to analyze, but
>>>>> I still don't understand how this incorrect behavior is triggered. I've
>>>>> also added more CFG dumps to temp-obj-dtors-cfg-output.cpp.test to see if
>>>>> anything stands out. I've also added a lot of debug output which clearly
>>>>> has to be removed before submission. I'd really appreciate any more advice
>>>>> you have here, and I'll keep (slowly) debugging in the mean time.
>>>>>
>>>>> The array processing while loop I added to ExprEngineCxx.cpp's
>>>>> ::VisitCXXDestructor was an attempt to fix the crash that I found when
>>>>> processing initializer_lists of temporary objects. I added a new test to
>>>>> dtor-xx11.cpp to cover this case:
>>>>>
>>>>> namespace Cxx11BraceInit {
>>>>>   struct Foo {
>>>>>     ~Foo() {}
>>>>>   };
>>>>>
>>>>>   void testInitializerList() {
>>>>>     for (Foo foo : {Foo(), Foo()}) {}
>>>>>   }
>>>>> }
>>>>>
>>>>> Without the array special casing, clang crashes while parsing this:
>>>>> clang:
>>>>> third_party/llvm/llvm/tools/clang/lib/StaticAnalyzer/Core/ExprEngineCXX.cpp:322:
>>>>> void clang::ento::ExprEngine::VisitCXXDestructor(clang::QualType, const
>>>>> clang::ento::MemRegion *, const clang::Stmt *, bool,
>>>>> clang::ento::ExplodedNode *, clang::ento::ExplodedNodeSet &): Assertion
>>>>> `RecordDecl && "Only CXXRecordDecls destructors can be processed"' failed.
>>>>>
>>>>> I've updated the special casing to be a bit simpler, but I think it
>>>>> (or something like it) is necessary to avoid crashing on initializer_lists.
>>>>> What do you think?
>>>>>
>>>>> Your recommendation of skipping destructors in VisitCXXConstructExpr
>>>>> worked perfectly: I've updated getRegionForConstructedObject in
>>>>> ExprEngineCXX.cpp to recurse when it finds a destructor, and that fixed the
>>>>> spurious garbage return value warnings, which now means that this patch
>>>>> eliminates all large scale false positives on this project's codebase.
>>>>> Thanks for your help!
>>>>>
>>>>> If there are any parts of this change that you'd like to split out and
>>>>> individually commit (like the dtor handling in PathDiagnostic.cpp) that
>>>>> sounds fine to me. I'm happy with whatever patching combination you'd
>>>>> prefer if you help me through the splitting and committing process.
>>>>>
>>>>> Thanks again for your help with this!
>>>>>
>>>>> -Alex
>>>>>
>>>>>
>>>>> On Tue, Mar 4, 2014 at 7:45 PM, Jordan Rose <jordan_rose at apple.com>wrote:
>>>>>
>>>>>> Wow, thanks for working on this, Alex. Unfortunately I think there
>>>>>> may be a few more problems than simply turning things back on. In
>>>>>> particular, from test/Analysis/temporaries.cpp:
>>>>>>
>>>>>>      if (compute(i == 5 &&
>>>>>>                  (i == 4 || i == 4 ||
>>>>>>                   compute(i == 5 && (i == 4 ||
>>>>>> check(NoReturnDtor()))))) ||
>>>>>>          i != 4) {
>>>>>> -      clang_analyzer_eval(true); // no warning, unreachable code
>>>>>> +      clang_analyzer_eval(true); // expected-warning{{TRUE}}
>>>>>> (possible when i=6)
>>>>>>      }
>>>>>>
>>>>>> i cannot equal 6 at this point in the code; testConsistencyNested has
>>>>>> a line earlier that says "if (i != 5) return". So we're not getting the
>>>>>> correct behavior here—either the destructor isn't ending up in the right
>>>>>> place in the CFG, or something is invalidating the value of 'i' that
>>>>>> shouldn't be. I would guess it's the former, since this series of tests
>>>>>> were designed to check Pavel's reworking of the CFG.
>>>>>>
>>>>>> This part confuses me:
>>>>>>
>>>>>> +    while (const ArrayType *ArrayType =
>>>>>> Ctx.getAsArrayType(ObjectType)) {
>>>>>> +      ObjectType = ArrayType->getElementType();
>>>>>> +      VisitCXXDestructor(ObjectType, Dest, S, IsBaseDtor, Pred, Dst);
>>>>>> +    }
>>>>>>
>>>>>> For a multidimensional array of, say, Foo, this visits "array",
>>>>>> "array[0]", "array[0][0]", etc, down to the single element case. In
>>>>>> addition, visiting "array[0]" will *also* trigger a destruction of
>>>>>> "array[0][0]", etc., since this loop happens as the very first thing in
>>>>>> VisitCXXDestructor.
>>>>>>
>>>>>> I would just leave the FIXME as is here, and not worry about this.
>>>>>>
>>>>>>    // FIXME: We need to run the same destructor on every element of
>>>>>> the array.
>>>>>>    // This workaround will just run the first destructor (which will
>>>>>> still
>>>>>>    // invalidate the entire array).
>>>>>>
>>>>>> And then we have your new test case:
>>>>>>
>>>>>> +  //TODO: figure out why this case causes an unexpected "Undefined
>>>>>> or garbage value returned to caller" warning
>>>>>> +  bool testNamedCustomDestructor() {
>>>>>> +    if (CheckCustomDestructor c = CheckCustomDestructor())
>>>>>> +      return true;
>>>>>> +    return false;
>>>>>> +  }
>>>>>>
>>>>>> First of all, nicely discovered. I'm not immediately sure what's
>>>>>> wrong here, but let's take a look at the CFG:
>>>>>>
>>>>>> * [B3]*
>>>>>>    1: CheckCustomDestructor() (CXXConstructExpr, struct
>>>>>> CheckCustomDestructor)
>>>>>>    2: [B3.1] (BindTemporary)
>>>>>>    3: [B3.2] (ImplicitCastExpr, NoOp, const struct
>>>>>> CheckCustomDestructor)
>>>>>>    4: [B3.3]
>>>>>>    5: [B3.4] (CXXConstructExpr, struct CheckCustomDestructor)
>>>>>>    6: ~CheckCustomDestructor() (Temporary object destructor)
>>>>>>    7: CheckCustomDestructor c = CheckCustomDestructor();
>>>>>>
>>>>>> B3.1 is the actual creation of the temporary. B3.5 is the copy
>>>>>> constructor required by the C++ standard to copy the temporary into 'c'.
>>>>>> B3.6 is the destruction of the temporary, and B3.7 is the actual VarDecl
>>>>>> for 'c'. (The block then goes on to call 'operator bool' on 'c' and then do
>>>>>> the if-branch.)
>>>>>>
>>>>>> The current handling of constructors for VarDecls is a bit hacky. If
>>>>>> you look in ExprEngine::VisitCXXConstructExpr, you'll notice it tries to
>>>>>> look ahead to the next CFG element to see if it's constructing a particular
>>>>>> variable. If so, it sets the target region to that variable. The trouble
>>>>>> is, there's now a destructor between B3.5 and B3.7, so my guess is that the
>>>>>> analyzer has decided it's *not* constructing a variable. I am okay
>>>>>> with cheating right now by skipping over destructor CFG elements in
>>>>>> VisitCXXConstructExpr, but I haven't thought about if there's a better way
>>>>>> to tell that a particular CXXConstructExpr goes with a particular VarDecl.
>>>>>>
>>>>>> As far as this part goes...
>>>>>>
>>>>>> -  case CFGElement::TemporaryDtor:
>>>>>> +  case CFGElement::TemporaryDtor: {
>>>>>> +    const CFGTemporaryDtor &Dtor = Source.castAs<CFGTemporaryDtor>();
>>>>>> +    return PathDiagnosticLocation(Dtor.getBindTemporaryExpr(), SM,
>>>>>> CallerCtx);
>>>>>> +  }
>>>>>>
>>>>>> ...that seems like a good obvious change, and I'm happy to commit
>>>>>> that (or for you to commit that) without turning anything else on. :-)
>>>>>>
>>>>>> I know this is a lot to throw at you at once, but please continue to
>>>>>> e-mail me with questions and progress. I'm very happy that someone is able
>>>>>> to put time into this.
>>>>>>
>>>>>> Jordan
>>>>>>
>>>>>>
>>>>>> On Mar 2, 2014, at 0:41 , Alex McCarthy <alexmc at google.com> wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I'm running clang's static analyzer on a C++ codebase at Google. I
>>>>>> saw a roughly a 50% false positive rate which stemmed from the analyzer not
>>>>>> recognizing temporary object destructors: this issue is discussed in some
>>>>>> length in another thread, which mentions a similar error rate on the
>>>>>> Chromium codebase:
>>>>>> http://comments.gmane.org/gmane.comp.compilers.clang.devel/33901
>>>>>>
>>>>>> Starting from Pavel's work which was reverted in
>>>>>> http://llvm-reviews.chandlerc.com/rL186925 , I've put together a new
>>>>>> patch (see attachment) that attempts to fix temporary object destructor
>>>>>> handling.
>>>>>>
>>>>>> This new patch fixes all of the new regression tests added after
>>>>>> Pavel's change was reverted, notably including
>>>>>> http://llvm-reviews.chandlerc.com/rL187133 . I've also fixed some
>>>>>> other crashes, including a crash when processing an array of temporary
>>>>>> objects use in a C++11 initializer_list, covered by a new regression test
>>>>>> in cfe/test/Analysis/dtor-cxx11.cpp . And most importantly, running
>>>>>> clang with this patch eliminates the 50% false positive rate I saw
>>>>>> previously (from ~800 warnings to ~400 across the ~1700 file codebase).
>>>>>>
>>>>>>
>>>>>> Now for the bad news:
>>>>>>
>>>>>> This patch introduces a new regression which wasn't covered by
>>>>>> existing tests: named temporaries declared and used within if statements
>>>>>> are considered dead while they're still being used, which results in "Undefined
>>>>>> or garbage value returned to caller" errors. I've added regression tests
>>>>>> to test/Analysis/dtor.cpp to cover this case, which currently fail. I've
>>>>>> also updated test/Analysis/temp-obj-dtors-cfg-output.cpp with relevant CFG
>>>>>> dumps to try to debug the problem. This new false positive is much nosier
>>>>>> than the ones this patch fixes: the only advantage to the current patch
>>>>>> as-is is that the garbage return value warnings are emitted in a small
>>>>>> collection of header files, making them much easier to ignore en masse.
>>>>>>
>>>>>> I don't have any compiler experience, so I'm moving slowly in the
>>>>>> clang codebase and could use some help understanding where to look next.
>>>>>> I've mostly been handling each crash or error as I find it, but I don't
>>>>>> have a high level understanding of the impact or context of my change. In
>>>>>> particular, I don't know how to read the CFG dumps I've generated, so I'm
>>>>>> not sure where things are going wrong. Ted, Jordan, and Anna: Manuel
>>>>>> Klimek mentioned that you've looked into this issue at length. Do have any
>>>>>> advice on what I'm doing wrong, or could you suggest other approaches I
>>>>>> might be able to try? Anything you can think of that can speed up my search
>>>>>> for a fix would be greatly appreciated.
>>>>>>
>>>>>> If we can get this patch working, it should address the following
>>>>>> issues:
>>>>>> http://llvm.org/bugs/show_bug.cgi?id=15599
>>>>>> http://llvm.org/bugs/show_bug.cgi?id=16664
>>>>>> http://llvm.org/bugs/show_bug.cgi?id=18159 (not sure, this bug is
>>>>>> referenced by a newly fixed test in test/Analysis/temporaries.cpp)
>>>>>>
>>>>>> Thanks for your help,
>>>>>> -Alex McCarthy
>>>>>>  <temporary-destructors.patch>
>>>>>>
>>>>>>
>>>>
>>> <ClangSimpleNoDtors.pdf><ClangComplexNoDtors.pdf><ClangSimple.pdf>
>>> <ClangComplex.pdf>
>>>
>>>
>>>
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140324/1ce47025/attachment.html>


More information about the cfe-dev mailing list