[cfe-dev] [RFC] Adding lifetime analysis to clang

Thu Mar 7 15:04:41 PST 2019

Hi!

I started to collect some more evidence on why type category-based local
warnings are a worthy addition. I worked used Clang as a target of this
experiment because we are all familiar with this project. The method was to
grep for some keywords like temporary, dangling, StringRef, and ArrayRef. I
filtered the results since some of the commits were not related to lifetime
issues at all (of course, there is no guarantee that I did not miss a lot
of interesting commits). After that, I compiled each commit with the
modified clang and categorized the results:

Errors that would have been caught by the warning (so less noise on the
build bots):
*
https://github.com/llvm-mirror/clang/commit/be0345e47273a01065874e568ad673f89b75317a
(StringRef RetTyName = RetTy.getAsString();)
*
https://github.com/llvm-mirror/clang/commit/7140ad92debc63af78a9c50dcad2a0b6d4ebc680
*
https://github.com/llvm-mirror/clang/commit/51ff8d7aa82cee89c4770832084ef5052692090a
*
https://github.com/llvm-mirror/clang/commit/4fe9644dc3533e4517f980645957c6fc9408c6f6
*
https://github.com/llvm-mirror/clang/commit/20f89390c7f9eaf7c62bcc9145fbc5ba7725557b
*
https://github.com/llvm-mirror/clang/commit/a638ce2afc68a2f67142ca8c7842172d49be61c0
*
https://github.com/llvm-mirror/clang/commit/77cd074b2f44d342085066e3a2046e76a8d0a655
*
https://github.com/llvm-mirror/clang/commit/dfa25cb2d16f812bb28bdf60b6b891f4e9140072

Old revision did not compile with modern Clang but the problem is very
likely to be caught:
*
https://github.com/llvm-mirror/clang/commit/3846ca29a8cc1d376a4b695194c29952dbbfb544
*
https://github.com/llvm-mirror/clang/commit/3791814b2c5efb3986ec4dd40c75c9d28f325222

Errors that are not caught because assignments are not checked, only
initializations (could be added as a separate flag?):
*
https://github.com/llvm-mirror/clang/commit/77c4e2380a572bfab344ca1bfc7b59beb9cf8fff
*
https://github.com/llvm-mirror/clang/commit/9670762c0d7fb6f3c13c61f044e335cf2f9523f9
*
https://github.com/llvm-mirror/clang/commit/a6fb4e07ed2e9145596490240faa1eebff5a53c0
*
https://github.com/llvm-mirror/clang/commit/61ed9caca5dd6a9b9ccee7fb51296234e6cc68ad

Errors that are not caught because they would need some semantic
information about APIs that are not covered by type categories or CFG based
analysis:
*
https://github.com/llvm-mirror/clang/commit/3c57879f0dd98cf30cdc0672ad727f150d304c1f
*
https://github.com/llvm-mirror/clang/commit/431fb79151297df73530ca45b57f7481fc5a9bd3
*
https://github.com/llvm-mirror/clang/commit/9fec733e8e62b1e2ccf0d39916899f2c2d92c764
(The problem is related to const PrintingPolicy &Policy; member of
FailedBooleanConditionPrinterHelper)
*
https://github.com/llvm-mirror/clang/commit/1b48633d308460049e1132960f3c33c850de5bd2

Errors not caught, probably bug in the prototype (and should be feasible to
catch), need to investigate:
*
https://github.com/llvm-mirror/clang/commit/97973546416c9d04b518c1b9572815b42872400f
*
https://github.com/llvm-mirror/clang/commit/1d22fa209d194c8ab676498b13ff5e25cb07d17d
*
https://github.com/llvm-mirror/clang/commit/563fb903fab15972e2d9c66b0ae046a94be86a71

Skipped, did not see any code that could trigger the behavior while
examining the diff and did not bother looking deeper:
*
https://github.com/llvm-mirror/clang/commit/35276390608adec9840c7ed64f2ee211162a97cb

All of these commits slipped through the review process. I feel like this
data could justify these checks (even more so since we can restrict it to
annotated and STL types by default reducing the number of false positives
to virtually 0).
What do you think?

Regards,
Gábor

On Fri, 1 Mar 2019 at 14:03, Manuel Klimek <klimek at google.com> wrote:

>
>
> On Fri, Mar 1, 2019 at 1:49 PM Gábor Horváth <xazax.hun at gmail.com> wrote:
>
>>
>>
>> On Fri, 1 Mar 2019 at 12:54, Manuel Klimek <klimek at google.com> wrote:
>>
>>> (correcting dmitri's email address)
>>>
>>> On Fri, Mar 1, 2019 at 12:24 PM Gábor Horváth <xazax.hun at gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Fri, 1 Mar 2019 at 12:13, Manuel Klimek <klimek at google.com> wrote:
>>>>
>>>>> On Fri, Mar 1, 2019 at 12:10 PM Manuel Klimek <klimek at google.com>
>>>>> wrote:
>>>>>
>>>>>> -apple George + google George to prevent bounces :)
>>>>>>
>>>>>> On Fri, Mar 1, 2019 at 12:06 PM Manuel Klimek <klimek at google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Fri, Mar 1, 2019 at 11:51 AM Gábor Horváth <xazax.hun at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Manuel,
>>>>>>>>
>>>>>>>> On Fri, 30 Nov 2018 at 12:04, Manuel Klimek <klimek at google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Nov 30, 2018 at 12:00 PM Gábor Horváth <
>>>>>>>>> xazax.hun at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Unfortunately, we do not have real world experience on a large
>>>>>>>>>> codebase for the following reasons:
>>>>>>>>>> * The implementation we have does not support annotations yet,
>>>>>>>>>> and large projects are likely to contain exceptions to the default rules
>>>>>>>>>> (even if only a few)
>>>>>>>>>> * The analysis has an expected coding style, e.g. it does not
>>>>>>>>>> reason about arithmetic on raw pointers
>>>>>>>>>>
>>>>>>>>>> We did run it on some projects like LLVM and there were a lot of
>>>>>>>>>> noise mainly due to the unsupported language features like pointer
>>>>>>>>>> arithmetic.
>>>>>>>>>> The rest of the false positives were mostly due to the basic
>>>>>>>>>> assumption of the analysis that every path is feasible.
>>>>>>>>>> We did find some true positives with a similar analysis looking
>>>>>>>>>> for use after moves in LLVM. Those true positives were redundant
>>>>>>>>>> std::forward calls that are not going to cause any runtime error but still
>>>>>>>>>> removing them would make the code cleaner and less likely to misbehave in
>>>>>>>>>> the future. See:
>>>>>>>>>> https://github.com/llvm-mirror/llvm/blob/master/include/llvm/Support/Error.h#L894
>>>>>>>>>> (The same arguments forwarded multiple times in a loop. Since the callee
>>>>>>>>>> never actually moves the arguments we will not end up using a moved from
>>>>>>>>>> object. But if we never move the argument, why would we use std::forward in
>>>>>>>>>> the first place?)
>>>>>>>>>>
>>>>>>>>>> But even if the cfg-based lifetime analysis end up being a coding
>>>>>>>>>> style specific check, the rest of the (cfg-less) clang warnings would
>>>>>>>>>> become much more powerful after type categories are upstreamed. They could
>>>>>>>>>> catch a lot of errors without having any false positives by default. Thus,
>>>>>>>>>> I do see the value going forward, even if we do not have much real world
>>>>>>>>>> experience yet.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Could you run the restricted checks on llvm?
>>>>>>>>>
>>>>>>>>
>>>>>>>> Sorry for the delay, but finally I found some time to actually
>>>>>>>> implement the CFG-less checks and run them on LLVM. The quick prototype is
>>>>>>>> available here:
>>>>>>>> https://github.com/mgehre/clang/commits/lifetime-warning-typecategories
>>>>>>>> Literally, all of the false positives I encountered after running
>>>>>>>> on LLVM and Clang were due to llvm::ValueHandleBase being miscategorized as
>>>>>>>> an owner (because of its user-defined destructor).
>>>>>>>> So annotating this class would render the LLVM codebase clean of
>>>>>>>> these new lifetime errors. Do you think this is a strong enough evidence to
>>>>>>>> start upstreaming the type categories or do you think we should do
>>>>>>>> additional measurements?
>>>>>>>>
>>>>>>>
>>>>>>> I'd like to get Richard's opinion on this.
>>>>>>>
>>>>>>
>>>>> So just to make sure I understand correctly:
>>>>> The result is that in LLVM we have:
>>>>> - 1 false positive root cause (causing X false positives)
>>>>> - 0 true positives
>>>>> ?
>>>>>
>>>>>
>>>>
>>>> Correct.
>>>>
>>>
>>> Unfortunately that doesn't sound like a really strong argument in favor
>>> - if you have something that works cleanly against clang trunk I can try to
>>> find somebody willing to try it on a bit of our internal codebase, but I
>>> can't promise anything.
>>>
>>> Basically what I'd be looking for is some evidence that the false
>>> positive ratio is not too high, and for that we'd need at least a codebase
>>> containing true positives.
>>>
>>
>> I do not agree.
>>
>
> Sorry if what I said came over stronger than I intended it to - with "not
> a strong argument in favor" I didn't mean that it's an argument against,
> just that it doesn't seem like overwhelming data for.
>
>
>> LLVM has good coverage and tests are regularly run with sanitizers on.
>> Thus, I would except most of the lifetime related issues being caught by
>> dynamic analysis in our case. I do remember, however, such problems to slip
>> through the reviews and break the build bots occasionally.
>>
>
> Would be interesting to have some evidence / examples for this that this
> check would have caught - I definitely agree that lifetime issues fall
> through the cracks, the question is whether those actually lead to true
> positives, or slip through the simple check.
>
>
>> I think, the value of those warnings are to catch those errors early,
>> saving time for the developers. So even though if it is unlikely to find
>> true positives in sanitizer clean builds for certain projects I still see
>> the value. Note that, this is true for most of the similar warnings in
>> clang. LLVM (and probably most other projects) is clean of
>> -Wreturn-stack-address, still we do not question whether that warning is
>> useful.
>>
>
> Agreed. I'm really mainly looking for evidence for the decision. I'm not
> saying that evidence doesn't exist, but that we have to make sure to find
> it.
>
>
>>
>>
>>
>>>
>>>
>>>>
>>>>
>>>>>
>>>>>>>
>>>>>>>> I also plan to look up some lifetime fixes/reverts in the commit
>>>>>>>> history to check if some of those problems would have been caught by these
>>>>>>>> warnings. I will share the results in a EuroLLVM technical talk.
>>>>>>>>
>>>>>>>> P.S.:
>>>>>>>> Does anybody have a good idea how to get a list of buildbot
>>>>>>>> breaking commits where the reason was lifetime related (other than grepping
>>>>>>>> git history)?
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Gábor
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Gabor
>>>>>>>>>>
>>>>>>>>>> On Thu, 29 Nov 2018 at 20:58, George Karpenkov <
>>>>>>>>>> ekarpenkov at apple.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Thanks, the idea looks great!
>>>>>>>>>>>
>>>>>>>>>>> The heuristic described in a talk on matching returned and
>>>>>>>>>>> accepted references looked fragile to me initially,
>>>>>>>>>>> but after a bit of time I could not find obvious and common
>>>>>>>>>>> counterexamples which would confuse it.
>>>>>>>>>>>
>>>>>>>>>>> Has the evaluation been done on any large project? E.g. LLVM
>>>>>>>>>>> itself, Chrome, Webkit, or any other large C++ codebase?
>>>>>>>>>>> What is the false positive rate? What is the annotation burden
>>>>>>>>>>> to suppress those?
>>>>>>>>>>> How many useful bugs were found?
>>>>>>>>>>>
>>>>>>>>>>> I’m also very happy that more dataflow analyses are being
>>>>>>>>>>> implemented in Clang,
>>>>>>>>>>> and I could help with reviews (though obviously I’m not the code
>>>>>>>>>>> owner for most of the components).
>>>>>>>>>>>
>>>>>>>>>>> On Nov 29, 2018, at 8:02 AM, Gábor Horváth via cfe-dev <
>>>>>>>>>>> cfe-dev at lists.llvm.org> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi!
>>>>>>>>>>>
>>>>>>>>>>> This is a proposal to implement Lifetime Analysis [1] defined by
>>>>>>>>>>> Herb Sutter in Clang.
>>>>>>>>>>> Summary from the paper:
>>>>>>>>>>> “This analysis shows how to efficiently diagnose many common
>>>>>>>>>>> cases of dangling (use-after-free) in C++ code, using only local analysis
>>>>>>>>>>> to report them as deterministic readable errors at compile time. The
>>>>>>>>>>> approach is to identify variables that are of generalized “Owner” types
>>>>>>>>>>> (e.g., smart pointers, containers, string) and “Pointer” types (e.g., int*,
>>>>>>>>>>> string_view, span, iterators, and ranges), and then use a local simple
>>>>>>>>>>> acyclic control flow graph (ACFG) analysis to track what each Pointer
>>>>>>>>>>> points to and identify when modifying an Owner invalidates a Pointer. The
>>>>>>>>>>> analysis leverages C++’s existing strong notions of scopes, object
>>>>>>>>>>> lifetimes, and const that carry rich information already available in
>>>>>>>>>>> reasonably modern C++ source code. Interestingly, it appears that with
>>>>>>>>>>> minor extension this analysis can also detect uses of local moved-from
>>>>>>>>>>> variables (use-after-move), which are a form of dangling.”
>>>>>>>>>>> More details can be found in the paper [1] or in the CppCon
>>>>>>>>>>> keynote [3].
>>>>>>>>>>>
>>>>>>>>>>> Matthias Gehre and myself had been working on a prototype in
>>>>>>>>>>> Clang [2]. The changes are rather large, so we are planning to take an
>>>>>>>>>>> incremental approach to upstreaming the features should the community want
>>>>>>>>>>> to see this upstream.
>>>>>>>>>>>
>>>>>>>>>>> *Plans for upstreaming*
>>>>>>>>>>>
>>>>>>>>>>> 1. Upstream Type Categorization
>>>>>>>>>>>
>>>>>>>>>>> Clang already performs statement-local lifetime analyses that
>>>>>>>>>>> would benefit from type categorization even before adding any other
>>>>>>>>>>> analysis.
>>>>>>>>>>>
>>>>>>>>>>> This includes annotating types as Owners and Pointers, and
>>>>>>>>>>> automatically inferring Owner or Point without annotation to minimize
>>>>>>>>>>> annotation burden.
>>>>>>>>>>>
>>>>>>>>>>> Consider the following code example:
>>>>>>>>>>>
>>>>>>>>>>> std::reference_wrapper<const int> get_data() {
>>>>>>>>>>>    const int i = 3;
>>>>>>>>>>>    return {i};
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> Unfortunately, today compilers do not warn on this case of
>>>>>>>>>>> returning a dangling reference. They do warn if we return a raw pointer or
>>>>>>>>>>> reference, but the compiler does not know that std::reference_wrapper also
>>>>>>>>>>> is a non-owning indirection. In the Lifetime analysis, this is diagnosed
>>>>>>>>>>> because std::reference_wrapper is recognized as a Pointer type.
>>>>>>>>>>>
>>>>>>>>>>> As a first step we would upstream the type categorization part
>>>>>>>>>>> of the analysis and make some clang warnings optionally use it. We would
>>>>>>>>>>> also upstream a set of annotations to give the users a way to fix potential
>>>>>>>>>>> false positives due to miscategorization. (This should be very rare
>>>>>>>>>>> according to our experience so far). By default, we could constrain the
>>>>>>>>>>> categorization for std types, whose semantics are known.
>>>>>>>>>>>
>>>>>>>>>>> 2. Extensions of existing CFG-less analyses
>>>>>>>>>>>
>>>>>>>>>>> 2a. Initialization from temporaries
>>>>>>>>>>> The goal is to detect Pointers that dangle on initialization,
>>>>>>>>>>> such as
>>>>>>>>>>> std::string_view sv = “test”s;
>>>>>>>>>>> By restricting the analysis to single statements, it has a low
>>>>>>>>>>> false-positive rate and can be done without building a CFG (i.e. faster).
>>>>>>>>>>>
>>>>>>>>>>> 2b. Return of locals
>>>>>>>>>>> The goal is to detect returning Pointers to local variables, e.g.
>>>>>>>>>>> std::reference_wrapper<const int> get_data() {
>>>>>>>>>>>    const int i = 3;
>>>>>>>>>>>    return {i};
>>>>>>>>>>> }
>>>>>>>>>>> Similar to 2a also restricted to single statement.
>>>>>>>>>>>
>>>>>>>>>>> 2c. Member pointer that dangles once construction is complete
>>>>>>>>>>> struct X {
>>>>>>>>>>>    std::string_view sv;
>>>>>>>>>>>    X() : sv("test"s) {} // warning: string_view member bound to
>>>>>>>>>>> string temporary whose lifetime ends within the constructor
>>>>>>>>>>> };
>>>>>>>>>>>
>>>>>>>>>>> 2d. New of a Pointer that dangles after the end of the
>>>>>>>>>>> full-expression
>>>>>>>>>>> new string_view("test"s) // warning: dynamically-allocated
>>>>>>>>>>> string_view refers to string whose lifetime ends at the end of the
>>>>>>>>>>> full-expression
>>>>>>>>>>>
>>>>>>>>>>> 3. Intra-function analysis across basic blocks, excluding
>>>>>>>>>>> function call expressions
>>>>>>>>>>> Propagate point-to sets of Pointers across branches/loops
>>>>>>>>>>> intra-function, e.g. analysing
>>>>>>>>>>>
>>>>>>>>>>> int* p = &i;
>>>>>>>>>>> if(condition)
>>>>>>>>>>>  p = nullptr;
>>>>>>>>>>> *p; // ERROR: p is possibly null
>>>>>>>>>>>
>>>>>>>>>>> We have some CFG patches and some code traversing the CFG and
>>>>>>>>>>> propagating the analysis state. With the type categories already in place,
>>>>>>>>>>> this patch should be smaller. We could split these patches further by
>>>>>>>>>>> implementing null tracking in a separate patch.
>>>>>>>>>>>
>>>>>>>>>>> 4. Function calls
>>>>>>>>>>>
>>>>>>>>>>> auto find(const string& needle, const string& haystack) ->
>>>>>>>>>>> string_view [[gsl::lifetime(haystack)]];
>>>>>>>>>>>
>>>>>>>>>>> string_view sv = find(“needle”, haystack);
>>>>>>>>>>> sv[0]; // OK
>>>>>>>>>>> string_view sv = find(needle, “temporaryhaystack”);
>>>>>>>>>>> sv[0]; // ERROR: sv is dangling
>>>>>>>>>>>
>>>>>>>>>>> This includes the following subparts.
>>>>>>>>>>>
>>>>>>>>>>> 4a. Precondition checks
>>>>>>>>>>> Check that the psets of the arguments are valid at call site
>>>>>>>>>>> according to the lifetime annotations of the callee.
>>>>>>>>>>>
>>>>>>>>>>> 4b. Postcondition checks
>>>>>>>>>>> Check that the psets returned from a function adhere to its
>>>>>>>>>>> advertised return/output psets.
>>>>>>>>>>> Rigorous checking of not just the function arguments but also
>>>>>>>>>>> the returned values is crucial part of the analysis.
>>>>>>>>>>>
>>>>>>>>>>> 4c. Lifetimes annotations
>>>>>>>>>>>
>>>>>>>>>>> The analysis gets pretty usable at this point. Most of the time
>>>>>>>>>>> the user does not need any annotations, but it is crucial to have them
>>>>>>>>>>> before a project can adapt it. For example, the user will occasionally want
>>>>>>>>>>> to explicitly state that a member function is “const as far as Lifetime is
>>>>>>>>>>> concerned” even though the function itself is not actually declared const
>>>>>>>>>>> (e.g., vector::operator[] does not invalidate any Pointers, such as
>>>>>>>>>>> iterators or raw pointers).
>>>>>>>>>>>
>>>>>>>>>>> 5. Implementing use after move analysis and exception support
>>>>>>>>>>>
>>>>>>>>>>> These parts are not implemented yet in our prototype, but they
>>>>>>>>>>> will be useful additions for the analysis.
>>>>>>>>>>>
>>>>>>>>>>> *Questions*
>>>>>>>>>>>
>>>>>>>>>>> Does that make sense? What is the criteria for this work to be
>>>>>>>>>>> upstreamed? Who is willing to participate in reviewing the patches?
>>>>>>>>>>>
>>>>>>>>>>> Thanks in advance,
>>>>>>>>>>> Gabor, Matthias, and Herb
>>>>>>>>>>>
>>>>>>>>>>> [1]
>>>>>>>>>>> https://github.com/isocpp/CppCoreGuidelines/blob/master/docs/Lifetime.pdf
>>>>>>>>>>> [2] https://github.com/mgehre/clang
>>>>>>>>>>> [3] https://www.youtube.com/watch?v=80BZxujhY38
>>>>>>>>>>> [4] https://godbolt.org/z/90puuu
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> cfe-dev mailing list
>>>>>>>>>>> cfe-dev at lists.llvm.org
>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190308/27ff5211/attachment.html>