[cfe-dev] [Analyzer] Pointer escape vs. pointer invalidation on function call

Tue May 30 12:45:01 PDT 2017

30/05/2017 10:12 PM, Anna Zaks wrote:
>
>
> On Tue, May 30, 2017 at 6:47 AM, Artem Dergachev <noqnoqneo at gmail.com 
> <mailto:noqnoqneo at gmail.com>> wrote:
>
>     +Devin.
>
>     Yep, this behavior can be improved quite a bit. It's indeed overly
>     conservative to believe that a single escaping pointer argument
>     would cause other const pointer arguments' pointed-to regions
>     invalidated.
>
>     argumentsMayEscape() should still be relied upon, it seems,
>     because it contains a nice blacklist of weird functions, eg.
>     https://stackoverflow.com/questions/7726329/what-is-the-purpose-of-const-pointer-to-void/7726359
>     <https://stackoverflow.com/questions/7726329/what-is-the-purpose-of-const-pointer-to-void/7726359>
>
>     However, making argumentsMayEscape() more fine-grained, eg.
>     letting it provide a list of arguments that may escape, should be
>     quite an improvement.
>
>     Additionally, when structures are passed into functions, we could
>     consider const-qualifiers on pointer fields in these structures
>     and avoid invalidating through pointers stored in const-pointer
>     fields - might be an improvement as well. Direct parameters are
>     not special.
>
>     Possible aliasing between arguments should still be taken into
>     account, eg. tests with testMixedConstNonConstCalls() added in
>     https://reviews.llvm.org/D19057 <https://reviews.llvm.org/D19057>
>     should remain "UNKNOWN". So i'd imagine a two-pass procedure,
>     where on the first pass we mark base regions of all const pointers
>     passed into the function with TK_PreserveContents, and on the
>     second pass we remove TK_PreserveContents from base regions of
>     non-const pointers.
>
>     Finally, i agree that argumentsMayEscape() makes sense for plain C
>     system library functions only. It should ideally check that the
>     function is within the standard library, and is a plain-C
>     function, and avoid jumping to conclusions if this is not the case.
>
> We do rely on argumentsMayEscape() to support other languages, 
> specifically, there is ObjC specific code. I suspect, we would benefit 
> from reasoning about C++ as well, but currently, C++ method decls 
> overloads are not implemented. I suspect we need to do more 
> investigation on libc++.

Yep. My point was mostly about the fact that in 
AnyFunctionCall::argumentsMayEscape() (which i incorrectly referred to 
as argumentsMayEscape()) we check the identifier of the function, but we 
don't check if it's a plain C function (a C++ method may have the same 
identifier, and we're not checking that it's not a method; overrides are 
not necessary to break the contract of this method, just add some 
classes with methods of given names; ObjC messages cannot make it into 
AnyFunctionCall::argumentsMayEscape() because ObjC message isn't any 
function) or if it comes from a system library (probably an ObjC library 
in case of NS*Insert*, but it's still a C-style function, not a method, 
not a message), so the code in this function doesn't make much sense 
unless it is called in the context in which we have already checked all 
this stuff.

>
> As Alexey pointed out, argumentsMayEscape() is used/called mainly 
> after we know that the function comes from the system header. I do not 
> recall if there are cases where it is needed for non-system calls, but 
> we could envision having heuristics and treating some non-system calls 
> with specific names the same way we treat system calls.
>
> Note, that the implementation itself does include generic heuristics, 
> applicable to any functions, not just system functions. For example, 
> it call:
>  /// \brief Returns true if any of the arguments appear to represent 
> callbacks.
>   bool hasNonZeroCallbackArg() const;
>
> Further investigation would be great here as that code is somewhat 
> convoluted!
>
>
>     25/05/2017 8:07 PM, Aleksei Sidorin wrote:
>
>         Hi all,
>
>         I have some questions about CSA invalidation behaviour for the
>         case where some arguments can escape after call.
>
>         1. There is a condition in CallEvent::invalidateRegions():
>
>           if (!argumentsMayEscape())
>             findPtrToConstParams(PreserveArgs, *this);
>
>         The contents of PreserveArgs changed by findPtrToConstParams()
>         is used later for setting a special invalidation trait for its
>         items: TK_PreserveContents. But, as I understand, if some
>         pointer passed to function can escape, all the pointers passed
>         to function get invalidated independently on can they escape
>         or not. Why we don't just filter the escaping regions and
>         invalidate them but invalidate all the pointers instead?
>
>         2. For AnyFunctionCall, we think that void* arguments of can
>         escape:
>
>           if (CallEvent::argumentsMayEscape() ||
>         hasVoidPointerToNonConstArg())
>             return true;
>
>         But because of (1), this means that all other pointers passed
>         to such function (including pointers to const) are
>         invalidated. Checkers that use argumentsMayEscape() method
>         explicitly check that the call is located in system header.
>         So, should we move the check for system header into
>         argumentsMayEscape()? It looks like the commit that introduced
>         this behaviour was targeting system header functions only. And
>         should we avoid the invalidation of pointers to constant
>         memory if some pointer argument can escape?
>
>
>