[cfe-dev] [analyzer] MallocChecker & checker dependencies
Alexey Knyshev via cfe-dev
cfe-dev at lists.llvm.org
Wed Mar 28 03:57:17 PDT 2018
Hi all,
Any ideas on it?
Regards, Alexey K
2018-03-26 12:01 GMT+03:00 Alexey Knyshev <alexey.knyshev at gmail.com>:
> Hi all,
>
> During last few days I was working on std::shared_ptr constructor modeling
> and trying to understand how path sensitive analysis works. After some time
> I faced unexpected modelling of Call arguments SVals. Here is the portion
> of code that caused analyzer behavior that I don't understand:
>
> auto del = std::default_delete<S>();
> return std::shared_ptr<S>((S*)malloc(sizeof(S)), del);
>
> The case I observed: first argument SVal is modeled properly and I can get
> RegionState info from program state (provided by MallocChecker). The second
> one (del symbol) is unknown (DelSVal.dump() shows me that). On the other
> hand, the following code doesn't cause such problem and prints
> '&code{free}' as expected:
>
> const auto &del = free;
> return std::shared_ptr<S>((S*)malloc(sizeof(S)), del);
>
> So, the question is: what can cause such imprecise modeling of SVal
> (Deleter - second constructor arg)?
>
> P.S. I've changed shared_ptr constructor declaration in
> 'test/Analysis/Inputs/system-header-simulator-cxx-std-suppression.h':
>
> shared_ptr(_Tp* __p, void(*deleter)(void*))
>
> to
>
> template<class Deleter>
> shared_ptr(_Tp* __p, Deleter d)
>
> and I don't understand why it was declared with function pointer argument
> instead template. But looks like it isn't the source of my problem at
> first sight.
>
> Regards, Alexey K
>
> 2018-03-20 0:37 GMT+03:00 Artem Dergachev <noqnoqneo at gmail.com>:
>
>>
>>
>> On 17/03/2018 7:25 AM, Alexey Knyshev wrote:
>>
>>>
>>> This should already be supported by MallocChecker - i.e. it warns
>>> when you allocate memory with, say, new() and release it with
>>> free() or delete[]().
>>>
>>> If methods of a smart pointer are "inlined" during analysis, these
>>> bugs will already be caught automatically.
>>>
>>>
>>> I took a look at your recent work & status report posted there in
>>> mailing lists. And I have a question about how it would work in particular
>>> cases, e.g.:
>>> 1. Returning smart_ptr by value from function that has it's
>>> implementation in analyzed translation unit but it isn't called from any
>>> other function in the same TU.
>>>
>>> std::shared_ptr<S> createInstance() {
>>> return std::shared_ptr<S>(new S, free);
>>> }
>>>
>>> If I understand correctly, compiler is free to apply copy-elison
>>> optimization on shared_ptr which is returned by value. So destructor call
>>> won't be modeled properly because it's not inlined during analysis. And bug
>>> won't be caught.
>>>
>>
>> Indeed, the tempting - but risky - thing to do here is to declare
>> construction of "std::shared_ptr<S>(new S, free);" to be a bug on its own.
>> It is risky because it leaves us with a checker that finds non-bugs; there
>> is a chance that the destructor will never be called while the pointer is
>> still being owned by the shared_ptr. Such code would be ridiculous but at
>> the same time we wouldn't like to drive away users who had to write this to
>> work around an equally ridiculous proprietary API that they have no way of
>> changing.
>>
>> In this particular case i'd have tried to take the risk, keeping in mind
>> that i might end up rewriting the checker back to the conservative behavior
>> of only warning at the destructor.
>>
>> Note that copy elision is irrelevant, because in any case we won't free()
>> the pointer at the return site, because reference count will never reach
>> zero within the function. That's the whole point of the shared pointer.
>>
>> 2. As you mentioned before, report would be fired only if constructor and
>>> destructor have been inlined (which is not guaranteed to be done). As the
>>> result such check won't work in general case.
>>>
>>
>> Yeah, that's pretty much the primary motivation for making an
>> API-specific checker. It's not super strong in this case because this would
>> likely give a very slight increase over the existing coverage.
>>
>> In this case, however, the much more important motivation is to avoid
>> false positives that occur when we don't know the original reference count
>> of a shared pointer that was constructed before analysis has started - and
>> we start "assuming..." that any destructor of a shared pointer could
>> release memory. This is currently planned to be suppressed in a fairly
>> gross manner (https://reviews.llvm.org/D44281). For this reason we
>> believe that having explicit modeling for various smart pointers is a must.
>>
>> Actually my point is about modeling smart pointers behavior specifically
>>> and don't inline any calls to constructor / destructor & methods those
>>> modeled by checker. And the question immediately arises: is there way to
>>> control inline policy?
>>>
>>
>> It is indeed possible to make the checker "take over" function call
>> modeling by subscribing on the `eval::Call` callback. In this case both
>> inlining and conservative evaluation by the core are avoided, and the
>> checker who evaluates the call is responsible for modeling *all* effects of
>> the call (because two checkers cannot evalCall the same call, even though
>> they can do the usual checkPreCall/checkPostCall thingy as much as they
>> want).
>>
>> There is also the BodyFarm mechanism (which allows you to mock simplified
>> ASTs for standard functions) but you DO NOT WANT to use it for C++ because
>> it'd take you forever to construct the correct C++ AST with templates
>> manually.
>>
>> And sorry for any possible misunderstanding from my side as I'm not quite
>>> familiar with CSA internals.
>>>
>>> Regards, Alexey K
>>>
>>> 2018-03-17 5:21 GMT+03:00 Artem Dergachev <noqnoqneo at gmail.com <mailto:
>>> noqnoqneo at gmail.com>>:
>>>
>>>
>>>
>>> On 16/03/2018 12:30 PM, Alexey Knyshev wrote:
>>>
>>> Sorry, accidentally removed part of message before sending.
>>>
>>> Aleksei,
>>>
>>> Could you share your ideas of checker design? It is
>>> possible that
>>> problems you met can be solved in different (maybe even
>>> better)
>>> way if we know the whole picture.
>>>
>>>
>>> I'm currently have no clear idea how to implement it in the
>>> "right manner". But first of all I would aim to track the
>>> origin (malloc, new, new [], etc) of SVal and check if Deleter
>>> is the expected corresponding way to deallocate such SVal.
>>>
>>>
>>> This should already be supported by MallocChecker - i.e. it warns
>>> when you allocate memory with, say, new() and release it with
>>> free() or delete[]().
>>>
>>> If methods of a smart pointer are "inlined" during analysis, these
>>> bugs will already be caught automatically.
>>>
>>>
>>> Regards, Alexey K
>>>
>>> 2018-03-16 22:22 GMT+03:00 Alexey Knyshev
>>> <alexey.knyshev at gmail.com <mailto:alexey.knyshev at gmail.com>
>>> <mailto:alexey.knyshev at gmail.com
>>>
>>> <mailto:alexey.knyshev at gmail.com>>>:
>>>
>>> Hello guys,
>>>
>>> And thanks for your attention
>>>
>>> Aleksei,
>>>
>>> Could you share your ideas of checker design? It is
>>> possible
>>> that problems you met can be solved in different
>>> (maybe even
>>> better) way if we know the whole picture.
>>>
>>>
>>> As I said, I'm interested in improving current state of
>>> dynamic
>>> memory modeling. Especially stuff related to smart
>>> pointers (e.g.
>>> shared, unique) which also mentioned in list of potential
>>> checkers
>>> as smartptr.SmartPtrInit
>>> <https://clang-analyzer.llvm.org/potential_checkers.html
>>> <https://clang-analyzer.llvm.org/potential_checkers.html>>*
>>>
>>> *
>>>
>>> **You can see an example of how state trait is exported
>>> in
>>> GenericTaintChecker or MPIChecker. Generally, you just
>>> create
>>> a named ProgramStateTrait in the header.
>>>
>>>
>>> Briefly looked at MPITypes.h. Does it mean we should move
>>> RegionState to separate file and register it via Traits in
>>> the
>>> same manner to make it avaliable from other checkers (not
>>> other TU
>>> as mentioned in ento::mpi::RequestMap)?
>>>
>>>
>>> GenericTaintChecker is made in a fairly intrusive manner by
>>> putting stuff into the program state class directly, which isn't
>>> very sane. I'd definitely prefer something similar to dynamic type
>>> propagation (see DynamicTypeMap.cpp, DynamicTypeMap.h). You can
>>> use the usual REGISTER_MAP_WITH_PROGRAMSTATE (etc.) macros in the
>>> .cpp file as long as you provide accessor methods declared in the
>>> .h file that other checkers could include.
>>>
>>>
>>> Artem,
>>>
>>> you need to keep all of this in mind when writing a
>>> checker.
>>>
>>>
>>> Sure, but on the other hand I think it's possible to
>>> implement and
>>> improve modeling of various API calls' effects
>>> step-by-step. Let's
>>> say, it case of SmartPtrInit checker mentioned before the
>>> bare
>>> minimum would be modeling of construction (and destruction to
>>> avoid leak report from existing related checkers).
>>>
>>> which is why the few experimental C++ checkers that we
>>> currently have required heavy hacks to become
>>> possible. But
>>> for now you'd rather keep an eye on these problems.
>>>
>>>
>>> Does it mean it's better to wait a bit for Core
>>> improvements from
>>> main contributors? Does it make sense to make an efforts to
>>> implement SmartPtrItnit checker in current state of things?
>>>
>>>
>>> You should feel free to start working on the checker in an
>>> incremental manner (everything is better in an incremental
>>> manner!), but in some cases i may prefer improving the checker API
>>> or fixing core modeling bugs instead of writing large portions of
>>> checker code to work around inconvenient APIs or bugs, and this is
>>> something i might not be immediately capable of doing myself (due
>>> to being busy with other things), which may potentially delay your
>>> progress.
>>>
>>> Thanks in advance and regards,
>>> Alexey K
>>>
>>> 2018-03-16 21:47 GMT+03:00 Artem Dergachev
>>> <noqnoqneo at gmail.com <mailto:noqnoqneo at gmail.com>
>>> <mailto:noqnoqneo at gmail.com <mailto:noqnoqneo at gmail.com>>>:
>>>
>>>
>>>
>>> Dynamic memory management is pretty much done at this
>>> point -
>>> we have very good support for malloc()-like stuff and
>>> reasonably good support for operator new() (support for
>>> operator new[]() is currently missing because it requires
>>> calling an indefinite number of constructors which is
>>> not yet
>>> implemented). There is also a known issue with custom
>>> operator
>>> new(...) returning null pointers - in this case we
>>> should not
>>> be calling the constructor but for now we evaluate the
>>> constructor conservatively.
>>>
>>> Your real problem will be managing smart pointers
>>> themselves
>>> because they are C++ objects that have a myriad of
>>> methods,
>>> they can be copied, moved, assigned, move-assigned, they
>>> destroyed, they lifetime-extended, you need to keep all
>>> of
>>> this in mind when writing a checker. This is slowly
>>> getting
>>> easier because i'm currently working on that, but until
>>> recently it wasn't working correctly in the core, let
>>> alone
>>> checkers, which is why the few experimental C++
>>> checkers that
>>> we currently have required heavy hacks to become
>>> possible. But
>>> for now you'd rather keep an eye on these problems.
>>>
>>> On 16/03/2018 3:57 AM, Aleksei Sidorin via cfe-dev wrote:
>>>
>>> Hi Alexey!
>>>
>>> Could you share your ideas of checker design? It is
>>> possible that problems you met can be solved in
>>> different
>>> (maybe even better) way if we know the whole picture.
>>> Regarding your questions, you can find some
>>> answers below.
>>>
>>> 1. a) The first way for checker communication is
>>> to share
>>> their program state trait. You can see an example
>>> of how
>>> state trait is exported in GenericTaintChecker or
>>> MPIChecker. Generally, you just create a named
>>> ProgramStateTrait in the header. You can take a
>>> look at
>>> TaintManager.h and MPITypes.h and how they are used.
>>> b) To set a dependency from another checker, you
>>> can just
>>> register it while registering your checker. An
>>> example can
>>> be found in MallocChecker where register$Checker also
>>> calls registerCStringCheckerBasic to register a
>>> checker it
>>> depends on.
>>> As you pointed, inter-checker communication can
>>> become a
>>> source of some problems. Most of them are discussed
>>> in
>>> this conversation:
>>> http://clang-developers.42468.n3.nabble.com/analyzer-RFC-Des
>>> ign-idea-separate-modelling-from-checking-td4059122.html
>>> <http://clang-developers.42468.n3.nabble.com/analyzer-RFC-De
>>> sign-idea-separate-modelling-from-checking-td4059122.html>
>>>
>>> <http://clang-developers.42468.n3.nabble.com/analyzer-RFC-De
>>> sign-idea-separate-modelling-from-checking-td4059122.html
>>> <http://clang-developers.42468.n3.nabble.com/analyzer-RFC-De
>>> sign-idea-separate-modelling-from-checking-td4059122.html>>
>>>
>>> 2. I think there is nothing bad in sharing
>>> RegionState
>>> across checkers in the way shown in 1a.
>>>
>>> 3. Artem Dergachev has done some excellent work on
>>> improvement of operator 'new' processing in CSA
>>> engine.
>>> Regarding checkers, I can see some on
>>> https://clang-analyzer.llvm.org/potential_checkers.html
>>> <https://clang-analyzer.llvm.org/potential_checkers.html>
>>>
>>> <https://clang-analyzer.llvm.org/potential_checkers.html
>>> <https://clang-analyzer.llvm.org/potential_checkers.html>>:
>>> for example, undefbehavior.AutoptrsOwnSameObj. You
>>> can
>>> search this list to find more.
>>>
>>>
>>> 15.03.2018 23:54, Alexey Knyshev via cfe-dev пишет:
>>>
>>> Hi there!
>>>
>>> While thinking about how it would be possible to
>>> implement various smart ptr related checkers I
>>> tried
>>> to review current state of MallocChecker and
>>> came up
>>> with that it would be great to have
>>> RegionState info
>>> available to other checkers. Could you please
>>> share
>>> your points of view and comments on the following
>>> statements / questions:
>>>
>>> 1. Is there any right way for chaining
>>> checkers? How
>>> they are expected to communicate between each
>>> other
>>> (excluding generation of nodes /
>>> ProgramStates). I've
>>> heard that there are couple of problems caused by
>>> inlining functions, constructors / descructors.
>>> 2. What do you think about moving RegionState
>>> to the
>>> Core of CSA or providing optional extended info
>>> in
>>> MemRegion about the source of such region (new
>>> opearator / array new, malloc, alloca, etc). So
>>> it
>>> would be available to all checkers.
>>> 3. Is there any roadmap for CSA and especially
>>> for
>>> dynamic memory management modeling & related
>>> checkers?
>>>
>>> Regards, Alexey K
>>>
>>> -- linkedin.com/profile
>>> <http://linkedin.com/profile> <http://linkedin.com/profile>
>>>
>>> <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBtei
>>> QnWsYm-S9yxT7wQkfWhSw
>>> <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBtei
>>> QnWsYm-S9yxT7wQkfWhSw>
>>>
>>> <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBtei
>>> QnWsYm-S9yxT7wQkfWhSw
>>> <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBtei
>>> QnWsYm-S9yxT7wQkfWhSw>>>
>>>
>>> github.com/alexeyknyshev <http://github.com/alexeyknyshev>
>>> <http://github.com/alexeyknyshev
>>> <http://github.com/alexeyknyshev>>
>>> <http://github.com/alexeyknyshev
>>> <http://github.com/alexeyknyshev>
>>> <http://github.com/alexeyknyshev
>>> <http://github.com/alexeyknyshev>>>
>>> bitbucket.org/alexeyknyshev <http://bitbucket.org/alexeyknyshev>
>>> <http://bitbucket.org/alexeyknyshev
>>> <http://bitbucket.org/alexeyknyshev>>
>>> <https://bitbucket.org/alexeyknyshev/
>>> <https://bitbucket.org/alexeyknyshev/>
>>> <https://bitbucket.org/alexeyknyshev/
>>> <https://bitbucket.org/alexeyknyshev/>>>
>>>
>>>
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>>> <mailto:cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>>
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>> <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>>>
>>> <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>> <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>>
>>>
>>>
>>>
>>> -- Best regards,
>>> Aleksei Sidorin,
>>> SRR, Samsung Electronics
>>>
>>>
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>>> <mailto:cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>>
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>> <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>>>
>>> <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>> <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>>
>>>
>>>
>>>
>>>
>>>
>>> -- linkedin.com/profile <http://linkedin.com/profile>
>>>
>>> <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBtei
>>> QnWsYm-S9yxT7wQkfWhSw
>>> <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBtei
>>> QnWsYm-S9yxT7wQkfWhSw>>
>>>
>>> github.com/alexeyknyshev <http://github.com/alexeyknyshev>
>>> <http://github.com/alexeyknyshev
>>> <http://github.com/alexeyknyshev>>
>>> bitbucket.org/alexeyknyshev
>>> <http://bitbucket.org/alexeyknyshev>
>>> <https://bitbucket.org/alexeyknyshev/
>>> <https://bitbucket.org/alexeyknyshev/>>
>>>
>>>
>>>
>>>
>>> -- linkedin.com/profile <http://linkedin.com/profile>
>>> <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBtei
>>> QnWsYm-S9yxT7wQkfWhSw
>>> <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBtei
>>> QnWsYm-S9yxT7wQkfWhSw>>
>>>
>>> github.com/alexeyknyshev <http://github.com/alexeyknyshev>
>>> <http://github.com/alexeyknyshev
>>> <http://github.com/alexeyknyshev>>
>>> bitbucket.org/alexeyknyshev
>>> <http://bitbucket.org/alexeyknyshev>
>>> <https://bitbucket.org/alexeyknyshev/
>>> <https://bitbucket.org/alexeyknyshev/>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> linkedin.com/profile <https://www.linkedin.com/prof
>>> ile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>
>>>
>>> github.com/alexeyknyshev <http://github.com/alexeyknyshev>
>>> bitbucket.org/alexeyknyshev <https://bitbucket.org/alexeyknyshev/>
>>>
>>
>>
>
>
> --
> linkedin.com/profile
> <https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>
>
> github.com/alexeyknyshev
> bitbucket.org/alexeyknyshev
>
--
linkedin.com/profile
<https://www.linkedin.com/profile/view?id=AAMAABn6oKQBDhBteiQnWsYm-S9yxT7wQkfWhSw>
github.com/alexeyknyshev
bitbucket.org/alexeyknyshev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180328/c0343064/attachment.html>
More information about the cfe-dev
mailing list