[cfe-dev] [GSoC] Interested in idea: "Find null smart pointer dereferences with the Static Analyzer"

Artem Dergachev via cfe-dev cfe-dev at lists.llvm.org
Mon Mar 30 11:33:47 PDT 2020


The proposal looks good to me with your latest fixes! I encourage you to 
upload it to the GSoC website; their deadline is tomorrow. Thanks again 
for your interest :)

On 3/27/20 2:45 AM, Nithin Vadukkumchery Rajendrakumar wrote:
> Hi Artem,
>
> Thank you very much for the help and explanation.
> I have created a draft for my GSoC proposal and I am sharing the link 
> with this mail.
> Could you please have a look and let me know the feedback for the 
> draft proposal?
> Draft proposal: 
> https://docs.google.com/document/d/1HORDm6cq3YolYTPGFLCF5gLjIyf6bygHeWzG1kd1T-g/edit?usp=sharing
>
> ---
> Thanks & Regards,
> Nithin
>
>
> On Mon, Mar 23, 2020 at 5:18 PM Artem Dergachev <noqnoqneo at gmail.com 
> <mailto:noqnoqneo at gmail.com>> wrote:
>
>
>
>     On 3/20/20 7:56 PM, Nithin Vadukkumchery Rajendrakumar wrote:
>     > Hello Artem,
>     >
>     > I went through the checkers you suggested. I found this project
>     > seems interesting to me and I got a very basic idea about it.
>     >
>     > I tried to find out few cases where unique_ptr::operator->()
>     returns
>     > null apart from default constructed unique_ptr.
>     > *Case 1: *Use of std::move on std::unique_ptr
>     > It seems its already covered in the MoveChecker.
>     > *Case 2:* Use after calling release() on std::unique_ptr
>     > When I ran the analyzer for this scenario, it did produce any
>     warnings
>     > *Case 3: *Use up.reset() or up.reset(nullptr)
>     > Similar to release() case it seems this case also not covered.
>     > *Case 4:* Get raw pointer via std::unique_ptr.get() then delete
>     > I am not sure about this case. It seems user explicitly trying to
>     > break the code.
>
>     No-no, that's not how C++ works :) the smart pointer wouldn't be
>     aware
>     that the raw pointer is deleted, so it'll keep hosting the pointer
>     and
>     cause a use-after-free instead. We could warn about those as well
>     though; it might turn out to be an easy addition once you get the
>     checker running.
>
>     > *Case 5:* Use after swap(std::unique_ptr, null)
>     > In case we swap a std::unique_ptr with another std::unique_ptr with
>     > pointing null.
>     >
>     > I am guessing the list is not complete and this will be a first
>     task,
>     > to figure out all possible cases.
>     > And some what same we have to come up with for other smart pointers.
>     >
>     > Regarding the implementation part, similar to move checker we
>     have to
>     > keep a map for memory region and state (whether it is null or not).
>     > States should be updated based on the changes in MemRegion. I was
>     > wondering is this the right way? (I know I still have to figure out
>     > lot of details regarding concrete implementations)
>
>     Yup, I think that's the most solid and straightforward solution. Note
>     that you will have to not only enumerate all situations when the
>     smart
>     pointer becomes null, but also all the situations when the smart
>     pointer
>     becomes non-null.
>
>     > In case of default-constructed std::unique_ptr object, why can't we
>     > get symbolic value as null and do a check same as what we are doing
>     > for raw pointer?
>     > Is it because some limitations on tracking the symbolic values
>     > of std::unique_ptr objects?
>
>     Manipulating symbolic values inside the smart pointer is indeed
>     another
>     possible solution. The annoying limitation that we run into here
>     is that
>     our memory model ("RegionStore") doesn't currently allow setting a
>     "default" binding to a whole object when it's a part (say, a
>     field) of a
>     bigger object. This basically means that we have to understand how
>     does
>     the smart pointer work internally (which field corresponds to
>     what) in
>     order to manipulate its symbolic value, which ties us to a specific
>     implementation of the C++ standard library. This might still work
>     for a
>     unique pointer which probably always has exactly one field, but for
>     shared pointers things get complicated.
>
>     You can try to overcome this limitation of RegionStore if you're
>     eager
>     enough but that'll be challenging and potentially a lot of work. And
>     even if there wasn't this limitation, this approach isn't necessarily
>     much easier than the first approach.
>
>     > ----
>     > Thanks & Regards,
>     > Nithin
>     >
>     >
>     > On Tue, Mar 10, 2020 at 1:13 AM Nithin Vadukkumchery Rajendrakumar
>     > <vrnithinkumar at gmail.com <mailto:vrnithinkumar at gmail.com>
>     <mailto:vrnithinkumar at gmail.com <mailto:vrnithinkumar at gmail.com>>>
>     wrote:
>     >
>     >     Hi Artem,
>     >
>     >     Thank you very much for this detailed information and help.
>     >     I will checkout the existing checkers you mentioned and try
>     to get
>     >     a better understanding of the problem.
>     >
>     >     ----
>     >     Regards,
>     >     Nithin.VR
>     >
>     >
>     >     On Mon, Mar 9, 2020 at 2:30 AM Artem Dergachev
>     >     <noqnoqneo at gmail.com <mailto:noqnoqneo at gmail.com>
>     <mailto:noqnoqneo at gmail.com <mailto:noqnoqneo at gmail.com>>> wrote:
>     >
>     >         Hey!
>     >
>     >         Welcome. Let's see.
>     >
>     >         Nullability checker isn't the one that you're looking
>     for. It's a
>     >         different beast that governs hunt for null dereferences via
>     >         so-called
>     >         "nullability annotations". Like, a language extension is
>     provided
>     >         through which the programmer can tell the analyzer which
>     >         variables /
>     >         functions may or may not hold / produce null pointers,
>     and the
>     >         analyzer
>     >         checks whether it makes sense how these nullable and
>     non-null
>     >         values
>     >         propagate from one function to another. So it's the same
>     >         problem but a
>     >         different technique. It is targeted mostly at finding
>     crashes in
>     >         Objective-C apps that pass a lot of pointers around
>     across many
>     >         user-defined functions.
>     >
>     >         The proposed GSoC project is of a different nature: we
>     want to
>     >         teach the
>     >         static analyzer about a very specific C++ API, but we
>     want to
>     >         teach it
>     >         much more thoroughly. It's not enough to know that
>     >         std::unique_ptr::operator->() may occasionally return a null
>     >         pointer;
>     >         we'd much rather know when exactly does it return a null
>     >         pointer (eg.,
>     >         if the smart pointer is freshly default-constructed).
>     >
>     >         If you want to study existing checkers, check out:
>     >         - MoveChecker - the use-after-move checker which already
>     finds
>     >         *some*
>     >         null smart pointer dereferences, given that they're
>     guaranteed
>     >         to be
>     >         null after move.
>     >         - SmartPtrChecker currently does almost nothing, but that's
>     >         probably
>     >         where you put your code into :)
>     >         - IteratorChecker is a large ongoing pioneer project to find
>     >         iterator
>     >         and container related bugs such as dereferencing
>     vector.end().
>     >         It's the
>     >         closest thing to what you'll be implementing, but its
>     handling
>     >         of C++
>     >         objects is outdated and overly complicated because some new
>     >         facilities
>     >         for C++ support (mostly the ones explained in the second
>     half of
>     > https://www.youtube.com/watch?v=4n3l-ZcDJNY) weren't in place
>     >         yet when
>     >         it all started.
>     >
>     >         Once you understand the project a bit better and like
>     it, the
>     >         next step
>     >         is to discuss here (in this mailing list) what is the
>     best way to
>     >         implement the checker. The ultimate outcome of this
>     discussion
>     >         will be a
>     >         so-called "GSoC proposal". It's a few pages of text that you
>     >         write, post
>     >         here for more discussion, and eventually upload to the GSoC
>     >         website.
>     >         According to the GSoC timeline, the proposal should be
>     >         submitted by the
>     >         end of March. The proposal summarizes how *you*
>     understand the
>     >         project
>     >         and how *you* plan to tackle it during the summer.
>     >
>     >         Good luck on your GSoC path!
>     >         Artem.
>     >
>     >
>     >         On 3/7/20 3:40 PM, Nithin Vadukkumchery Rajendrakumar via
>     >         cfe-dev wrote:
>     >         >
>     >         > Greetings,
>     >         >
>     >         >
>     >         > I am interested to participate in GSoC 2020. I am
>     particularly
>     >         > interested in the project idea "Find null smart pointer
>     >         dereferences
>     >         > with the Static Analyzer". I am doing my masters in
>     computer
>     >         science
>     >         > and interested in program analysis and verification. I
>     thought
>     >         > GSoC2020 will be a wonderful opportunity to learn more
>     about
>     >         Clang
>     >         > Static Analyzer and contribute.
>     >         >
>     >         >
>     >         > I have started reading about smart pointers in C++ to
>     get a
>     >         good grasp
>     >         > of the concepts. Also, has some experience in implementing
>     >         Clang
>     >         > Static Analyzer simple checks(similar to
>     >         SimpleStreamChecker) from the
>     >         > tutorials. I read through few available tutorials and have
>     >         some basic
>     >         > idea about Control Flow Graph, Exploded Graph and Symbolic
>     >         Values. I
>     >         > have read the paper "A memory model for static
>     analysis of C
>     >         programs"
>     >         > to get some theoretical background. I also started
>     looking into
>     >         > NullabilityChecker.cpp
>     >         >
>     >       
>      <https://github.com/llvm/llvm-project/blob/master/clang/lib/StaticAnalyzer/Checkers/NullabilityChecker.cpp> to
>     >
>     >         > understand the codebase.
>     >         >
>     >         > I would like to know is this the right place to look?
>     >         >
>     >         > Could anyone please help me on what should I do next?
>     >         >
>     >         > ----
>     >         > Thanks & Regards,
>     >         > Nithin
>     >         >
>     >         > _______________________________________________
>     >         > cfe-dev mailing list
>     >         > cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>     <mailto:cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>>
>     >         > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>     >
>



More information about the cfe-dev mailing list