[cfe-dev] [analyzer][GSoC] Implementing a dataflow framework for the Clang Static Analyzer

Kristóf Umann via cfe-dev cfe-dev at lists.llvm.org
Thu Mar 14 06:02:58 PDT 2019


Artem Dergachev <noqnoqneo at gmail.com> ezt írta (időpont: 2019. márc. 7.,
Cs, 1:38):

> An unrelated humble/funny project is now officially up on
> http://llvm.org/OpenProjects.html#analyze-llvm
>
> On 3/2/19 6:50 AM, Gábor Horváth wrote:
>
> Hi!
>
> I think I could have the same level of commitment I had last summer. If
> you find that sufficient, I am happy to help with the mentoring!
>
>
> Yay! I mean, regardless of how the final setup would look, i'd be happy to
> see you intervene :)
>
> I do agree with Artem, if you just started to familiarize yourself with
> the topic, I would suggest starting with implementing some algorithms from
> scratch (or tweak existing ones). While improving existing algorithms is
> always nice, and it is good to get rid of some technical debt, it is
> sometimes harder to write a good report in the end.
>
> Some random ideas that I would love to see implemented:
> * Points to analysis would have a number of uses like more precise loop
> widening, call graph construction, ability to experiment with other
> invalidation schemes
> * Static backward slicing could help us shorten bug paths and filter events
>
>
> Whoa, this one sounds really interesting and also very impactful. We're
> right now pretty bad at figuring out which events are relevant and which
> aren't, and it constantly bothers me (one of the latest examples is
> https://bugs.llvm.org/show_bug.cgi?id=40913#c4). Maybe we could even poke
> that as a separate smaller project? Another good thing about this smaller
> project is that i don't know how to implement that, but it sounds like you
> do - which means that we could co-mentor it as part of GSoC (:
>

Currently, this is the one topic that interests me the most, it sounds like
an achievable, but "tough enough" task for a summer.


>
> * For some checks even just a set of dominators/post dominators for each
> basic block would be useful, see Kristof's uninitialized value check's
> "guarded member" heuristic
>
> Before I make a former proposal however, I'll implement this as an
exercise, because (according to Gábor) this is a project that could be
implemented in a week(ish). First, I'll add some code to my checker, and
see what it takes to make this functionality available to every checker --
this should give me a good idea on how this entire project could look like.

Thank you so much for the encouragement by the way, I'm super excited to
work on this! :)


> * Maybe some better measure of function complexity for the analyzer's
> inlining decision? As far as I remember we use the number of basic blocks
> now.
> * Maybe some static edge frequency estimate for the CFG to be able to
> experiment with other exploration strategies? Taking the most or least
> frequently taken edge leading to an unvisited node sounds like a fun
> experiment, but nothing guarantee that the end result will actually be
> useful.
>
> I think it is also ok to write multiple proposals, but you should aim for
> quality over quantity.
>
> Regards,
> Gabor
>
> On Sat, 2 Mar 2019 at 00:11, Artem Dergachev <noqnoqneo at gmail.com> wrote:
>
>> Hey!
>>
>> First of all, yup, we're totally doing GSoC this year! I'll do my best
>> to finally force myself to update the open projects list soon with a
>> funny random idea that some folks around there are interested in :)
>> Now, also i'm alone these days since George changed jobs; Gabor, are
>> you interested in / capable of helping out again? 'Cause they'll most
>> likely only let me pick one student, while we already have a few
>> people looking into participating :)
>>
>> Data flow... Well, this one's definitely wanted, but that's a big one,
>> definitely bigger than any of the projects that we've tried so far :)
>> I guess such project would aim at allowing data-flow checkers that
>> don't have to write down transfer functions for every kind of
>> statement in every checker, but instead rely on the engine to handle
>> common basic effects of statements. If done properly, it pretty much
>> means writing a new Static Analyzer, which is more complicated than
>> the old one because it also needs its state merge operations defined.
>>
>> There may be "poor-man's" solutions to this, which would allow us to
>> have a bit of progress without developing too much complex machinery.
>> One such solution is to introduce careful (conservative) tracking of
>> dropped execution paths in the Static Analyzer; it would allow finding
>> bugs that require analysis of all paths as long as the Analyzer is
>> known to have explored the function in its entirety. In practice
>> that'd be a lot of dropped coverage (much more than a proper
>> solution), but it'd still probably find *something*, while allowing us
>> to re-use most of the infrastructure. This, of course, is a dead-end
>> in the long run - if we ever want to regain the lost coverage, we'd
>> have to start from scratch.
>>
>> So my overall feel here is that i'd love to hear a specific proposal,
>> but i suspect that if you go for doing this properly, one summer would
>> only be enough to scratch the surface. If you want to have a feel of
>> what it'd take, i'd recommend poking our existing data flow analyses.
>> Eg., what would it take to remove my C++ "object under construction"
>> liveness hacks by improving live variables analysis? You might also
>> want to write a few analyses of your own and see if you can generalize
>> something out of them. Eg., if you write an analysis that checks
>> whether a function is pure, we can instantly use it to improve our
>> conservative evaluation of pure functions (eg., produce the same
>> return value symbol every time it's called with the same parameters
>> and skip invalidations - we may even try to do it across translation
>> units because such summary is easy to serialize).
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190314/14a69dfa/attachment.html>


More information about the cfe-dev mailing list