[cfe-dev] Working on open projects

Thu Sep 14 07:19:36 PDT 2017

Hello,

These are analyzer projects, which improve symbolic execution-based 
bug-finding of the clang's --analyze option, but not compilation or code 
generation. At the same time, these projects require relatively little 
understanding of the analyzer's internals (compared to other projects).

* The body farm project does not require much knowledge about the 
analyzer, and mostly requires knowledge of the AST. The idea of the 
project is to synthesize ASTs of functions in order to help the analyzer 
what they do, when they are not available in the current translation 
unit (which is a problem because clang only compiles, and therefore 
analyzes, one translation unit at a time). Having an AST for an external 
function automagically allows the analyzer to "inline" it during 
analysis; lack of the AST would mean that the analyzer would assume that 
anything can happen when such function is called, which reduces 
precision of the analysis.

Body-farmed ASTs are useful for system library functions that are simple 
enough. The AST does not need to necessarily do exactly what the 
function does, because the analyzer does not model everything exactly. 
For example, any atomic operations on integers may be replaced with 
regular integer operations because the analyzer would naturally do all 
its symbolic calculations atomically. You can see what functions are 
already there (very few, i guess we only have a couple of libdispatch 
functions that are modeled to immediately call their callback; George 
has recently farmed a body for std::call_once similarly in 
https://reviews.llvm.org/D37840, which turned out to be harder than 
usual) and follow the example to add the functions you're interested in. 
Various compiler builtins (eg., again, atomics?) might make a nice 
addition, and as far as I remember, Devin may have a couple of ideas as 
well.

There is another mechanism in the analyzer, "evalCall", that allows 
analyzer checkers to compute the effects of the function directly, 
without consulting any sort of AST. The evalCall mechanism is older and 
in many (but not all) cases more powerful, but probably overly powerful 
and poorly scales with the number of checkers, so body farms are 
preferred whenever possible.

Finally, there is an effort to allow the analyzer to import stuff from 
other translation units through ASTImporter 
(https://reviews.llvm.org/D30691); if successful, as a neat side effect 
this may allow us to replace manual AST construction in body farms with 
simply feeding raw source code to the analyzer, which might be easier.

* The C++ operator-new project is about constructing the clang's CFG 
more accurately. Because most of the compilation relies on the LLVM's 
CFG, clang CFG is essentially used only by the analyzer and a couple of 
analysis-based compiler warnings, but not for compilation, and as such 
it is not entirely finished. I didn't look deeply into this problem yet, 
but it seems that by the time the analyzer sees the object construction 
element in the CFG, he wasn't informed that he needs to allocate 
symbolic memory to hold the newly constructed object, which needs fixing.

While fixing the CFG is the first step, the ultimate goal of this 
project is to enable the "-analyzer-config c++-allocator-inlining=true" 
option by default. Which means that work would also need to be done on 
the analyzer side in order to understand the new CFG items and act 
accordingly.

As an example of a recent CFG work i could recommend 
https://reviews.llvm.org/D15031 which is not related to operator new, 
but gives an impression of how this area of our code looks.

* As for contacts, this mailing list is the right place to discuss what 
you want to do, and our phabricator (reviews.llvm.org) is the right 
place to publish your patches. I've also CC'd the analyzer's code owner 
Anna and other potentially interested people.

On 9/14/17 2:27 AM, Jiten Thakkar via cfe-dev wrote:
> Hi All,
> I was going through open projects page 
> (https://clang-analyzer.llvm.org/open_projects.html) and wondering if 
> that page is up to date or not. I found 'Explicitly model standard 
> library functions with BodyFarm' and 'Enhance CFG to model C++ new 
> more precisely' interesting to work on. I have some experience with 
> LLVM API and modeling functions for verification as part of my masters 
> project. So if anyone can let me know whom should I contact for those 
> projects or how should I get started then it would be very helpful.
>
> Thanks,
> Jiten
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev