[cfe-dev] General query : Alpha security checkers and taint analysis

Artem Dergachev via cfe-dev cfe-dev at lists.llvm.org
Tue Apr 5 10:12:36 PDT 2016

 > since I don't have any prior knowledge about clang ,
 > do i need to go through any other tutorials to
 > completely understand the code of various experimental
 > checkers and also to write one of my own?

Emm, no, we don't yet have a single good tutorial for everything. Some 
useful reading includes:

- lib/StaticAnalyzer/README.txt is a veeery brief introduction.

- The link [2] from lib/StaticAnalyzer/README.txt is a good detailed 
description of the memory model (MemRegion class hierarchy).

- See docs/analyzer/IPA.txt for a quick introduction to how 
inter-procedural analysis works.

- See docs/analyzer/RegionStore.txt for a shorter introduction to the 
memory model, with some implementation caveeats.

You may want to get familiar with the clang abstract syntax tree, to 
just know how clang represents types etc., the good video is there: 
http://clang.llvm.org/docs/IntroductionToTheClangAST.html .

Also, checker code is usually relatively simple. And the API is also 
relatively easy and intuitive - well, in most places. Just dump things 
often - or read the exploded graphs - and try to understand what's going 
on. Learning by example is what everybody does, i guess, even though not 
all examples are as good as i wish they were.

 > Another specific question I have is that , suppose i
 > have a statement var = read_value() . can I directly
 > add read_value function to be one of the taint sources
 > by adding a line in addSourcesPost function of
 > GenericTaintChecker ?

It should work. Though if you want to share your work later, then 
probably it'd be inconvenient to have very specific functions in the 
generic tain checker, and we'd have to think how to separate them.

 > And after changing the file , do i need to
 > necessarily run 'make clang' inside build directory
 > or is there any simple way to reflect the changes
 > ,since the former takes way too much time.

You do. There are some usual tricks to speed up compilation - use the 
shared libraries option, use a faster compiler (clang?), use a faster 
linker (gold?), maybe use a release build if you don't want to have a 
debugger. Try to reduce the number of linkers running in parallel, 
otherwise they may eat up all the RAM and begin to swap.

For developing new analyzer checkers, there's one more option: load them 
as a clang plugin (eg. 'clang -cc1 -load checker.so <...>'), see 
examples/analyzer-plugin/ for an example. In this case you don't need to 
rebuild clang, just the checker, but running becomes a bit more tricky - 
not sure if, say, the scan-build script supports this method.

So probably it's a good idea for you to copy GenericTaintChecker, change 
it to a plugin, and go ahead extending it.

More information about the cfe-dev mailing list