<div dir="ltr">Great, Thanks for the detailed explanation. I started out directly with this tutorial <a href="http://llvm.org/devmtg/2012-11/Zaks-Rose-Checker24Hours.pdf">http://llvm.org/devmtg/2012-11/Zaks-Rose-Checker24Hours.pdf</a> and also read the basics in clang static analyzer developer manual. But , since I don't have any prior knowledge about clang , do i need to go through any other tutorials to completely understand the code of various experimental checkers and also to write one of my own?<div><br></div><div>Another specific question I have is that , suppose i have a statement var = read_value() . can I directly add read_value function to be one of the taint sources by adding a line in addSourcesPost function of GenericTaintChecker ? And after changing the file , do i need to necessarily run 'make clang' inside build directory or is there any simple way to reflect the changes ,since the former takes way too much time.</div><div><br></div><div>Regards,</div><div>Ashwin</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Apr 5, 2016 at 3:26 PM, Artem Dergachev via cfe-dev <span dir="ltr"><<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">> For example,<br>

>   x = getchar();<br>

>   char y = x + 1;<br>

> Which part of the code taints y?<br>

<br></span>

Propagation of taint through the symbol hierarchy is done by the core automatically. In fact, no propagation is done in any continuous manner - the core just looks at the symbol that you're interested in and finds tainted sub-symbols inside it. This mechanism is implemented in the ProgramState::isTainted() methods, and it relies on the assumption that symbols on which the taint originally appears are always atomic (of SymbolData class).<br>

<br>

In your particular example, the following happens:<br>

1. getchar() returns a SymbolConjured - an atomic symbol that represents the return value. Technically, it returns an SVal of nonloc::SymbolVal class, but it is a simple wrapper around the symbol, so there isn't much difference. If you dump() the program state, you'd see it as something like "conj_$0<int>".<br>

2. The conjured symbol is stored in the memory region (VarRegion) that represents AST variable 'x' in the analyzer's memory model. If you dump() the program state, you'd see a binding in the Store: "(x, 0, direct): conj_$0<int>".<br>

3. In order to compute 'x + 1', the conjured symbol "conj_$0<int>" is loaded as (r)value of expression 'x'.<br>

4. A SymIntExpr - a symbolic expression 'conj_$0<int> + 1' is stored in the memory region of variable 'y'.<br>

5. Suppose then you ask if value of 'y' is tainted. Then symbol 'conj_$0<int> + 1' is taken to represent the value of 'y'.<br>

6. In order to see if the value is tainted, ProgramState::isTainted(SymbolRef Sym, ...) iterates over all sub-symbols of the symbolic expression.<br>

7. It finds 'conj_$0<int>' as one of such sub-symbols (and only; '1' is not a symbol).<br>

8. Seeing that 'conj_$0<int>' was marked as tainted by the TaintPropagation checker, it decides that the whole symbol is tainted. Therefore it reports that value of the expression 'y' is tainted.<div class="HOEnZb"><div class="h5"><br>

_______________________________________________<br>

cfe-dev mailing list<br>

<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>

<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev</a><br>

</div></div></blockquote></div><br></div>