[cfe-dev] Pointers as SVals

Artem Dergachev via cfe-dev cfe-dev at lists.llvm.org
Thu Jun 18 04:20:50 PDT 2020


If a symbol (SymExpr object) `$p` is an unknown numeric value of a 
memory address, then a symbolic region (i.e., SymbolicRegion object) 
`SymRegion{$p}` represents the segment of memory that starts at address 
$p and ends at another unknown position, and a pointer value 
(loc::MemRegionVal object) `&SymRegion{$p}` represents, well, a value of 
a pointer to the beginning of symbolic region `SymRegion{$p}`.

All three are basically the same thing. `SymRegion{$p}` is slightly 
different because it implies the existence of the other end of the 
segment (even if it's unknown) but `&SymRegion{$p}` is basically the 
same thing as `$p`, just represented as an object of a different type 
(SVal as opposed to SymExpr).

Think of SymbolicRegion and loc::MemRegionVal as adaptors; they don't 
change the meaning behind the object, they only represent it in a 
different manner, like a different point of view on the same entity. The 
important technical difference between `&SymRegion{$p}` and `$p` is that 
the former is Loc and the latter is NonLoc.

There's another such adaptor, nonloc::SymbolVal, that represents 
SymExprs as SVals directly. For any symbol `$p` of pointer type, 
nonloc::SymbolVal of `$p` is ill-formed; it is always going to be 
canonically represented as loc::MemRegionVal `&SymRegion{$p}` instead. 
So nonloc::SymbolVal can only be used on regular integers. This ensures 
that Loc values are always used for representing pointers (or 
references, or values of glvalue expressions) and NonLoc values are 
always used for representing integers and other prvalues of non-pointer 
type.

This entire system of adaptors might seem unnecessarily complicated and 
it probably is but i can't say we suffer too much from its existence and 
i don't have anything better in mind and i believe it adds a bit of type 
safety that helps us avoid introducing bugs in the code.

See also http://lists.llvm.org/pipermail/cfe-dev/2017-June/054084.html


 > `clang_analyzer_dump()` says it is an element region

It doesn't. It says "&Element", not "Element". This should be read as 
"address of element" and indicates that the dumped value is a 
loc::MemRegionVal, i.e. a pointer value. That's exactly how explainer 
works as well, which is why it says "pointer to".



On 6/18/20 12:57 PM, Ádám Balogh via cfe-dev wrote:
>
> Hello,
>
> I am trying to understand how to distinguish the value of the pointer 
> itself and the pointed region. However, I experience some 
> contradictions while testing. Look at the following piece of code:
>
> ```
>
> const int* get_ptr();
>
> void f() {
>
>   const int *p = get_ptr();
>
>   clang_analyzer_dump(p);
>
>   clang_analyzer_explain(p);
>
> }
>
> ```
>
> The output of this code:
>
> ```
>
> ptr_dump_explain.c:8:3: warning: &SymRegion{conj_$2{const int *, LC1, 
> S715, #1}} [debug.ExprInspection]
>
>   clang_analyzer_dump(p);
>
>   ^~~~~~~~~~~~~~~~~~~~~~
>
> ptr_dump_explain.c:9:3: warning: symbol of type 'const int *' conjured 
> at statement 'get_ptr()' [debug.ExprInspection]
>
>   clang_analyzer_explain(p);
>
>   ^~~~~~~~~~~~~~~~~~~~~~~~~
>
> ```
>
> Is `p` a region or a symbol? `clang_analyzer_dump()` says it is a 
> region, more specifically a symbolic region, but still a region. 
> However, `clang_analyzer_explain()` says it is a symbol, which I think 
> is wrong. According to `SValExplainer.h` it should print something 
> like `object at…` or `pointee of …` but not explain the raw symbol 
> without mentioning the region.
>
> I tried to change the code to the following:
>
> ```
>
> void f() {
>
>   const int *p = get_ptr();
>
>   ++p;
>
>   clang_analyzer_dump(p);
>
>   clang_analyzer_explain(p);
>
> }
>
> ```
>
> The output changes:
>
> ```
>
> ptr_dump_explain.c:9:3: warning: &Element{SymRegion{conj_$2{const int 
> *, LC1, S715, #1}},1 S64b,int} [debug.ExprInspection]
>
>   clang_analyzer_dump(p);
>
>   ^~~~~~~~~~~~~~~~~~~~~~
>
> ptr_dump_explain.c:10:3: warning: pointer to element of type 'int' 
> with index 1 of pointee of symbol of type 'const int *' conjured at 
> statement 'get_ptr()' [debug.ExprInspection]
>
>   clang_analyzer_explain(p);
>
>   ^~~~~~~~~~~~~~~~~~~~~~~~~
>
> ```
>
> This is even stranger, because here `clang_analyzer_dump()` says it is 
> an element region, thus a region of the array element. However, here 
> `clang_analyzer_explain()` says it is a pointer to the element, thus 
> not the element itself. According to `SValExplainer.h` the output for 
> an element region should begin with `element of type…`. What is wrong 
> here? Both functions take the same type of parameter:
>
> ```
>
> void clang_analyzer_dump(const int*);
>
> void clang_analyzer_explain(const int*);
>
> ```
>
> What do I misunderstand here?
>
> Regards,
>
> Ádám
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200618/092d1402/attachment-0001.html>


More information about the cfe-dev mailing list