[cfe-dev] [analyzer] Binding address-of globals

Artem Dergachev via cfe-dev cfe-dev at lists.llvm.org
Tue Jun 12 14:51:10 PDT 2018



On 6/12/18 7:05 AM, Rafael·Stahl wrote:
>
> Alright thanks for the info. As I see it number 2 should already be 
> solved, but number 1 is still not clear to me.
>
> The issue is that there is no direct binding available, as is with the 
> non-global case.
>
> - Non-global: Will return direct binding from getBindingForField. The 
> initialization earlier in main caused this direct binding.
>
> - Global: Does not find direct binding in getBindingForField and 
> cannot resolve FieldInit to a constant.
>

Well, there never is a direct binding for anything unless it was put 
there during analysis, which is not the case for global initializers. 
That's the exact problem you're solving.

I guess the difference here is that you can't evaluate the initializer 
expression in compile time (because the actual numeric value for address 
of the global is not known before the program is run), but during 
analysis we don't care about the precise numeric value of the address. 
The SVal that represents the address of a global variable 
(loc::MemRegionVal that wraps a VarRegion) says exactly that: "it's the 
address of that global variable" without specifying what this address is.

So you'd have to step away from the constant folding methods used by the 
compiler (eg. EvaluateAsInt) and implement analyzer-specific constant 
folding that works similarly but collapses the expression to a concrete 
value in the analyzer's sense rather than to a compile-time constant 
value. So that DeclRefExpr(VarDecl) would collapse to a 
loc::MemRegionVal(VarRegion) which is State->getLValue(VarDecl, LCtx), 
where LCtx is obviously ignored for global variables.

> Now I could add some code to the case where getConstantVal fails to 
> look at the FieldInit Expr and return a new FieldRegion in a 
> loc::MemRegionVal if I find UnaryOp(&) -> DeclRefExpr(FieldDecl). The 
> issue is that this is very tailored to the example and does not work 
> in general. I feel like the SVal for the FieldInit Expr should be 
> available somewhere but I cannot figure out where.
>
> There is ProgramState::getSVal(const Stmt*, const LocationContext*) 
> but not sure if this is applicable here - also because the RegionStore 
> doesn't seem to have any ProgramState or LocationContext.
>
> Rafael
>
>
> On 12.06.2018 01:59, Artem Dergachev wrote:
>> Hmm. It sounds as if we need to fix both things here, and both of 
>> them are something that you already know how to solve:
>>
>> 1. Be able to constant-fold "gs.sub" to "&gsubs",
>> 2. Be able to constant-fold "(&gsubs)->p" to "0x80008000".
>>
>> I guess the confusion arises because steps 1 and 2 are separated in 
>> time; they are in fact two independent loads. They interact through 
>> the Environment: we compute the sub-expression, put its value into 
>> the Environment, then later when we need to perform the second load 
>> we can retrieve the value from the Environment. Once we perform the 
>> first load correctly, it becomes irrelevant that such load ever 
>> happened; ExprEngine, like checkers, is stateless. The problem 
>> becomes as easy as loading "gsubs.p" because the analyzer knows, in 
>> path-sensitive manner, that the sub-expression "gs.sub" has evaluated 
>> to "&gsubs"; that'd be already encoded in the MemRegion structure.
>>
>> So i think we don't need to retroactively create anything. Instead, 
>> we simply need to perform every step precisely. Which is anyway a 
>> good thing because there's always code that never gets to the second 
>> step.
>>
>> Sorry if the answer is not spot-on; i'm not sure i fully understood 
>> the question.
>>
>> On 6/7/18 1:52 AM, Rafael·Stahl via cfe-dev wrote:
>>> Hi,
>>>
>>> continuing my effort to make the analyzer understand more constants, 
>>> I did take a look at the following case:
>>>
>>>
>>> struct SubS {
>>>     int *p;
>>> };
>>>
>>> struct S {
>>>     struct SubS *sub;
>>> };
>>>
>>> struct SubS const gsubs = {
>>>     .p = 0x80008000
>>> };
>>> struct S const gs = {
>>>     .sub = &gsubs
>>> };
>>>
>>> int main() {
>>>     struct SubS subs = {
>>>         .p = 0x80008000
>>>     };
>>>     struct S s = {
>>>         .sub = &subs
>>>     };
>>>
>>>     *s.sub->p;
>>>     *gs.sub->p;
>>> }
>>>
>>> Here, the analyzer recognizes the dereference via s, but not gs. 
>>> This seems to be the case because region information will be stored 
>>> for subs, but not for gsubs.
>>>
>>> I'm not sure how to solve this issue. Could we retroactively create 
>>> the region information whenever we encounter constants like this? Or 
>>> rather add something to the getBinding functions that manually 
>>> resolves this case? For the latter it seems like the analyzer should 
>>> already understand what is happening without many additions, but 
>>> it's unclear to me how it connects.
>>>
>>> Best regards
>>> Rafael
>>>
>>>
>>>
>>> _______________________________________________
>>> cfe-dev mailing list
>>> cfe-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180612/1fc50554/attachment.html>


More information about the cfe-dev mailing list