[cfe-dev] [analyzer] Conjuring symbols in checkBeginFunction()

Artem Dergachev via cfe-dev cfe-dev at lists.llvm.org
Fri Jun 22 16:02:06 PDT 2018



On 6/21/18 12:21 AM, Julian Ganz wrote:
> Hi Artem,
>
> thanks for the reply. In retrospect what I wrote might have been a little
> confusing. I hope the following does explain the problem a bit better.
> Things should get clearer especially towards the end of the mail.
>
> Also, apparently you replied to me directly. It seems the mail did not hit
> the list(yet). Was this on purpose?

Mmm, nope, the message is in fact there on the list: 
http://lists.llvm.org/pipermail/cfe-dev/2018-June/058256.html

Also i accidentally recalled that we've had a conversation on this 
subject already, where i did actually forget to reply-all: 
http://lists.llvm.org/pipermail/cfe-dev/2018-March/057297.html

>
>>> and, finally, getting a symbol representing this region's value:
>>>       auto symbol = C.getSymbolManager().getRegionValueSymbol(fieldRegion);
>> You shouldn't be doing this. You can obtain the same result by using
>> State->getSVal(fieldRegion), which would simply retrieve the value
>> that's already bound to the region before the analysis begins. It's not
>> printed or explicitly written out in the RegionStore's maps, but it's
>> "just there".
> That's exactly my problem: apparently there _is_ no value bound to the
> region.

IT IS THERE. Merely optimized out.

>>> In callbacks executed afterwards, I indeed see that some symbol is bound to
>>> the field. However, it appears not to be the one I created. E.g.
>>> `state->dump()` prints:
>>>       Expressions:
>>>         (0xf5e7470,0x931a828) this->field :
>>> &base{SymRegion{derived_$224{conj_$221{int},base{base{base{base{SymRegion{re
>>> g_$0<class ObjType * this>}->field,somethig}, somethig }, somethig else},
>>> somethig else }-> somethig else }}, somethig else }
>> It means that the field was overwritten during analysis and now contains
>> a different unknown value that's denoted by a different symbol.
> Yes, that's what I thought.
>
>>  From your dump, you should be able to retrieve the &base{...} region
>> from the expression 0x931a828 with location context 0xf5e7470.
> Yep, this would require a (potentially custom) symbol-/memory-visitor.
> However, as you already pointed out yourself, I'd be unable to find the
> symbol I created anyway.
>
>> It might be that you're supplying a wrong LocationContext (which
>> represents a stack frame, eg. during recursion the same active
>> expression may have different values on different stack frames in the
>> backtrace).
> I thought so myself. But the location context is the correct one correct.
> It's the expression that's somehow different.
>
>> It might also be that MemberExpr is not an active expression.
>> Environment maintains only expressions that are currently under
>> evaluation, because you can't even define what a value of an expression
>> is if you aren't in the process of evaluating it.
> That's exactly what I'm trying to do: attaching a symbol to a member of an
> object _in_advance_

I mean, that's not how the language works. One does not simply prevent a 
variable from being overwritten. Instead you should be making sure that 
contents of the memory are correct, so that the expression automatically 
evaluates to the value that you expect to see.

> , e.g. before any expression is processed by the
> analyzer. Because I _do_ know some facts about that field that the analyzer
> doesn't. I do this because doing a potentially expensive operation on every
> single instance of a `MemberExpr` wasn't an option for me for multiple
> reasons. It also greatly reduces the precision of the analysis since, of
> course, the member might be modified resulting in the symbol to be
> removed/altered. This, in turn, would result in the additional information
> to be wrongly re-attached the next time I see it because I have no way to
> tell whether I already processed the member without either keeping some
> additional data-structure or looking at every single predecessor of the
> current `ExplodedNode`.
>
> Frankly,by now I reached the conclusion that some of the design decisions
> made for the static analyzer are rather questionable, but that's another
> topic.

I mean, it's strange how you come to that conclusion by desperately 
trying to do something that "greatly reduces the precision of the analysis".

>> That's why we have symbols and regions that are distinct from expressions
>> and actually have any sort of meaning in more than just one moment of
>> time.
> I totally understand; and that's not the problem I'm facing.
>
>> If you need to recover a value of an expression that was evaluated long
>> time ago, you most likely should use program state traits to track the
>> information your checker needs.
> The problem is not that the information was collected "a long time ago" but
> _before_ the analysis even began.
>
>>> Is there actually any way to make this work with the (current) clang static
>>> analyzer? E.g. to create symbols wich are not (yet) "backed" by an
>>> expression encountered in the AST?
>> I didn't fully understand what are your final goals with this, but it
>> definitely sounds like you're on a wrong track with these attempts. I
>> might be able to help if you explain what sort of check are you trying
>> to implement and what sort of code you're running it on (based on your
>> dump, it's a large piece of code, so you might have to reduce it).
> Basically, I have to run some sort of iterative taint analysis which relies
> on additional information provided from outside the (current) analyzer. The
> problem I face is that I need to introduce known tainting information for an
> object from the configuration and/or previous runs. I also came up with
> another solution which circumvents the analyzer's limitations, but it's
> rather awful.
>
> Regards,
> Julian
>
> .........................................................
> Julian Ganz
> Wissenschaftlicher Mitarbeiter | Research Assistant
> Systementwurf in der Mikroelektronik (SiM)
>
> FZI Forschungszentrum Informatik
> Haid-und-Neu-Str. 10–14
> 76131 Karlsruhe, Germany
> Tel.: +49 721 9654-440
>
> ganz at fzi.de
> www.fzi.de | www.twitter.com/FZI_official | www.facebook.com/FZI.Official |
> www.youtube.com/FZIchannel
>
> .........................................................
> FZI Forschungszentrum Informatik am Karlsruher Institut für Technologie
> Stiftung des bürgerlichen Rechts
> Stiftung Az: 14-0563.1 Regierungspräsidium Karlsruhe
> Vorstand: Prof. Dr. Andreas Oberweis, Jan Wiesenberger, Prof. Dr.-Ing. J.
> Marius Zöllner
> Vorsitzender des Kuratoriums: Ministerialdirigent Günther Leßnerkraus
> .........................................................
>




More information about the cfe-dev mailing list