[cfe-dev] Question on inspecting information across program paths

Venugopal Raghavan via cfe-dev cfe-dev at lists.llvm.org
Mon Feb 13 04:57:55 PST 2017


Hi Aleksei,

Thanks, I have re-written the checker based on your suggestion.

The custom program state that a checker adds to a program state gets
propagated to succeeding states also unless they are explicitly removed. Is
there a way to write some data into the custom state which is valid only
for that state and does not get propagated to succeeding states? I did
something like this:

ProgramStateRef state = C.getState();   // C is CheckerContext
const LocationContext *lctx = C.getLocationContext();
SVal val = state->getSVal(S, lctx);
SymbolRef Sym = val.getAsSymbol();  // Symbol I want to be added to this
state and to this state alone

// Remove existing custom state data
const CustomStateTy &Syms = state->get<CustomState>();
 for (auto I = Syms.begin(), E = Syms.end(); I != E; ++I) {
     SymbolRef Sym1 = *I;
    state = state->remove<CustomState>(Sym1);
 }
state = state->add<CustomState>(Sym);  // Add the symbol that is needed for
the current state

The idea was to remove custom state information from a state and then write
the information (symbol) corresponding to that state. Thus I write the
symbol to a state S and that creates a new state S1 with that symbol. S
itself is not modified since states are immutable. Now, the state which
succeeds S1 (which I call S2) also gets the same symbol through state
propagation which should however not be associated with that state. Hence I
try remove the symbol from S2 but it does not actually remove it from that
S2 but creates a new state S3 with the symbol removed (due to the immutable
nature of states).

Finally, I would still be left with states such as S2 which has information
which do not belong to them. Can I fix this in some way?

Not sure if this is a requirement outside the support provided by the
analyzer or I have created a requirement which is unnatural.

Thanks.

Regards,
Venu.

On Thu, Feb 9, 2017 at 8:54 PM, Aleksei Sidorin <a.sidorin at samsung.com>
wrote:

> Hello Venugopal,
>
> Unfortunately, there is no a simple way to get "symbol diff". Your
> approach with sequential iteration seems right to me in general.
> About comparison of symbols. If you want just ensure that some symbols are
> same or different (like reg_$0<x> and reg_$1<y> which are different), you
> can just compare their pointers (SymbolRefs). Symbols are not modifiable,
> every symbol is allocated in the pool and lives here, SymbolRefs just point
> on it. If these pointers are different, symbols are different too, and
> equal otherwise.
>
>
> 09.02.2017 18:20, Aleksei Sidorin пишет:
>
> Hello Venugopal,
>
> If you want just ensure that some symbols are same (like reg_$0<x> and
> reg_$1<y>), you can just compare their pointers (SymbolRefs). Symbols are
> not modifiable, every symbol is allocated in the pool and lives here,
> SymbolRefs just point on it. If these pointers are different, symbols are
> different too, and equal otherwise.
>
>
> 09.02.2017 13:03, Venugopal Raghavan пишет:
>
> Hi,
>
> Apologies, but, I am back again.
>
> I have two SymbolRef pointers (SymExpr *). One SymbolRef is (reg_$0<x>) +
> 3$117. The other SymbolRef is (reg_$0<x>) + 2$130. x is a variable in my
> program and these SymbolRefs respresent an assignment of x + y to a
> variable "a". y has the value 3 along one path to the program point where
> "a" is assigned and it has the value 2 along a second path to the same
> point, hence the two SymbolRefs differ in the constants.
>
> I want to compare these two SymbolRefs and find out where they differ. In
> this case, I want to be able to say that they differ in the constant value.
> I thought of using symbol_iterator to iterate over each SymbolRef or
> SymExpr * in lock-step fashion and find the first point where it differs.
> But these is no operator== defined in the SymExpr class and hence I am
> unable to check the equality or otherwise of two symbols in the SymbolRefs.
>
> Is there a way to compare two SymbolRefs and check for the differences?
> Otherwise, is there a better way to check whether the symbolic value at the
> same program point but in two different program states are the same or not,
> and if they are different, is there a way to identify the difference?
> Maybe, my approach of comparing two symbolic expressions at a program point
> is not the right approach in the first place.
>
> Thanks.
>
> Regards,
> Venu.
>
> On Mon, Feb 6, 2017 at 2:42 PM, Venugopal Raghavan <venur2005 at gmail.com>
> wrote:
>
>> Hi Aleksei,
>>
>> Thanks. That seems to work.
>>
>> Regards,
>> Venu.
>>
>> On Mon, Feb 6, 2017 at 2:05 PM, Aleksei Sidorin <a.sidorin at samsung.com>
>> wrote:
>>
>>> Hello,
>>>
>>> On the point where PreStmt callbacks are called, the expression value is
>>> still not computed so the result of getSVal() is always UnknownVal and not
>>> a symbol. You should probably use PostStmt callback instead.
>>>
>>>
>>> 06.02.2017 08:21, Venugopal Raghavan пишет:
>>>
>>> Hi Aleksei,
>>>
>>> Thanks for the reply. Unfortunately, I am having further trouble with my
>>> checker implementation. I have the following code in my checker (called
>>> ConstantChecker):
>>>
>>>  void ConstantChecker::checkPreStmt(const Stmt *S, CheckerContext &C)
>>> const {
>>>         ProgramStateRef state = C.getState();
>>>         const LocationContext *lctx = C.getLocationContext();
>>>         SVal val = state->getSVal(S, lctx);
>>>         SymbolRef Sym = val.getAsSymbol();
>>>         // Stmt 1
>>>                       .
>>>                       .
>>>                       .
>>> }
>>>
>>> My input test case for the checker is test1.c and I run the
>>> command clang -cc1 -analyze -analyzer-checker=optin.performance.PathSpecificConstants
>>> test1.c where PathSpecificConstants is the name of my checker:
>>>
>>> test(int x) {
>>>    int y;
>>>    int a;
>>>    if (x > 0) {
>>>      y = 2;
>>>    }
>>>    else {
>>>      y = 3;
>>>    }
>>>    a = x + y;
>>>    printf("a = %d\n", a);
>>> }
>>>
>>> When I print "Sym" in gdb at the break point Stmt 1, I always get Sym as
>>> 0x0. This probably means that val is never a symbol for my test case and
>>> hence val.getAsSymbol() fails. I do not understand this. I would have
>>> thought that at least the expression x + y would be represented by a
>>> symbolic value such as $0. I tried doing const MemRegion *mem =
>>> val.getAsRegion() also at that  point and tried to print out mem in gdb.
>>> That also has a value 0x0 except for the printf statement. I cannot
>>> understand this too. I would have expected variables such as "x", "y" and
>>> "a" to be associated with a memory region and hence mem to have a non-null
>>> value at least at the points where y and a are being assigned to.
>>>
>>> My plan was to use Sym as the index into a custom map that I have
>>> created where I am storing the values of constants seen in the test case. I
>>> am unable to get this plan working since Sym is always 0x0.
>>>
>>> I have gone through the Checker Developer Manual. I am sorry to be
>>> asking specific questions about my code in this fashion, but,
>>> unfortunately, I am stuck at this point.
>>>
>>> Thanks.
>>>
>>> Regards,
>>> Venugopal Raghavan.
>>>
>>> On Wed, Feb 1, 2017 at 11:13 PM, Aleksei Sidorin <a.sidorin at samsung.com>
>>> wrote:
>>>
>>>> Hello Venugopal,
>>>>
>>>> You will need to extract all the information from ExplodedNodes.
>>>> For example, state is a part of ExplodedNode. So, you can iterate over
>>>> ExplodedNodes of graph using smth like:
>>>> for (auto Iter = G.nodes_begin(), E = G.nodes_end(); I != E; ++I) {
>>>>   ProgramStateRef State = I->getState();
>>>>   // Do the stuff with State
>>>> }
>>>>
>>>>
>>>>
>>>>
>>>> 01.02.2017 12:19, Venugopal Raghavan пишет:
>>>>
>>>> Hi,
>>>>
>>>> I am registering my own map in the program state using the the
>>>> following:
>>>>
>>>> REGISTER_MAP_WITH_PROGRAMSTATE(ConstantMap, SymbolRef,
>>>> VariableConstVal)
>>>>
>>>> where VariableConstVal is the structure containing the information I
>>>> want in this map.
>>>>
>>>> I am populating this map in the callback checkPostStmt() using state =
>>>> state->set<ConstantMap>(sym, ...) while the state graph is being
>>>> constructed.
>>>>
>>>> At the end of analysis, I want to now read the information stored in
>>>> this map in the callback checkEndAnalysis(ExplodedGraph &G,
>>>> BugReporter &B, ExprEngine &Eng). However, I am not sure how to get the
>>>> SymbolRef associated with an ExplodedNode which is what I get when I
>>>> iterate over the ExplodedGraph at the end of the analysis.
>>>>
>>>> How can I get the SymbolRef that I can use to index into my map? In
>>>> checkPostStmt(stmt* S, CheckerContext &C), I used the following code to get
>>>> the SymbolRef that I can use to store the information in the map:
>>>>
>>>>   SVal val = state->getSVal(S, lctx);
>>>>   SymbolRef Sym = val.getAsSymbol();
>>>>
>>>> but in checkEndAnalysis(ExplodedGraph &G, BugReporter &B, ExprEngine
>>>> &Eng) I do not seem to have a handle to stmt* to get the SVal.
>>>>
>>>> I guess my question is probably not phrased very well, but I hope some
>>>> one can make some sense out of it. Also, I guess I have a long way to go
>>>> before I understand the data structures in the analyzer well. I checked the
>>>> other checkers, but none of them seem to exhibit exactly my requirement.
>>>>
>>>> Thanks.
>>>>
>>>> Regards,
>>>> Venugopal Raghavan.
>>>>
>>>> On Wed, Jan 25, 2017 at 3:19 PM, Venugopal Raghavan <
>>>> venur2005 at gmail.com> wrote:
>>>>
>>>>> Hi Aleksei,
>>>>>
>>>>> Thanks. I will check this.
>>>>>
>>>>> Regards,
>>>>> Venu.
>>>>>
>>>>> On Wed, Jan 25, 2017 at 3:15 PM, Aleksei Sidorin <
>>>>> a.sidorin at samsung.com> wrote:
>>>>>
>>>>>> Hello Venugopal,
>>>>>>
>>>>>> During analysis, you have access to a single path only. But after the
>>>>>> analysis is done, you can summarize information across different paths
>>>>>> using checkEndAnalysis() callback. You will get full ExplodedGraph built
>>>>>> for the function so you will be able to look into its every node.
>>>>>>
>>>>>>
>>>>>> 25.01.2017 09:25, Venugopal Raghavan via cfe-dev пишет:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Firstly, apologies for the long email below.
>>>>>>
>>>>>> I am new to the Clang Static Analyzer (CLA). My interest in CLA was
>>>>>> more to see if I can solve quickly path-sensitive data-flow analysis
>>>>>> problems than as a vehicle to build bug-finding checkers. I needed the
>>>>>> solutions to such data flow problems to extract some behavioral properties
>>>>>> from some test cases I have. For example, consider the following piece of
>>>>>> code:
>>>>>>
>>>>>> if (cond) {
>>>>>>    x = 4;
>>>>>> }
>>>>>> else {
>>>>>>    x = 5;
>>>>>> }
>>>>>>
>>>>>> L: .... = ...x..; // Use of variable x
>>>>>>
>>>>>> I want to identify program locations such as "L" in the code above
>>>>>> where x is not a constant if you aggregate data-flow information across
>>>>>> paths, but, on the other hand, is actually a constant if you have
>>>>>> path-specific data flow information. Since CFA does path-specific analysis,
>>>>>> I was curious to know if it would help me with such tasks. The "bug-report"
>>>>>> I want at location L is that x has a value 4, given the path constraint for
>>>>>> the path including the "then-path" of the if statement and that x has a
>>>>>> value 5 along the else-path.
>>>>>>
>>>>>> I started writing a checker to accomplish the above task, but was
>>>>>> quickly blocked by some basic doubts. My rough plan for the checker was to
>>>>>> maintain a map in ProgramState which maps ProgramPoints (or, maybe program
>>>>>> symbols) to constants when those variables indeed have constant values at
>>>>>> the ProgramPoint. For example, when CFA expands states along the then-path
>>>>>> and reaches "L", my map would say that at ProgramPoint "L", variable x has
>>>>>> a constant value 4. Then, when CFA expands nodes along the else-path, I
>>>>>> guess it would again create a state corresponding to L which now says that
>>>>>> x has the constant value 5. I want to be able to catch this scenario where
>>>>>> the same variable at the same ProgramPoint has two different constant
>>>>>> values depending on the path taken.
>>>>>>
>>>>>> However, the issue is that, when the state graph along the else-path
>>>>>> is expanded, it would no longer have any of the map contents that was built
>>>>>> along the disjoint then-path. How then can I get access to the then-path
>>>>>> information when I am expanding the path along the else-path? Since the
>>>>>> checker is also required to be stateless, I cannot maintain auxiliary
>>>>>> information in the checker that is persistent across multiple calls to the
>>>>>> call-back function.
>>>>>>
>>>>>> Can you help me with this question? Maybe, I am trying to use CFA in
>>>>>> ways it was not meant to be used viz. as a vehicle to solve data flow
>>>>>> problems which consults information cutting across paths. Maybe, the CFA is
>>>>>> meant to handle problems which require the examination of information
>>>>>> restricted to single paths. Is that the case? Or, am I missing something?
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> Regards,
>>>>>> Venugopal Raghavan.
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> cfe-dev mailing listcfe-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>
>>>>>> --
>>>>>> Best regards,
>>>>>> Aleksei Sidorin,
>>>>>> SRR, Samsung Electronics
>>>>>>
>>>>>> --
>>>> Best regards,
>>>> Aleksei Sidorin,
>>>> SRR, Samsung Electronics
>>>>
>>>> --
>>> Best regards,
>>> Aleksei Sidorin,
>>> SRR, Samsung Electronics
>>>
>>> --
> Best regards,
> Aleksei Sidorin,
> SRR, Samsung Electronics
>
> --
> Best regards,
> Aleksei Sidorin,
> SRR, Samsung Electronics
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170213/8869c6f0/attachment.html>


More information about the cfe-dev mailing list