[cfe-dev] Question on inspecting information across program paths

Venugopal Raghavan via cfe-dev cfe-dev at lists.llvm.org
Wed Feb 22 01:34:43 PST 2017


Hi Aleksei,

I am getting false positives in my checker when I try test cases with
loops. Consider the following test case:

test(int x) {
   int y = 0;
   int a = 0;
   int b;

   while (y++ < 10) {   // S1
     if (x < 0) {            // S2
       a = 1;
       break;
     }
     a++;
   }

   if (a < 10) {
     b = 0;
   }
   else {
     b = 1;                 // S3
   }
   printf("%d\n", b);
}

When the condition in S2 is false, it appears that the exploded graph does
not follow all 10 iterations of the while loop and hence that branch in the
exploded graph does not even hit S3 because the exploration stops with a
message "Block count exceeded". Hence, my checker thinks that S3 is
unreachable from S1 and I get a false positive.

Is there a way to increase the block count? Is the handling of
constant-trip count loops in a issue in the analyzer or is there is a fix
for this?

Thanks.

Regards,
Venu.

On Thu, Feb 16, 2017 at 10:51 AM, Venugopal Raghavan <venur2005 at gmail.com>
wrote:

> Hi Aleksei,
>
> Once again thanks a lot for your inputs. I have modified the checker to
> create a chain of ExplodedNodes as you suggested and will test it.
>
> BTW, I used C.addTransition(state, E) instead of C.generateNonFatalError
> Node(InitialState, MarkedNode)) because I did not find a version of
> generateNonFatalErrorNode() whose second argument is the predecessor
> ExplodedNode to which the current node needs to be attached.
>
> Regards,
> Venu.
>
> On Tue, Feb 14, 2017 at 5:20 PM, Aleksei Sidorin <a.sidorin at samsung.com>
> wrote:
>
>> Hi Venugopal,
>>
>> There is a way to do it. The simplest way, as I guess, is to create a
>> chain of ExplodedNodes with your states. The code will look similar to this:
>>
>> ProgramStateRef InitialState = C.getState();
>> ProgramStateRef MarkedState = InitialState->set<...>(...);
>> if (ExplodedNode *MarkedNode = C.generate[NonFatalError?]Node(MarkedState))
>> {
>>   ExplodedNode *NonMarkedNode = C.generate[NonFatalError?]Node(InitialState,
>> MarkedNode))
>> }
>>
>> So, the chain will look like this:
>> <InitialState, InitialNode> => <MarkedState, MarkedNode> =>
>> <InitialState, NonMarkedNode>
>>
>> But if you want to just add some mark to a newly-created node, you can
>> not use state at all. Instead, you can just create a node with a specified
>> tag.
>>
>> static SimpleProgramPointTag CustomTag("My marked node!");
>> ExplodedNode *NodeWithSpecialTag = C.generate[NonFatalError?]Node(NewState,
>> C.getPredecessor(), &CustomTag);
>>
>>
>>
>> 13.02.2017 15:57, Venugopal Raghavan пишет:
>>
>> Hi Aleksei,
>>
>> Thanks, I have re-written the checker based on your suggestion.
>>
>> The custom program state that a checker adds to a program state gets
>> propagated to succeeding states also unless they are explicitly removed. Is
>> there a way to write some data into the custom state which is valid only
>> for that state and does not get propagated to succeeding states? I did
>> something like this:
>>
>> ProgramStateRef state = C.getState();   // C is CheckerContext
>> const LocationContext *lctx = C.getLocationContext();
>> SVal val = state->getSVal(S, lctx);
>> SymbolRef Sym = val.getAsSymbol();  // Symbol I want to be added to this
>> state and to this state alone
>>
>> // Remove existing custom state data
>> const CustomStateTy &Syms = state->get<CustomState>();
>>  for (auto I = Syms.begin(), E = Syms.end(); I != E; ++I) {
>>      SymbolRef Sym1 = *I;
>>     state = state->remove<CustomState>(Sym1);
>>  }
>> state = state->add<CustomState>(Sym);  // Add the symbol that is needed
>> for the current state
>>
>> The idea was to remove custom state information from a state and then
>> write the information (symbol) corresponding to that state. Thus I write
>> the symbol to a state S and that creates a new state S1 with that symbol. S
>> itself is not modified since states are immutable. Now, the state which
>> succeeds S1 (which I call S2) also gets the same symbol through state
>> propagation which should however not be associated with that state. Hence I
>> try remove the symbol from S2 but it does not actually remove it from that
>> S2 but creates a new state S3 with the symbol removed (due to the immutable
>> nature of states).
>>
>> Finally, I would still be left with states such as S2 which has
>> information which do not belong to them. Can I fix this in some way?
>>
>> Not sure if this is a requirement outside the support provided by the
>> analyzer or I have created a requirement which is unnatural.
>>
>> Thanks.
>>
>> Regards,
>> Venu.
>>
>> On Thu, Feb 9, 2017 at 8:54 PM, Aleksei Sidorin <a.sidorin at samsung.com>
>> wrote:
>>
>>> Hello Venugopal,
>>>
>>> Unfortunately, there is no a simple way to get "symbol diff". Your
>>> approach with sequential iteration seems right to me in general.
>>> About comparison of symbols. If you want just ensure that some symbols
>>> are same or different (like reg_$0<x> and reg_$1<y> which are different),
>>> you can just compare their pointers (SymbolRefs). Symbols are not
>>> modifiable, every symbol is allocated in the pool and lives here,
>>> SymbolRefs just point on it. If these pointers are different, symbols are
>>> different too, and equal otherwise.
>>>
>>>
>>> 09.02.2017 18:20, Aleksei Sidorin пишет:
>>>
>>> Hello Venugopal,
>>>
>>> If you want just ensure that some symbols are same (like reg_$0<x> and
>>> reg_$1<y>), you can just compare their pointers (SymbolRefs). Symbols are
>>> not modifiable, every symbol is allocated in the pool and lives here,
>>> SymbolRefs just point on it. If these pointers are different, symbols are
>>> different too, and equal otherwise.
>>>
>>>
>>> 09.02.2017 13:03, Venugopal Raghavan пишет:
>>>
>>> Hi,
>>>
>>> Apologies, but, I am back again.
>>>
>>> I have two SymbolRef pointers (SymExpr *). One SymbolRef is (reg_$0<x>)
>>> + 3$117. The other SymbolRef is (reg_$0<x>) + 2$130. x is a variable in my
>>> program and these SymbolRefs respresent an assignment of x + y to a
>>> variable "a". y has the value 3 along one path to the program point where
>>> "a" is assigned and it has the value 2 along a second path to the same
>>> point, hence the two SymbolRefs differ in the constants.
>>>
>>> I want to compare these two SymbolRefs and find out where they differ.
>>> In this case, I want to be able to say that they differ in the constant
>>> value. I thought of using symbol_iterator to iterate over each SymbolRef or
>>> SymExpr * in lock-step fashion and find the first point where it differs.
>>> But these is no operator== defined in the SymExpr class and hence I am
>>> unable to check the equality or otherwise of two symbols in the SymbolRefs.
>>>
>>> Is there a way to compare two SymbolRefs and check for the differences?
>>> Otherwise, is there a better way to check whether the symbolic value at the
>>> same program point but in two different program states are the same or not,
>>> and if they are different, is there a way to identify the difference?
>>> Maybe, my approach of comparing two symbolic expressions at a program point
>>> is not the right approach in the first place.
>>>
>>> Thanks.
>>>
>>> Regards,
>>> Venu.
>>>
>>> On Mon, Feb 6, 2017 at 2:42 PM, Venugopal Raghavan <venur2005 at gmail.com>
>>> wrote:
>>>
>>>> Hi Aleksei,
>>>>
>>>> Thanks. That seems to work.
>>>>
>>>> Regards,
>>>> Venu.
>>>>
>>>> On Mon, Feb 6, 2017 at 2:05 PM, Aleksei Sidorin <a.sidorin at samsung.com>
>>>> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> On the point where PreStmt callbacks are called, the expression value
>>>>> is still not computed so the result of getSVal() is always UnknownVal and
>>>>> not a symbol. You should probably use PostStmt callback instead.
>>>>>
>>>>>
>>>>> 06.02.2017 08:21, Venugopal Raghavan пишет:
>>>>>
>>>>> Hi Aleksei,
>>>>>
>>>>> Thanks for the reply. Unfortunately, I am having further trouble with
>>>>> my checker implementation. I have the following code in my checker (called
>>>>> ConstantChecker):
>>>>>
>>>>>  void ConstantChecker::checkPreStmt(const Stmt *S, CheckerContext &C)
>>>>> const {
>>>>>         ProgramStateRef state = C.getState();
>>>>>         const LocationContext *lctx = C.getLocationContext();
>>>>>         SVal val = state->getSVal(S, lctx);
>>>>>         SymbolRef Sym = val.getAsSymbol();
>>>>>         // Stmt 1
>>>>>                       .
>>>>>                       .
>>>>>                       .
>>>>> }
>>>>>
>>>>> My input test case for the checker is test1.c and I run the
>>>>> command clang -cc1 -analyze -analyzer-checker=optin.performance.PathSpecificConstants
>>>>> test1.c where PathSpecificConstants is the name of my checker:
>>>>>
>>>>> test(int x) {
>>>>>    int y;
>>>>>    int a;
>>>>>    if (x > 0) {
>>>>>      y = 2;
>>>>>    }
>>>>>    else {
>>>>>      y = 3;
>>>>>    }
>>>>>    a = x + y;
>>>>>    printf("a = %d\n", a);
>>>>> }
>>>>>
>>>>> When I print "Sym" in gdb at the break point Stmt 1, I always get Sym
>>>>> as 0x0. This probably means that val is never a symbol for my test case and
>>>>> hence val.getAsSymbol() fails. I do not understand this. I would have
>>>>> thought that at least the expression x + y would be represented by a
>>>>> symbolic value such as $0. I tried doing const MemRegion *mem =
>>>>> val.getAsRegion() also at that  point and tried to print out mem in gdb.
>>>>> That also has a value 0x0 except for the printf statement. I cannot
>>>>> understand this too. I would have expected variables such as "x", "y" and
>>>>> "a" to be associated with a memory region and hence mem to have a non-null
>>>>> value at least at the points where y and a are being assigned to.
>>>>>
>>>>> My plan was to use Sym as the index into a custom map that I have
>>>>> created where I am storing the values of constants seen in the test case. I
>>>>> am unable to get this plan working since Sym is always 0x0.
>>>>>
>>>>> I have gone through the Checker Developer Manual. I am sorry to be
>>>>> asking specific questions about my code in this fashion, but,
>>>>> unfortunately, I am stuck at this point.
>>>>>
>>>>> Thanks.
>>>>>
>>>>> Regards,
>>>>> Venugopal Raghavan.
>>>>>
>>>>> On Wed, Feb 1, 2017 at 11:13 PM, Aleksei Sidorin <
>>>>> a.sidorin at samsung.com> wrote:
>>>>>
>>>>>> Hello Venugopal,
>>>>>>
>>>>>> You will need to extract all the information from ExplodedNodes.
>>>>>> For example, state is a part of ExplodedNode. So, you can iterate
>>>>>> over ExplodedNodes of graph using smth like:
>>>>>> for (auto Iter = G.nodes_begin(), E = G.nodes_end(); I != E; ++I) {
>>>>>>   ProgramStateRef State = I->getState();
>>>>>>   // Do the stuff with State
>>>>>> }
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> 01.02.2017 12:19, Venugopal Raghavan пишет:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am registering my own map in the program state using the the
>>>>>> following:
>>>>>>
>>>>>> REGISTER_MAP_WITH_PROGRAMSTATE(ConstantMap, SymbolRef,
>>>>>> VariableConstVal)
>>>>>>
>>>>>> where VariableConstVal is the structure containing the information I
>>>>>> want in this map.
>>>>>>
>>>>>> I am populating this map in the callback checkPostStmt() using state
>>>>>> = state->set<ConstantMap>(sym, ...) while the state graph is being
>>>>>> constructed.
>>>>>>
>>>>>> At the end of analysis, I want to now read the information stored in
>>>>>> this map in the callback checkEndAnalysis(ExplodedGraph &G,
>>>>>> BugReporter &B, ExprEngine &Eng). However, I am not sure how to get the
>>>>>> SymbolRef associated with an ExplodedNode which is what I get when I
>>>>>> iterate over the ExplodedGraph at the end of the analysis.
>>>>>>
>>>>>> How can I get the SymbolRef that I can use to index into my map? In
>>>>>> checkPostStmt(stmt* S, CheckerContext &C), I used the following code to get
>>>>>> the SymbolRef that I can use to store the information in the map:
>>>>>>
>>>>>>   SVal val = state->getSVal(S, lctx);
>>>>>>   SymbolRef Sym = val.getAsSymbol();
>>>>>>
>>>>>> but in checkEndAnalysis(ExplodedGraph &G, BugReporter &B, ExprEngine
>>>>>> &Eng) I do not seem to have a handle to stmt* to get the SVal.
>>>>>>
>>>>>> I guess my question is probably not phrased very well, but I hope
>>>>>> some one can make some sense out of it. Also, I guess I have a long way to
>>>>>> go before I understand the data structures in the analyzer well. I checked
>>>>>> the other checkers, but none of them seem to exhibit exactly my requirement.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> Regards,
>>>>>> Venugopal Raghavan.
>>>>>>
>>>>>> On Wed, Jan 25, 2017 at 3:19 PM, Venugopal Raghavan <
>>>>>> venur2005 at gmail.com> wrote:
>>>>>>
>>>>>>> Hi Aleksei,
>>>>>>>
>>>>>>> Thanks. I will check this.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Venu.
>>>>>>>
>>>>>>> On Wed, Jan 25, 2017 at 3:15 PM, Aleksei Sidorin <
>>>>>>> a.sidorin at samsung.com> wrote:
>>>>>>>
>>>>>>>> Hello Venugopal,
>>>>>>>>
>>>>>>>> During analysis, you have access to a single path only. But after
>>>>>>>> the analysis is done, you can summarize information across different paths
>>>>>>>> using checkEndAnalysis() callback. You will get full ExplodedGraph built
>>>>>>>> for the function so you will be able to look into its every node.
>>>>>>>>
>>>>>>>>
>>>>>>>> 25.01.2017 09:25, Venugopal Raghavan via cfe-dev пишет:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Firstly, apologies for the long email below.
>>>>>>>>
>>>>>>>> I am new to the Clang Static Analyzer (CLA). My interest in CLA was
>>>>>>>> more to see if I can solve quickly path-sensitive data-flow analysis
>>>>>>>> problems than as a vehicle to build bug-finding checkers. I needed the
>>>>>>>> solutions to such data flow problems to extract some behavioral properties
>>>>>>>> from some test cases I have. For example, consider the following piece of
>>>>>>>> code:
>>>>>>>>
>>>>>>>> if (cond) {
>>>>>>>>    x = 4;
>>>>>>>> }
>>>>>>>> else {
>>>>>>>>    x = 5;
>>>>>>>> }
>>>>>>>>
>>>>>>>> L: .... = ...x..; // Use of variable x
>>>>>>>>
>>>>>>>> I want to identify program locations such as "L" in the code above
>>>>>>>> where x is not a constant if you aggregate data-flow information across
>>>>>>>> paths, but, on the other hand, is actually a constant if you have
>>>>>>>> path-specific data flow information. Since CFA does path-specific analysis,
>>>>>>>> I was curious to know if it would help me with such tasks. The "bug-report"
>>>>>>>> I want at location L is that x has a value 4, given the path constraint for
>>>>>>>> the path including the "then-path" of the if statement and that x has a
>>>>>>>> value 5 along the else-path.
>>>>>>>>
>>>>>>>> I started writing a checker to accomplish the above task, but was
>>>>>>>> quickly blocked by some basic doubts. My rough plan for the checker was to
>>>>>>>> maintain a map in ProgramState which maps ProgramPoints (or, maybe program
>>>>>>>> symbols) to constants when those variables indeed have constant values at
>>>>>>>> the ProgramPoint. For example, when CFA expands states along the then-path
>>>>>>>> and reaches "L", my map would say that at ProgramPoint "L", variable x has
>>>>>>>> a constant value 4. Then, when CFA expands nodes along the else-path, I
>>>>>>>> guess it would again create a state corresponding to L which now says that
>>>>>>>> x has the constant value 5. I want to be able to catch this scenario where
>>>>>>>> the same variable at the same ProgramPoint has two different constant
>>>>>>>> values depending on the path taken.
>>>>>>>>
>>>>>>>> However, the issue is that, when the state graph along the
>>>>>>>> else-path is expanded, it would no longer have any of the map contents that
>>>>>>>> was built along the disjoint then-path. How then can I get access to the
>>>>>>>> then-path information when I am expanding the path along the else-path?
>>>>>>>> Since the checker is also required to be stateless, I cannot maintain
>>>>>>>> auxiliary information in the checker that is persistent across multiple
>>>>>>>> calls to the call-back function.
>>>>>>>>
>>>>>>>> Can you help me with this question? Maybe, I am trying to use CFA
>>>>>>>> in ways it was not meant to be used viz. as a vehicle to solve data flow
>>>>>>>> problems which consults information cutting across paths. Maybe, the CFA is
>>>>>>>> meant to handle problems which require the examination of information
>>>>>>>> restricted to single paths. Is that the case? Or, am I missing something?
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Venugopal Raghavan.
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> cfe-dev mailing listcfe-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>>
>>>>>>>> --
>>>>>>>> Best regards,
>>>>>>>> Aleksei Sidorin,
>>>>>>>> SRR, Samsung Electronics
>>>>>>>>
>>>>>>>> --
>>>>>> Best regards,
>>>>>> Aleksei Sidorin,
>>>>>> SRR, Samsung Electronics
>>>>>>
>>>>>> --
>>>>> Best regards,
>>>>> Aleksei Sidorin,
>>>>> SRR, Samsung Electronics
>>>>>
>>>>> --
>>> Best regards,
>>> Aleksei Sidorin,
>>> SRR, Samsung Electronics
>>>
>>> --
>>> Best regards,
>>> Aleksei Sidorin,
>>> SRR, Samsung Electronics
>>>
>>> --
>> Best regards,
>> Aleksei Sidorin,
>> SRR, Samsung Electronics
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170222/1bfddc21/attachment.html>


More information about the cfe-dev mailing list