[cfe-dev] Question on inspecting information across program paths

Wed Feb 15 21:21:19 PST 2017

Hi Aleksei,

Once again thanks a lot for your inputs. I have modified the checker to
create a chain of ExplodedNodes as you suggested and will test it.

BTW, I used C.addTransition(state, E) instead of
C.generateNonFatalErrorNode(InitialState,
MarkedNode)) because I did not find a version of generateNonFatalErrorNode()
whose second argument is the predecessor ExplodedNode to which the current
node needs to be attached.

Regards,
Venu.

On Tue, Feb 14, 2017 at 5:20 PM, Aleksei Sidorin <a.sidorin at samsung.com>
wrote:

> Hi Venugopal,
>
> There is a way to do it. The simplest way, as I guess, is to create a
> chain of ExplodedNodes with your states. The code will look similar to this:
>
> ProgramStateRef InitialState = C.getState();
> ProgramStateRef MarkedState = InitialState->set<...>(...);
> if (ExplodedNode *MarkedNode = C.generate[NonFatalError?]Node(MarkedState))
> {
>   ExplodedNode *NonMarkedNode = C.generate[NonFatalError?]Node(InitialState,
> MarkedNode))
> }
>
> So, the chain will look like this:
> <InitialState, InitialNode> => <MarkedState, MarkedNode> => <InitialState,
> NonMarkedNode>
>
> But if you want to just add some mark to a newly-created node, you can not
> use state at all. Instead, you can just create a node with a specified tag.
>
> static SimpleProgramPointTag CustomTag("My marked node!");
> ExplodedNode *NodeWithSpecialTag = C.generate[NonFatalError?]Node(NewState,
> C.getPredecessor(), &CustomTag);
>
>
>
> 13.02.2017 15:57, Venugopal Raghavan пишет:
>
> Hi Aleksei,
>
> Thanks, I have re-written the checker based on your suggestion.
>
> The custom program state that a checker adds to a program state gets
> propagated to succeeding states also unless they are explicitly removed. Is
> there a way to write some data into the custom state which is valid only
> for that state and does not get propagated to succeeding states? I did
> something like this:
>
> ProgramStateRef state = C.getState();   // C is CheckerContext
> const LocationContext *lctx = C.getLocationContext();
> SVal val = state->getSVal(S, lctx);
> SymbolRef Sym = val.getAsSymbol();  // Symbol I want to be added to this
> state and to this state alone
>
> // Remove existing custom state data
> const CustomStateTy &Syms = state->get<CustomState>();
>  for (auto I = Syms.begin(), E = Syms.end(); I != E; ++I) {
>      SymbolRef Sym1 = *I;
>     state = state->remove<CustomState>(Sym1);
>  }
> state = state->add<CustomState>(Sym);  // Add the symbol that is needed
> for the current state
>
> The idea was to remove custom state information from a state and then
> write the information (symbol) corresponding to that state. Thus I write
> the symbol to a state S and that creates a new state S1 with that symbol. S
> itself is not modified since states are immutable. Now, the state which
> succeeds S1 (which I call S2) also gets the same symbol through state
> propagation which should however not be associated with that state. Hence I
> try remove the symbol from S2 but it does not actually remove it from that
> S2 but creates a new state S3 with the symbol removed (due to the immutable
> nature of states).
>
> Finally, I would still be left with states such as S2 which has
> information which do not belong to them. Can I fix this in some way?
>
> Not sure if this is a requirement outside the support provided by the
> analyzer or I have created a requirement which is unnatural.
>
> Thanks.
>
> Regards,
> Venu.
>
> On Thu, Feb 9, 2017 at 8:54 PM, Aleksei Sidorin <a.sidorin at samsung.com>
> wrote:
>
>> Hello Venugopal,
>>
>> Unfortunately, there is no a simple way to get "symbol diff". Your
>> approach with sequential iteration seems right to me in general.
>> About comparison of symbols. If you want just ensure that some symbols
>> are same or different (like reg_$0<x> and reg_$1<y> which are different),
>> you can just compare their pointers (SymbolRefs). Symbols are not
>> modifiable, every symbol is allocated in the pool and lives here,
>> SymbolRefs just point on it. If these pointers are different, symbols are
>> different too, and equal otherwise.
>>
>>
>> 09.02.2017 18:20, Aleksei Sidorin пишет:
>>
>> Hello Venugopal,
>>
>> If you want just ensure that some symbols are same (like reg_$0<x> and
>> reg_$1<y>), you can just compare their pointers (SymbolRefs). Symbols are
>> not modifiable, every symbol is allocated in the pool and lives here,
>> SymbolRefs just point on it. If these pointers are different, symbols are
>> different too, and equal otherwise.
>>
>>
>> 09.02.2017 13:03, Venugopal Raghavan пишет:
>>
>> Hi,
>>
>> Apologies, but, I am back again.
>>
>> I have two SymbolRef pointers (SymExpr *). One SymbolRef is (reg_$0<x>) +
>> 3$117. The other SymbolRef is (reg_$0<x>) + 2$130. x is a variable in my
>> program and these SymbolRefs respresent an assignment of x + y to a
>> variable "a". y has the value 3 along one path to the program point where
>> "a" is assigned and it has the value 2 along a second path to the same
>> point, hence the two SymbolRefs differ in the constants.
>>
>> I want to compare these two SymbolRefs and find out where they differ. In
>> this case, I want to be able to say that they differ in the constant value.
>> I thought of using symbol_iterator to iterate over each SymbolRef or
>> SymExpr * in lock-step fashion and find the first point where it differs.
>> But these is no operator== defined in the SymExpr class and hence I am
>> unable to check the equality or otherwise of two symbols in the SymbolRefs.
>>
>> Is there a way to compare two SymbolRefs and check for the differences?
>> Otherwise, is there a better way to check whether the symbolic value at the
>> same program point but in two different program states are the same or not,
>> and if they are different, is there a way to identify the difference?
>> Maybe, my approach of comparing two symbolic expressions at a program point
>> is not the right approach in the first place.
>>
>> Thanks.
>>
>> Regards,
>> Venu.
>>
>> On Mon, Feb 6, 2017 at 2:42 PM, Venugopal Raghavan <venur2005 at gmail.com>
>> wrote:
>>
>>> Hi Aleksei,
>>>
>>> Thanks. That seems to work.
>>>
>>> Regards,
>>> Venu.
>>>
>>> On Mon, Feb 6, 2017 at 2:05 PM, Aleksei Sidorin <a.sidorin at samsung.com>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> On the point where PreStmt callbacks are called, the expression value
>>>> is still not computed so the result of getSVal() is always UnknownVal and
>>>> not a symbol. You should probably use PostStmt callback instead.
>>>>
>>>>
>>>> 06.02.2017 08:21, Venugopal Raghavan пишет:
>>>>
>>>> Hi Aleksei,
>>>>
>>>> Thanks for the reply. Unfortunately, I am having further trouble with
>>>> my checker implementation. I have the following code in my checker (called
>>>> ConstantChecker):
>>>>
>>>>  void ConstantChecker::checkPreStmt(const Stmt *S, CheckerContext &C)
>>>> const {
>>>>         ProgramStateRef state = C.getState();
>>>>         const LocationContext *lctx = C.getLocationContext();
>>>>         SVal val = state->getSVal(S, lctx);
>>>>         SymbolRef Sym = val.getAsSymbol();
>>>>         // Stmt 1
>>>>                       .
>>>>                       .
>>>>                       .
>>>> }
>>>>
>>>> My input test case for the checker is test1.c and I run the
>>>> command clang -cc1 -analyze -analyzer-checker=optin.performance.PathSpecificConstants
>>>> test1.c where PathSpecificConstants is the name of my checker:
>>>>
>>>> test(int x) {
>>>>    int y;
>>>>    int a;
>>>>    if (x > 0) {
>>>>      y = 2;
>>>>    }
>>>>    else {
>>>>      y = 3;
>>>>    }
>>>>    a = x + y;
>>>>    printf("a = %d\n", a);
>>>> }
>>>>
>>>> When I print "Sym" in gdb at the break point Stmt 1, I always get Sym
>>>> as 0x0. This probably means that val is never a symbol for my test case and
>>>> hence val.getAsSymbol() fails. I do not understand this. I would have
>>>> thought that at least the expression x + y would be represented by a
>>>> symbolic value such as $0. I tried doing const MemRegion *mem =
>>>> val.getAsRegion() also at that  point and tried to print out mem in gdb.
>>>> That also has a value 0x0 except for the printf statement. I cannot
>>>> understand this too. I would have expected variables such as "x", "y" and
>>>> "a" to be associated with a memory region and hence mem to have a non-null
>>>> value at least at the points where y and a are being assigned to.
>>>>
>>>> My plan was to use Sym as the index into a custom map that I have
>>>> created where I am storing the values of constants seen in the test case. I
>>>> am unable to get this plan working since Sym is always 0x0.
>>>>
>>>> I have gone through the Checker Developer Manual. I am sorry to be
>>>> asking specific questions about my code in this fashion, but,
>>>> unfortunately, I am stuck at this point.
>>>>
>>>> Thanks.
>>>>
>>>> Regards,
>>>> Venugopal Raghavan.
>>>>
>>>> On Wed, Feb 1, 2017 at 11:13 PM, Aleksei Sidorin <a.sidorin at samsung.com
>>>> > wrote:
>>>>
>>>>> Hello Venugopal,
>>>>>
>>>>> You will need to extract all the information from ExplodedNodes.
>>>>> For example, state is a part of ExplodedNode. So, you can iterate over
>>>>> ExplodedNodes of graph using smth like:
>>>>> for (auto Iter = G.nodes_begin(), E = G.nodes_end(); I != E; ++I) {
>>>>>   ProgramStateRef State = I->getState();
>>>>>   // Do the stuff with State
>>>>> }
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> 01.02.2017 12:19, Venugopal Raghavan пишет:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am registering my own map in the program state using the the
>>>>> following:
>>>>>
>>>>> REGISTER_MAP_WITH_PROGRAMSTATE(ConstantMap, SymbolRef,
>>>>> VariableConstVal)
>>>>>
>>>>> where VariableConstVal is the structure containing the information I
>>>>> want in this map.
>>>>>
>>>>> I am populating this map in the callback checkPostStmt() using state =
>>>>> state->set<ConstantMap>(sym, ...) while the state graph is being
>>>>> constructed.
>>>>>
>>>>> At the end of analysis, I want to now read the information stored in
>>>>> this map in the callback checkEndAnalysis(ExplodedGraph &G,
>>>>> BugReporter &B, ExprEngine &Eng). However, I am not sure how to get the
>>>>> SymbolRef associated with an ExplodedNode which is what I get when I
>>>>> iterate over the ExplodedGraph at the end of the analysis.
>>>>>
>>>>> How can I get the SymbolRef that I can use to index into my map? In
>>>>> checkPostStmt(stmt* S, CheckerContext &C), I used the following code to get
>>>>> the SymbolRef that I can use to store the information in the map:
>>>>>
>>>>>   SVal val = state->getSVal(S, lctx);
>>>>>   SymbolRef Sym = val.getAsSymbol();
>>>>>
>>>>> but in checkEndAnalysis(ExplodedGraph &G, BugReporter &B, ExprEngine
>>>>> &Eng) I do not seem to have a handle to stmt* to get the SVal.
>>>>>
>>>>> I guess my question is probably not phrased very well, but I hope some
>>>>> one can make some sense out of it. Also, I guess I have a long way to go
>>>>> before I understand the data structures in the analyzer well. I checked the
>>>>> other checkers, but none of them seem to exhibit exactly my requirement.
>>>>>
>>>>> Thanks.
>>>>>
>>>>> Regards,
>>>>> Venugopal Raghavan.
>>>>>
>>>>> On Wed, Jan 25, 2017 at 3:19 PM, Venugopal Raghavan <
>>>>> venur2005 at gmail.com> wrote:
>>>>>
>>>>>> Hi Aleksei,
>>>>>>
>>>>>> Thanks. I will check this.
>>>>>>
>>>>>> Regards,
>>>>>> Venu.
>>>>>>
>>>>>> On Wed, Jan 25, 2017 at 3:15 PM, Aleksei Sidorin <
>>>>>> a.sidorin at samsung.com> wrote:
>>>>>>
>>>>>>> Hello Venugopal,
>>>>>>>
>>>>>>> During analysis, you have access to a single path only. But after
>>>>>>> the analysis is done, you can summarize information across different paths
>>>>>>> using checkEndAnalysis() callback. You will get full ExplodedGraph built
>>>>>>> for the function so you will be able to look into its every node.
>>>>>>>
>>>>>>>
>>>>>>> 25.01.2017 09:25, Venugopal Raghavan via cfe-dev пишет:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Firstly, apologies for the long email below.
>>>>>>>
>>>>>>> I am new to the Clang Static Analyzer (CLA). My interest in CLA was
>>>>>>> more to see if I can solve quickly path-sensitive data-flow analysis
>>>>>>> problems than as a vehicle to build bug-finding checkers. I needed the
>>>>>>> solutions to such data flow problems to extract some behavioral properties
>>>>>>> from some test cases I have. For example, consider the following piece of
>>>>>>> code:
>>>>>>>
>>>>>>> if (cond) {
>>>>>>>    x = 4;
>>>>>>> }
>>>>>>> else {
>>>>>>>    x = 5;
>>>>>>> }
>>>>>>>
>>>>>>> L: .... = ...x..; // Use of variable x
>>>>>>>
>>>>>>> I want to identify program locations such as "L" in the code above
>>>>>>> where x is not a constant if you aggregate data-flow information across
>>>>>>> paths, but, on the other hand, is actually a constant if you have
>>>>>>> path-specific data flow information. Since CFA does path-specific analysis,
>>>>>>> I was curious to know if it would help me with such tasks. The "bug-report"
>>>>>>> I want at location L is that x has a value 4, given the path constraint for
>>>>>>> the path including the "then-path" of the if statement and that x has a
>>>>>>> value 5 along the else-path.
>>>>>>>
>>>>>>> I started writing a checker to accomplish the above task, but was
>>>>>>> quickly blocked by some basic doubts. My rough plan for the checker was to
>>>>>>> maintain a map in ProgramState which maps ProgramPoints (or, maybe program
>>>>>>> symbols) to constants when those variables indeed have constant values at
>>>>>>> the ProgramPoint. For example, when CFA expands states along the then-path
>>>>>>> and reaches "L", my map would say that at ProgramPoint "L", variable x has
>>>>>>> a constant value 4. Then, when CFA expands nodes along the else-path, I
>>>>>>> guess it would again create a state corresponding to L which now says that
>>>>>>> x has the constant value 5. I want to be able to catch this scenario where
>>>>>>> the same variable at the same ProgramPoint has two different constant
>>>>>>> values depending on the path taken.
>>>>>>>
>>>>>>> However, the issue is that, when the state graph along the else-path
>>>>>>> is expanded, it would no longer have any of the map contents that was built
>>>>>>> along the disjoint then-path. How then can I get access to the then-path
>>>>>>> information when I am expanding the path along the else-path? Since the
>>>>>>> checker is also required to be stateless, I cannot maintain auxiliary
>>>>>>> information in the checker that is persistent across multiple calls to the
>>>>>>> call-back function.
>>>>>>>
>>>>>>> Can you help me with this question? Maybe, I am trying to use CFA in
>>>>>>> ways it was not meant to be used viz. as a vehicle to solve data flow
>>>>>>> problems which consults information cutting across paths. Maybe, the CFA is
>>>>>>> meant to handle problems which require the examination of information
>>>>>>> restricted to single paths. Is that the case? Or, am I missing something?
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Venugopal Raghavan.
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> cfe-dev mailing listcfe-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>>>
>>>>>>> --
>>>>>>> Best regards,
>>>>>>> Aleksei Sidorin,
>>>>>>> SRR, Samsung Electronics
>>>>>>>
>>>>>>> --
>>>>> Best regards,
>>>>> Aleksei Sidorin,
>>>>> SRR, Samsung Electronics
>>>>>
>>>>> --
>>>> Best regards,
>>>> Aleksei Sidorin,
>>>> SRR, Samsung Electronics
>>>>
>>>> --
>> Best regards,
>> Aleksei Sidorin,
>> SRR, Samsung Electronics
>>
>> --
>> Best regards,
>> Aleksei Sidorin,
>> SRR, Samsung Electronics
>>
>> --
> Best regards,
> Aleksei Sidorin,
> SRR, Samsung Electronics
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170216/dbd750d6/attachment.html>