[cfe-dev] Question on inspecting information across program paths
Aleksei Sidorin via cfe-dev
cfe-dev at lists.llvm.org
Tue Feb 14 03:50:45 PST 2017
Hi Venugopal,
There is a way to do it. The simplest way, as I guess, is to create a
chain of ExplodedNodes with your states. The code will look similar to this:
ProgramStateRef InitialState = C.getState();
ProgramStateRef MarkedState = InitialState->set<...>(...);
if (ExplodedNode *MarkedNode =
C.generate[NonFatalError?]Node(MarkedState)) {
ExplodedNode *NonMarkedNode =
C.generate[NonFatalError?]Node(InitialState, MarkedNode))
}
So, the chain will look like this:
<InitialState, InitialNode> => <MarkedState, MarkedNode> =>
<InitialState, NonMarkedNode>
But if you want to just add some mark to a newly-created node, you can
not use state at all. Instead, you can just create a node with a
specified tag.
static SimpleProgramPointTag CustomTag("My marked node!");
ExplodedNode *NodeWithSpecialTag =
C.generate[NonFatalError?]Node(NewState, C.getPredecessor(), &CustomTag);
13.02.2017 15:57, Venugopal Raghavan пишет:
> Hi Aleksei,
>
> Thanks, I have re-written the checker based on your suggestion.
>
> The custom program state that a checker adds to a program state gets
> propagated to succeeding states also unless they are explicitly
> removed. Is there a way to write some data into the custom state which
> is valid only for that state and does not get propagated to succeeding
> states? I did something like this:
>
> ProgramStateRef state = C.getState(); // C is CheckerContext
> const LocationContext *lctx = C.getLocationContext();
> SVal val = state->getSVal(S, lctx);
> SymbolRef Sym = val.getAsSymbol(); // Symbol I want to be added to
> this state and to this state alone
>
> // Remove existing custom state data
> const CustomStateTy &Syms = state->get<CustomState>();
> for (auto I = Syms.begin(), E = Syms.end(); I != E; ++I) {
> SymbolRef Sym1 = *I;
> state = state->remove<CustomState>(Sym1);
> }
> state = state->add<CustomState>(Sym); // Add the symbol that is
> needed for the current state
>
> The idea was to remove custom state information from a state and then
> write the information (symbol) corresponding to that state. Thus I
> write the symbol to a state S and that creates a new state S1 with
> that symbol. S itself is not modified since states are immutable. Now,
> the state which succeeds S1 (which I call S2) also gets the same
> symbol through state propagation which should however not be
> associated with that state. Hence I try remove the symbol from S2 but
> it does not actually remove it from that S2 but creates a new state S3
> with the symbol removed (due to the immutable nature of states).
>
> Finally, I would still be left with states such as S2 which has
> information which do not belong to them. Can I fix this in some way?
>
> Not sure if this is a requirement outside the support provided by the
> analyzer or I have created a requirement which is unnatural.
>
> Thanks.
>
> Regards,
> Venu.
>
> On Thu, Feb 9, 2017 at 8:54 PM, Aleksei Sidorin <a.sidorin at samsung.com
> <mailto:a.sidorin at samsung.com>> wrote:
>
> Hello Venugopal,
>
> Unfortunately, there is no a simple way to get "symbol diff". Your
> approach with sequential iteration seems right to me in general.
> About comparison of symbols. If you want just ensure that some
> symbols are same or different (like reg_$0<x> and reg_$1<y> which
> are different), you can just compare their pointers (SymbolRefs).
> Symbols are not modifiable, every symbol is allocated in the pool
> and lives here, SymbolRefs just point on it. If these pointers are
> different, symbols are different too, and equal otherwise.
>
>
> 09.02.2017 18:20, Aleksei Sidorin пишет:
>> Hello Venugopal,
>>
>> If you want just ensure that some symbols are same (like
>> reg_$0<x> and reg_$1<y>), you can just compare their pointers
>> (SymbolRefs). Symbols are not modifiable, every symbol is
>> allocated in the pool and lives here, SymbolRefs just point on
>> it. If these pointers are different, symbols are different too,
>> and equal otherwise.
>>
>>
>> 09.02.2017 13:03, Venugopal Raghavan пишет:
>>> Hi,
>>>
>>> Apologies, but, I am back again.
>>>
>>> I have two SymbolRef pointers (SymExpr *). One SymbolRef
>>> is (reg_$0<x>) + 3$117. The other SymbolRef is (reg_$0<x>) +
>>> 2$130. x is a variable in my program and these SymbolRefs
>>> respresent an assignment of x + y to a variable "a". y has the
>>> value 3 along one path to the program point where "a" is
>>> assigned and it has the value 2 along a second path to the same
>>> point, hence the two SymbolRefs differ in the constants.
>>>
>>> I want to compare these two SymbolRefs and find out where they
>>> differ. In this case, I want to be able to say that they differ
>>> in the constant value. I thought of using symbol_iterator to
>>> iterate over each SymbolRef or SymExpr * in lock-step fashion
>>> and find the first point where it differs. But these is no
>>> operator== defined in the SymExpr class and hence I am unable to
>>> check the equality or otherwise of two symbols in the SymbolRefs.
>>>
>>> Is there a way to compare two SymbolRefs and check for the
>>> differences? Otherwise, is there a better way to check whether
>>> the symbolic value at the same program point but in two
>>> different program states are the same or not, and if they are
>>> different, is there a way to identify the difference? Maybe, my
>>> approach of comparing two symbolic expressions at a program
>>> point is not the right approach in the first place.
>>>
>>> Thanks.
>>>
>>> Regards,
>>> Venu.
>>>
>>> On Mon, Feb 6, 2017 at 2:42 PM, Venugopal Raghavan
>>> <venur2005 at gmail.com <mailto:venur2005 at gmail.com>> wrote:
>>>
>>> Hi Aleksei,
>>>
>>> Thanks. That seems to work.
>>>
>>> Regards,
>>> Venu.
>>>
>>> On Mon, Feb 6, 2017 at 2:05 PM, Aleksei Sidorin
>>> <a.sidorin at samsung.com <mailto:a.sidorin at samsung.com>> wrote:
>>>
>>> Hello,
>>>
>>> On the point where PreStmt callbacks are called, the
>>> expression value is still not computed so the result of
>>> getSVal() is always UnknownVal and not a symbol. You
>>> should probably use PostStmt callback instead.
>>>
>>>
>>> 06.02.2017 08:21, Venugopal Raghavan пишет:
>>>> Hi Aleksei,
>>>>
>>>> Thanks for the reply. Unfortunately, I am having
>>>> further trouble with my checker implementation. I have
>>>> the following code in my checker (called ConstantChecker):
>>>>
>>>> void ConstantChecker::checkPreStmt(const Stmt *S,
>>>> CheckerContext &C) const {
>>>> ProgramStateRef state = C.getState();
>>>> const LocationContext *lctx =
>>>> C.getLocationContext();
>>>> SVal val = state->getSVal(S, lctx);
>>>> SymbolRef Sym = val.getAsSymbol();
>>>> // Stmt 1
>>>> .
>>>> .
>>>> .
>>>> }
>>>> My input test case for the checker is test1.c and I run
>>>> the command clang -cc1 -analyze
>>>> -analyzer-checker=optin.performance.PathSpecificConstants
>>>> test1.c where PathSpecificConstants is the name of my
>>>> checker:
>>>>
>>>> test(int x) {
>>>> int y;
>>>> int a;
>>>> if (x > 0) {
>>>> y = 2;
>>>> }
>>>> else {
>>>> y = 3;
>>>> }
>>>> a = x + y;
>>>> printf("a = %d\n", a);
>>>> }
>>>>
>>>> When I print "Sym" in gdb at the break point Stmt 1, I
>>>> always get Sym as 0x0. This probably means that val is
>>>> never a symbol for my test case and hence
>>>> val.getAsSymbol() fails. I do not understand this. I
>>>> would have thought that at least the expression x + y
>>>> would be represented by a symbolic value such as $0. I
>>>> tried doing const MemRegion *mem = val.getAsRegion()
>>>> also at that point and tried to print out mem in gdb.
>>>> That also has a value 0x0 except for the printf
>>>> statement. I cannot understand this too. I would have
>>>> expected variables such as "x", "y" and "a" to be
>>>> associated with a memory region and hence mem to have a
>>>> non-null value at least at the points where y and a are
>>>> being assigned to.
>>>>
>>>> My plan was to use Sym as the index into a custom map
>>>> that I have created where I am storing the values of
>>>> constants seen in the test case. I am unable to get
>>>> this plan working since Sym is always 0x0.
>>>>
>>>> I have gone through the Checker Developer Manual. I am
>>>> sorry to be asking specific questions about my code in
>>>> this fashion, but, unfortunately, I am stuck at this point.
>>>>
>>>> Thanks.
>>>>
>>>> Regards,
>>>> Venugopal Raghavan.
>>>>
>>>> On Wed, Feb 1, 2017 at 11:13 PM, Aleksei Sidorin
>>>> <a.sidorin at samsung.com <mailto:a.sidorin at samsung.com>>
>>>> wrote:
>>>>
>>>> Hello Venugopal,
>>>>
>>>> You will need to extract all the information from
>>>> ExplodedNodes.
>>>> For example, state is a part of ExplodedNode. So,
>>>> you can iterate over ExplodedNodes of graph using
>>>> smth like:
>>>> for (auto Iter = G.nodes_begin(), E =
>>>> G.nodes_end(); I != E; ++I) {
>>>> ProgramStateRef State = I->getState();
>>>> // Do the stuff with State
>>>> }
>>>>
>>>>
>>>>
>>>>
>>>> 01.02.2017 12:19, Venugopal Raghavan пишет:
>>>>> Hi,
>>>>>
>>>>> I am registering my own map in the program state
>>>>> using the the following:
>>>>>
>>>>> REGISTER_MAP_WITH_PROGRAMSTATE(ConstantMap,
>>>>> SymbolRef, VariableConstVal)
>>>>>
>>>>> where VariableConstVal is the structure containing
>>>>> the information I want in this map.
>>>>>
>>>>> I am populating this map in the
>>>>> callback checkPostStmt() using state =
>>>>> state->set<ConstantMap>(sym, ...) while the state
>>>>> graph is being constructed.
>>>>>
>>>>> At the end of analysis, I want to now read the
>>>>> information stored in this map in the
>>>>> callback checkEndAnalysis(ExplodedGraph &G,
>>>>> BugReporter &B, ExprEngine &Eng). However, I am
>>>>> not sure how to get the SymbolRef associated with
>>>>> an ExplodedNode which is what I get when I iterate
>>>>> over the ExplodedGraph at the end of the analysis.
>>>>>
>>>>> How can I get the SymbolRef that I can use to
>>>>> index into my map? In checkPostStmt(stmt* S,
>>>>> CheckerContext &C), I used the following code to
>>>>> get the SymbolRef that I can use to store the
>>>>> information in the map:
>>>>>
>>>>> SVal val = state->getSVal(S, lctx);
>>>>> SymbolRef Sym = val.getAsSymbol();
>>>>>
>>>>> but in checkEndAnalysis(ExplodedGraph &G,
>>>>> BugReporter &B, ExprEngine &Eng) I do not seem to
>>>>> have a handle to stmt* to get the SVal.
>>>>>
>>>>> I guess my question is probably not phrased very
>>>>> well, but I hope some one can make some sense out
>>>>> of it. Also, I guess I have a long way to go
>>>>> before I understand the data structures in the
>>>>> analyzer well. I checked the other checkers, but
>>>>> none of them seem to exhibit exactly my requirement.
>>>>>
>>>>> Thanks.
>>>>>
>>>>> Regards,
>>>>> Venugopal Raghavan.
>>>>>
>>>>> On Wed, Jan 25, 2017 at 3:19 PM, Venugopal
>>>>> Raghavan <venur2005 at gmail.com
>>>>> <mailto:venur2005 at gmail.com>> wrote:
>>>>>
>>>>> Hi Aleksei,
>>>>>
>>>>> Thanks. I will check this.
>>>>>
>>>>> Regards,
>>>>> Venu.
>>>>>
>>>>> On Wed, Jan 25, 2017 at 3:15 PM, Aleksei
>>>>> Sidorin <a.sidorin at samsung.com
>>>>> <mailto:a.sidorin at samsung.com>> wrote:
>>>>>
>>>>> Hello Venugopal,
>>>>>
>>>>> During analysis, you have access to a
>>>>> single path only. But after the analysis
>>>>> is done, you can summarize information
>>>>> across different paths using
>>>>> checkEndAnalysis() callback. You will get
>>>>> full ExplodedGraph built for the function
>>>>> so you will be able to look into its every
>>>>> node.
>>>>>
>>>>>
>>>>> 25.01.2017 09:25, Venugopal Raghavan via
>>>>> cfe-dev пишет:
>>>>>> Hi,
>>>>>>
>>>>>> Firstly, apologies for the long email below.
>>>>>>
>>>>>> I am new to the Clang Static Analyzer
>>>>>> (CLA). My interest in CLA was more to see
>>>>>> if I can solve quickly path-sensitive
>>>>>> data-flow analysis problems than as a
>>>>>> vehicle to build bug-finding checkers. I
>>>>>> needed the solutions to such data flow
>>>>>> problems to extract some behavioral
>>>>>> properties from some test cases I have.
>>>>>> For example, consider the following piece
>>>>>> of code:
>>>>>>
>>>>>> if (cond) {
>>>>>> x = 4;
>>>>>> }
>>>>>> else {
>>>>>> x = 5;
>>>>>> }
>>>>>>
>>>>>> L: .... = ...x..; // Use of variable x
>>>>>>
>>>>>> I want to identify program locations such
>>>>>> as "L" in the code above where x is not a
>>>>>> constant if you aggregate data-flow
>>>>>> information across paths, but, on the
>>>>>> other hand, is actually a constant if you
>>>>>> have path-specific data flow information.
>>>>>> Since CFA does path-specific analysis, I
>>>>>> was curious to know if it would help me
>>>>>> with such tasks. The "bug-report" I want
>>>>>> at location L is that x has a value 4,
>>>>>> given the path constraint for the path
>>>>>> including the "then-path" of the if
>>>>>> statement and that x has a value 5 along
>>>>>> the else-path.
>>>>>>
>>>>>> I started writing a checker to accomplish
>>>>>> the above task, but was quickly blocked
>>>>>> by some basic doubts. My rough plan for
>>>>>> the checker was to maintain a map in
>>>>>> ProgramState which maps ProgramPoints
>>>>>> (or, maybe program symbols) to constants
>>>>>> when those variables indeed have constant
>>>>>> values at the ProgramPoint. For example,
>>>>>> when CFA expands states along the
>>>>>> then-path and reaches "L", my map would
>>>>>> say that at ProgramPoint "L", variable x
>>>>>> has a constant value 4. Then, when CFA
>>>>>> expands nodes along the else-path, I
>>>>>> guess it would again create a state
>>>>>> corresponding to L which now says that x
>>>>>> has the constant value 5. I want to be
>>>>>> able to catch this scenario where the
>>>>>> same variable at the same ProgramPoint
>>>>>> has two different constant values
>>>>>> depending on the path taken.
>>>>>>
>>>>>> However, the issue is that, when the
>>>>>> state graph along the else-path is
>>>>>> expanded, it would no longer have any of
>>>>>> the map contents that was built along the
>>>>>> disjoint then-path. How then can I get
>>>>>> access to the then-path information when
>>>>>> I am expanding the path along the
>>>>>> else-path? Since the checker is also
>>>>>> required to be stateless, I cannot
>>>>>> maintain auxiliary information in the
>>>>>> checker that is persistent across
>>>>>> multiple calls to the call-back function.
>>>>>>
>>>>>> Can you help me with this question?
>>>>>> Maybe, I am trying to use CFA in ways it
>>>>>> was not meant to be used viz. as a
>>>>>> vehicle to solve data flow problems which
>>>>>> consults information cutting across
>>>>>> paths. Maybe, the CFA is meant to handle
>>>>>> problems which require the examination of
>>>>>> information restricted to single paths.
>>>>>> Is that the case? Or, am I missing something?
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> Regards,
>>>>>> Venugopal Raghavan.
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> cfe-dev mailing list
>>>>>> cfe-dev at lists.llvm.org
>>>>>> <mailto:cfe-dev at lists.llvm.org>
>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>> <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Aleksei Sidorin,
>>>>> SRR, Samsung Electronics
>>>>>
>>>> --
>>>> Best regards,
>>>> Aleksei Sidorin,
>>>> SRR, Samsung Electronics
>>>>
>>> --
>>> Best regards,
>>> Aleksei Sidorin,
>>> SRR, Samsung Electronics
>>>
>> --
>> Best regards,
>> Aleksei Sidorin,
>> SRR, Samsung Electronics
>
> --
> Best regards,
> Aleksei Sidorin,
> SRR, Samsung Electronics
>
--
Best regards,
Aleksei Sidorin,
SRR, Samsung Electronics
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170214/a5531441/attachment.html>
More information about the cfe-dev
mailing list