[cfe-dev] Question on inspecting information across program paths

Aleksei Sidorin via cfe-dev cfe-dev at lists.llvm.org
Thu Feb 9 07:24:17 PST 2017


Hello Venugopal,

Unfortunately, there is no a simple way to get "symbol diff". Your 
approach with sequential iteration seems right to me in general.
About comparison of symbols. If you want just ensure that some symbols 
are same or different (like reg_$0<x> and reg_$1<y> which are 
different), you can just compare their pointers (SymbolRefs). Symbols 
are not modifiable, every symbol is allocated in the pool and lives 
here, SymbolRefs just point on it. If these pointers are different, 
symbols are different too, and equal otherwise.


09.02.2017 18:20, Aleksei Sidorin пишет:
> Hello Venugopal,
>
> If you want just ensure that some symbols are same (like reg_$0<x> and 
> reg_$1<y>), you can just compare their pointers (SymbolRefs). Symbols 
> are not modifiable, every symbol is allocated in the pool and lives 
> here, SymbolRefs just point on it. If these pointers are different, 
> symbols are different too, and equal otherwise.
>
>
> 09.02.2017 13:03, Venugopal Raghavan пишет:
>> Hi,
>>
>> Apologies, but, I am back again.
>>
>> I have two SymbolRef pointers (SymExpr *). One SymbolRef 
>> is (reg_$0<x>) + 3$117. The other SymbolRef is (reg_$0<x>) + 2$130. x 
>> is a variable in my program and these SymbolRefs respresent an 
>> assignment of x + y to a variable "a". y has the value 3 along one 
>> path to the program point where "a" is assigned and it has the value 
>> 2 along a second path to the same point, hence the two SymbolRefs 
>> differ in the constants.
>>
>> I want to compare these two SymbolRefs and find out where they 
>> differ. In this case, I want to be able to say that they differ in 
>> the constant value. I thought of using symbol_iterator to iterate 
>> over each SymbolRef or SymExpr * in lock-step fashion and find the 
>> first point where it differs. But these is no operator== defined in 
>> the SymExpr class and hence I am unable to check the equality or 
>> otherwise of two symbols in the SymbolRefs.
>>
>> Is there a way to compare two SymbolRefs and check for the 
>> differences? Otherwise, is there a better way to check whether the 
>> symbolic value at the same program point but in two different program 
>> states are the same or not, and if they are different, is there a way 
>> to identify the difference? Maybe, my approach of comparing two 
>> symbolic expressions at a program point is not the right approach in 
>> the first place.
>>
>> Thanks.
>>
>> Regards,
>> Venu.
>>
>> On Mon, Feb 6, 2017 at 2:42 PM, Venugopal Raghavan 
>> <venur2005 at gmail.com <mailto:venur2005 at gmail.com>> wrote:
>>
>>     Hi Aleksei,
>>
>>     Thanks. That seems to work.
>>
>>     Regards,
>>     Venu.
>>
>>     On Mon, Feb 6, 2017 at 2:05 PM, Aleksei Sidorin
>>     <a.sidorin at samsung.com <mailto:a.sidorin at samsung.com>> wrote:
>>
>>         Hello,
>>
>>         On the point where PreStmt callbacks are called, the
>>         expression value is still not computed so the result of
>>         getSVal() is always UnknownVal and not a symbol. You should
>>         probably use PostStmt callback instead.
>>
>>
>>         06.02.2017 08:21, Venugopal Raghavan пишет:
>>>         Hi Aleksei,
>>>
>>>         Thanks for the reply. Unfortunately, I am having further
>>>         trouble with my checker implementation. I have the following
>>>         code in my checker (called ConstantChecker):
>>>
>>>          void ConstantChecker::checkPreStmt(const Stmt *S,
>>>         CheckerContext &C) const {
>>>                 ProgramStateRef state = C.getState();
>>>                 const LocationContext *lctx = C.getLocationContext();
>>>                 SVal val = state->getSVal(S, lctx);
>>>                 SymbolRef Sym = val.getAsSymbol();
>>>                 // Stmt 1
>>>                               .
>>>                               .
>>>                               .
>>>         }
>>>         My input test case for the checker is test1.c and I run the
>>>         command clang -cc1 -analyze
>>>         -analyzer-checker=optin.performance.PathSpecificConstants
>>>         test1.c where PathSpecificConstants is the name of my checker:
>>>
>>>         test(int x) {
>>>            int y;
>>>            int a;
>>>            if (x > 0) {
>>>              y = 2;
>>>            }
>>>            else {
>>>              y = 3;
>>>            }
>>>            a = x + y;
>>>            printf("a = %d\n", a);
>>>         }
>>>
>>>         When I print "Sym" in gdb at the break point Stmt 1, I
>>>         always get Sym as 0x0. This probably means that val is never
>>>         a symbol for my test case and hence val.getAsSymbol() fails.
>>>         I do not understand this. I would have thought that at least
>>>         the expression x + y would be represented by a symbolic
>>>         value such as $0. I tried doing const MemRegion *mem =
>>>         val.getAsRegion() also at that  point and tried to print out
>>>         mem in gdb. That also has a value 0x0 except for the printf
>>>         statement. I cannot understand this too. I would have
>>>         expected variables such as "x", "y" and "a" to be associated
>>>         with a memory region and hence mem to have a non-null value
>>>         at least at the points where y and a are being assigned to.
>>>
>>>         My plan was to use Sym as the index into a custom map that I
>>>         have created where I am storing the values of constants seen
>>>         in the test case. I am unable to get this plan working since
>>>         Sym is always 0x0.
>>>
>>>         I have gone through the Checker Developer Manual. I am sorry
>>>         to be asking specific questions about my code in this
>>>         fashion, but, unfortunately, I am stuck at this point.
>>>
>>>         Thanks.
>>>
>>>         Regards,
>>>         Venugopal Raghavan.
>>>
>>>         On Wed, Feb 1, 2017 at 11:13 PM, Aleksei Sidorin
>>>         <a.sidorin at samsung.com <mailto:a.sidorin at samsung.com>> wrote:
>>>
>>>             Hello Venugopal,
>>>
>>>             You will need to extract all the information from
>>>             ExplodedNodes.
>>>             For example, state is a part of ExplodedNode. So, you
>>>             can iterate over ExplodedNodes of graph using smth like:
>>>             for (auto Iter = G.nodes_begin(), E = G.nodes_end(); I
>>>             != E; ++I) {
>>>               ProgramStateRef State = I->getState();
>>>               // Do the stuff with State
>>>             }
>>>
>>>
>>>
>>>
>>>             01.02.2017 12:19, Venugopal Raghavan пишет:
>>>>             Hi,
>>>>
>>>>             I am registering my own map in the program state using
>>>>             the the following:
>>>>
>>>>             REGISTER_MAP_WITH_PROGRAMSTATE(ConstantMap, SymbolRef,
>>>>             VariableConstVal)
>>>>
>>>>             where VariableConstVal is the structure containing the
>>>>             information I want in this map.
>>>>
>>>>             I am populating this map in the
>>>>             callback checkPostStmt() using state =
>>>>             state->set<ConstantMap>(sym, ...) while the state graph
>>>>             is being constructed.
>>>>
>>>>             At the end of analysis, I want to now read the
>>>>             information stored in this map in the
>>>>             callback checkEndAnalysis(ExplodedGraph &G, BugReporter
>>>>             &B, ExprEngine &Eng). However, I am not sure how to get
>>>>             the SymbolRef associated with an ExplodedNode which is
>>>>             what I get when I iterate over the ExplodedGraph at the
>>>>             end of the analysis.
>>>>
>>>>             How can I get the SymbolRef that I can use to index
>>>>             into my map? In checkPostStmt(stmt* S, CheckerContext
>>>>             &C), I used the following code to get the SymbolRef
>>>>             that I can use to store the information in the map:
>>>>
>>>>               SVal val = state->getSVal(S, lctx);
>>>>               SymbolRef Sym = val.getAsSymbol();
>>>>
>>>>             but in checkEndAnalysis(ExplodedGraph &G, BugReporter
>>>>             &B, ExprEngine &Eng) I do not seem to have a handle to
>>>>             stmt* to get the SVal.
>>>>
>>>>             I guess my question is probably not phrased very well,
>>>>             but I hope some one can make some sense out of it.
>>>>             Also, I guess I have a long way to go before I
>>>>             understand the data structures in the analyzer well. I
>>>>             checked the other checkers, but none of them seem to
>>>>             exhibit exactly my requirement.
>>>>
>>>>             Thanks.
>>>>
>>>>             Regards,
>>>>             Venugopal Raghavan.
>>>>
>>>>             On Wed, Jan 25, 2017 at 3:19 PM, Venugopal Raghavan
>>>>             <venur2005 at gmail.com <mailto:venur2005 at gmail.com>> wrote:
>>>>
>>>>                 Hi Aleksei,
>>>>
>>>>                 Thanks. I will check this.
>>>>
>>>>                 Regards,
>>>>                 Venu.
>>>>
>>>>                 On Wed, Jan 25, 2017 at 3:15 PM, Aleksei Sidorin
>>>>                 <a.sidorin at samsung.com
>>>>                 <mailto:a.sidorin at samsung.com>> wrote:
>>>>
>>>>                     Hello Venugopal,
>>>>
>>>>                     During analysis, you have access to a single
>>>>                     path only. But after the analysis is done, you
>>>>                     can summarize information across different
>>>>                     paths using checkEndAnalysis() callback. You
>>>>                     will get full ExplodedGraph built for the
>>>>                     function so you will be able to look into its
>>>>                     every node.
>>>>
>>>>
>>>>                     25.01.2017 09:25, Venugopal Raghavan via
>>>>                     cfe-dev пишет:
>>>>>                     Hi,
>>>>>
>>>>>                     Firstly, apologies for the long email below.
>>>>>
>>>>>                     I am new to the Clang Static Analyzer (CLA).
>>>>>                     My interest in CLA was more to see if I can
>>>>>                     solve quickly path-sensitive data-flow
>>>>>                     analysis problems than as a vehicle to build
>>>>>                     bug-finding checkers. I needed the solutions
>>>>>                     to such data flow problems to extract some
>>>>>                     behavioral properties from some test cases I
>>>>>                     have. For example, consider the following
>>>>>                     piece of code:
>>>>>
>>>>>                     if (cond) {
>>>>>                        x = 4;
>>>>>                     }
>>>>>                     else {
>>>>>                        x = 5;
>>>>>                     }
>>>>>
>>>>>                     L: .... = ...x..; // Use of variable x
>>>>>
>>>>>                     I want to identify program locations such as
>>>>>                     "L" in the code above where x is not a
>>>>>                     constant if you aggregate data-flow
>>>>>                     information across paths, but, on the other
>>>>>                     hand, is actually a constant if you have
>>>>>                     path-specific data flow information. Since CFA
>>>>>                     does path-specific analysis, I was curious to
>>>>>                     know if it would help me with such tasks. The
>>>>>                     "bug-report" I want at location L is that x
>>>>>                     has a value 4, given the path constraint for
>>>>>                     the path including the "then-path" of the if
>>>>>                     statement and that x has a value 5 along the
>>>>>                     else-path.
>>>>>
>>>>>                     I started writing a checker to accomplish the
>>>>>                     above task, but was quickly blocked by some
>>>>>                     basic doubts. My rough plan for the checker
>>>>>                     was to maintain a map in ProgramState which
>>>>>                     maps ProgramPoints (or, maybe program symbols)
>>>>>                     to constants when those variables indeed have
>>>>>                     constant values at the ProgramPoint. For
>>>>>                     example, when CFA expands states along the
>>>>>                     then-path and reaches "L", my map would say
>>>>>                     that at ProgramPoint "L", variable x has a
>>>>>                     constant value 4. Then, when CFA expands nodes
>>>>>                     along the else-path, I guess it would again
>>>>>                     create a state corresponding to L which now
>>>>>                     says that x has the constant value 5. I want
>>>>>                     to be able to catch this scenario where the
>>>>>                     same variable at the same ProgramPoint has two
>>>>>                     different constant values depending on the
>>>>>                     path taken.
>>>>>
>>>>>                     However, the issue is that, when the state
>>>>>                     graph along the else-path is expanded, it
>>>>>                     would no longer have any of the map contents
>>>>>                     that was built along the disjoint then-path.
>>>>>                     How then can I get access to the then-path
>>>>>                     information when I am expanding the path along
>>>>>                     the else-path? Since the checker is also
>>>>>                     required to be stateless, I cannot maintain
>>>>>                     auxiliary information in the checker that is
>>>>>                     persistent across multiple calls to the
>>>>>                     call-back function.
>>>>>
>>>>>                     Can you help me with this question? Maybe, I
>>>>>                     am trying to use CFA in ways it was not meant
>>>>>                     to be used viz. as a vehicle to solve data
>>>>>                     flow problems which consults information
>>>>>                     cutting across paths. Maybe, the CFA is meant
>>>>>                     to handle problems which require the
>>>>>                     examination of information restricted to
>>>>>                     single paths. Is that the case? Or, am I
>>>>>                     missing something?
>>>>>
>>>>>                     Thanks.
>>>>>
>>>>>                     Regards,
>>>>>                     Venugopal Raghavan.
>>>>>
>>>>>
>>>>>                     _______________________________________________
>>>>>                     cfe-dev mailing list
>>>>>                     cfe-dev at lists.llvm.org
>>>>>                     <mailto:cfe-dev at lists.llvm.org>
>>>>>                     http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>>                     <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>>>>
>>>>                     -- 
>>>>                     Best regards,
>>>>                     Aleksei Sidorin,
>>>>                     SRR, Samsung Electronics
>>>>
>>>             -- 
>>>             Best regards,
>>>             Aleksei Sidorin,
>>>             SRR, Samsung Electronics
>>>
>>         -- 
>>         Best regards,
>>         Aleksei Sidorin,
>>         SRR, Samsung Electronics
>>
> -- 
> Best regards,
> Aleksei Sidorin,
> SRR, Samsung Electronics

-- 
Best regards,
Aleksei Sidorin,
SRR, Samsung Electronics
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20170209/c7670ebd/attachment.html>


More information about the cfe-dev mailing list