[cfe-dev] PR16833 - Analyzer hangs trying to analyze large case ranges

Aleksei Sidorin a.sidorin at samsung.com
Thu Sep 4 23:13:09 PDT 2014


Thanks Jordan.

I have participated in this conversation and it was the thing that 
pointed me to the source of analyzer misbehavior I face on some files. 
This problem was critical for me and, considering your ideas, I have 
tried to implement a solution that fixes a problem (and it seems to work 
in my case).

The source of the problem we faced was that we can correctly handle "LOW 
<= $sym <= HIGH" assumptions but cannot handle their opposite in the 
_one_ ProgramState as it needed to get correct assumptions for 'default' 
case. Using given assumes, we can handle "$sym < LOW" or "$sym > HIGH" 
but not both "$sym < LOW || $sym > HIGH" because there is a non-constant 
symbolic expression in both parts of an expression so we cannot reason 
about it now. I have implemented a helper method that returns a pair of 
states: "LOW <= $sym <= HIGH" in the first state and its opposite in the 
second one. The first state is used for the current case, second state 
is used to check another cases until 'default' case.

Zach, sorry for not CC'ing you in review message.

> Looks like Aleksei's taken some of these ideas and run with them in 
> http://reviews.llvm.org/D5102. Aleksei, I haven't had time to look at 
> your implementation yet, but if you haven't seen it yet here's some of 
> the issues Zach and I were discussing.
>
> Jordan
>
>
> On Aug 21, 2014, at 20:13 , Jordan Rose <jordan_rose at apple.com 
> <mailto:jordan_rose at apple.com>> wrote:
>
>> Hm, okay. Here's some basic structure for what happens now:
>>
>> 1. SimpleConstraintManager::canReasonAbout decides whether or not a 
>> symbolic expression is even worth processing.
>> 2. SimpleConstraintManager::assumeAux (the one that takes a NonLoc) 
>> does some preprocessing and canonicalizing so that it can identify 
>> the parts of a constraint.
>> 3. SimpleConstraintManager::assumeSymRel makes sure all the numbers 
>> involved are compatible, and then dispatches to the various 
>> RangeConstraintManager methods
>> 4. RangeConstraintManager::assumeSym* handle the six possible 
>> relational operators and the minute tricky differences between them, 
>> applying the new constraints as intersections with the existing range 
>> sets.
>>
>> If you really want to put this into the constraint manager, I think I 
>> would /still/ do this the "sneaky way", by canonicalizing your "LOW < 
>> $sym < HIGH" into "$sym - LOW < HIGH - low" in 
>> SimpleConstraintManager::assumeAux. That achieves maximal code reuse. 
>> If you weren't to do it this way, I think you'd have to introduce a 
>> new assumeInRange method that could be overridden; this would 
>> probably be easier to get right but I'm not sure it would actually 
>> add value.
>>
>> As a side note, I would consider writing your disjunction with & 
>> rather than &&. Because of &&'s short-circuiting, you'd never 
>> actually see that expression show up in assume() otherwise, and 
>> handling & would make it possible to handle similar cases in user 
>> code in the future.
>>
>> Please continue to e-mail with any questions you may have!
>> Jordan
>>
>>
>> On Aug 18, 2014, at 7:41 , Zach Davis <zdavkeos at gmail.com 
>> <mailto:zdavkeos at gmail.com>> wrote:
>>
>>> Let me see if I understand what you are saying:
>>>
>>> We are able to symbolically evaluate (correctly) the "x >= lower && x
>>> <= upper" expression with SVal's. However, we can't 'remember' this
>>> result because we don't construct a Constraint on the expression, and
>>> even if we did the existing RangeConstraint manager doesn't support
>>> it.
>>>
>>> I feel like the RangeConstraint manager is really the best way to fix
>>> this, even if it's not easy.  It doesn't seem to be a huge issue
>>> anyway, if someone _really_ wants to analyse a file with such an
>>> expression, they can use the patch i supplied in the mean time.
>>>
>>>
>>> Unfortunately, looking through the Constraint manager(s) code, I have
>>> no idea where to start.
>>>
>>> Zach
>>>
>>> On Tue, Aug 12, 2014 at 9:38 PM, Jordan Rose <jordan_rose at apple.com 
>>> <mailto:jordan_rose at apple.com>> wrote:
>>>> That FIXME, and indeed the code to support case ranges, actually 
>>>> predates
>>>> RangeConstraintManager, which implements most of what the FIXME is 
>>>> referring
>>>> to. (That is, the technical/historical reason we couldn't do it 
>>>> symbolically
>>>> is "it wasn't implemented yet".) So your change is in fact on the right
>>>> track.
>>>>
>>>> However, what you've done is construct the expression "$x >= LOWER 
>>>> && $x <=
>>>> UPPER". This is correct, but it's not a form that 
>>>> RangeConstraintManager
>>>> will actually handle right now:
>>>>
>>>> void clang_analyzer_eval(int);
>>>> void test(int i) {
>>>> switch (i) {
>>>> case 1 ... 5:
>>>> return;
>>>> }
>>>> clang_analyzer_eval(x == 2); // should print FALSE, but will print 
>>>> UNKNOWN
>>>> with your patch.
>>>> }
>>>>
>>>> (clang_analyzer_eval is a special function used to probe the 
>>>> analyzer state,
>>>> handled by the debug.ExprInspection checker.)
>>>>
>>>> The easiest way to fix this would be to use assumeDual on each 
>>>> expression
>>>> individually. Then you end up with three cases: $x < LOWER, $x is 
>>>> in range,
>>>> and $x > UPPER. Unfortunately, you'd then have to restructure the whole
>>>> function, which currently assumes that you end up with exactly one 
>>>> state
>>>> that can end up in the "default" branch. That wouldn't be too hard.
>>>>
>>>> The harder way to solve this is to actually teach 
>>>> [Range]ConstraintManager
>>>> to recognize constraints like this, and do the right thing. The
>>>> implementation is already set up to handle disjoint sets of ranges, 
>>>> so you
>>>> could actually do this without changing everything. It's probably 
>>>> more messy
>>>> than you'd like, though.
>>>>
>>>> The sneaky way to do this is to realize that "$x >= LOWER && $x <= 
>>>> UPPER" is
>>>> actually equivalent to "$x - (LOWER - MIN) <= UPPER - (LOWER - 
>>>> MIN)", where
>>>> MIN is the minimum value for the integer type, for integers with 
>>>> wraparound
>>>> semantics. (RangeConstraintManager actually takes advantage of this in
>>>> reverse to translate "$x - OFFSET <= VALUE" back into a pair of range
>>>> disjunctions.) The only trouble with doing something like this is that
>>>> wraparound is Hardâ„¢ and you/we will want to be very careful that this
>>>> actually handles everything correctly, for all the integer types we 
>>>> care
>>>> about. A definite benefit of this is that we're back to only having one
>>>> state, which prevents potential exponential explosion as analysis 
>>>> continues.
>>>>
>>>> What do you think we should do here?
>>>> Jordan
>>>>
>>>>
>>>> On Aug 12, 2014, at 12:39 , Zach Davis <zdavkeos at gmail.com 
>>>> <mailto:zdavkeos at gmail.com>> wrote:
>>>>
>>>> All-
>>>>
>>>> I was looking over 16833 and had a few questions.
>>>>
>>>> The issue is the analyzer tries to concretize all the values in a case
>>>> statement with a range value:
>>>>
>>>>   switch (a) {
>>>>       case 1 ... 0xfffff: // hang's trying each value in the range
>>>>          return 1;
>>>>       default:
>>>>          return 0;
>>>>   }
>>>>
>>>> Is there any reason we can't symbolically evaluate 1 <= a <= 0xfffff,
>>>> rather than trying each 'a == v for v in 1..0xfffff'? I've attached a
>>>> patch that attempts to do this and it seems to work (it no longer
>>>> hangs anyway).  Is there a technical/historical reason for doing it
>>>> the current way?
>>>>
>>>> The FIXME above the block mentions using '"ranges" for NonLVals'. I'm
>>>> not sure what this means, or if any work has been done toward this.
>>>> If not, would there be any value in pursing a patch like this in the
>>>> meantime to resolve the performance issue?
>>>>
>>>> Thanks,
>>>> Zach
>>>> <16833_hang_on_large_case_range.patch>_______________________________________________
>>>> cfe-dev mailing list
>>>> cfe-dev at cs.uiuc.edu <mailto:cfe-dev at cs.uiuc.edu>
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>>>>
>>>>
>>
>


-- 
Best regards,
Aleksei Sidorin
Software Engineer,
IMSWL-IMCG, SRR, Samsung Electronics

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140905/106561ab/attachment.html>


More information about the cfe-dev mailing list