[cfe-dev] PR16833 - Analyzer hangs trying to analyze large case ranges

Zach Davis zdavkeos at gmail.com
Fri Sep 5 13:32:19 PDT 2014


I'm glad you have had more success than me.  The road I've been going down
has caused much head-scratching and little progress.

Zach


On Fri, Sep 5, 2014 at 1:13 AM, Aleksei Sidorin <a.sidorin at samsung.com>
wrote:

>  Thanks Jordan.
>
> I have participated in this conversation and it was the thing that pointed
> me to the source of analyzer misbehavior I face on some files. This problem
> was critical for me and, considering your ideas, I have tried to implement
> a solution that fixes a problem (and it seems to work in my case).
>
> The source of the problem we faced was that we can correctly handle "LOW
> <= $sym <= HIGH" assumptions but cannot handle their opposite in the _one_
> ProgramState as it needed to get correct assumptions for 'default' case.
> Using given assumes, we can handle "$sym < LOW" or "$sym > HIGH" but not
> both "$sym < LOW || $sym > HIGH" because there is a non-constant symbolic
> expression in both parts of an expression so we cannot reason about it now.
> I have implemented a helper method that returns a pair of states: "LOW <=
> $sym <= HIGH" in the first state and its opposite in the second one. The
> first state is used for the current case, second state is used to check
> another cases until 'default' case.
>
> Zach, sorry for not CC'ing you in review message.
>
>  Looks like Aleksei's taken some of these ideas and run with them in
> http://reviews.llvm.org/D5102. Aleksei, I haven't had time to look at
> your implementation yet, but if you haven't seen it yet here's some of the
> issues Zach and I were discussing.
>
>  Jordan
>
>
>  On Aug 21, 2014, at 20:13 , Jordan Rose <jordan_rose at apple.com> wrote:
>
>  Hm, okay. Here's some basic structure for what happens now:
>
>  1. SimpleConstraintManager::canReasonAbout decides whether or not a
> symbolic expression is even worth processing.
> 2. SimpleConstraintManager::assumeAux (the one that takes a NonLoc) does
> some preprocessing and canonicalizing so that it can identify the parts of
> a constraint.
> 3. SimpleConstraintManager::assumeSymRel makes sure all the numbers
> involved are compatible, and then dispatches to the various
> RangeConstraintManager methods
> 4. RangeConstraintManager::assumeSym* handle the six possible relational
> operators and the minute tricky differences between them, applying the new
> constraints as intersections with the existing range sets.
>
>  If you really want to put this into the constraint manager, I think I
> would *still* do this the "sneaky way", by canonicalizing your "LOW <
> $sym < HIGH" into "$sym - LOW < HIGH - low" in
> SimpleConstraintManager::assumeAux. That achieves maximal code reuse. If
> you weren't to do it this way, I think you'd have to introduce a new
> assumeInRange method that could be overridden; this would probably be
> easier to get right but I'm not sure it would actually add value.
>
>  As a side note, I would consider writing your disjunction with & rather
> than &&. Because of &&'s short-circuiting, you'd never actually see that
> expression show up in assume() otherwise, and handling & would make it
> possible to handle similar cases in user code in the future.
>
>  Please continue to e-mail with any questions you may have!
> Jordan
>
>
>  On Aug 18, 2014, at 7:41 , Zach Davis <zdavkeos at gmail.com> wrote:
>
> Let me see if I understand what you are saying:
>
> We are able to symbolically evaluate (correctly) the "x >= lower && x
> <= upper" expression with SVal's. However, we can't 'remember' this
> result because we don't construct a Constraint on the expression, and
> even if we did the existing RangeConstraint manager doesn't support
> it.
>
> I feel like the RangeConstraint manager is really the best way to fix
> this, even if it's not easy.  It doesn't seem to be a huge issue
> anyway, if someone _really_ wants to analyse a file with such an
> expression, they can use the patch i supplied in the mean time.
>
>
> Unfortunately, looking through the Constraint manager(s) code, I have
> no idea where to start.
>
> Zach
>
> On Tue, Aug 12, 2014 at 9:38 PM, Jordan Rose <jordan_rose at apple.com>
> wrote:
>
> That FIXME, and indeed the code to support case ranges, actually predates
> RangeConstraintManager, which implements most of what the FIXME is
> referring
> to. (That is, the technical/historical reason we couldn't do it
> symbolically
> is "it wasn't implemented yet".) So your change is in fact on the right
> track.
>
> However, what you've done is construct the expression "$x >= LOWER && $x <=
> UPPER". This is correct, but it's not a form that RangeConstraintManager
> will actually handle right now:
>
> void clang_analyzer_eval(int);
> void test(int i) {
> switch (i) {
> case 1 ... 5:
> return;
> }
> clang_analyzer_eval(x == 2); // should print FALSE, but will print UNKNOWN
> with your patch.
> }
>
> (clang_analyzer_eval is a special function used to probe the analyzer
> state,
> handled by the debug.ExprInspection checker.)
>
> The easiest way to fix this would be to use assumeDual on each expression
> individually. Then you end up with three cases: $x < LOWER, $x is in range,
> and $x > UPPER. Unfortunately, you'd then have to restructure the whole
> function, which currently assumes that you end up with exactly one state
> that can end up in the "default" branch. That wouldn't be too hard.
>
> The harder way to solve this is to actually teach [Range]ConstraintManager
> to recognize constraints like this, and do the right thing. The
> implementation is already set up to handle disjoint sets of ranges, so you
> could actually do this without changing everything. It's probably more
> messy
> than you'd like, though.
>
> The sneaky way to do this is to realize that "$x >= LOWER && $x <= UPPER"
> is
> actually equivalent to "$x - (LOWER - MIN) <= UPPER - (LOWER - MIN)", where
> MIN is the minimum value for the integer type, for integers with wraparound
> semantics. (RangeConstraintManager actually takes advantage of this in
> reverse to translate "$x - OFFSET <= VALUE" back into a pair of range
> disjunctions.) The only trouble with doing something like this is that
> wraparound is Hardâ„¢ and you/we will want to be very careful that this
> actually handles everything correctly, for all the integer types we care
> about. A definite benefit of this is that we're back to only having one
> state, which prevents potential exponential explosion as analysis
> continues.
>
> What do you think we should do here?
> Jordan
>
>
> On Aug 12, 2014, at 12:39 , Zach Davis <zdavkeos at gmail.com> wrote:
>
> All-
>
> I was looking over 16833 and had a few questions.
>
> The issue is the analyzer tries to concretize all the values in a case
> statement with a range value:
>
>   switch (a) {
>       case 1 ... 0xfffff: // hang's trying each value in the range
>          return 1;
>       default:
>          return 0;
>   }
>
> Is there any reason we can't symbolically evaluate 1 <= a <= 0xfffff,
> rather than trying each 'a == v for v in 1..0xfffff'? I've attached a
> patch that attempts to do this and it seems to work (it no longer
> hangs anyway).  Is there a technical/historical reason for doing it
> the current way?
>
> The FIXME above the block mentions using '"ranges" for NonLVals'. I'm
> not sure what this means, or if any work has been done toward this.
> If not, would there be any value in pursing a patch like this in the
> meantime to resolve the performance issue?
>
> Thanks,
> Zach
>
> <16833_hang_on_large_case_range.patch>_______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>
>
>
>
>
> --
> Best regards,
> Aleksei Sidorin
> Software Engineer,
> IMSWL-IMCG, SRR, Samsung Electronics
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140905/9de6d394/attachment.html>


More information about the cfe-dev mailing list