<div dir="ltr"><div>I'm glad you have had more success than me. The road I've been going down has caused much head-scratching and little progress.<br><br></div>Zach<br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Sep 5, 2014 at 1:13 AM, Aleksei Sidorin <span dir="ltr"><<a href="mailto:a.sidorin@samsung.com" target="_blank">a.sidorin@samsung.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div>Thanks Jordan.<br>
<br>
I have participated in this conversation and it was the thing that
pointed me to the source of analyzer misbehavior I face on some
files. This problem was critical for me and, considering your
ideas, I have tried to implement a solution that fixes a problem
(and it seems to work in my case).<br>
<br>
The source of the problem we faced was that we can correctly
handle "LOW <= $sym <= HIGH" assumptions but cannot handle
their opposite in the _one_ ProgramState as it needed to get
correct assumptions for 'default' case. Using given assumes, we
can handle "$sym < LOW" or "$sym > HIGH" but not both "$sym
< LOW || $sym > HIGH" because there is a non-constant
symbolic expression in both parts of an expression so we cannot
reason about it now. I have implemented a helper method that
returns a pair of states: "LOW <= $sym <= HIGH" in the first
state and its opposite in the second one. The first state is used
for the current case, second state is used to check another cases
until 'default' case.<br>
<br>
Zach, sorry for not CC'ing you in review message.<br>
<br>
</div><div><div class="h5">
<blockquote type="cite">
<div>Looks like Aleksei's taken some of these ideas and run with
them in <a href="http://reviews.llvm.org/D5102" target="_blank">http://reviews.llvm.org/D5102</a>.
Aleksei, I haven't had time to look at your implementation yet,
but if you haven't seen it yet here's some of the issues Zach
and I were discussing.</div>
<div><br>
</div>
<div>Jordan</div>
<div><br>
</div>
<br>
<div>
<div>On Aug 21, 2014, at 20:13 , Jordan Rose <<a href="mailto:jordan_rose@apple.com" target="_blank">jordan_rose@apple.com</a>>
wrote:</div>
<br>
<blockquote type="cite">
<div style="word-wrap:break-word">
<div>Hm, okay. Here's some basic structure for what happens
now:</div>
<div><br>
</div>
<div>1. SimpleConstraintManager::canReasonAbout decides
whether or not a symbolic expression is even worth
processing.</div>
<div>2. SimpleConstraintManager::assumeAux (the one that
takes a NonLoc) does some preprocessing and canonicalizing
so that it can identify the parts of a constraint.</div>
<div>3. SimpleConstraintManager::assumeSymRel makes sure all
the numbers involved are compatible, and then dispatches
to the various RangeConstraintManager methods</div>
<div>4. RangeConstraintManager::assumeSym* handle the six
possible relational operators and the minute tricky
differences between them, applying the new constraints as
intersections with the existing range sets.</div>
<div><br>
</div>
<div>If you really want to put this into the constraint
manager, I think I would <i>still</i> do this the "sneaky
way", by canonicalizing your "LOW < $sym < HIGH"
into "$sym - LOW < HIGH - low" in
SimpleConstraintManager::assumeAux. That achieves maximal
code reuse. If you weren't to do it this way, I think
you'd have to introduce a new assumeInRange method that
could be overridden; this would probably be easier to get
right but I'm not sure it would actually add value.</div>
<div><br>
</div>
<div>As a side note, I would consider writing your
disjunction with & rather than &&. Because of
&&'s short-circuiting, you'd never actually see
that expression show up in assume() otherwise, and
handling & would make it possible to handle similar
cases in user code in the future.</div>
<div><br>
</div>
<div>Please continue to e-mail with any questions you may
have!</div>
<div>Jordan</div>
<div><br>
</div>
<br>
<div>
<div>On Aug 18, 2014, at 7:41 , Zach Davis <<a href="mailto:zdavkeos@gmail.com" target="_blank">zdavkeos@gmail.com</a>>
wrote:</div>
<br>
<blockquote type="cite">Let me see if I understand what
you are saying:<br>
<br>
We are able to symbolically evaluate (correctly) the "x
>= lower && x<br>
<= upper" expression with SVal's. However, we can't
'remember' this<br>
result because we don't construct a Constraint on the
expression, and<br>
even if we did the existing RangeConstraint manager
doesn't support<br>
it.<br>
<br>
I feel like the RangeConstraint manager is really the
best way to fix<br>
this, even if it's not easy. It doesn't seem to be a
huge issue<br>
anyway, if someone _really_ wants to analyse a file with
such an<br>
expression, they can use the patch i supplied in the
mean time.<br>
<br>
<br>
Unfortunately, looking through the Constraint manager(s)
code, I have<br>
no idea where to start.<br>
<br>
Zach<br>
<br>
On Tue, Aug 12, 2014 at 9:38 PM, Jordan Rose <<a href="mailto:jordan_rose@apple.com" target="_blank">jordan_rose@apple.com</a>>
wrote:<br>
<blockquote type="cite">That FIXME, and indeed the code
to support case ranges, actually predates<br>
RangeConstraintManager, which implements most of what
the FIXME is referring<br>
to. (That is, the technical/historical reason we
couldn't do it symbolically<br>
is "it wasn't implemented yet".) So your change is in
fact on the right<br>
track.<br>
<br>
However, what you've done is construct the expression
"$x >= LOWER && $x <=<br>
UPPER". This is correct, but it's not a form that
RangeConstraintManager<br>
will actually handle right now:<br>
<br>
void clang_analyzer_eval(int);<br>
void test(int i) {<br>
switch (i) {<br>
case 1 ... 5:<br>
return;<br>
}<br>
clang_analyzer_eval(x == 2); // should print FALSE,
but will print UNKNOWN<br>
with your patch.<br>
}<br>
<br>
(clang_analyzer_eval is a special function used to
probe the analyzer state,<br>
handled by the debug.ExprInspection checker.)<br>
<br>
The easiest way to fix this would be to use assumeDual
on each expression<br>
individually. Then you end up with three cases: $x
< LOWER, $x is in range,<br>
and $x > UPPER. Unfortunately, you'd then have to
restructure the whole<br>
function, which currently assumes that you end up with
exactly one state<br>
that can end up in the "default" branch. That wouldn't
be too hard.<br>
<br>
The harder way to solve this is to actually teach
[Range]ConstraintManager<br>
to recognize constraints like this, and do the right
thing. The<br>
implementation is already set up to handle disjoint
sets of ranges, so you<br>
could actually do this without changing everything.
It's probably more messy<br>
than you'd like, though.<br>
<br>
The sneaky way to do this is to realize that "$x >=
LOWER && $x <= UPPER" is<br>
actually equivalent to "$x - (LOWER - MIN) <= UPPER
- (LOWER - MIN)", where<br>
MIN is the minimum value for the integer type, for
integers with wraparound<br>
semantics. (RangeConstraintManager actually takes
advantage of this in<br>
reverse to translate "$x - OFFSET <= VALUE" back
into a pair of range<br>
disjunctions.) The only trouble with doing something
like this is that<br>
wraparound is Hard™ and you/we will want to be very
careful that this<br>
actually handles everything correctly, for all the
integer types we care<br>
about. A definite benefit of this is that we're back
to only having one<br>
state, which prevents potential exponential explosion
as analysis continues.<br>
<br>
What do you think we should do here?<br>
Jordan<br>
<br>
<br>
On Aug 12, 2014, at 12:39 , Zach Davis <<a href="mailto:zdavkeos@gmail.com" target="_blank">zdavkeos@gmail.com</a>>
wrote:<br>
<br>
All-<br>
<br>
I was looking over 16833 and had a few questions.<br>
<br>
The issue is the analyzer tries to concretize all the
values in a case<br>
statement with a range value:<br>
<br>
switch (a) {<br>
case 1 ... 0xfffff: // hang's trying each value
in the range<br>
return 1;<br>
default:<br>
return 0;<br>
}<br>
<br>
Is there any reason we can't symbolically evaluate 1
<= a <= 0xfffff,<br>
rather than trying each 'a == v for v in 1..0xfffff'?
I've attached a<br>
patch that attempts to do this and it seems to work
(it no longer<br>
hangs anyway). Is there a technical/historical reason
for doing it<br>
the current way?<br>
<br>
The FIXME above the block mentions using '"ranges" for
NonLVals'. I'm<br>
not sure what this means, or if any work has been done
toward this.<br>
If not, would there be any value in pursing a patch
like this in the<br>
meantime to resolve the performance issue?<br>
<br>
Thanks,<br>
Zach<br>
<16833_hang_on_large_case_range.patch>_______________________________________________<br>
cfe-dev mailing list<br>
<a href="mailto:cfe-dev@cs.uiuc.edu" target="_blank">cfe-dev@cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev</a><br>
<br>
<br>
</blockquote>
</blockquote>
</div>
<br>
</div>
</blockquote>
</div>
<br>
</blockquote>
<br>
<br>
</div></div><span class="HOEnZb"><font color="#888888"><pre cols="72">--
Best regards,
Aleksei Sidorin
Software Engineer,
IMSWL-IMCG, SRR, Samsung Electronics
</pre>
</font></span></div>
</blockquote></div><br></div>