[PATCH] D82381: [analyzer] Introduce small improvements to the solver infra

Wed Jul 8 09:32:57 PDT 2020

NoQ accepted this revision.
NoQ added inline comments.

================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:734
   //        expressions which we currently do not know how to negate.
-  const RangeSet *getRangeForMinusSymbol(ProgramStateRef State, SymbolRef Sym) {
+  Optional<RangeSet> getRangeForInvertedSub(SymbolRef Sym) {
     if (const SymSymExpr *SSE = dyn_cast<SymSymExpr>(Sym)) {
----------------
vsavchenko wrote:
> ASDenysPetrov wrote:
> > vsavchenko wrote:
> > > ASDenysPetrov wrote:
> > > > As for me, I'd call this like `getRangeForNegatedSymSymExpr`, since you do Negate operation inside.
> > > I'm not super happy about my name either, but I feel like it describes it better than the previous name and your version.  That function doesn't work for any `SymSymExpr` and it doesn't simply negate whatever we gave it.  It works specifically for symbolic subtractions and this is the information I want to be reflected in the name.
> > Oh, I just assumed //...Sub// at the end as a //subexpression// but you mean //subtraction//. What I'm trying to say is that we can rename it like `getRangeFor...`//the expression which this function can handle//. E.g. `getRangeForNegatedSubtractionSymSymExpr`. My point is in a speaking name.
> > 
> > I think //invertion// is not something appropriate in terms of applying minus operator. I think invertion of zero should be something opposite but not a zero. Because when you would like to implement the function which turns [A, B] into [MIN, A)U(B, MAX], what would be the name of it? I think this is an //invertion//.
> > 
> > But this is not a big deal, it's just my thoughts.
> My thought process here was that we are trying to get range for `A - B` and there is also information on `B - A`, so we can get something for `A - B` based on that.  So, it doesn't really matter what it does under the hood with ranges, it matters what its semantics are.  Here I called `B - A` //an inverted subtraction//.
> I don't really know what would be the best name, but I thought that this one makes more sense.
> Because when you would like to implement the function which turns [A, B] into [MIN, A)U(B, MAX], what would be the name of it? I think this is an //invertion//.

https://en.wikipedia.org/wiki/Complement_(set_theory)
https://en.wikipedia.org/wiki/Inverse_function

"Negated subtraction" is definitely my favorite so far. "Mirrored" might be a good layman term as well.

================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:467-470
+/// Available representations for the arguments are:
+///   * RangeSet
+///   * Optional<RangeSet>
+///   * RangeSet *
----------------
vsavchenko wrote:
> xazax.hun wrote:
> > NoQ wrote:
> > > Why do we have these representations in the first place?
> > > 
> > > It sounds like you're treating null pointers / empty optionals equivalently to full ranges (i.e., intersecting with them has no effect). How hard would it be to simply construct an actual full range in all the places in which we currently have empty optionals?
> > +1
> > 
> > I also find this confusing. Should I interpret a None as a full or empty range? Does this depend on the context or a general rule? Is there any reason we need to handle nullable pointers or could we actually make call sites more uniform to get rid of some of the complexity here?
> > It sounds like you're treating null pointers / empty optionals equivalently to full ranges (i.e., intersecting with them has no effect)
> It is not actually true.  Empty optionals is the information that "this range information is not available for this symbol".  It is true that intersecting with them has no effect, but they are semantically different in other aspects.  
> 
> Currently solver RELIES on the information that whether the range is available or not (see my previous comment), and we simply couldn't get rid of this behavior as part of NFC.
> 
> > How hard would it be to simply construct an actual full range in all the places in which we currently have empty optionals?
> It is not hard architecturally, but it WILL make the change functional and possibly impact the performance.
> 
> > Should I interpret a None as a full or empty range?
> Neither of those.  That is the problem right now, that we rely on different sources of information for the ranges and behave differently depending on whether one piece of information is available or not.
> 
> > Does this depend on the context or a general rule?
> Semantics are always the same - this info is not available.
> 
> > Is there any reason we need to handle nullable pointers?
> `State->get<XYZ>(abc)` returns a nullable pointer meaning optional object, it is hard to avoid it especially because we currently behave very differently when this information is not available.
> 
> > ...could we actually make call sites more uniform to get rid of some of the complexity here?
> Yes we can, but later on. See the explanations above and in my other comment.
> it WILL make the change functional and possibly impact the performance

Mmm, yeah, i guess the necessity to construct a full range and then do the possibly O(N) intersection when it'll do nothing most of the time is indeed daunting.

Maybe we should implement full ranges as a special case in the `RangeSet` class? Like, don't actually construct the single range, just remember the type; special-case all operations on such sets to perform trivially in O(1), etc.?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82381/new/

https://reviews.llvm.org/D82381