[PATCH] D38662: [BasicAA] Support arbitrary pointer sizes (and fix an overflow bug)

Wed Oct 11 13:16:47 PDT 2017

mppf added a comment.

@hfinkel - I'm obviously not your main reviewer, but I did spend a few minutes looking at this.

First, it appears that this changeset addresses the problem I was having. I'm running in to a new problem with 128-bit pointers when using trunk (vs LLVM 4 or 5) which I'll suggest a fix for / bug report once I've got something minimized. I'll be working on that as well as generating a better test case for this BasicAA issue.

Thanks!

================
Comment at: lib/Analysis/BasicAliasAnalysis.cpp:379
+  unsigned ShiftBits = Offset.getBitWidth() - PointerSize;
+  return (Offset << ShiftBits).ashr(ShiftBits);
+}
----------------
This is meant to just take the bottom PointerSize bits. Shouldn't you use APInt.trunc to express it more simply?

================
Comment at: lib/Analysis/BasicAliasAnalysis.cpp:387
+
+  return MaxPointerSize;
 }
----------------
I find it surprising that you needed to do this, but I don't have a great understanding of what's going on. In the change description, you pointed out 

>This is because, as noted in the patch, when GetLinearExpression decomposes an expression into C1*V+C2, and we then multiply this by Scale, and distribute, to get (C1*Scale)*V + C2*Scale, it can be the case that, even through C1*V+C2 does not overflow for relevant values of V, (C2*Scale) can overflow. If this happens, later logic will draw invalid conclusions from the (base) offset value. 

Can you explain why it's not sufficient to do these computations in the number ring for the GEP? E.g. if all GEPs were with 32-bit pointers, mathematically the distribution should work if everything is done with 32-bit numbers, including with overflow. And the LLVM spec says that is what these GEPs mean...

Is it just that the alias analysis pass then conservatively throws up its hands if overflow occurred? Or is it the case that the computations done here might cross different pointer sizes?

================
Comment at: lib/Analysis/BasicAliasAnalysis.cpp:492
+            CIdx->getValue().sextOrSelf(MaxPointerSize))
+              .sextOrTrunc(MaxPointerSize);
         continue;
----------------
I'm probably missing something, but isn't .sextOrSelf(x).sextOrTrunc(x) going to do the same thing as .sextOrTrunc(x) ?

================
Comment at: lib/Analysis/BasicAliasAnalysis.cpp:553

-      if (Scale) {
-        VariableGEPIndex Entry = {Index, ZExtBits, SExtBits,
-                                  static_cast<int64_t>(Scale)};
+      if (!!Scale) {
+        VariableGEPIndex Entry = {Index, ZExtBits, SExtBits, Scale};
----------------
I've never seen !! used like this before. Is it intentional? Is it the same as if (Scale != 0) ? If so, wouldn't that be clearer?

================
Comment at: lib/Analysis/BasicAliasAnalysis.cpp:1098
+  if (C1->getBitWidth() > 64 || C2->getBitWidth() > 64)
+    return MayAlias;
+
----------------
This addition seems surprising to me. Why wouldn't we just use APInt for the computation below? Is this a workaround for something?

https://reviews.llvm.org/D38662