[PATCH] D11212: [SCEV] Apply NSW and NUW flags via poison value analysis

Sat Jul 18 19:33:22 PDT 2015

broune added a comment.

Thanks for the comments, Sanjoy. I'll update the code with name changes Monday.

================
Comment at: include/llvm/Analysis/ScalarEvolution.h:587
@@ +586,3 @@
+    // (e.g. a nuw add) would trigger undefined behavior on overflow.
+    SCEV::NoWrapFlags getNoWrapFlagsFromPoison(const Value *V);
+
----------------
sanjoy wrote:
> Minor & optional: I'd call this `getNoWrapFlagsFromUB`.
SGTM, I'll make the change.

================
Comment at: include/llvm/Analysis/ValueTracking.h:313
@@ +312,3 @@
+  /// the loop header.
+  bool isGuaranteedToExecuteForEveryIteration(const Instruction *I,
+					      const Loop *L);
----------------
sanjoy wrote:
> Is the indentation a little bit off here?
> 
> Actually, I'll just assume you'll run this change through `clang-format` before checking in, and won't make any further whitespace related comments.  Please let me know if you'd prefer otherwise.
Sorry, I need to find a better way to set up my editor. I'll run clang-format before checking in.

================
Comment at: include/llvm/Analysis/ValueTracking.h:327
@@ +326,3 @@
+
+  /// Returns either nullptr or an operand of I such that I will trigger
+  /// undefined behavior if I is executed and that operand has a full-poison
----------------
sanjoy wrote:
> I'd prefer if you were a little more terse -- perhaps `getGuaranteedNonPoisonOp`?
> 
> (IOW, which operand is guaranteed to not be poison since if it were poison then the program is undefined)
I wasn't so happy with it myself and I like your suggestion. Maybe getGuaranteedNonFullPoisonOp ? I'll change it Monday.

================
Comment at: include/llvm/Analysis/ValueTracking.h:339
@@ +338,3 @@
+  /// the basic block that is the parent of I.
+  bool undefinedBehaviorIsGuaranteedIfFullPoison(const Instruction *PoisonI);
+
----------------
sanjoy wrote:
> Similarly, how about calling this `isKnownNotPoison`?
Nice. isKnownNotFullPoison? I'm concerned that it would be too easy to miss the distinction between poison and full-poison and the bugs from that would be hard to shake out.

================
Comment at: lib/Analysis/ScalarEvolution.cpp:4144
@@ +4143,3 @@
+      if (isLoopInvariant(OtherOp, AddRec->getLoop()) &&
+          isGuaranteedToExecuteForEveryIteration(BinOp, AddRec->getLoop()))
+        return Flags;
----------------
sanjoy wrote:
> I'm not sure that this is correct.  I think you need to prove that the instruction you used to justify UB if `V` was poison is what needs to execute on every loop iteration.
> 
> This is not a problem currently because `undefinedBehaviorIsGuaranteedIfFullPoison` only looks at a single basic block, but semantically, the following program will be problematic:
> 
> ```
> 
>  loop_header:
>    %x = add nsw %p, %q
>    ...
> 
>  outside_the_loop:
>    ;; no other uses of %x
>    store i32 42, GEP(@global, %x)
> 
> ```
> 
> You can conclude that the `%x` does not overflow in the last iteration, but that's it -- even if `%x` was poison in all the other iterations your program is well defined.
As you point out, UB is not guaranteed on poison in this example, so even if `undefinedBehaviorIsGuaranteedIfFullPoison` is made stronger by considering more basic blocks, it should still return false here, right?

================
Comment at: lib/Analysis/ValueTracking.cpp:3327
@@ +3326,3 @@
+    !isa<InvokeInst>(I) && // might not terminate
+    !isa<ResumeInst>(I) && // has no successors
+    !isa<ReturnInst>(I); // has no successors
----------------
sanjoy wrote:
> If you're dealing with terminator instructions here (I'm not sure that that is necessary) why is `br` okay?  Can't a `br` form an infinite loop?
Yes, br can form an infinite loop. This function should still return false for br, as each single execution of br does terminate and then transfers execution to its successor (even if it is its own successor), but I suspect that that's not what you're getting at with this question. :)

Zooming out a bit, what I'm really using this function for is as a component of proving that one instruction B strongly post-dominates another instruction A. Part of that is to prove that there are no infinite paths from A to B, since in such a path we'd never actually get to B, even if B (non-strongly) post-dominates A. In the general case, yes, we'd also need to take into account loops between A and B within the same CFG/function and prove that they terminate. The main reason that I'm only considering one basic block at a time is to make things simpler by avoiding the need for that, since I think that this change is already complex enough as it is. Even then, we still need to look out for single instructions where a single invocation can be enough to prevent strong post-dominance (even within a basic block), and this function serves that purpose.

You're right that this over-all feature could work with a function like this that did not consider terminators. I included terminators anyway just because it gives this function a cohesive responsibility hat makes it easier to think about and it's also what would/will be needed when reasoning about strong post-dominance across basic blocks. I don't have any strong objection to taking the terminators out for now, though.

http://reviews.llvm.org/D11212