[PATCH] D71974: [Attributor][WIP] Connect AAIsDead with AAUndefinedBehavior

Tue Dec 31 11:17:47 PST 2019

baziotis marked an inline comment as done.
baziotis added inline comments.

================
Comment at: llvm/lib/Transforms/IPO/Attributor.cpp:3120
+    }
+    AliveSuccessors = std::move(AliveSuccessorsCopy);
+
----------------
jdoerfert wrote:
> baziotis wrote:
> > baziotis wrote:
> > > jdoerfert wrote:
> > > > baziotis wrote:
> > > > > baziotis wrote:
> > > > > > jdoerfert wrote:
> > > > > > > I think something like `make_filter_range` from `llvm/include/llvm/ADT/STLExtras.h` could make this nicer. Or maybe we just need to put it in a helper function, or both.
> > > > > > Yes, using a helper would be better. `make_filter_range` seems cool, I didn't know it. TBH though, I think it gives less control and predictability. It's not as clear to me as in the current what happens under the hood (apart from the fact that we'll query 2 times the `AAUB`). Also, FWIW, the current will have a worst case of one allocation and one free and an average case of no allocation / free (assuming that most instructions have less than 8 successors).
> > > > > > I'll put it in a helper function and update me if you still think it's not that good and I'll change it to `make filter_range`.
> > > > > Btw, I forgot to explain the `AssumedLiveBlocks.erase(BB)` part. Basically, if an alive successor is UB and its parent BB has been put into `AssumedLiveBlocks`, then we should mark that BB dead by removing it. That however seems bad to me. Up to now, we only inserted to it and it feels that something can go badly if we start removing stuff (like, go to an endless loop or the not having a monotone procedure).
> > > > > I'll take a look again, it was a quick change but feel free to update me if we should worry about that.
> > > > I haven't read the code in detail, some observations:
> > > > 
> > > > ```
> > > >  if (AAUB.isKnownToCauseUB(AliveSuccessor) ||
> > > >           (Assumed = AAUB.isAssumedToCauseUB(AliveSuccessor))) {
> > > > ```
> > > > Assumed is always true or uninitialized here. Reading it after results in true or UB. What you want is to check assumed and if true check known to determine if you used assumed information.
> > > > 
> > > > ---
> > > > 
> > > > No need for `count` if you call `erase`. Just erase/insert stuff, the return value even tells you if it was in before.
> > > > 
> > > > ---
> > > > 
> > > > I would like to understand the situation in which we think a block is UB *only after* it was put in `AssumedLiveBlocks`. We should not erase it but add an assertion so we can see if it triggers on the test. I hope it does not (and we can keep the assertion) as that would mean it works properly.
> > > > Assumed is always true or uninitialized here. Reading it after results in true or UB. What you want is to check assumed and if true check known to determine if you used assumed information.
> > > Yes, it should be initialized to `false`. Well, the reason the code has been written in this (maybe weird) way is because if `Known` returns true, we don't have to call assumed (which we will do if we always call assumed first). I think that if we just initialize to `false` it will be ok.
> > > 
> > > ---
> > > 
> > > > No need for count if you call erase. Just erase/insert stuff, the return value even tells you if it was in before.
> > > Ty, I didn't know that.
> > > 
> > > ---
> > > 
> > > > I would like to understand the situation in which we think a block is UB *only after* it was put in AssumedLiveBlocks. We should not erase it but add an assertion so we can see if it triggers on the test. I hope it does not (and we can keep the assertion) as that would mean it works properly.
> > > 
> > > A simple way this can happen is with that:
> > > ```
> > > define void @cond_br_on_undef_interproc() {
> > >   br i1 undef, label %t, label %e
> > > t:
> > >   ret void
> > > e:
> > >   ret void
> > > }
> > > ```
> > > Note that `br i1 undef, ...` is in the entry block, which is assumed live in `initialize()` based on known info (and of course, at this point, it's not yet known it causes UB). So, this becomes:
> > > ```
> > > define void @cond_br_on_undef_interproc() {
> > >   unreachable
> > > t:
> > >   unreachable
> > > e:
> > >   unreachable
> > > }
> > > ```
> > > which is ok I guess, but we could have deleted the whole function (which we didn't since the block is still alive).
> > > When other instructions go to the entry block, that gets more complicated.
> > > I'm not yet sure what happens for non-entry blocks.
> > Nit: Actually, we can remove `Assumed` completely and use `UsedAssumedInformation` which will probably make the code maybe too smart for our own good. :P
> > Note that br i1 undef, ... is in the entry block, which is assumed live in initialize() based on known info (and of course, at this point, it's not yet known it causes UB).
> 
> We need to apply the same logic (via the helper) in the initialize.
> 
> > When other instructions go to the entry block, that gets more complicated.
> 
> That is forbidden.
> 
> > I'm not yet sure what happens for non-entry blocks.
> 
> I think the logic in the initialize will allow us to place an assertion here.
> 
> > Nit: Actually, we can remove Assumed completely and use UsedAssumedInformation which will probably make the code maybe too smart for our own good. :P
> 
> Variables do not cost anything. Make the core readable first.
>> When other instructions go to the entry block, that gets more complicated.
> That is forbidden.
Sorry, I didn't mean to say "go" as "branch to". I meant when we add other instructions to the entry block.
> We need to apply the same logic (via the helper) in the initialize.
Sorry, I didn't get that. Could you please specify what logic?
I was thinking something like (in the initialize):
- If it is known to cause UB, just return.
- Otherwise, insert `Front` in `ToBeExplored` and `assumeLive()` the block only if `!AAUB.isAssumedToCauseUB(Front)`.

---

Unfortunately, I think that the fact that `isAssumedToCauseUB()` is over-optimistic will make all this more and more complicated.
Take for example this:
```
define i32 @example(i1* %alloc) {
  %cond = load i1, i1* %alloc
  br i1 %cond, label %t, label %e
t:
  ret i32 1
e:
  ret i32 2
}
```
(and assuming we're having the current thing in `initialize()`).

  # The entry block will be marked live.
  # `AAIsDeadFunction::updateImpl()` is called. The `br` is successor of `%cond`. //But//, because `isAssumedToCauseUB()` is overoptimistic, it will assume this UB. This in turn will insert `%cond` to `KnownDeadEnds`. Now we could say here that this is based on assumed info and another `updateImpl()` will correct but here's the catch (or you could call it something like a dead-lock):
  # `AAUB::updateImpl()` will be called. But, this uses `checkForAllInstructions()`, which uses liveness, which in turn, uses `isAssumedDead()`. The `br` will be //assumed dead// (because its predecessor, `%cond`, is in `KnownDeadEnds`) and won't make it to the predicate which would eventually add it to `AssumedNoUBInsts` which would correct all this situation.

This means that the `br` never gets live and this procedure continues and we're left with:
```
define i32 @cond_br_on_undef_uninit(i1* nocapture nofree nonnull readonly dereferenceable(1) %alloc) #0 {
  %cond = load i1, i1* %alloc
  br i1 %cond, label %t, label %e

t:                                                ; preds = %0
  unreachable

e:                                                ; preds = %0
  unreachable
}
```

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71974/new/

https://reviews.llvm.org/D71974