[PATCH] D155528: [SimplifyCFG][FIX] Update GlobalsAA after an assumption was created

Tue Jul 18 00:49:57 PDT 2023

nikic requested changes to this revision.
nikic added a comment.
This revision now requires changes to proceed.

Let's look at @aeubanks case first. We have a transform like this:

  define internal void @src(ptr %p) memory(none) {
    %c = icmp eq ptr %p, null
    br i1 %c, label %err, label %ok
  err:
    unreachable
  ok:
    ret void
  }

  define internal void @tgt(ptr %p) memory(none) {
    %c = icmp eq ptr %p, null
    %0 = xor i1 %c, true
    call void @llvm.assume(i1 %0) memory(inaccessiblemem: readwrite)
    ret void
  }

I believe this transform is correct. The `memory` attribute (and other similar attributes) always operate from the perspective of effects observable by the caller. In this case, the caller can legally treat the `@tgt` function as `memory(none)`. The fact that the whole function is `memory(none)` does not imply that there are no memory accesses inside the function -- this is most obviously true for accesses to allocas, but I believe this is essentially the same situation. You can think of the "inaccessible memory" being modified as being function-local inaccessible memory, effects on which cannot be observed beyond the function boundary.

-----

Now for @jdoerfert's case, things are a bit more interesting. Again, I believe it is fine for FunctionAttrs to infer `memory(none)` on the function, but the GlobalsAA interaction could also cause us to ignore the assume effects //inside// the function.

For assumes in particular, I believe that the memory effects exists only to avoid optimizing the assume away. Assumes don't //really// have effects. The property ensuring their correctness is the lack of `speculatable`, which prevents them from being hoisted past control flow. So I believe from a correctness perspective, treating assumes as `memory(none)` is fine as well. However, it may result in assumes being dropped unprofitably (though in practice this probably doesn't happen because we don't use AA to drive DCE).

However, is this problem actually specific to assumes? For example, what if the vectorizer replaces plain load/store with some kind of masked load/store intrinsic -- will that get treated as `memory(none)` by GlobalsAA as well? If yes, that would certainly be wrong and could lead to real miscompilations.

And if yes, then I think it's clear that the approach you're persuing here cannot work. This means that we have to update the call graph every time an intrinsic call is introduced, while we currently widely assume that this is not necessary. In D141190 <https://reviews.llvm.org/D141190> you implemented a change to start including non-nocallback intrinsics in the call graph and made GlobalsAA rely on that -- I think we just found out what the reason for the original design was and probably need to revert that change. As long as arbitrary transforms can insert intrinsics, we cannot require them to be part of the call graph.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155528/new/

https://reviews.llvm.org/D155528