[PATCH] D56720: [MemorySSA] Add caching results of reaching LiveOnEntry MemoryDef to ClobberWalker

Tue Jan 15 12:01:53 PST 2019

george.burgess.iv added a comment.

Thanks for this!

It's unrelated to the general problem here, but for this repro in particular, it looks pretty simple to reduce compile time to ~2 seconds on clang ToT. Clang needs to emit a dynamic initializer for `kTests` because neither `Condition` nor `Register` are constexpr. If you tag `kTests` with `[[clang::require_constant_initialization]]` and address the complaints it has, it looks like just a handful of additions of `constexpr ` will allow clang to elide the dynamic initializer entirely.

Back on topic, since we already keep per-Memory{Use,Def} caches of where things point to, and keeping up with those has been a constant source of bugs, I'm pretty reluctant to have a second cache here to have to reason about.

We originally had a similar cache embedded inside of the clobber walker (https://reviews.llvm.org/D31576). It was removed because it was a net loss (or neutral) in cycles and memory in all of the cases we could find. In general, for caches where you can't trivially figure out the things that point to a cached clobber (as we have here), we're in the unfortunate situation of needing to drop the entire thing on basically every meaningful update of any `MemoryAccess`, which makes it difficult for them to provide great value in general.

Looking at the repro, it appears that, after optimizations, we end up with a single basic block that consists of over 20,000 (!) lines of IR. Around 19,000 of these are `store`s. So, it makes sense that our current walker spins a lot for this. :)

When we first landed MemorySSA, there were talks about flagging pathological blocks like that and just refusing to walk/optimize them at all. For the reasons detailed above, I'd strongly prefer we do that over keeping a cache, unless a new cache is a very solid win in general. If you find that idea reasonable, it looks like the `MaxCheckLimit` flag is only respected in our `OptimizeUses` walker; I'd be happy to have some logic in our CachingWalker that says "if we walk more than `MaxCheckLimit` defs in a single block, give up."

What do you think?

================
Comment at: lib/Analysis/MemorySSA.cpp:599

-  /// Result of calling walkToPhiOrClobber.
-  struct UpwardsWalkResult {
-    /// The "Result" of the walk. Either a clobber, the last thing we walked, or
-    /// both. Include alias info when clobber found.
-    MemoryAccess *Result;
-    bool IsKnownClobber;
-    Optional<AliasResult> AR;
-  };
+  WalkResultsCache CachedResults;

----------------
If we end up going forward with this, please make caching optional, so that we can have an `EXPENSIVE_CHECKS` assert to verify that we get the same result with/without our cache (without dropping our cache), as we did before.

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D56720/new/

https://reviews.llvm.org/D56720