[PATCH] D12063: [PM] Port ScalarEvolution to the new pass manager.

Sun Aug 16 11:32:00 PDT 2015

hfinkel added a comment.

> This change makes ScalarEvolution a stand-alone object and just produces

>  one from a pass as needed. Making this work well requires making the

>  object movable, using references instead of overwritten pointers in

>  a number of places, and other refactorings.

> 

> But there is a really big, really scary change here. Prior to this patch

>  ScalarEvolution was never *actually* invalidated!!! Re-running the pass

>  just re-wired up the various other analyses and didn't remove any of the

>  existing entries in the SCEV caches or clear out anything at all.

As we discussed on IRC, this is scarily broken, and we definitely need to move to a correct solution. I think that, as it happens, the passes most like to indirectly invalidate SCEVs (things like LoopVectorize and LoopRotate that change loop starting values and trip counts around existing instructions) are SCEV-aware, and call forgetLoop, and that's why we've not really seen this blow up on us. Regardless, this does not rule out more-subtle situations from non-SCEV-aware passes.

> This

>  might seem OK as everything in SCEV that can uses ValueHandles to track

>  updates to the values that serve as SCEV keys. However, this still means

>  that as we ran SCEV over each function in the module, we kept

>  accumulating more and more SCEVs into the cache. At the end, we would

>  have a SCEV cache with every value that we ever needed a SCEV for in the

>  entire module!!! Yowzers. The releaseMemory routine would dump all of

>  this, but that isn't realy called during normal runs of the pipeline as

>  far as I can see.

> 

> To make matters worse, there *is* actually a key that we don't update

>  with value handles -- there is a map keyed off of Loop*s. Because

>  LoopInfo *does* release its memory from run to run, it is entirely

>  possible to run SCEV over one function, then over another function, and

>  then lookup a Loop* from the second function but find an entry inserted

>  for the first function! Ouch.

> 

> To make matters still worse, there are plenty of updates that *don't*

>  trip a value handle. It seems incredibly unlikely that today GVN or

>  another pass that invalidates SCEV can update values in *just* such

>  a way that a subsequent run of SCEV will incorrectly find lookups in

>  a cache, but it is theoretically possible and would be a nightmare to

>  debug.

> 

> With this refactoring, I've fixed all this by actually destroying and

>  recreating the ScalarEvolution object from run to run. Technically, this

>  could increase the amount of malloc traffic we see, but then again it is

>  also technically correct. ;] I don't actually think we're suffering from

>  tons of malloc traffic from SCEV because if we were, the fact that we

>  never clear the memory would seem more likely to have come up as an

>  actual problem before now. So, I've made the simple fix here. If in fact

>  there are serious issues with too much allocation and deallocation,

>  I can work on a clever fix that preserves the allocations (while

>  clearing the data) between each run, but I'd prefer to do that kind of

>  optimization with a test case where it helps a lot.

We can certainly do that if it helps, I'd wait, however, unless and until benchmarking suggests it. I suspect that, if we see this as a compile-time hit at all, it will be because we're actually recomputing SCEVs across invalidations, where we were not previously, and the memory allocation will be a secondary concern.

> Any concerns with this approach? If folks are generally, happy, I'll add

>  the actual new pass manager wiring and submit, thanks!

Please proceed.

http://reviews.llvm.org/D12063