[PATCH] D84108: [SimplifyCFG][LoopRotate] SimplifyCFG: disable common instruction hoisting by default, enable late in pipeline

Sun Jul 26 15:31:49 PDT 2020

lebedev.ri added a comment.

In D84108#2174232 <https://reviews.llvm.org/D84108#2174232>, @lebedev.ri wrote:

> I think we're focusing on the wrong thing here.
>  Is the preexisting memdep perf problems the only issue everyone has with this?
>  Does nobody have any other, perhaps fundamental, concerns here?
>  If not, are memdep perf problems a blocker here?

> In D84108#2174113 <https://reviews.llvm.org/D84108#2174113>, @nikic wrote:
> 
>> Some thoughts:
>>
>> - Ultimately memdep analysis spends most time in alias analysis, and we can make that part faster but using BatchAA. This has a non-trivial positive impact in general: http://llvm-compile-time-tracker.com/compare.php?from=bc79ed7e16003c8550da8710b321d6d5d4243faf&to=1ab8f1b475d2b9d464add10d1a7f87f18c073fb0&stat=instructions For this particular test case it drops instructions from ~4.2M to ~3.8M.
> 
> 
> Looking forward to such a patch.
> 
>> - I think fundamentally, what memdep does and the way it does it, cannot be easily improved. This is probably something that MemorySSA-based NewGVN improves on (at least in theory -- in practice MemorySSA has a not so stellar compile-time record).
> 
> Unless there's some global cache missing that is, but i'd think it would already be there, so yeah.
> 
>> - What can be improved though are the cutoffs. Non-local memdep analysis has two primary cutoffs: The BlockScanLimit=100, which limits the amount of instructions inspected in a single block, and the BlockNumberLimit=1000, which limits the amount of inspected blocks. Taken together, this means that a **single** memdep query can inspect up to 100000 instructions, which is frankly insane. A quick test with -memdep-block-number-limit=100 drops the instructions for the test case to ~1.8M.
> 
> Right.
> 
>> It may be reasonable to either reduce the default block limit here, or to introduce an additional "total instruction limit" for non-local queries.
> 
> On the first glance, it doesn't look right to me to lower the existing limits, they don't seem *that* high.
>  What indeed i think we are missing, is some more global budget. I'll take a look.

After doing some stats, D84609 <https://reviews.llvm.org/D84609>

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D84108/new/

https://reviews.llvm.org/D84108