[PATCH] D84108: [SimplifyCFG][LoopRotate] SimplifyCFG: disable common instruction hoisting by default, enable late in pipeline
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sun Jul 26 15:31:49 PDT 2020
lebedev.ri added a comment.
In D84108#2174232 <https://reviews.llvm.org/D84108#2174232>, @lebedev.ri wrote:
> I think we're focusing on the wrong thing here.
> Is the preexisting memdep perf problems the only issue everyone has with this?
> Does nobody have any other, perhaps fundamental, concerns here?
> If not, are memdep perf problems a blocker here?
> In D84108#2174113 <https://reviews.llvm.org/D84108#2174113>, @nikic wrote:
>
>> Some thoughts:
>>
>> - Ultimately memdep analysis spends most time in alias analysis, and we can make that part faster but using BatchAA. This has a non-trivial positive impact in general: http://llvm-compile-time-tracker.com/compare.php?from=bc79ed7e16003c8550da8710b321d6d5d4243faf&to=1ab8f1b475d2b9d464add10d1a7f87f18c073fb0&stat=instructions For this particular test case it drops instructions from ~4.2M to ~3.8M.
>
>
> Looking forward to such a patch.
>
>> - I think fundamentally, what memdep does and the way it does it, cannot be easily improved. This is probably something that MemorySSA-based NewGVN improves on (at least in theory -- in practice MemorySSA has a not so stellar compile-time record).
>
> Unless there's some global cache missing that is, but i'd think it would already be there, so yeah.
>
>> - What can be improved though are the cutoffs. Non-local memdep analysis has two primary cutoffs: The BlockScanLimit=100, which limits the amount of instructions inspected in a single block, and the BlockNumberLimit=1000, which limits the amount of inspected blocks. Taken together, this means that a **single** memdep query can inspect up to 100000 instructions, which is frankly insane. A quick test with -memdep-block-number-limit=100 drops the instructions for the test case to ~1.8M.
>
> Right.
>
>> It may be reasonable to either reduce the default block limit here, or to introduce an additional "total instruction limit" for non-local queries.
>
> On the first glance, it doesn't look right to me to lower the existing limits, they don't seem *that* high.
> What indeed i think we are missing, is some more global budget. I'll take a look.
After doing some stats, D84609 <https://reviews.llvm.org/D84609>
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D84108/new/
https://reviews.llvm.org/D84108
More information about the llvm-commits
mailing list