[PATCH] D22778: Add Loop Sink pass to reverse the LICM based of basic block frequency.
Dehao Chen via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 19 08:39:14 PDT 2016
danielcdh added a comment.
Thanks for the reviews.
I also added an overall algorithm description at file level per Madhur's suggest.
================
Comment at: lib/Transforms/Scalar/LoopSink.cpp:159
+ BlockFrequencyInfo &BFI) {
+ if (I.getNumUses() > MaxNumberOfUsesForSinking)
+ return false;
----------------
chandlerc wrote:
> The number of user *instructions* isn't really the right thing to apply the threshold to as that doesn't directly change the cost.
>
> The idea is that we need the size of `BBsToSinkInto` to be a small constant in order for the search for the coldest dominating set to be "just" linear in the number of blocks in the loop.
>
> So while a threshold of "40" may make sense for number of user instructions, I suspect the threshold should be much smaller when applied to the size of `BBsToSinkInto`.
>
> I also think you should add two comments about this. One, you should comment to the `findBBsToSinkInto` function clarifying the algorithmic complexity (That it O(N * M) or O(M^2) where N is SortedLoopBBs.size() and M is BBsToSinkInto.size()), and you should mention where you check this threshold that the reason is because we're going to call `findBBsToSinkInto` which would go quadratic if we didn't set a cap.
>
> The reason for all of this is that I'm worried some future maintainer will come along and not really understand how risk it is to adjust these thresholds so I think it is important to document the implications.
>
> I still think we will long-term need a better algorithmic approach here as I suspect we'll find annoying cases where the threshold blocks an important optimization (such as when there are way too many initial BBsToInsertInto but there are a small number of common dominating blocks). But I understand this is one of the really hard problems (its the same as optimal PHI placement and a bunch of other ones), and I don't want to hold up your patch on a comprehensive approach here.
>
>
> On an unrelated note, you should also document that this threshold has a secondary function: it places an upper bound on how much code growth we may trigger here. I'd document this in particular as that seems somewhat accidental and I suspect we may long-term want a better threshold for that. I would in fact encourage you to leave a FIXME to adjust this for min_size and opt_size.
>
>
>
You actually mean the size of BBs to be small constant? Because computing BBsToSinkInto is the most computation intensive part of the algorithm.
================
Comment at: lib/Transforms/Scalar/LoopSink.cpp:194
+
+ for (++BI; BI != BBsToSinkInto.end(); ++BI) {
+ BasicBlock *N = *BI;
----------------
chandlerc wrote:
> You didn't actually switch to the sorted list here.
>
> Also, you can just use a range based for loop here.
The reason I used iterator here is because we need to handle the first entry in a different way.
https://reviews.llvm.org/D22778
More information about the llvm-commits
mailing list