[PATCH] D22778: Add Loop Sink pass to reverse the LICM based of basic block frequency.

Dehao Chen via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 19 08:39:14 PDT 2016


danielcdh added a comment.

Thanks for the reviews.

I also added an overall algorithm description at file level per Madhur's suggest.



================
Comment at: lib/Transforms/Scalar/LoopSink.cpp:159
+                            BlockFrequencyInfo &BFI) {
+  if (I.getNumUses() > MaxNumberOfUsesForSinking)
+    return false;
----------------
chandlerc wrote:
> The number of user *instructions* isn't really the right thing to apply the threshold to as that doesn't directly change the cost.
> 
> The idea is that we need the size of `BBsToSinkInto` to be a small constant in order for the search for the coldest dominating set to be "just" linear in the number of blocks in the loop.
> 
> So while a threshold of "40" may make sense for number of user instructions, I suspect the threshold should be much smaller when applied to the size of `BBsToSinkInto`.
> 
> I also think you should add two comments about this. One, you should comment to the `findBBsToSinkInto` function clarifying the algorithmic complexity (That it O(N * M) or O(M^2) where N is SortedLoopBBs.size() and M is BBsToSinkInto.size()), and you should mention where you check this threshold that the reason is because we're going to call `findBBsToSinkInto` which would go quadratic if we didn't set a cap.
> 
> The reason for all of this is that I'm worried some future maintainer will come along and not really understand how risk it is to adjust these thresholds so I think it is important to document the implications.
> 
> I still think we will long-term need a better algorithmic approach here as I suspect we'll find annoying cases where the threshold blocks an important optimization (such as when there are way too many initial BBsToInsertInto but there are a small number of common dominating blocks). But I understand this is one of the really hard problems (its the same as optimal PHI placement and a bunch of other ones), and I don't want to hold up your patch on a comprehensive approach here.
> 
> 
> On an unrelated note, you should also document that this threshold has a secondary function: it places an upper bound on how much code growth we may trigger here. I'd document this in particular as that seems somewhat accidental and I suspect we may long-term want a better threshold for that. I would in fact encourage you to leave a FIXME to adjust this for min_size and opt_size.
> 
> 
> 
You actually mean the size of BBs to be small constant? Because computing BBsToSinkInto is the most computation intensive part of the algorithm.


================
Comment at: lib/Transforms/Scalar/LoopSink.cpp:194
+
+  for (++BI; BI != BBsToSinkInto.end(); ++BI) {
+    BasicBlock *N = *BI;
----------------
chandlerc wrote:
> You didn't actually switch to the sorted list here.
> 
> Also, you can just use a range based for loop here.
The reason I used iterator here is because we need to handle the first entry in a different way.


https://reviews.llvm.org/D22778





More information about the llvm-commits mailing list