[PATCH] D22778: Add Loop Sink pass to reverse the LICM based of basic block frequency.

Madhur Amilkanthwar via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 19 01:12:49 PDT 2016


Dahao,
I think the basic algorithm should be captured somewhere (preferably in
patch's summary)
It would be useful to know the algorithm quickly rather than going through
source code.

On Wed, Oct 19, 2016 at 1:19 PM, Chandler Carruth via llvm-commits <
llvm-commits at lists.llvm.org> wrote:

> chandlerc added a comment.
>
> This is really close. Some minor nit picks and a few more interesting
> comments below.
>
>
>
> ================
> Comment at: include/llvm/Transforms/Utils/LoopUtils.h:472-474
> +/// If SafetyInfo is nullptr, we are checking for sinking instructions
> from
> +/// preheader to loop body (no speculation).
> +/// If SafetyInfo is not nullptr, we are checking for hoisting/sinking
> ----------------
> chandlerc wrote:
> > Here and elsewhere in comments, I would just say "is null" rather than
> "is nullptr".
> You changed one 'nullptr' to 'null' but missed the other.
>
>
> ================
> Comment at: lib/Transforms/Scalar/LoopSink.cpp:11-12
> +// This pass does the inverse transformation of what LICM does.
> +// It traverses all of the instructions in the loop's preheader and sink
> +// them/ to the loop body where frequency is lower than the loop's
> preheader.
> +// This pass is a reverse-transformation of LICM. It differs from the Sink
> ----------------
> "and sink them/ to" -> "and sinks them to"
>
>
> ================
> Comment at: lib/Transforms/Scalar/LoopSink.cpp:101
> +///
> +/// /p SortedLoopBBs is used to help find the optimal sinking locations.
> +/// It stores a list of BBs that is:
> ----------------
> "/p" -> "\p"
>
>
> ================
> Comment at: lib/Transforms/Scalar/LoopSink.cpp:159
> +                            BlockFrequencyInfo &BFI) {
> +  if (I.getNumUses() > MaxNumberOfUsesForSinking)
> +    return false;
> ----------------
> The number of user *instructions* isn't really the right thing to apply
> the threshold to as that doesn't directly change the cost.
>
> The idea is that we need the size of `BBsToSinkInto` to be a small
> constant in order for the search for the coldest dominating set to be
> "just" linear in the number of blocks in the loop.
>
> So while a threshold of "40" may make sense for number of user
> instructions, I suspect the threshold should be much smaller when applied
> to the size of `BBsToSinkInto`.
>
> I also think you should add two comments about this. One, you should
> comment to the `findBBsToSinkInto` function clarifying the algorithmic
> complexity (That it O(N * M) or O(M^2) where N is SortedLoopBBs.size() and
> M is BBsToSinkInto.size()), and you should mention where you check this
> threshold that the reason is because we're going to call
> `findBBsToSinkInto` which would go quadratic if we didn't set a cap.
>
> The reason for all of this is that I'm worried some future maintainer will
> come along and not really understand how risk it is to adjust these
> thresholds so I think it is important to document the implications.
>
> I still think we will long-term need a better algorithmic approach here as
> I suspect we'll find annoying cases where the threshold blocks an important
> optimization (such as when there are way too many initial BBsToInsertInto
> but there are a small number of common dominating blocks). But I understand
> this is one of the really hard problems (its the same as optimal PHI
> placement and a bunch of other ones), and I don't want to hold up your
> patch on a comprehensive approach here.
>
>
> On an unrelated note, you should also document that this threshold has a
> secondary function: it places an upper bound on how much code growth we may
> trigger here. I'd document this in particular as that seems somewhat
> accidental and I suspect we may long-term want a better threshold for that.
> I would in fact encourage you to leave a FIXME to adjust this for min_size
> and opt_size.
>
>
>
>
>
> ================
> Comment at: lib/Transforms/Scalar/LoopSink.cpp:178-179
> +      findBBsToSinkInto(L, BBs, SortedLoopBBs, DT, BFI);
> +  if (BBsToSinkInto.size() == 0)
> +    return false;
> +
> ----------------
> We generally prefer calling `.empty()` to testing `.size()` against zero.
>
>
> ================
> Comment at: lib/Transforms/Scalar/LoopSink.cpp:184
> +                             BBsToSinkInto.end());
> +  std::stable_sort(SortedBBsToSinkInto.begin(),
> SortedBBsToSinkInto.end(),
> +                   [&](BasicBlock *A, BasicBlock *B) {
> ----------------
> Here you don't need stable sort since this is a total ordering. You should
> just use std::sort and mention that this is a known total ordering in a
> comment. You could do that in an overall comment htat explains what you're
> doing here:
>
>   // Copy the final BBs into a vector and sort them using the total
> ordering
>   // of the loop block numbers as iterating the set doesn't give a useful
>   // order. No need to stable sort as the block numbers are a total
> ordering.
>
>
> ================
> Comment at: lib/Transforms/Scalar/LoopSink.cpp:194
> +
> +  for (++BI; BI != BBsToSinkInto.end(); ++BI) {
> +    BasicBlock *N = *BI;
> ----------------
> You didn't actually switch to the sorted list here.
>
> Also, you can just use a range based for loop here.
>
>
> ================
> Comment at: lib/Transforms/Scalar/LoopSink.cpp:218
> +/// sum frequency of inserted copy is smaller than preheader's frequency.
> +bool sinkLoop(Loop &L, AAResults &AA, LoopInfo &LI, DominatorTree &DT,
> +              BlockFrequencyInfo &BFI, ScalarEvolution *SE) {
> ----------------
> This routine doesn't sink the loop, so a name `sinkLoop` seems confusing.
> Maybe `sinkLoopInvariantInstructions`?
>
> Also, I think this should be a static function.
>
>
> ================
> Comment at: lib/Transforms/Scalar/LoopSink.cpp:241-246
> +  SmallVector<BasicBlock *, 10> SortedLoopBBs;
> +  DenseMap<BasicBlock *, int> LoopBlockNumber;
> +  int i = 0;
> +  for (BasicBlock *B : L.blocks())
> +    if (BFI.getBlockFreq(B) <= BFI.getBlockFreq(L.getLoopPreheader())) {
> +      SortedLoopBBs.push_back(B);
> ----------------
> It wasn't obvious to me reading this that `SortedLoopBBs` only contained
> the oop BBs that are less hot than the preheader. I think it might be nice
> to clue the reader in that this isn't *all* the loop BBs. Maybe
> `SortedColdLoopBBs`? Or just `ColdLoopBBs`? If you make this change, I'd
> keep the name consistent throughout of course.
>
> Also, you use `<=` here, but `<` everywhere else I see, any particular
> reason to include BBs in this list with the same frequency?
>
>
> https://reviews.llvm.org/D22778
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>



-- 
*Disclaimer: Views, concerns, thoughts, questions, ideas expressed in this
mail are of my own and my employer has no take in it. *
Thank You.
Madhur D. Amilkanthwar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161019/ba95262f/attachment.html>


More information about the llvm-commits mailing list