[PATCH] D43776: [SLP] Fix PR36481: vectorize reassociated instructions.

Ayal Zaks via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Apr 1 00:31:08 PDT 2018


Ayal accepted this revision.
Ayal added a comment.
This revision is now accepted and ready to land.

Looks good to me, thanks for addressing the issues, have only a few last minor suggestions.

In https://reviews.llvm.org/D43776#1032937, @ABataev wrote:

> In https://reviews.llvm.org/D43776#1031044, @Ayal wrote:
>
> > This patch addresses the following TODO, plus handles extracts:
> >
> >   // Check if the loads are consecutive, reversed, or neither.
> >   // TODO: What we really want is to sort the loads, but for now, check
> >   // the two likely directions.
> >
> >
> > At some point it's worth documenting that the best order is set once, at the root of the tree, and then gets propagated to its leaves. Would be good to do so w/o having to rebuild the tree (introduced before this patch)?
>
>
> Yes, but this must be in a different patch, not this one.


Sure, please add a TODO. This patch makes such rebuilds more frequent.



================
Comment at: include/llvm/Analysis/LoopAccessAnalysis.h:673
+///
+/// Returns 'false' if sorting is not legal, otherwise returns 'true'.
+///
----------------
Sounds better to specify "Returns 'true' if ..., otherwise returns 'false'" ?


================
Comment at: lib/Analysis/LoopAccessAnalysis.cpp:1113
+    const auto *Diff =
+        dyn_cast<SCEVConstant>(SE.getMinusSCEV(SE.getSCEV(Ptr), Scev0));
+    // The pointers may not have a constant offset from each other, or SCEV
----------------
ABataev wrote:
> Ayal wrote:
> > ABataev wrote:
> > > Ayal wrote:
> > > > ABataev wrote:
> > > > > Ayal wrote:
> > > > > > Use computeConstantDifference() instead of computing it explicitly? It should compare GetUnderlyingObject()'s, if worthwhile, rather than doing so here.
> > > > > Tried, it does not work in many cases.
> > > > Then that should be fixed, worth opening a PR?
> > > I'm not sure that this is a bug, it is a feature :) . computeConstantDifference() is used to get the difference between the pointer returned by the `GetUnderlyingObject()` and  kind of a `GEP %ptr, 0, n`, where `n` is constant. But this function does not work if the first load element is from `GEP %ptr, 0, m`, not from `%ptr`, and others are from `GEP %ptr, 0, m+1`, `GEP %ptr, 0, m+2` etc.
> > ok, right; `computeConstantDifference()` is lightweight. But it would be good to refactor a more time consuming variant which checks if `getMinusSCEV` returns a constant, and first compares UnderlyingObjects; and also AddressSpaces, following LoopVectorize.
> Added checks for addressspace, comparison for underlying objects already were there.
Right, the comment was about refactoring it altogether into a separate, more time consuming variant of `computeConstantDifference()` that checks everything; can alternatively leave a TODO for now?


================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:2074
+    }
   }
 
----------------
Ayal wrote:
> It may be easier to read if instead of clearing CurrentOrder at each early exit, we break from the loop, and right after the loop check if it exited early and if (`I < E`) clear CurrentOrder and return false.
The suggestion for easier reading was for something like:

```
  for (unsigned I = 0; I < E; ++I) {
    auto *Inst = cast<Instruction>(VL[I]);
    if (Inst->getOperand(0) != Vec)
      break;
    Optional<unsigned> Idx = getExtractIndex(Inst);
    if (!Idx)
      break;
    const unsigned ExtIdx = *Idx;
    if (ExtIdx >= E || CurrentOrder[ExtIdx] != E + 1)
      break;
    CurrentOrder[ExtIdx] = I;
    if (ExtIdx != I)
      ShouldKeepOrder = false;
  }

  if (I < E) {
    CurrentOrder.clear();
    return false;
  }
  return ShouldKeepOrder;
```


================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:1252
+  /// reordering, if reordering is not required it is counted using \a
+  /// DirectOrderNum.
+  DenseMap<OrdersType, unsigned, OrdersTypeDenseMapInfo> NumOpsWantToKeepOrder;
----------------
`/// DirectOrderNum.` >> `/// NumOpsWantToKeepOriginalOrder.`


================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:1629
+        newTreeEntry(VL, /*Vectorized=*/true, UserTreeIdx, ReuseShuffleIndicies,
+                     I->getFirst());
+        return;
----------------
Would the following work and be easier to read?

>   ++NumOpsWantToKeepOrder[CurrentOrder];
> 
>   auto &StoredCurrentOrder = NumOpsWantToKeepOrder.find(CurrentOrder)->getFirst();
>   newTreeEntry(VL, /*Vectorized=*/true, UserTreeIdx, ReuseShuffleIndicies,
>                StoredCurrentOrder);

or alternatively rename `I` to something more meaningful like `StoredCurrentOrderAndNum`.


================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:2054
+  bool WrongSequence = false;
+  CurrentOrder.assign(VL.size(), E + 1);
+  for (unsigned I = 0; I < E; ++I) {
----------------
May as well do `CurrentOrder.assign(E, E+1);`


Repository:
  rL LLVM

https://reviews.llvm.org/D43776





More information about the llvm-commits mailing list