[PATCH] D101109: [SLP]Improve multinode analysis.

Alexey Bataev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 8 07:30:20 PST 2021


ABataev added a comment.

In D101109#3113468 <https://reviews.llvm.org/D101109#3113468>, @vporpo wrote:

> Perhaps the score changes could be split into a separate patch?

Not alone, they cause regressions. Will try to separate cost-model changes.



================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:1337
+      unsigned SameOpNumber = 0;
+      MapVector<unsigned, std::pair<unsigned, unsigned>> HashMap;
+      for (int I = getNumLanes(); I > 0; --I) {
----------------
vporpo wrote:
> Could you add a comment what these `unsigned` values are for? (or perhaps use a struct instead?). 
> Could you also describe the the purpose of `HashMap` and what the keys and values are? 
Yes, just forgot to add some extra comments, will add them after updates.
`std::pair<unsigned, unsigned>` is used to implement a simple voting algorithm and choose the lane with the least number of operands that can freely move about or less profitable because it already has the most optimal set of operands.
The first unsigned is a counter for voting, the second unsigned is the counter of lanes with instructions with same/alternate opcodes and same parent basic block.



================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:1338
+      MapVector<unsigned, std::pair<unsigned, unsigned>> HashMap;
+      for (int I = getNumLanes(); I > 0; --I) {
+        unsigned Lane = I - 1;
----------------
vporpo wrote:
> Is there any reasoning behind the iteration in reverse? If so could you please add a comment?
This is just to be closer to the original results, before this patch, nothing else, if we have multiple lanes with same cost.


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:1379
+      /// parent.
+      unsigned SameCodeParentOps = 0;
+      /// Hash for the actual operands ordering.
----------------
vporpo wrote:
> NIT: I find `Code` a bit confusing, also perhaps there is no need to refer to `Parent` in the variable name?  Perhaps rename to something like `NumOpsWithSameOpcode`?
Will rename it but I'd rather keep `Parent`, because I compare not only opcodes but the parent too.


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:1380
+      unsigned SameCodeParentOps = 0;
+      /// Hash for the actual operands ordering.
+      unsigned Hash = 0;
----------------
vporpo wrote:
> Could you add a bit more text in the comment what the hashed is used for? I can see that it is used as a key in the `HashMap` above, but could you explain how it is being used?
It is used to count operands, actually their position id and opcode value. It is used in the voting mechanism to find the lane with the least number of operands that can freely move about or less profitable because it already has the most optimal set of operands. I can use `SmallVector<unsigned>` instead but to use hash code, it is faster and requires less memory.


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:1396-1397
       // the other by subtracting it from the total number of operands.
-      for (unsigned OpIdx = 0; OpIdx != NumOperands; ++OpIdx)
-        if (getData(OpIdx, Lane).APO)
+      // Operands with the same instruction opcode and parent are more
+      // profitable since we don't need to move them in many cases.
+      bool AllUndefs = true;
----------------
vporpo wrote:
> Could you elaborate a bit on this? If I understand correctly the more similar opcodes we can find, the easier it is to reorder them, therefore this can act as a tie-breaker when the NumOfAPOs is equal?
> 
If the lane already has operands with the same opcode and same parent, no need to swap the operands in this lane, with a high probability such lane already can be vectorized effectively.


================
Comment at: llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:1408
+        if (auto *I = dyn_cast<Instruction>(OpData.V)) {
+          if (!OpcodeI || !getSameOpcode({OpcodeI, I}).getOpcode() ||
+              I->getParent() != Parent) {
----------------
vporpo wrote:
> If I am not mistaken this code will count the consecutive operands with same opcode and BB. Is it because this is a good enough approximation?
Yes, exactly, in most cases it results in the optimal values in the lane.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101109/new/

https://reviews.llvm.org/D101109



More information about the llvm-commits mailing list