[PATCH] D28907: [SLP] Fix for PR30787: Failure to beneficially vectorize 'copyable' elements in integer binary ops.

Alexey Bataev via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 30 07:13:56 PST 2018


ABataev added inline comments.


================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:3657
+  for (Value *V : VL)
+    if (!extendSchedulingRegion(V, S.OpValue))
       return false;
----------------
dtemirbulatov wrote:
> ABataev wrote:
> > Seems to me, we should extend scheduling region only for instructions with the same or alternate opcode. Maybe, it will resolve all your problems.
> No, in that case, we would not be able to extend scheduling region with non-alternative opcodes.
Did you try that? Actually, we're not going to vectorize instructions with the different opcodes. So, maybe, we should exclude them from the scheduling region?


================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:4079
+          LastScheduledInst = pickedInst;
+          std::set<Instruction *> Reschedule;
+          Reschedule.insert(pickedInst);
----------------
dtemirbulatov wrote:
> ABataev wrote:
> > 1. Use `SmallSet` instead
> > 2. Why you can't use `SmallVector` instead?
> Replaced to SmallSet, We could not use Vector here since it is not possible to have multiple entities to a single Instruction.
The main problem with this code that you're checking only one level of dependency. What if you have dependency deeper, in 2, 3 or more level? Will it work? The code itself is very complex and, if we're going to keep this solution, must be outlined in a separate function.


================
Comment at: test/Transforms/SLPVectorizer/X86/internal-dep.ll:10
+; We are tring to arrange this bundle with an internal dependancy:
+; [  %sub.i = add nsw i32 undef, -1;  %shr.i.i = lshr i32 %conv31.i, 1;  %shr.1.i.i = lshr i32 %conv31.i, 2;  %shr.2.i.i = lshr i32 %conv31.i, 3;  %shr.3.i.i = lshr i32 %conv31.i, 4;  %shr.4.i.i = lshr i32 %conv31.i, 5;  %shr.5.i.i = lshr i32 %conv31.i, 6;  %shr.6.i.i = lshr i32 %conv31.i, 7;  %shr.7.i.i = lshr i32 %conv31.i, 8;  %shr.8.i.i = lshr i32 %conv31.i, 9;  %shr.9.i.i = lshr i32 %conv31.i, 10;  %shr.10.i.i = lshr i32 %conv31.i, 11;  %shr.11.i.i = lshr i32 %conv31.i, 12;  %shr.12.i.i = lshr i32 %conv31.i, 13;  %shr.13.i.i = lshr i32 %conv31.i, 14;  %shr.14.i.i = lshr i32 %conv31.i, 15]
+; There is a data dependancy for conv31.i that uses sub.i and other operations that use conv31.i.
----------------
That's why I suggest to try to exclude this instruction from the scheduling. Actually, we should not schedule `%sub.i`, we should schedule pseudo instruction `lshr i32 %sub.i, 0`. Shall we include it into scheduling?


https://reviews.llvm.org/D28907





More information about the llvm-commits mailing list