[PATCH] D28907: [SLP] Fix for PR30787: Failure to beneficially vectorize 'copyable' elements in integer binary ops.
Alexey Bataev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 30 07:13:56 PST 2018
ABataev added inline comments.
================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:3657
+ for (Value *V : VL)
+ if (!extendSchedulingRegion(V, S.OpValue))
return false;
----------------
dtemirbulatov wrote:
> ABataev wrote:
> > Seems to me, we should extend scheduling region only for instructions with the same or alternate opcode. Maybe, it will resolve all your problems.
> No, in that case, we would not be able to extend scheduling region with non-alternative opcodes.
Did you try that? Actually, we're not going to vectorize instructions with the different opcodes. So, maybe, we should exclude them from the scheduling region?
================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:4079
+ LastScheduledInst = pickedInst;
+ std::set<Instruction *> Reschedule;
+ Reschedule.insert(pickedInst);
----------------
dtemirbulatov wrote:
> ABataev wrote:
> > 1. Use `SmallSet` instead
> > 2. Why you can't use `SmallVector` instead?
> Replaced to SmallSet, We could not use Vector here since it is not possible to have multiple entities to a single Instruction.
The main problem with this code that you're checking only one level of dependency. What if you have dependency deeper, in 2, 3 or more level? Will it work? The code itself is very complex and, if we're going to keep this solution, must be outlined in a separate function.
================
Comment at: test/Transforms/SLPVectorizer/X86/internal-dep.ll:10
+; We are tring to arrange this bundle with an internal dependancy:
+; [ %sub.i = add nsw i32 undef, -1; %shr.i.i = lshr i32 %conv31.i, 1; %shr.1.i.i = lshr i32 %conv31.i, 2; %shr.2.i.i = lshr i32 %conv31.i, 3; %shr.3.i.i = lshr i32 %conv31.i, 4; %shr.4.i.i = lshr i32 %conv31.i, 5; %shr.5.i.i = lshr i32 %conv31.i, 6; %shr.6.i.i = lshr i32 %conv31.i, 7; %shr.7.i.i = lshr i32 %conv31.i, 8; %shr.8.i.i = lshr i32 %conv31.i, 9; %shr.9.i.i = lshr i32 %conv31.i, 10; %shr.10.i.i = lshr i32 %conv31.i, 11; %shr.11.i.i = lshr i32 %conv31.i, 12; %shr.12.i.i = lshr i32 %conv31.i, 13; %shr.13.i.i = lshr i32 %conv31.i, 14; %shr.14.i.i = lshr i32 %conv31.i, 15]
+; There is a data dependancy for conv31.i that uses sub.i and other operations that use conv31.i.
----------------
That's why I suggest to try to exclude this instruction from the scheduling. Actually, we should not schedule `%sub.i`, we should schedule pseudo instruction `lshr i32 %sub.i, 0`. Shall we include it into scheduling?
https://reviews.llvm.org/D28907
More information about the llvm-commits
mailing list