[PATCH] Using Masked Load / Store intrinsics in Loop Vectorizer
Elena Demikhovsky
elena.demikhovsky at intel.com
Sun Dec 7 05:17:20 PST 2014
I'm adding comments, changing function names, joining 4 tests in one.
================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:778
@@ +777,3 @@
+ }
+ bool setMaskedOp(const Instruction* I) {
+ return (MaskedOp.find(I) != MaskedOp.end());
----------------
aschwaighofer wrote:
> This is a predicate. Could you rename it to isMaskedOperation?
do you mean isLegalMaskedLoad()? isLegalMaskedStore()?
================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:889
@@ -874,1 +888,3 @@
+ /// call to an appropriate masked intrinsic
+ std::set<const Instruction*> MaskedOp;
};
----------------
aschwaighofer wrote:
> Should this be a SmallPtrSet?
What size should I give in this case? I don't know how many masked operations will be in this block.
Or it is not so important?
================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:5386-5387
@@ +5385,4 @@
+ !SI->getParent()->getSinglePredecessor()) {
+ if (canPredicateStore(SI->getValueOperand()->getType(),
+ SI->getPointerOperand())) {
+ MaskedOp.insert(SI);
----------------
mzolotukhin wrote:
> Could this condition be simplified? E.g. we can store result of canPredicateStore(..) in a bool variable and use it here.
> Also, I think it might be incorrect - shouldn't we insert SI to MaskedOp even if NumPredStores hasn't exceeded NumberOfStoresToPredicate?
If NumPredStores hasn't exceeded NumberOfStoresToPredicate I'll not generate masked store. I want to keep the current behavior.
Something like this?
bool isSafePtr = (SafePtrs.count(SI->getPointerOperand()) !=0);
bool isSinglePredecessor = SI->getParent()->getSinglePredecessor();
if (++NumPredStores > NumberOfStoresToPredicate || !isSafePtr ||
!isSinglePredecessor) {
// Build a masked store if it is legal for the target, otherwise scalarize
// the block.
bool isLegalMaskedOp = canPredicateStore(SI->getValueOperand()->getType(),
SI->getPointerOperand());
if (isLegalMaskedOp) {
--NumPredStores;
MaskedOp.insert(SI);
continue;
}
return false;
}
================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:5457
@@ -5389,3 +5456,3 @@
- assert(MaxVectorSize <= 32 && "Did not expect to pack so many elements"
+ assert(MaxVectorSize <= 64 && "Did not expect to pack so many elements"
" into one vector!");
----------------
aschwaighofer wrote:
> Should this be in a separate patch? This seems like an unrelated change.
My tests don't work without this patch because 512 / 8 = 64
================
Comment at: test/Transforms/LoopVectorize/X86/mask1.ll:1
@@ +1,2 @@
+; RUN: opt < %s -O3 -mcpu=corei7-avx -S | FileCheck %s -check-prefix=AVX1
+; RUN: opt < %s -O3 -mcpu=core-avx2 -S | FileCheck %s -check-prefix=AVX2
----------------
aschwaighofer wrote:
> Could the mask1.ll, mask2.ll, etc be in one file? We try to not create to many .ll files when test cases are related.
>
> Could you use
>
> CHECK-LABEL: @function_name
> CHECK: ... // rest of the test for function_name test
>
> CHECK-LABEL: @function_name2
> CHECK: ... // rest of the test for function_name2 test
>
> Could you modify the test so that it tests more of the expected IR?
>
no problem
http://reviews.llvm.org/D6527
More information about the llvm-commits
mailing list