[PATCH] D13722: Scalarization for masked gather/scatter intrinsics
Elena Demikhovsky via llvm-commits
llvm-commits at lists.llvm.org
Sun Oct 25 03:17:58 PDT 2015
delena marked an inline comment as done.
delena added a comment.
Hi Michael, not committing the code. I want to hear your opinion about rejection of non-power-of-2 elements.
================
Comment at: ../lib/CodeGen/CodeGenPrepare.cpp:1361
@@ +1360,3 @@
+ "Ptr" + Twine(Idx));
+ LoadInst* Load = Builder.CreateAlignedLoad(Ptr, AlignVal,
+ "Load" + Twine(Idx));
----------------
mkuper wrote:
> Don't you need AlignVal = std::min(AlignVal, VecType->getScalarSizeInBits()), like for the load/store case?
No. Gather / scatter alignment means alignment for one element, not for a vector. If the specified alignment is bigger than element size, we can't handle it properly on X86. The only one option that I see here is to scalarize masked-gather-scatter to a chain of scalar loads with required alignment.
================
Comment at: ../lib/CodeGen/CodeGenPrepare.cpp:1367
@@ +1366,3 @@
+ }
+ Value *NewI = Builder.CreateSelect(Mask, VResult, Src0);
+ CI->replaceAllUsesWith(NewI);
----------------
mkuper wrote:
> This gets eventually cleaned up is Src0 is undef, right?
Yes, the codegen with clean it anyway.
================
Comment at: ../lib/Target/X86/X86TargetTransformInfo.cpp:1212
@@ +1211,3 @@
+}
+
+bool X86TTIImpl::isLegalMaskedScatter(Type *DataType) {
----------------
mkuper wrote:
> So, say, a <32 x i32> gather will get split into two <16 x i32> gathers by the legalizer?
> In any case:
> a) We should probably reject the not-power-of-2 case, unless we know the legalizer can handle it. If we don't reject it, then I think there should be a test for it.
> b) I guess the name is a bit confusing - it's not really "isLegal" (because some things it accepts may not actually be legal on the target), it's more like "isLegalizeable". I don't think we should change the name, but it may be worth to note this in the declaration in TargetTransformInfo.h.
a) I can add rejection of the not-power-of-2 cases. And a test.
But I want to know your opinion about vector and scalar types in this function:
(This comment I want to put inside)
// This function is called now in two cases: from the Loop Vectorizer
// and from the Scalarizer.
// When the Loop Vectorizer asks about legality of the feature,
// the vectorization factor is not calculated yet. The Loop Vectorizer
// sends a scalar type and the decision is based on the width of the
// scalar element.
// Later on, the cost model will estimate usage this intrinsic based on
// the vector type.
// The Scalarizer asks again about legality. It sends a vector type.
// In this case we can reject non-power-of-2 vectors.
if (isa<VectorType>(DataTy) && !isPowerOf2_32(DataTy->getVectorNumElements()))
return false;
Type *ScalarTy = DataTy->getScalarType();
...
b) "IsLegal" comes from TypeLegalizer and it is equal to "IsSupported". I wrote in the comments in TargetTransformInfo.h "Return true if the target supports.."
Repository:
rL LLVM
http://reviews.llvm.org/D13722
More information about the llvm-commits
mailing list