[llvm-dev] Unsafe floating point operation (FDiv & FRem) in LoopVectorizer
Nema, Ashutosh via llvm-dev
llvm-dev at lists.llvm.org
Tue Sep 25 00:23:30 PDT 2018
Hi,
Consider the following test case:
int foo(float *A, float *B, float *C, int len, int VSMALL) {
for (int i = 0; i < len; i++)
if (C[i] > VSMALL)
A[i] = B[i] / C[i];
}
In this test the div operation is conditional but llvm is generating unconditional div for this case:
vector.body: ; preds = %vector.body, %vector.ph
%index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ]
%0 = getelementptr inbounds float, float* %C, i64 %index
%1 = bitcast float* %0 to <8 x float>*
%wide.load = load <8 x float>, <8 x float>* %1, align 4, !tbaa !2, !alias.scope !6
%2 = fcmp ogt <8 x float> %wide.load, %broadcast.splat30
%3 = getelementptr inbounds float, float* %B, i64 %index
%4 = bitcast float* %3 to <8 x float>*
%wide.masked.load = call <8 x float> @llvm.masked.load.v8f32.p0v8f32(<8 x float>* %4, i32 4, <8 x i1> %2, <8 x float> undef), !tbaa !2, !alias.scope !9
%5 = fdiv <8 x float> %wide.masked.load, %wide.load
%6 = getelementptr inbounds float, float* %A, i64 %index
%7 = bitcast float* %6 to <8 x float>*
call void @llvm.masked.store.v8f32.p0v8f32(<8 x float> %5, <8 x float>* %7, i32 4, <8 x i1> %2), !tbaa !2, !alias.scope !11, !noalias !13
%index.next = add i64 %index, 8
%8 = icmp eq i64 %index.next, %n.vec
br i1 %8, label %middle.block, label %vector.body, !llvm.loop !14
The generated IR seems unsafe because fdiv is not respecting the compare mask.
As div is the unsafe operation, llvm should generates the predicated divs.
If I change the data type of A, B & C to the integer type then it generates the right code, where div is predicated based on the mask, and scalar div gets generated for each lane.
This seems like a problem in predicate instruction detection part of LV, currently it considers only UDiv, SDiv, URem, SRem.
bool LoopVectorizationCostModel::isScalarWithPredication(Instruction *I, unsigned VF) {
if (!Legal->blockNeedsPredication(I->getParent()))
return false;
switch(I->getOpcode()) {
default:
break;
case Instruction::UDiv: <- Floating point operations not considered i.e FDiv & FRem
case Instruction::SDiv:
case Instruction::SRem:
case Instruction::URem:
return mayDivideByZero(*I);
}
I don't have any background of this function, but I feel this should consider FDiv & FRem instructions as well.
If there is no objection to it, will do a patch.
Thanks,
Ashutosh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180925/20ab4f0a/attachment-0001.html>
More information about the llvm-dev
mailing list