[llvm-dev] llvm 10: Why is float experimental_vector_reduce_fmin not tried?

Mark Schimmel via llvm-dev llvm-dev at lists.llvm.org
Tue Nov 24 13:16:20 PST 2020


LLVM vectorizes this same function for floating point addition just fine (uses experimental_vector_reduce_v2_fadd), but refuses to do the same for minf(). Does anyone have any insight why that would be? I'm using -ffast-math but that doesn't seem to help.

>From grep'ing the sources the best I can figure is that some logic exists for Instruction::FCmp but perhaps not for Intrinsic:: minnum. Is that the case?

; Function Attrs: norecurse nounwind readonly
define float @f(float addrspace(4)* noalias nocapture readonly %a, float addrspace(4)* noalias nocapture readonly %b, float %m) local_unnamed_addr #0 {
entry:
  br label %for.body

for.cond.cleanup:                                 ; preds = %for.body
  ret float %3

for.body:                                         ; preds = %entry, %for.body
  %m.addr.024 = phi float [ %m, %entry ], [ %3, %for.body ] ; [#uses=1 type=float]
  %i.023 = phi i32 [ 0, %entry ], [ %inc, %for.body ] ; [#uses=3 type=i32]
  %arrayidx = getelementptr inbounds float, float addrspace(4)* %a, i32 %i.023 ; [#uses=1 type=float addrspace(4)*]
  %0 = load float, float addrspace(4)* %arrayidx, align 4, !tbaa !3 ; [#uses=1 type=float]
  %arrayidx1 = getelementptr inbounds float, float addrspace(4)* %b, i32 %i.023 ; [#uses=1 type=float addrspace(4)*]
  %1 = load float, float addrspace(4)* %arrayidx1, align 4, !tbaa !3 ; [#uses=1 type=float]
  %2 = tail call fast float @llvm.minnum.f32(float %0, float %1) ; [#uses=1 type=float]
  %3 = tail call fast float @llvm.minnum.f32(float %m.addr.024, float %2) ; [#uses=2 type=float]
  %inc = add nuw nsw i32 %i.023, 1                ; [#uses=2 type=i32]
  %cmp = icmp ult i32 %inc, 8192                  ; [#uses=1 type=i1]
  br i1 %cmp, label %for.body, label %for.cond.cleanup, !llvm.loop !7
}

LV: Checking a loop in "f" from /path/to/x.c
LV: Loop hints: force=enabled width=0 unroll=0 optspace=0
LV: Found a loop: for.body
LV: Not vectorizing: Found an unidentified PHI   %m.addr.024 = phi float [ %m, %entry ], [ %3, %for.body ] ; [#uses=1 type=float]
LV: Interleaving disabled by the pass manager
LV: Can't vectorize the instructions or CFG
LV: Not vectorizing: Cannot prove legality.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201124/af751ca8/attachment.html>


More information about the llvm-dev mailing list