[PATCH] D95690: [LoopVectorize] improve IR fast-math-flags propagation in reductions

Sanjay Patel via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 29 11:22:38 PST 2021


spatel added inline comments.


================
Comment at: llvm/test/Transforms/LoopVectorize/X86/reduction-fastmath.ll:302-303
 ; CHECK:       middle.block:
-; CHECK-NEXT:    [[RDX_MINMAX_CMP:%.*]] = fcmp fast ogt <4 x float> [[TMP10]], [[TMP11]]
-; CHECK-NEXT:    [[RDX_MINMAX_SELECT:%.*]] = select fast <4 x i1> [[RDX_MINMAX_CMP]], <4 x float> [[TMP10]], <4 x float> [[TMP11]]
-; CHECK-NEXT:    [[TMP13:%.*]] = call float @llvm.vector.reduce.fmax.v4f32(<4 x float> [[RDX_MINMAX_SELECT]])
----------------
This is a miscompile - we created `fast` ops from the more strict original code.


================
Comment at: llvm/test/Transforms/LoopVectorize/X86/reduction-fastmath.ll:304
-; CHECK-NEXT:    [[RDX_MINMAX_SELECT:%.*]] = select fast <4 x i1> [[RDX_MINMAX_CMP]], <4 x float> [[TMP10]], <4 x float> [[TMP11]]
-; CHECK-NEXT:    [[TMP13:%.*]] = call float @llvm.vector.reduce.fmax.v4f32(<4 x float> [[RDX_MINMAX_SELECT]])
 ; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i64 [[WIDE_TRIP_COUNT]], [[N_VEC]]
----------------
Yet we created a non-FMF reduction intrinsic...


================
Comment at: llvm/test/Transforms/LoopVectorize/X86/reduction-fastmath.ll:345
   %0 = load float, float* %arrayidx, align 4
   %cmp1.inv = fcmp nnan ninf oge float %0, %max.013
   %max.0. = select i1 %cmp1.inv, float %0, float %max.013
----------------
The `ninf` is not actually necessary, but `nsz` should be required.


================
Comment at: llvm/test/Transforms/LoopVectorize/float-minmax-instruction-flag.ll:107-108
   %t4 = load float, float* %t3, align 4
   %t5 = fcmp olt float %t2, %t4
   %t6 = select i1 %t5, float %t2, float %t4
   %t7 = add i64 %t1, 1
----------------
This test demonstrates that we are not even relying on the instruction-level FMF, and it's another miscompile. The function only has the equivalent of `nnan`. I think we should require `nsz` too.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D95690/new/

https://reviews.llvm.org/D95690



More information about the llvm-commits mailing list