[PATCH] D106239: [AArch64] Expand the SVE min/max reduction costs to NEON
David Sherwood via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 20 01:23:38 PDT 2021
david-arm added inline comments.
================
Comment at: llvm/test/Analysis/CostModel/AArch64/reduce-minmax.ll:190
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V8f16 = call half @llvm.vector.reduce.fmax.v8f16(<8 x half> undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 73 for instruction: %V16f16 = call half @llvm.vector.reduce.fmax.v16f16(<16 x half> undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2f32 = call float @llvm.vector.reduce.fmax.v2f32(<2 x float> undef)
----------------
dmgreen wrote:
> david-arm wrote:
> > Hi @dmgreen, something strange is going on for v16f16 here with a cost of 73. I ran llc for this intrinsic and got:
> >
> > fmaxnm v0.8h, v0.8h, v1.8h
> > fmaxnmv h0, v0.8h
> >
> > so a cost of 3 inline with umax.v16i16 seems reasonable here.
> Yeah I saw that, It is an odd one. This test is run without fullfp16, so I think the costs of any half min/max should be higher. The original version of this patch didn't include FP and I hadn't noticed when rebasing over the tests.
>
> I'll looks at correcting that properly.
OK thanks. Yeah I see now. I ran the llc command with "-mattr=+sve", which enabled fullfp16 automatically. That explains the efficient codegen. :)
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D106239/new/
https://reviews.llvm.org/D106239
More information about the llvm-commits
mailing list