[PATCH] D97961: [Cost]Canonicalize the cost for logical or/and reductions.
David Sherwood via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 5 00:45:00 PST 2021
david-arm added inline comments.
================
Comment at: llvm/include/llvm/CodeGen/BasicTTIImpl.h:1902
+ // Or reduction for i1 is represented as:
+ // %val = bitcast <ReduxWidth x i1> to iReduxWidth
+ // %res = cmp ne iReduxWidth %val, 0
----------------
ABataev wrote:
> david-arm wrote:
> > I'm not sure this is always true because some backends (e.g. AArch64) promote i1 to larger integers. The costs for AArch64 still look a bit odd to be honest. I tried them out manually and I observe about 8 instructions for AND reductions using <4 x i1> vectors since we have lots of bytewise moves of -1 into the vector lanes of a <4 x i32> vector.
> This is known problem, see
> https://bugs.llvm.org/show_bug.cgi?id=41636
> https://bugs.llvm.org/show_bug.cgi?id=41635
> https://bugs.llvm.org/show_bug.cgi?id=41634
>
> Looks like the construct is not lowered properly on some targets
Sure, I totally agree the codegen for ARM and AArch64 is awful and I take your point. I was just wondering if this assumption was a problem:
%val = bitcast <ReduxWidth x i1> to iReduxWidth
as I don't think is true for targets that promote i1 to i32 or something like that. In the bug shown above (https://bugs.llvm.org/show_bug.cgi?id=41636) even the optimal code is still operating on vectors of i8 types.
I guess for those targets that do promote i1->iX they can come up with their own cost in the target specific getArithmeticReductionCost so maybe this isn't really a problem?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D97961/new/
https://reviews.llvm.org/D97961
More information about the llvm-commits
mailing list