[PATCH] D103820: [X86] Prefer vpmovq2m over vpternlogd + vpcmpgtq
Dávid Bolvanský via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jun 7 09:59:08 PDT 2021
xbolva00 added a comment.
In D103820#2803120 <https://reviews.llvm.org/D103820#2803120>, @davezarzycki wrote:
> Actually, wait. Something weird is going on at the mid-level. These two functions should generate the same optimized IR, right?
>
> typedef int V __attribute__((vector_size(64)));
>
> V lt_zero_x_y(V mask, V x, V y) { return mask < 0 ? x : y; }
> V ge_zero_y_x(V mask, V x, V y) { return mask >= 0 ? y : x; }
Just cursious why with
typedef int V __attribute__((vector_size(4)));
we produce
define dso_local i32 @_Z11lt_zero_x_yDv1_iS_S_(i32 %0, i32 %1, i32 %2) local_unnamed_addr #0 {
%4 = insertelement <1 x i32> poison, i32 %0, i32 0
%5 = insertelement <1 x i32> poison, i32 %1, i32 0
%6 = insertelement <1 x i32> poison, i32 %2, i32 0
%7 = icmp sgt <1 x i32> %4, <i32 -1>
%8 = select <1 x i1> %7, <1 x i32> %6, <1 x i32> %5
%9 = extractelement <1 x i32> %8, i32 0
ret i32 %9
}
Why not scalarize it on IR level?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D103820/new/
https://reviews.llvm.org/D103820
More information about the llvm-commits
mailing list