[PATCH] D50992: [InstCombine] try to fold insertelt + vector op into scalar op + insertelt
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Aug 20 16:35:10 PDT 2018
spatel added a comment.
In https://reviews.llvm.org/D50992#1206612, @efriedma wrote:
> I'm concerned the backend won't reliably reverse the transform. For integer operations, SelectionDAG heavily depends on IR types to decide whether to perform an operation in integer or SIMD registers, and transferring values between the two register files is slow. Yes, the second of an insertelement is a scalar, but many backends pattern-match a load+insertelement to a vector register load.
If we add a load to the sequence, we have something like this:
define <4 x i32> @vector_add_constant(i32* %p) {
%x = load i32, i32* %p
%ins = insertelement <4 x i32> undef, i32 %x, i32 0
%bo = add <4 x i32> %ins, <i32 42, i32 42, i32 42, i32 42>
ret <4 x i32> %bo
}
define <4 x i32> @scalar_add_constant(i32* %p) {
%x = load i32, i32* %p
%b = add i32 %x, 42
%bo = insertelement <4 x i32> undef, i32 %b, i32 0
ret <4 x i32> %bo
}
And for x86, that's:
vmovd (%rdi), %xmm0 ## xmm0 = mem[0],zero,zero,zero
vpaddd LCPI4_0(%rip), %xmm0, %xmm0
vs.
movl (%rdi), %eax
addl $42, %eax
vmovd %eax, %xmm0
Let me see about adding a DAG reversal...
https://reviews.llvm.org/D50992
More information about the llvm-commits
mailing list