[PATCH] D38316: [InstCombine] replace bitcast to scalar + insertelement with widening shuffle + vector bitcast
Sanjay Patel via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Sep 27 13:30:11 PDT 2017
spatel added a comment.
In https://reviews.llvm.org/D38316#882545, @efriedma wrote:
> I meant, "how do we fix x86 in the general case"? Consider the following (with -mtriple=x86_64 -mattr=+xop):
Ah! Sorry, I misread the question.
>
>
> define <8 x i64> @test(i32 %x0, i32 %x1, <8 x i64> %v) {
> %1 = insertelement <2 x i32> undef, i32 %x0, i32 0
> %2 = insertelement <2 x i32> %1, i32 %x1, i32 1
> %3 = bitcast <2 x i32> %2 to i64
> %4 = insertelement <8 x i64> %v, i64 %3, i32 0
> ret <8 x i64> %4
> }
>
>
> We currently generate a five-instruction sequence for something which can be done in two instructions. And the instcombine here won't trigger.
Ok - so this example is an extension of a different instcombine that I was originally thinking of to solve this case (https://bugs.llvm.org/show_bug.cgi?id=34716#c1). We could trade an insert for a bitcast:
define <8 x i64> @test_not_undef_bc(i32 %x0, i32 %x1, <8 x i64> %v) {
%bc = bitcast <8 x i64> %v to <16 x i32>
%i1 = insertelement <16 x i32> %bc, i32 %x0, i32 0
%i2 = insertelement <16 x i32> %i1, i32 %x1, i32 1
%bc2 = bitcast <16 x i32> %i2 to <8 x i64>
ret <8 x i64> %bc2
}
I think we'd get that by applying a fold like:
(ins (bitcast (ins ))) --> (bitcast (ins (bitcast )))
...twice. That's still 3 instructions though?
vpinsrd $0, %edi, %xmm0, %xmm2
vpinsrd $1, %esi, %xmm2, %xmm2
vblendps $15, %ymm2, %ymm0, %ymm0 ## ymm0 = ymm2[0,1,2,3],ymm0[4,5,6,7]
https://reviews.llvm.org/D38316
More information about the llvm-commits
mailing list