[llvm-commits] [AVX] Add EXTRACT_SUBVECTOR to DAGCombine
Bruno Cardoso Lopes
bruno.cardoso at gmail.com
Tue Sep 20 16:23:03 PDT 2011
On Mon, Sep 19, 2011 at 11:42 PM, Rackover, Zvi <zvi.rackover at intel.com> wrote:
> Bruno, thanks for reviewing and investing effort in this.
> The undef is only a special case of what this patch addresses.
>
> Take for example one of the tests in the patch:
> define <8 x i32> @test(<8 x i32> %v1, <8 x i32> %v2) {
> %1 = add <8 x i32> %v1, %v2
> %2 = add <8 x i32> %1, %v1
> ret <8 x i32> %2
> }
>
> Running on TOT gives:
> vextractf128 $1, %ymm1, %xmm3
> vextractf128 $1, %ymm0, %xmm2
> vpaddd %xmm3, %xmm2, %xmm3
> vpaddd %xmm1, %xmm0, %xmm1
> vinsertf128 $1, %xmm3, %ymm1, %ymm3 <--------
> vextractf128 $1, %ymm3, %xmm1 <---------
> vpaddd %xmm2, %xmm1, %xmm1
> vpaddd %xmm0, %xmm3, %xmm0
> vinsertf128 $1, %xmm1, %ymm0, %ymm0
>
> Running with the patch applied gives:
> vextractf128 $1, %ymm1, %xmm3
> vextractf128 $1, %ymm0, %xmm2
> vpaddd %xmm3, %xmm2, %xmm3
> vpaddd %xmm2, %xmm3, %xmm2
> vpaddd %xmm1, %xmm0, %xmm1
> vpaddd %xmm0, %xmm1, %xmm0
> vinsertf128 $1, %xmm2, %ymm0, %ymm0
>
> We can optimize away redundant insert_subvector/extract_subvector pairs by applying the following transforms:
> EXTRACT_SV( INSERT_SV( V1, V2, I ), I) ----> V2
> EXTRACT_SV( INSERT_SV( V1, V2, I1 ), I2) ----> EXTRACT_SV( V1, I2 )
>
> I thought it would be right to make this optimization target-interdependent, but if it should be X86-specific, where should it be located?
Cool! I've applied with some tweaks in r140204!
Thanks Zvi!
--
Bruno Cardoso Lopes
http://www.brunocardoso.cc
More information about the llvm-commits
mailing list