[llvm-commits] [AVX] Add EXTRACT_SUBVECTOR to DAGCombine

Rackover, Zvi zvi.rackover at intel.com
Mon Sep 19 23:42:52 PDT 2011


Bruno, thanks for reviewing and investing effort in this.
The undef is only a special case of what this patch addresses.

Take for example one of the tests in the patch:
define <8 x i32> @test(<8 x i32> %v1, <8 x i32> %v2) {
  %1 = add <8 x i32> %v1, %v2
  %2 = add <8 x i32> %1, %v1
  ret <8 x i32> %2
}

Running on TOT gives:
   vextractf128    $1, %ymm1, %xmm3
   vextractf128    $1, %ymm0, %xmm2
   vpaddd  %xmm3, %xmm2, %xmm3
   vpaddd  %xmm1, %xmm0, %xmm1
   vinsertf128     $1, %xmm3, %ymm1, %ymm3  <--------
   vextractf128    $1, %ymm3, %xmm1               <---------
   vpaddd  %xmm2, %xmm1, %xmm1
   vpaddd  %xmm0, %xmm3, %xmm0
   vinsertf128     $1, %xmm1, %ymm0, %ymm0

Running with the patch applied gives:
   vextractf128    $1, %ymm1, %xmm3
   vextractf128    $1, %ymm0, %xmm2
   vpaddd  %xmm3, %xmm2, %xmm3
   vpaddd  %xmm2, %xmm3, %xmm2
   vpaddd  %xmm1, %xmm0, %xmm1
   vpaddd  %xmm0, %xmm1, %xmm0
   vinsertf128     $1, %xmm2, %ymm0, %ymm0

We can optimize away redundant insert_subvector/extract_subvector pairs by applying the following transforms:
EXTRACT_SV( INSERT_SV( V1, V2, I ), I)         ----> V2
EXTRACT_SV( INSERT_SV( V1, V2, I1 ), I2)     ----> EXTRACT_SV( V1, I2 )

I thought it would be right to make this optimization target-interdependent, but if it should be X86-specific, where should it be located?

Thanks, Zvi

-----Original Message-----
From: Bruno Cardoso Lopes [mailto:bruno.cardoso at gmail.com] 
Sent: Tuesday, September 20, 2011 02:41
To: Rackover, Zvi
Cc: llvm-commits at cs.uiuc.edu
Subject: Re: [llvm-commits] [AVX] Add EXTRACT_SUBVECTOR to DAGCombine

Hi Zvi,

On Mon, Sep 19, 2011 at 11:31 AM, Rackover, Zvi <zvi.rackover at intel.com> wrote:
> Hi Bruno and other codegen people,
>
>
>
> Please review the attached patch and commit if acceptable.
>
> I categorized this patch as AVX, since I am not aware of other cases in X86
> where it shows any benefit.

You should be looking in INSERT_SUBVECTOR instead of extract, because
is the undef
being inserted in the upper part that you want to catch. I provided
another fix for this in r140097

Thanks

-- 
Bruno Cardoso Lopes
http://www.brunocardoso.cc
---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.





More information about the llvm-commits mailing list