[PATCH][AVX] Lower v4i64->v4i32 ISD::TRUNCATE for minimal shuffles

Wed Mar 5 08:32:39 PST 2014

Hi Cameron,

the change in X86ISelLowering.cpp looks good to me.
However I am not the code owner so please wait for more feedback
before submitting your patch :-).

In my opinion, test avx-trunc.ll should be improved as follow:
 - add a CHECK-NOT for trunc_64_32 to verify that with your fix the
backend no longer emits two shuffle instructions (we expect a single
shuffle instruction now).
 - I think you should probably also check that no movlhps is
introduced when lowering that truncate (in trunc_64_32). That truncate
was expected before your change and not it is no longer needed I
think.

While at it, you can improve the test and replace CHECK: with
CHECK-LABEL when needed.

Andrea

On Wed, Mar 5, 2014 at 3:44 PM, Cameron McInally
<cameron.mcinally at nyu.edu> wrote:
> Hey guys,
>
> For AVX, v4i64->v4i32 truncates are currently lowered into two
> shuffles plus a movlh:
>
>> vpshufd $8, %xmm1, %xmm1        ## xmm1 = xmm1[0,2,0,0]
>> vpshufd $8, %xmm0, %xmm0        ## xmm0 = xmm0[0,2,0,0]
>> vmovlhps %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[0],xmm1[0]
>
> This could also be done using a vshufps:
>
>> vshufps $-120, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[0,2],xmm1[0,2]
>
> Please note that this does change the execution domain of the shuffle,
> but as far as I can tell this should be okay. My understanding, from
> looking at Fog's tables, is that the shuffles should be a wash and
> avoiding the movlh is a win.
>
> Any insights into whether this change is a good idea or not?
>
> Tia,
> Cameron
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>