[PATCH][X86] Add a check for 'isMOVHLPSMask' within method 'isShuffleMaskLegal'.

Tue Jul 15 23:57:30 PDT 2014

Looks good.

-  Elena

-----Original Message-----
From: Andrea Di Biagio [mailto:andrea.dibiagio at gmail.com] 
Sent: Tuesday, July 15, 2014 21:21
To: llvm-commits at cs.uiuc.edu for LLVM
Cc: Nadav Rotem; Jim Grosbach; Demikhovsky, Elena
Subject: [PATCH][X86] Add a check for 'isMOVHLPSMask' within method 'isShuffleMaskLegal'.

Hi,

This patch adds a check for 'isMOVHLPSMask' inside method 'X86TargetLowering::isShuffleMaskLegal'.

The x86 backend already knows how to lower vector shuffles into X86ISD::MOVHLPS dag nodes. Method 'LowerVECTOR_SHUFFLE' explicitly calls function 'isMOVHLPSMask' to check if a shuffle dag node in input can be safely expanded into a single 'movhlps' instruction.

I noticed however that method 'isShuffleMaskLegal' doesn't know that shuffles implementing a 'movhlps' operation are perfectly legal for SSE targets.

The reason why we should update that method is because the DAGCombiner conservatively avoids combining two shuffles if the resulting shuffle mask would be illegal for the target. At the moment, shuffles with a MOVHLPS mask are wrongly considered not to be legal by 'isShuffleMaskLegal'.This is the root cause of some poor-code generation bugs.

This patch fixes method 'isShuffleMaskLegal' adding a check for 'isMOVHLPSMask'.

Given the following IR:

define <4 x i32> @test_movhl_1(<4 x i32> %a, <4 x i32> %b) {
  %1 = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 2, i32 7, i32 5, i32 3>
  %2 = shufflevector <4 x i32> %1, <4 x i32> %b, <4 x i32> <i32 6, i32 1, i32 0, i32 3>
  ret <4 x i32> %2
}

Before this patch, llc (-mcpu=corei7 -march=x86-64) generated the following sub-optimal assembly sequence:
    shufps $126, %xmm1, %xmm0
    pshufd $120, %xmm0, %xmm0
    shufps $18, %xmm0, %xmm1
    shufps $-56, %xmm0, %xmm1
    movapd %xmm1, %xmm0

With this patch, the backend recognizes that the two shuffles could be combined into a single legal shuffle which is then lowered as a single movhlps instruction:
    movhlps %xmm1, %xmm0

I added some other examples in test 'combine-vec-shuffle-3.ll'

Please let me know if ok to submit.

Thanks,
Andrea
---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.