[PATCH] D77152: [SelectionDAG] Better legalization for FSHL and FSHR

Tue Jun 16 05:31:43 PDT 2020

spatel added inline comments.

================
Comment at: llvm/test/CodeGen/X86/vector-fshl-rot-128.ll:1670-1675
 ; X32-SSE-NEXT:    movdqa %xmm0, %xmm1
 ; X32-SSE-NEXT:    psrlq $50, %xmm1
+; X32-SSE-NEXT:    movsd {{.*#+}} xmm1 = xmm1[0,1]
 ; X32-SSE-NEXT:    psllq $14, %xmm0
-; X32-SSE-NEXT:    por %xmm1, %xmm0
+; X32-SSE-NEXT:    movsd {{.*#+}} xmm0 = xmm0[0,1]
+; X32-SSE-NEXT:    orpd %xmm1, %xmm0
----------------
foad wrote:
> Can anyone suggest what might have caused this regression? We get two presumably redundant movsd instructions:
> ```
> 	psrlq	$50, %xmm1
> 	movsd	%xmm1, %xmm1            # xmm1 = xmm1[0,1]
> 	psllq	$14, %xmm0
> 	movsd	%xmm0, %xmm0            # xmm0 = xmm0[0,1]
> 	orpd	%xmm1, %xmm0
> ```
I looked at the debug output for this case.
It only shows up for x86-32 because we legalize the shift amount constant there with:
      t4: v2i64 = BUILD_VECTOR Constant:i64<14>, Constant:i64<14>
-->
        t15: v4i32 = BUILD_VECTOR Constant:i32<14>, undef:i32, Constant:i32<14>, undef:i32
      t13: v2i64 = bitcast t15

And that then leads to a big mess when legalizing the 'rotl' node.

We convert an x86 vector-shift-by-variable node into an x86-vector-shift-by-constant node very late which alters an x86 MOVSD node to have the same operands. 

But we've already processed that node in the last combining stage, so it survives to isel. We could probably zap that in DAGToDAG pre-processing. 

This seems like an unlikely pattern/problem, and there are no existing tests that show it, so I don't think we need to hold this patch up for this regression.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77152/new/

https://reviews.llvm.org/D77152