[PATCH] D60214: [DAGCombiner] move splat-shuffle after binop with splat constant

Wed Apr 17 12:50:22 PDT 2019

efriedma added inline comments.

================
Comment at: llvm/test/CodeGen/ARM/reg_sequence.ll:280
+; CHECK-NEXT:  vext.32  [[Q8]], [[Q8]], [[Q8]], #2
+; CHECK-NEXT:  vmul.f32  [[Q8]], [[Q8]], [[Q9]]
 ; CHECK-NEXT:  vmul.f32	[[Q8]], [[Q8]], [[Q8]]
----------------
I think the reason you're seeing a regression here is that ARM has a combined "splat-and-multiply" operation.  If the splat isn't an operand to the multiply, you end up with an extra instruction.  If DAGCombine is going to reorder the operations, we might need some ARM/AArch64-specific pattern-matching code for multiplies.

Granted, this isn't really a great testcase to demonstrate that issue; it's sort of degenerate.  Instead, consider something like the following:

```
define <4 x i32> @x(<4 x i32> %a) {
entry:
  %splat = shufflevector <4 x i32> %a, <4 x i32> undef, <4 x i32> zeroinitializer
  %mul = mul <4 x i32> %splat, <i32 3, i32 3, i32 3, i32 3>
  ret <4 x i32> %mul
}
```

On AArch64, this currently compiles to something like "movi v1.4s, #3; mul v0.4s, v1.4s, v0.s[0]".  If you reorder the operations, I think you end up with three instructions.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60214/new/

https://reviews.llvm.org/D60214