[PATCH] D49499: [X86] Prefer unpckhpd over movhlps in isel for fake unary cases

Craig Topper via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 18 10:57:20 PDT 2018


craig.topper created this revision.
craig.topper added a reviewer: RKSimon.

In r337348, I changed lowering to prefer X86ISD::UNPCKL/UNPCKH opcodes over MOVLHPS/MOVHLPS for v2f64 {0,0} and {1,1} shuffles when we have SSE2. This enabled the removal of a bunch of weirdly bitcasted isel patterns in r337349. To avoid changing the tests I placed a gross hack in isel to still emit movhlps instructions for fake unary unpckh nodes. A similar hack was not needed for unpckl and movlhps because we do execution domain switching for those. But unpckh and movhlps have swapped operand order.

This patch removes the hack.

This is a code size increase since unpckhpd requires a 0x66 prefix and movhlps does not. But if that's a big concern we should be using movhlps for all unpckhpd opcodes and let commuteInstruction turnit into unpckhpd when its an advantage.

Alternatively we could try to turn enable execution domain switching when both inputs are indentical. I have a prototype patch where I tried this. It doesn't catch all cases though. When two address instruction pass inserts copies in front of instruction to satisfy the tied operand constraint on the dest, it doesn't propagate the copy to other inputs that use the same input register. This leads to things like "movaps %xmm0, %xmm1   unpckhpd %xmm0, %xmm1" with %xmm0 being used by a later instruction.


https://reviews.llvm.org/D49499

Files:
  lib/Target/X86/X86InstrSSE.td
  test/CodeGen/X86/buildvec-insertvec.ll
  test/CodeGen/X86/cast-vsel.ll
  test/CodeGen/X86/combine-fcopysign.ll
  test/CodeGen/X86/complex-fastmath.ll
  test/CodeGen/X86/fma.ll
  test/CodeGen/X86/fp128-extract.ll
  test/CodeGen/X86/ftrunc.ll
  test/CodeGen/X86/haddsub-2.ll
  test/CodeGen/X86/haddsub-3.ll
  test/CodeGen/X86/haddsub-undef.ll
  test/CodeGen/X86/half.ll
  test/CodeGen/X86/nontemporal-2.ll
  test/CodeGen/X86/pr11334.ll
  test/CodeGen/X86/sse-schedule.ll
  test/CodeGen/X86/sse2-intrinsics-fast-isel.ll
  test/CodeGen/X86/sse3-avx-addsub-2.ll
  test/CodeGen/X86/sse_partial_update.ll
  test/CodeGen/X86/var-permute-128.ll
  test/CodeGen/X86/vec_extract.ll
  test/CodeGen/X86/vec_fp_to_int.ll
  test/CodeGen/X86/vector-reduce-fadd-fast.ll
  test/CodeGen/X86/vector-reduce-fadd.ll
  test/CodeGen/X86/vector-reduce-fmax-nnan.ll
  test/CodeGen/X86/vector-reduce-fmax.ll
  test/CodeGen/X86/vector-reduce-fmin-nnan.ll
  test/CodeGen/X86/vector-reduce-fmin.ll
  test/CodeGen/X86/vector-reduce-fmul-fast.ll
  test/CodeGen/X86/vector-reduce-fmul.ll
  test/CodeGen/X86/vector-rem.ll
  test/CodeGen/X86/vector-shuffle-128-v2.ll
  test/CodeGen/X86/vector-shuffle-combining.ll
  test/CodeGen/X86/widen_conv-3.ll
  test/CodeGen/X86/widen_conv-4.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D49499.156109.patch
Type: text/x-patch
Size: 148575 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180718/3ab523fb/attachment-0001.bin>


More information about the llvm-commits mailing list