[PATCH] [X86][SSE] Avoid scalarization of v2i64 vector shifts

Wed Mar 18 08:09:37 PDT 2015

Hi qcolombet, mkuper, andreadb, spatel,

Currently v2i64 vectors shifts (non-equal shift amounts) are scalarized, costing 4 x extract, 2 x x86-shifts and 2 x insert instructions - and it gets even more awkward on 32-bit targets.

This patch separately shifts the vector by both shift amounts and then shuffles the partial results back together, costing 2 x shuffles and 2 x sse-shifts instructions (+ 2 movs on pre-AVX hardware).

Note - this patch only improves the SHL / LSHR shifts as ASHR 2i64 shifts aren't currently supported in hardware - I'm looking at fixing this in a future patch.

REPOSITORY
  rL LLVM

http://reviews.llvm.org/D8416

Files:
  lib/Target/X86/X86ISelLowering.cpp
  test/CodeGen/X86/vshift-4.ll
  test/CodeGen/X86/x86-shifts.ll

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D8416.22186.patch
Type: text/x-patch
Size: 6234 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150318/8008df26/attachment.bin>