[PATCH] [X86][SSE] Avoid scalarization of v2i64 vector shifts
llvm-dev at redking.me.uk
Wed Mar 18 08:09:37 PDT 2015
Hi qcolombet, mkuper, andreadb, spatel,
Currently v2i64 vectors shifts (non-equal shift amounts) are scalarized, costing 4 x extract, 2 x x86-shifts and 2 x insert instructions - and it gets even more awkward on 32-bit targets.
This patch separately shifts the vector by both shift amounts and then shuffles the partial results back together, costing 2 x shuffles and 2 x sse-shifts instructions (+ 2 movs on pre-AVX hardware).
Note - this patch only improves the SHL / LSHR shifts as ASHR 2i64 shifts aren't currently supported in hardware - I'm looking at fixing this in a future patch.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 6234 bytes
Desc: not available
More information about the llvm-commits