[PATCH][X86] Teach the backend how to lower vector shift left into multiply rather than scalarizing it.

Thu Feb 6 14:26:55 PST 2014

Hi Andrea,

This is a very nice improvement, but should do a bit more, I believe.

AVX2 adds 256-bit wide vector versions of these instructions, so if AVX2 is available, the same transformation should be applied to v16i16 and v8i32 shifts. Worth looking to see if AVX512 extends 

The test cases should check that when compiling for AVX, the VEX prefixed form of the instructions are generated instead of the SSE versions.

Thanks,
-Jim

On Feb 6, 2014, at 11:45 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:

> Hi,
> 
> This patch teaches the backend how to efficiently lower a packed
> vector shift left into a packed vector multiply if the vector of shift
> counts is known to be constant (i.e. a constant build_vector).
> 
> Instead of expanding a packed shift into a sequence of scalar shifts,
> the backend should try (when possible) to convert the vector shift
> into a vector multiply.
> 
> Before this patch, a shift of a MVT::v8i16 vector by a build_vector of
> constants was always scalarized into a long sequence of "vector
> extracts + scalar shifts + vector insert".
> With this patch, if there is SSE2 support, we emit a single vector multiply.
> 
> The new x86 test 'vec_shift6.ll' contains some examples of code that
> are affected by this patch.
> 
> Please let me know if ok to submit.
> 
> Thanks,
> Andrea Di Biagio
> SN Systems - Sony Computer Entertainment Group
> <patch.diff>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits