[PATCH][X86] Teach the backend how to lower vector shift left into multiply rather than scalarizing it.
Jim Grosbach
grosbach at apple.com
Thu Feb 6 14:26:55 PST 2014
Hi Andrea,
This is a very nice improvement, but should do a bit more, I believe.
AVX2 adds 256-bit wide vector versions of these instructions, so if AVX2 is available, the same transformation should be applied to v16i16 and v8i32 shifts. Worth looking to see if AVX512 extends
The test cases should check that when compiling for AVX, the VEX prefixed form of the instructions are generated instead of the SSE versions.
Thanks,
-Jim
On Feb 6, 2014, at 11:45 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote:
> Hi,
>
> This patch teaches the backend how to efficiently lower a packed
> vector shift left into a packed vector multiply if the vector of shift
> counts is known to be constant (i.e. a constant build_vector).
>
> Instead of expanding a packed shift into a sequence of scalar shifts,
> the backend should try (when possible) to convert the vector shift
> into a vector multiply.
>
> Before this patch, a shift of a MVT::v8i16 vector by a build_vector of
> constants was always scalarized into a long sequence of "vector
> extracts + scalar shifts + vector insert".
> With this patch, if there is SSE2 support, we emit a single vector multiply.
>
> The new x86 test 'vec_shift6.ll' contains some examples of code that
> are affected by this patch.
>
> Please let me know if ok to submit.
>
> Thanks,
> Andrea Di Biagio
> SN Systems - Sony Computer Entertainment Group
> <patch.diff>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list