[PATCH][InstCombine][X86] Improve the folding of calls to X86 packed shifts intrinsics.

Thu May 8 02:18:00 PDT 2014

Hi Jim,
Thanks for the feedback!

I'll try to move these changes to the backend.

Cheers,
Andrea
 On 8 May 2014 03:18, "Jim Grosbach" <grosbach at apple.com> wrote:

> Hi Andrea,
>
> I’m really excited to see these patches continuing. Our vector codegen has
> been needing exactly this sort of detail oriented tuning for a long time
> now.
>
> These are both good improvements, but would be better as DAGCombines in
> the X86 backend. The main argument for doing these intrinsic combines at
> the IR level is when the input expression is likely to be split across
> multiple basic blocks by the time the backend sees it and would thus not be
> recognized by a DAG combiner. Both of these transforms should avoid that
> problem, though, and so can be dealt with there.
>
> -Jim
>
> On May 7, 2014, at 8:42 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com>
> wrote:
>
> > Hi,
> >
> > This patch teaches InstCombine how to fold a packed SSE2/AVX2 shift
> > intrinsic into its first operand if the shift count is a zerovector
> > (i.e. a 'ConstantAggregateZero’).
> > Also, this patch teaches InstCombine how to lower a packed arithmetic
> > shift intrinsics into an 'ashr' instruction if the shift count is
> > known to be smaller than the vector element size.
> >
> > Please let me know if ok to submit.
> >
> > Thanks,
> > Andrea Di Biagio
> > SN Systems - Sony Computer Entertainment Group
> > <patch-instcombine-vshifts.diff>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140508/e5255a85/attachment.html>