[PATCH] Optimize insertqi when we copy all the lower 64 bits.

Filipe Cabecinhas filcab at gmail.com
Tue Apr 15 18:30:19 PDT 2014


On Tue, Apr 15, 2014 at 11:04 AM, Nadav Rotem <nrotem at apple.com> wrote:

> Hi Filipe,
>
> Why is this an IR-level transform? Could you implement this in
> SelectionDAG ?
>
> Thanks,
> Nadav
>
> On Apr 15, 2014, at 10:48 AM, Jim Grosbach <grosbach at apple.com> wrote:
>
> > +Nadav
> >
> > Hi Filipe,
> >
> > I like the idea of this transform. Nadav will have a better idea than I
> about whether this is the right place to go about it.
> >
> > -Jim
> >
> > On Apr 14, 2014, at 7:52 PM, Filipe Cabecinhas <
> filcab+llvm.phabricator at gmail.com> wrote:
> >
> >> Updated the patch to add range merging, generating fewer insertqi when
> possible.
> >>
> >> This also allows us to find more places to do the first opt.
> >>
> >> Hi grosbach,
> >>
> >> http://reviews.llvm.org/D3357
> >>
> >> CHANGE SINCE LAST DIFF
> >> http://reviews.llvm.org/D3357?vs=8482&id=8521#toc
> >>
> >> Files:
> >> lib/Transforms/InstCombine/InstCombineCalls.cpp
> >> test/Transforms/InstCombine/vec_demanded_elts.ll
> >> <D3357.2.patch>
> >
>
>

Hi Nadav,

I'm not as familiar with SelectionDAG, but would it be able to fold N uses
of insertqi easily? Doing this on instcombine allows us to, for example,
easily fold these into one insertqi:

-------------------------
declare <2 x i64> @llvm.x86.sse4a.insertqi(<2 x i64>, <2 x i64>, i8, i8)

define <2 x i64> @testInsert64v_2(<2 x i64> %v, <2 x i64> %i) {
  %1 = tail call <2 x i64> @llvm.x86.sse4a.insertqi(<2 x i64> %v, <2 x i64>
%i, i8 16, i8 0)
  %2 = tail call <2 x i64> @llvm.x86.sse4a.insertqi(<2 x i64> %1, <2 x i64>
%i, i8 20, i8 5)
  %3 = tail call <2 x i64> @llvm.x86.sse4a.insertqi(<2 x i64> %2, <2 x i64>
%i, i8 32, i8 25)
  %4 = tail call <2 x i64> @llvm.x86.sse4a.insertqi(<2 x i64> %3, <2 x i64>
%i, i8 10, i8 52)
  ret <2 x i64> %4
}
-------------------------

If you add 2 to the 10 on the %4 call, you event get them all folded to a
ret %i.

If you still think it's preferable to implement it in SelectionDAG, let me
know and I'll change the patch.

Thanks,

  Filipe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140415/fa3b8558/attachment.html>


More information about the llvm-commits mailing list