[PATCH] fix PR15296

Tue Feb 19 11:21:21 PST 2013

As explained in the log message, the root cause of inefficient code
generation for 256-bit shift on AVX is that the necessary preparation of
256-bit integer shift is split into regular DAG lowering and DAG
combining in existing code. The lowering will expand 256-bit into
128-bit ones and hence DAG combining has no chance to recognize that.

Another point is that, if a transform could be in lowering, we should do
that in lowering to minimize DAG combination overhead.

Yours
- Michael

On Tue, 2013-02-19 at 11:16 -0800, Craig Topper wrote:
> We shouldn't need to move this much code to fix this bug. Are you just
> trying to avoid having duplicate/similar code in the DAG combine and
> in the lowering path?
> 
> On Tue, Feb 19, 2013 at 10:33 AM, Michael Liao
> <michael.liao at intel.com> wrote:
>         Hi Nadav
>         
>         The issue for me to split this patch is that it will break
>         lots of
>         regression tests. The new lowering is moved from the original
>         shift
>         combining plus a special handling in 32-bit mode.
>         
>         please check the latest patch following your other suggestion.
>         The patch
>         is revised to handle the canonicalized form of broadcast.
>         Could you
>         point me to more documents on the canonical forms of some
>         useful
>         patterns, such as broadcast.
>         
>         BTW, the current code style suggests 'funcName'. In that file,
>         there're
>         many existing functions still following the previous one but
>         there're
>         some following the new one. Do you need change to new one? I
>         could take
>         that if needed.
>         
>         Yours
>         - Michael
>         
>         On Tue, 2013-02-19 at 09:42 -0800, Nadav Rotem wrote:
>         > Hi Michael,
>         >
>         > Thanks for working on this.  In this patch you moved a lot
>         of code around. Can you split it into two patches: refactoring
>         and the new lowering ?
>         >
>         > +static SDValue lowerScalarImmediateShift(SDValue Op,
>         SelectionDAG &DAG,
>         >
>         > Please match the coding style of the existing code. In this
>         file most static functions start with an upper case (for
>         example PromoteSplat, LowerBuildVector, etc).
>         >
>         > +    // Check remaiing shift amounts.
>         > +    for (unsigned i = Ratio; i != Amt.getNumOperands(); i
>         += Ratio) {
>         >
>         > Typo.
>         >
>         > +  %smear.0 = insertelement <8 x i32> undef, i32 %shiftval,
>         i32 0
>         > +  %smear.1 = insertelement <8 x i32> %smear.0, i32 %
>         shiftval, i32 1
>         > +  %smear.2 = insertelement <8 x i32> %smear.1, i32 %
>         shiftval, i32 2
>         > +  %smear.3 = insertelement <8 x i32> %smear.2, i32 %
>         shiftval, i32 3
>         > +  %smear.4 = insertelement <8 x i32> %smear.3, i32 %
>         shiftval, i32 4
>         > +  %smear.5 = insertelement <8 x i32> %smear.4, i32 %
>         shiftval, i32 5
>         > +  %smear.6 = insertelement <8 x i32> %smear.5, i32 %
>         shiftval, i32 6
>         > +  %smear.7 = insertelement <8 x i32> %smear.6, i32 %
>         shiftval, i32 7
>         > +  %bitop = lshr <8 x i32> %input, %smear.7
>         > +  ret <8 x i32> %bitop
>         >
>         > The canonicalized form of broadcast is insert+shuffle.
>         >
>         > Thanks,
>         > Nadav
>         >
>         > On Feb 19, 2013, at 12:32 AM, Michael Liao
>         <michael.liao at intel.com> wrote:
>         >
>         > > Hi All,
>         > >
>         > > This patch fixes PR15296 by lowering SRA/SRL/SHL into
>         PSRA/PSRL/PSHL
>         > > during DAG lowering instead of DAG combining, because
>         256-bit integer
>         > > vector will be expanded in lowering where DAG combining
>         will lose the
>         > > chance to transform it.
>         > >
>         > > Thanks for your review
>         > > - Michael
>         > >
>         > >
>         <0001-Fix-PR15296.patch>_______________________________________________
>         > > llvm-commits mailing list
>         > > llvm-commits at cs.uiuc.edu
>         > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>         >
>         
>         
>         
>         _______________________________________________
>         llvm-commits mailing list
>         llvm-commits at cs.uiuc.edu
>         http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>         
> 
> 
> 
> 
> -- 
> ~Craig