[PATCH] fix PR15296

Michael Liao michael.liao at intel.com
Tue Feb 19 13:56:31 PST 2013


Hi Nadav

Here are the split patches. 0001 and 0002 are just code refactoring to
prepare the fix. 0003 is the fix by moving logic from DAG combining into
DAG lowering.

Test cases are enhanced to run in 32-bit mode and add 4 x i64 vshift
test case.

Yours
- Michael

On Tue, 2013-02-19 at 13:13 -0800, Nadav Rotem wrote:
> +  EVT ShAmtVT = ShAmt.getValueType();
> +  // The shift amount is an i32.
> +  if (ShAmtVT.bitsGT(MVT::i32))
> +    ShAmt = DAG.getNode(ISD::TRUNCATE, dl, MVT::i32, ShAmt);
> +  else if (ShAmtVT.bitsLT(MVT::i32))
> +    ShAmt = DAG.getNode(ISD::ZERO_EXTEND, dl, MVT::i32, ShAmt);
> 
> 
> You can use  DAG.getZExtOrTrunc().
> 
> 
> Thanks,
> Nadav
> 
> 
> 
> On Feb 19, 2013, at 1:08 PM, Craig Topper <craig.topper at gmail.com>
> wrote:
> 
> > I have attached a patch that fixes the 256-bit problem in lowering
> > without this much change. It leaves everything else in DAG combine.
> > Not sure if I have a 32-bit mode problem or not though.
> > 
> > On Tue, Feb 19, 2013 at 11:21 AM, Michael Liao
> > <michael.liao at intel.com> wrote:
> >         As explained in the log message, the root cause of
> >         inefficient code
> >         generation for 256-bit shift on AVX is that the necessary
> >         preparation of
> >         256-bit integer shift is split into regular DAG lowering and
> >         DAG
> >         combining in existing code. The lowering will expand 256-bit
> >         into
> >         128-bit ones and hence DAG combining has no chance to
> >         recognize that.
> >         
> >         Another point is that, if a transform could be in lowering,
> >         we should do
> >         that in lowering to minimize DAG combination overhead.
> >         
> >         Yours
> >         - Michael
> >         
> >         On Tue, 2013-02-19 at 11:16 -0800, Craig Topper wrote:
> >         > We shouldn't need to move this much code to fix this bug.
> >         Are you just
> >         > trying to avoid having duplicate/similar code in the DAG
> >         combine and
> >         > in the lowering path?
> >         >
> >         > On Tue, Feb 19, 2013 at 10:33 AM, Michael Liao
> >         > <michael.liao at intel.com> wrote:
> >         >         Hi Nadav
> >         >
> >         >         The issue for me to split this patch is that it
> >         will break
> >         >         lots of
> >         >         regression tests. The new lowering is moved from
> >         the original
> >         >         shift
> >         >         combining plus a special handling in 32-bit mode.
> >         >
> >         >         please check the latest patch following your other
> >         suggestion.
> >         >         The patch
> >         >         is revised to handle the canonicalized form of
> >         broadcast.
> >         >         Could you
> >         >         point me to more documents on the canonical forms
> >         of some
> >         >         useful
> >         >         patterns, such as broadcast.
> >         >
> >         >         BTW, the current code style suggests 'funcName'.
> >         In that file,
> >         >         there're
> >         >         many existing functions still following the
> >         previous one but
> >         >         there're
> >         >         some following the new one. Do you need change to
> >         new one? I
> >         >         could take
> >         >         that if needed.
> >         >
> >         >         Yours
> >         >         - Michael
> >         >
> >         >         On Tue, 2013-02-19 at 09:42 -0800, Nadav Rotem
> >         wrote:
> >         >         > Hi Michael,
> >         >         >
> >         >         > Thanks for working on this.  In this patch you
> >         moved a lot
> >         >         of code around. Can you split it into two patches:
> >         refactoring
> >         >         and the new lowering ?
> >         >         >
> >         >         > +static SDValue
> >         lowerScalarImmediateShift(SDValue Op,
> >         >         SelectionDAG &DAG,
> >         >         >
> >         >         > Please match the coding style of the existing
> >         code. In this
> >         >         file most static functions start with an upper
> >         case (for
> >         >         example PromoteSplat, LowerBuildVector, etc).
> >         >         >
> >         >         > +    // Check remaiing shift amounts.
> >         >         > +    for (unsigned i = Ratio; i !=
> >         Amt.getNumOperands(); i
> >         >         += Ratio) {
> >         >         >
> >         >         > Typo.
> >         >         >
> >         >         > +  %smear.0 = insertelement <8 x i32> undef, i32
> >         %shiftval,
> >         >         i32 0
> >         >         > +  %smear.1 = insertelement <8 x i32> %smear.0,
> >         i32 %
> >         >         shiftval, i32 1
> >         >         > +  %smear.2 = insertelement <8 x i32> %smear.1,
> >         i32 %
> >         >         shiftval, i32 2
> >         >         > +  %smear.3 = insertelement <8 x i32> %smear.2,
> >         i32 %
> >         >         shiftval, i32 3
> >         >         > +  %smear.4 = insertelement <8 x i32> %smear.3,
> >         i32 %
> >         >         shiftval, i32 4
> >         >         > +  %smear.5 = insertelement <8 x i32> %smear.4,
> >         i32 %
> >         >         shiftval, i32 5
> >         >         > +  %smear.6 = insertelement <8 x i32> %smear.5,
> >         i32 %
> >         >         shiftval, i32 6
> >         >         > +  %smear.7 = insertelement <8 x i32> %smear.6,
> >         i32 %
> >         >         shiftval, i32 7
> >         >         > +  %bitop = lshr <8 x i32> %input, %smear.7
> >         >         > +  ret <8 x i32> %bitop
> >         >         >
> >         >         > The canonicalized form of broadcast is insert
> >         +shuffle.
> >         >         >
> >         >         > Thanks,
> >         >         > Nadav
> >         >         >
> >         >         > On Feb 19, 2013, at 12:32 AM, Michael Liao
> >         >         <michael.liao at intel.com> wrote:
> >         >         >
> >         >         > > Hi All,
> >         >         > >
> >         >         > > This patch fixes PR15296 by lowering
> >         SRA/SRL/SHL into
> >         >         PSRA/PSRL/PSHL
> >         >         > > during DAG lowering instead of DAG combining,
> >         because
> >         >         256-bit integer
> >         >         > > vector will be expanded in lowering where DAG
> >         combining
> >         >         will lose the
> >         >         > > chance to transform it.
> >         >         > >
> >         >         > > Thanks for your review
> >         >         > > - Michael
> >         >         > >
> >         >         > >
> >         >
> >         <0001-Fix-PR15296.patch>_______________________________________________
> >         >         > > llvm-commits mailing list
> >         >         > > llvm-commits at cs.uiuc.edu
> >         >         > >
> >         http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >         >         >
> >         >
> >         >
> >         >
> >         >         _______________________________________________
> >         >         llvm-commits mailing list
> >         >         llvm-commits at cs.uiuc.edu
> >         >
> >         http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >         >
> >         >
> >         >
> >         >
> >         >
> >         > --
> >         > ~Craig
> >         
> >         
> >         
> > 
> > 
> > 
> > 
> > -- 
> > ~Craig <vshift.patch>
> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Move-length-scalar-immediate-shift-lowering-into-a-d.patch
Type: text/x-patch
Size: 1972 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130219/1637e752/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-Mark-all-variable-shifts-needing-customizing.patch
Type: text/x-patch
Size: 4817 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130219/1637e752/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-Fix-PR15296.patch
Type: text/x-patch
Size: 15549 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130219/1637e752/attachment-0002.bin>


More information about the llvm-commits mailing list