I have attached a patch that fixes the 256-bit problem in lowering without this much change. It leaves everything else in DAG combine. Not sure if I have a 32-bit mode problem or not though.<br><br><div class="gmail_quote">

On Tue, Feb 19, 2013 at 11:21 AM, Michael Liao <span dir="ltr"><<a href="mailto:michael.liao@intel.com" target="_blank">michael.liao@intel.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">As explained in the log message, the root cause of inefficient code<br>

generation for 256-bit shift on AVX is that the necessary preparation of<br>

256-bit integer shift is split into regular DAG lowering and DAG<br>

combining in existing code. The lowering will expand 256-bit into<br>

128-bit ones and hence DAG combining has no chance to recognize that.<br>

<br>

Another point is that, if a transform could be in lowering, we should do<br>

that in lowering to minimize DAG combination overhead.<br>

<br>

Yours<br>

<span><font color="#888888">- Michael<br>

</font></span><div><div><br>

On Tue, 2013-02-19 at 11:16 -0800, Craig Topper wrote:<br>

> We shouldn't need to move this much code to fix this bug. Are you just<br>

> trying to avoid having duplicate/similar code in the DAG combine and<br>

> in the lowering path?<br>

><br>

> On Tue, Feb 19, 2013 at 10:33 AM, Michael Liao<br>

> <<a href="mailto:michael.liao@intel.com" target="_blank">michael.liao@intel.com</a>> wrote:<br>

>         Hi Nadav<br>

><br>

>         The issue for me to split this patch is that it will break<br>

>         lots of<br>

>         regression tests. The new lowering is moved from the original<br>

>         shift<br>

>         combining plus a special handling in 32-bit mode.<br>

><br>

>         please check the latest patch following your other suggestion.<br>

>         The patch<br>

>         is revised to handle the canonicalized form of broadcast.<br>

>         Could you<br>

>         point me to more documents on the canonical forms of some<br>

>         useful<br>

>         patterns, such as broadcast.<br>

><br>

>         BTW, the current code style suggests 'funcName'. In that file,<br>

>         there're<br>

>         many existing functions still following the previous one but<br>

>         there're<br>

>         some following the new one. Do you need change to new one? I<br>

>         could take<br>

>         that if needed.<br>

><br>

>         Yours<br>

>         - Michael<br>

><br>

>         On Tue, 2013-02-19 at 09:42 -0800, Nadav Rotem wrote:<br>

>         > Hi Michael,<br>

>         ><br>

>         > Thanks for working on this.  In this patch you moved a lot<br>

>         of code around. Can you split it into two patches: refactoring<br>

>         and the new lowering ?<br>

>         ><br>

>         > +static SDValue lowerScalarImmediateShift(SDValue Op,<br>

>         SelectionDAG &DAG,<br>

>         ><br>

>         > Please match the coding style of the existing code. In this<br>

>         file most static functions start with an upper case (for<br>

>         example PromoteSplat, LowerBuildVector, etc).<br>

>         ><br>

>         > +    // Check remaiing shift amounts.<br>

>         > +    for (unsigned i = Ratio; i != Amt.getNumOperands(); i<br>

>         += Ratio) {<br>

>         ><br>

>         > Typo.<br>

>         ><br>

>         > +  %smear.0 = insertelement <8 x i32> undef, i32 %shiftval,<br>

>         i32 0<br>

>         > +  %smear.1 = insertelement <8 x i32> %smear.0, i32 %<br>

>         shiftval, i32 1<br>

>         > +  %smear.2 = insertelement <8 x i32> %smear.1, i32 %<br>

>         shiftval, i32 2<br>

>         > +  %smear.3 = insertelement <8 x i32> %smear.2, i32 %<br>

>         shiftval, i32 3<br>

>         > +  %smear.4 = insertelement <8 x i32> %smear.3, i32 %<br>

>         shiftval, i32 4<br>

>         > +  %smear.5 = insertelement <8 x i32> %smear.4, i32 %<br>

>         shiftval, i32 5<br>

>         > +  %smear.6 = insertelement <8 x i32> %smear.5, i32 %<br>

>         shiftval, i32 6<br>

>         > +  %smear.7 = insertelement <8 x i32> %smear.6, i32 %<br>

>         shiftval, i32 7<br>

>         > +  %bitop = lshr <8 x i32> %input, %smear.7<br>

>         > +  ret <8 x i32> %bitop<br>

>         ><br>

>         > The canonicalized form of broadcast is insert+shuffle.<br>

>         ><br>

>         > Thanks,<br>

>         > Nadav<br>

>         ><br>

>         > On Feb 19, 2013, at 12:32 AM, Michael Liao<br>

>         <<a href="mailto:michael.liao@intel.com" target="_blank">michael.liao@intel.com</a>> wrote:<br>

>         ><br>

>         > > Hi All,<br>

>         > ><br>

>         > > This patch fixes PR15296 by lowering SRA/SRL/SHL into<br>

>         PSRA/PSRL/PSHL<br>

>         > > during DAG lowering instead of DAG combining, because<br>

>         256-bit integer<br>

>         > > vector will be expanded in lowering where DAG combining<br>

>         will lose the<br>

>         > > chance to transform it.<br>

>         > ><br>

>         > > Thanks for your review<br>

>         > > - Michael<br>

>         > ><br>

>         > ><br>

>         <0001-Fix-PR15296.patch>_______________________________________________<br>

>         > > llvm-commits mailing list<br>

>         > > <a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>

>         > > <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>

>         ><br>

><br>

><br>

><br>

>         _______________________________________________<br>

>         llvm-commits mailing list<br>

>         <a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>

>         <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>

><br>

><br>

><br>

><br>

><br>

> --<br>

> ~Craig<br>

<br>

<br>

</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br>~Craig