Thanks for the comments. I forgot to "svn add" the test cases. I'll commit it tonight if I have time. Also forgot to make it work for 256-bit FP. I'll add the asserts and try to clean up the loops as well. I think there's an 80 column violation in there too.<br>

<br><div class="gmail_quote">On Wed, Nov 30, 2011 at 1:47 AM, Duncan Sands <span dir="ltr"><<a href="mailto:baldrick@free.fr">baldrick@free.fr</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

Hi Craig,<br>

<div class="im"><br>

> Add instruction selection support for AVX2 horizontal add/sub instructions.<br>

<br>

</div>thanks for doing this.  Please add a testcase.<br>

<div class="im"><br>

> --- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)<br>

> +++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Nov 30 03:10:50 2011<br>

> @@ -14329,7 +14329,9 @@<br>

>       return false;<br>

><br>

>     EVT VT = LHS.getValueType();<br>

> -  unsigned N = VT.getVectorNumElements();<br>

> +  unsigned NumElts = VT.getVectorNumElements();<br>

> +  unsigned NumLanes = VT.getSizeInBits()/128;<br>

<br>

</div>Here you assume that the number of bits is always a multiple of 128.  While this<br>

is currently true (and probably always will be) how about adding an assertion<br>

checking this?<br>

<div class="im"><br>

> +  for (unsigned l = 0; l != NumLanes; ++l) {<br>

> +    unsigned LaneStart = l*NumLaneElts;<br>

> +    for (unsigned i = 0; i != NumLaneElts/2; ++i) {<br>

<br>

</div>Here you assume that NumLaneElts is even, in particular that it is not 1.  Is it<br>

really impossible for it to be equal to 1?  If it is impossible, please add an<br>

assertion that checks that.  If it is possible, please handle it!<br>

<div class="im"><br>

> +      unsigned LIdx = LMask[i+LaneStart];<br>

> +      unsigned RIdx = RMask[i+LaneStart];<br>

> +<br>

> +      // Ignore any UNDEF components.<br>

> +      if (LIdx>= 2*NumElts || RIdx>= 2*NumElts ||<br>

</div>> +          (!A.getNode()&&  (LIdx<  NumElts || RIdx<  NumElts)) ||<br>

> +          (!B.getNode()&&  (LIdx>= NumElts || RIdx>= NumElts)))<br>

<div class="im">> +        continue;<br>

> +<br>

> +      // Check that successive elements are being operated on.  If not, this is<br>

> +      // not a horizontal operation.<br>

</div>> +      if (!(LIdx == 2*i + LaneStart&&  RIdx == 2*i + LaneStart + 1)&&<br>

> +          !(isCommutative&&  LIdx == 2*i + LaneStart + 1&&  RIdx == 2*i + LaneStart))<br>

> +        return false;<br>

> +    }<br>

<br>

This loop and the one below are essentially doing the same thing.  How about<br>

unifying them, but introducing an additional loop that loops over the values<br>

0 and 1:<br>

   for (unsigned h = 0; h != 2; ++h)<br>

<div class="im">     for (unsigned i = 0; i != NumLaneElts/2; ++i) {<br>

</div>       unsigned LIdx = LMask[i+LaneStart+h*(NumLaneElts/2)];<br>

and so on.<br>

<br>

Ciao, Duncan.<br>

<br>

PS: I didn't check the correctness of your multi-lane logic.<br>

<div class="HOEnZb"><div class="h5">_______________________________________________<br>

llvm-commits mailing list<br>

<a href="mailto:llvm-commits@cs.uiuc.edu">llvm-commits@cs.uiuc.edu</a><br>

<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>

</div></div></blockquote></div><br><br clear="all"><br>-- <br>~Craig<br>